Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NodeVersionAllocationDecider should get Replication Type from index meta instead of node settings #12744

Open
KunjueYu opened this issue Mar 19, 2024 · 2 comments · May be fixed by #12811
Labels
bug Something isn't working Cluster Manager Indexing:Replication Issues and PRs related to core replication framework eg segrep Storage:Remote

Comments

@KunjueYu
Copy link

KunjueYu commented Mar 19, 2024

Describe the bug

The class org.opensearch.cluster.routing.allocation.decider.NodeVersionAllocationDecider misuses the node settings to get the replication type value. As INDEX_REPLICATION_TYPE_SETTING is an indexScope setting, we should get the value from the index metadata.

public class NodeVersionAllocationDecider extends AllocationDecider {

    public static final String NAME = "node_version";

    private final ReplicationType replicationType;

    public NodeVersionAllocationDecider(Settings settings) {
        replicationType = IndexMetadata.INDEX_REPLICATION_TYPE_SETTING.get(settings);
    }

    @Override
    public Decision canAllocate(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation) {
        if (shardRouting.primary()) {
            if (replicationType == ReplicationType.SEGMENT) {

Related component

Cluster Manager

To Reproduce

  1. build a cluster with different node versions.
  2. create a index with segment replication enabled.
  3. allocate the primary shard to a higher-version node while the corresponding replica shard on a lower-version node.
  4. Contrary to expectations, the allocation will succeed.

Expected behavior

I believe the correct code should be like

public class NodeVersionAllocationDecider extends AllocationDecider {

    public static final String NAME = "node_version";

    public NodeVersionAllocationDecider() {
    }

    @Override
    public Decision canAllocate(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation) {
        if (shardRouting.primary()) {
            IndexMetadata indexMd = allocation.metadata().getIndexSafe(shardRouting.index());
            final ReplicationType replicationType = IndexMetadata.INDEX_REPLICATION_TYPE_SETTING.get(indexMd.getSettings());
            if (replicationType == ReplicationType.SEGMENT) {
                List<ShardRouting> replicas = allocation.routingNodes()

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@andrross
Copy link
Member

[Triage - attendees 1 2 3 4 5 6]
@KunjueYu Thanks for filing! Any interest in putting together a PR to fix this issue along with a test that replicates it?

@KunjueYu
Copy link
Author

I have opened a pr. I'd appreciate it if you could review it.

[Triage - attendees 1 2 3 4 5 6] @KunjueYu Thanks for filing! Any interest in putting together a PR to fix this issue along with a test that replicates it?

@rajiv-kv rajiv-kv moved this from 🆕 New to 🏗 In progress in Cluster Manager Project Board Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cluster Manager Indexing:Replication Issues and PRs related to core replication framework eg segrep Storage:Remote
Projects
Status: 🏗 In progress
Status: 🆕 New
Development

Successfully merging a pull request may close this issue.

3 participants