Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add conflict markers to the lock file #9370

Merged
merged 5 commits into from
Dec 10, 2024
Merged

add conflict markers to the lock file #9370

merged 5 commits into from
Dec 10, 2024

Conversation

BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Nov 22, 2024

This PR adds a notion of "conflict markers" to the lock file as an
attempt to address #9289. The idea is to encode a new kind of boolean
expression indicating how to choose dependencies based on which extras
are activated.

As an example of what conflict markers look like, consider one of the cases
brought up in #9289, where anyio had unconditional dependencies on
two different versions of idna. Now, those are gated by markers, like this:

        [[package]]
        name = "anyio"
        version = "4.3.0"
        source = { registry = "https://pypi.org/simple" }
        dependencies = [
            { name = "idna", version = "3.5", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-7-project-foo'" },
            { name = "idna", version = "3.6", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-7-project-bar' or extra != 'extra-7-project-foo'" },
            { name = "sniffio" },
        ]

The odd extra values like extra-7-project-foo are an encoding of not just the conflicting extra (foo) but also the package it's declared for (project). We need both bits of information because different packages may have the same extra name, even if they are completely unrelated. The extra- part is a prefix to distinguish it from groups (which, in this case, would be encoded as group-7-project-foo if foo were a dependency group). And the 7 part indicates the length of the package name which makes it possible to parse out the package and extra name from this encoding. (We don't actually utilize that property, but it seems like good sense to do it in case we do need to extra information from these markers.)

While this preserves PEP 508 compatibility at a surface level, it does require utilizing this encoding scheme in order
to evaluate them when they're present (which only occurs when conflicting extras/groups are declared).

My sense is that the most complex part of this change is not just adding conflict markers, but their simplification. I tried to address this in the code comments and commit messages.

Reviewers should look at this commit-by-commit.

Fixes #9289, Fixes #9546, Fixes #9640, Fixes #9622, Fixes #9498, Fixes #9701, Fixes #9734

@BurntSushi
Copy link
Member Author

The relevant change to the lock file from #9289 (comment) is here:

[[package]]
name = "anyio"
version = "4.6.2.post1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
    { name = "idna", version = "3.9", source = { registry = "https://pypi.org/simple" }, conflict-marker = "extra == 'bar' or extra != 'foo'" },
    { name = "idna", version = "3.10", source = { registry = "https://pypi.org/simple" }, conflict-marker = "extra == 'foo'" },
    { name = "sniffio" },
]

Notice the new conflict-marker field. It doesn't appear on any other entry in the lock file. (That was one of the hardest parts of this PR.)

@@ -3867,7 +3867,7 @@ fn lock_conflicting_mixed() -> Result<()> {

[package.dev-dependencies]
project1 = [
{ name = "sortedcontainers", version = "2.3.0", source = { registry = "https://pypi.org/simple" } },
{ name = "sortedcontainers", version = "2.3.0", source = { registry = "https://pypi.org/simple" }, conflict-marker = "extra != 'project2'" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these are going to show up in the lockfile? In that case, I remain somewhat biased towards using group = rather than encoding the names in the string value.

Copy link
Member Author

@BurntSushi BurntSushi Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoding the names in the string value would be an internal detail not exposed to the lock file. It would be written like it is here, and when groups are added, it would be group != 'foobar'.

(My three choices given on Discord all have the same behavior with respect to the lock file and how the conflict markers are written. The only differences between them are how we implement it.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, ack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow up here: when I wrote the above comment, I was still thinking we'd have "normal PEP 508 marker" and "conflict marker" as separate things in the lock file. And in that scenario, it seemed more justifiable to have our own little custom syntax for the conflict marker that we mapped back to PEP 508 markers internally. But since then, I've come to understand that these markers need to be combined since the conflict markers themselves can be dependent on the PEP 508 markers.

I didn't want to make the markers no longer parseable by PEP 508... Maybe that's something we should do in the future, but I felt it was good not to break that property at least for now. But that does mean that we end up with some funny looking markers that I was originally trying to avoid. (The comments and commit messages give more detail.)

@@ -19421,8 +19421,6 @@ fn lock_multiple_sources_index_disjoint_extras_with_marker() -> Result<()> {
version = "3.1.3"
source = { registry = "https://download.pytorch.org/whl/cu124" }
resolution-markers = [
"sys_platform == 'darwin'",
"sys_platform != 'darwin'",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bug, perhaps? Why write it when it's empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think writing it when it's empty is probably an extant bug, perhaps exacerbated by conflicting extras. (Or maybe even only made possible by it.) I can look into it.

But as for deleting the markers here, I believe it's correct. I wrote up an explanation for it in the commit message.

/// For example, given an edge like `foo[x1] -> bar`, then it is known that
/// `x1` is activated. This in turn can be used to simplify any downstream
/// conflict markers with `extra == "x1"` in them.
pub(crate) fn simplify_conflict_markers(graph: &mut Graph<ResolutionGraphNode, UniversalMarker>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to see an example of this.

@BurntSushi
Copy link
Member Author

OK, this PR should actually be ready for some very rough testing, but still only with extras, not groups. At the very least, this PR does currently fix #9289:

[andrew@duff mre]$ run-uv pr1 -- sync
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Using CPython 3.11.9
Creating virtual environment at: .venv
Resolved 5 packages in 10ms
Audited in 0.01ms
[andrew@duff mre]$ run-uv pr1 -- sync --extra baz
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Resolved 5 packages in 2ms
Installed 3 packages in 1ms
 + anyio==4.6.2.post1
 + idna==3.9
 + sniffio==1.3.1
[andrew@duff mre]$ run-uv pr1 -- sync --extra baz --extra bar
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Resolved 5 packages in 2ms
Audited 3 packages in 0.02ms
[andrew@duff mre]$ run-uv pr1 -- sync --extra baz --extra foo
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Resolved 5 packages in 2ms
Uninstalled 1 package in 0.62ms
Installed 1 package in 0.85ms
 - idna==3.9
 + idna==3.10
[andrew@duff mre]$ run-uv pr1 -- sync --extra bar
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Resolved 5 packages in 2ms
Uninstalled 3 packages in 0.91ms
Installed 1 package in 0.83ms
 - anyio==4.6.2.post1
 - idna==3.10
 + idna==3.9
 - sniffio==1.3.1
[andrew@duff mre]$ run-uv pr1 -- sync --extra foo
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.12s
Resolved 5 packages in 2ms
Uninstalled 1 package in 0.61ms
Installed 1 package in 0.86ms
 - idna==3.9
 + idna==3.10

If we can get some confidence in this technique with extras, then I'm pretty sure I know how to extend it to groups.

I think the main thing I'm worried about right now is that the logic in this PR might be treating extras as a global resource. So if your top-level package foo has an extra x1 and some random transitive dependency bar also has an extra named x1 that isn't enabled (directly or indirectly by enabling foo[x1]), then I think my current logic will falsely assume that bar[x1] is active. So that could result in following dependency edges that aren't actually activated.

I believe this can be resolved by enhancing conflict markers to include package names instead of just extra and group names. But this also doesn't feel right to me, for reasons that I'm finding difficult to articulate. Another hiccup here is ExtrasSpecification and DevGroupsManifest. As far as I can tell, I am likely misusing it (in the last commit in this PR). It seems like it is only intended to be used with the root package, but in this PR, I am using it for any arbitrary dependency to check which extras are active or not. So there are probably still some knots here to untangle, but I think it's probably an evolution on what I've done so far.

@BurntSushi
Copy link
Member Author

OK, I think I found a case where treating extras as more "global" than they ought to be leads to incorrect behavior.

Consider this pyproject.toml:

[project]
name = "project"
version = "0.1.0"
requires-python = ">=3.11,<3.12"
dependencies = [
  "proxy1",
  "anyio>=4",
]

[tool.uv.workspace]
members = ["proxy1"]

[project.optional-dependencies]
x1 = ["idna==3.9"]

[tool.uv.sources]
proxy1 = { workspace = true }

[tool.uv]
conflicts = [
  [
    {package = "project", extra = "x1"},
    {package = "proxy1", extra = "x2"},
    {package = "proxy1", extra = "x3"},
  ],
]

And this at proxy1/pyproject.toml:

[project]
name = "proxy1"
version = "0.1.0"
requires-python = ">=3.11,<3.12"
dependencies = []

[project.optional-dependencies]
x2 = ["idna==3.8"]
x3 = ["idna==3.7"]

And now do a sync:

$ rm -rf .venv uv.lock && run-uv pr1 -- sync
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Using CPython 3.11.9
Creating virtual environment at: .venv
Resolved 7 packages in 13ms
Installed 3 packages in 2ms
 + anyio==4.6.2.post1
 + idna==3.7
 + sniffio==1.3.1

and notice that idna==3.7 gets installed. I think it would be correct for either idna==3.7 or idna==3.8 to be installed here, but the point is to notice the version that gets picked. Now let's watch what happens when we enable extra x2:

$ rm -rf .venv uv.lock && run-uv pr1 -- sync --extra x2
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Using CPython 3.11.9
Creating virtual environment at: .venv
Resolved 7 packages in 54ms
Installed 3 packages in 1ms
 + anyio==4.6.2.post1
 + idna==3.8
 + sniffio==1.3.1

In this case, idna==3.8 gets installed, because that's the dependency attached to the x2 extra in the proxy1 package. But, crucially, notice that the top-level package does not have an extra named x2. So I believe --extra x2 should have no effect whatsoever. So the fact that --extra x2 influences which package is installed here seems wrong. I believe it should have no effect.

So I guess our conflict markers need to include not just the extra/group name, but the package activating it. I think that would solve this.

@charliermarsh
Copy link
Member

Nice.

@BurntSushi BurntSushi added the internal A refactor or improvement that is not user-facing label Nov 23, 2024
@BurntSushi BurntSushi changed the base branch from main to ag/moar-refactoring November 23, 2024 15:59
Base automatically changed from ag/moar-refactoring to main November 23, 2024 18:14
@BurntSushi BurntSushi force-pushed the ag/i9289 branch 4 times, most recently from e3ef041 to f0fa231 Compare November 26, 2024 19:58
@BurntSushi
Copy link
Member Author

OK, I finally have #9289 fixed in this PR (including all examples in that issue and some other examples that I came up with). I do not believe there are any remaining known cases of ambiguity for extras (I haven't handled groups yet), so if anyone is willing to try and "break" this PR with just conflicting extras, that would be great.

This PR is not in a mergeable state. So my next focus is to get it into one.

One major change in this PR is that, while the markers being written to the lock file are still PEP 508 compatible, they may now include some very odd extra values. I haven't fully fleshed it out in this PR yet, but they will ultimately look something like platform_system == 'Darwin' and (extra == 'extra-0-7-8-11-pytorch-cpu' or extra == 'group-0-7-8-11-pytorch-wat'. The reasoning for this (and the precise format shown here) is pretty complicated. There are other points in the design space, but I believe they all require abdicating PEP 508 for markers in the lock file. The high level reasoning here is that some edges in the lock file, when conflicting extras/groups are specified, must be conditionally walked at installation time depending on which extras/groups are activated. And that conditional logic can actually vary depending on platform, which means that the conditional logic for extras/groups has to be married to the conditional logic for the current platform.

/// then its "universal" marker looks like this:
///
/// ```text
/// sys_platform == 'Darwin' and extra == 'extra-3-foo-x1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "3" here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I explained this somewhere, but it might have only been in a commit message. So I added a code comment here too:

    /// Why `extra-3-foo-x1`?
    ///
    /// * The `extra` prefix is there to distinguish it from `group`.
    /// * The `3` is there to indicate the length of the package name,
    /// in bytes. This isn't strictly necessary for encoding, but is required
    /// if we were ever to need to decode a package and extra/group name from a
    /// conflict marker.
    /// * The `foo` package name ensures we namespace the extra/group name,
    /// since multiple packages can have the same extra/group name.

Does that help?

// character in `package` or `extra` values. But if we know the length of
// the package name, we can always parse each field unambiguously.
let package_len = package.as_str().len();
ExtraName::new(format!("extra-{package_len}-{package}-{extra}")).unwrap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh ok. I see now. It's sort of unfortunate that this representation is "user-facing" in the lockfile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that I have a better answer. We could do something that's legal but not normalized...? Like use dashes for one piece?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, I 100% agree that this is unfortunate. I just don't know of a better way to go about it without inventing our own marker system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we bound to PEP 508, or could we do something like adding a sigil that only allowed here and in the lockfile but gets rejected everywhere else, or even as a specific new marker type? I see the problem with the MarkerTree being nested and such, but this looks like a potential collision and it's user facing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this looks like a potential collision and it's user facing

I find this very difficult to reason about. On the one hand, it is certainly true that someone could use extra-3-pkg-foo as an extra name, and thus potentially collide with a separately defined extra named foo for the package pkg. (I think such cases would be pathological given the baroqueness of extra-3-pkg-foo. You not only need an extra with that name but it also has to be declared as a conflicting extra.) On the other hand, even if a collision does occur, it's actually not totally clear to me that it would cause any problems. That's because we tend to strip the "normal" extra == 'foo' from markers at various points.

Are we bound to PEP 508, or could we do something like adding a sigil that only allowed here and in the lockfile but gets rejected everywhere else, or even as a specific new marker type?

A new marker type is possible (I think I suggested a new marker type or even something that isn't a marker somewhere), but I believe it's costly in terms of implementation effort. And even if we do have a new marker type, there is enough inter-operation going on that we'd need to be able to convert between the marker types. I think it's doable, but yeah, a fair bit of work.

Some extra sigil that isn't typically allowed in extra names is also possible... I think it's probably less work than adding a new marker type. And the implementation kinda-sorta already makes room for it. There's definitely some refactoring cost involved here because we wouldn't be able to use ExtraName or GroupName. So that means new APIs for evaluating markers with something looser than &[ExtraName]. And new parsing APIs to "allow" the sigil in certain cases but continue to reject it in others. Overall possible, but I'd say it's at least a couple days of work.

Since I think that, in order for a collision to occur, it has to be pathological, I'm not sure it's worth doing. One thing I can do as a mitigation is add a test that specifically provokes a collision and ensure behavior is reasonable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we bound to PEP 508

I think we are bound to PEP 508 insomuch as we want to be. While these conflict markers are technically valid PEP 508 markers, I think they are definitely a step toward PEP 508 incompatibility since they require an encoding scheme to use properly.

Copy link
Member Author

@BurntSushi BurntSushi Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test that instigates a collision and things seem to be okay:

/// This tests a somewhat pathological case where a package has an extra whose
/// name corresponds to uv's conflicting extra encoding of another extra. That
/// is, an extra `foo` and an extra `extra-3-pkg-foo`.
///
/// In theory these could collide and cause problems. But in practice, we don't
/// involve the `extra == "foo"` marker in the same places, I believe, as we do
/// `extra == "extra-3-pkg-foo"`.
///
/// Ref: https://github.com/astral-sh/uv/pull/9370#discussion_r1876083284
#[test]
fn collision_extra() -> Result<()> {
let context = TestContext::new("3.12");
let pyproject_toml = context.temp_dir.child("pyproject.toml");
pyproject_toml.write_str(
r#"
[project]
name = "pkg"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = ["anyio"]
[project.optional-dependencies]
foo = ["idna==3.5"]
bar = ["idna==3.6"]
extra-3-pkg-foo = ["sortedcontainers>=2"]
[tool.uv]
conflicts = [
[
{ extra = "foo" },
{ extra = "bar" },
],
]
"#,
)?;
uv_snapshot!(context.filters(), context.lock(), @r###"
success: true
exit_code: 0
----- stdout -----
----- stderr -----
Resolved 6 packages in [TIME]
"###);
let lock = context.read("uv.lock");
insta::with_settings!({
filters => context.filters(),
}, {
assert_snapshot!(
lock,
@r###"
version = 1
requires-python = ">=3.12"
resolution-markers = [
]
conflicts = [[
{ package = "pkg", extra = "foo" },
{ package = "pkg", extra = "bar" },
]]
[options]
exclude-newer = "2024-03-25T00:00:00Z"
[[package]]
name = "anyio"
version = "4.3.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "idna", version = "3.5", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-3-pkg-foo'" },
{ name = "idna", version = "3.6", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-3-pkg-bar' or extra != 'extra-3-pkg-foo'" },
{ name = "sniffio" },
]
sdist = { url = "https://files.pythonhosted.org/packages/db/4d/3970183622f0330d3c23d9b8a5f52e365e50381fd484d08e3285104333d3/anyio-4.3.0.tar.gz", hash = "sha256:f75253795a87df48568485fd18cdd2a3fa5c4f7c5be8e5e36637733fce06fed6", size = 159642 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/14/fd/2f20c40b45e4fb4324834aea24bd4afdf1143390242c0b33774da0e2e34f/anyio-4.3.0-py3-none-any.whl", hash = "sha256:048e05d0f6caeed70d731f3db756d35dcc1f35747c8c403364a8332c630441b8", size = 85584 },
]
[[package]]
name = "idna"
version = "3.5"
source = { registry = "https://pypi.org/simple" }
resolution-markers = [
]
sdist = { url = "https://files.pythonhosted.org/packages/9b/c4/db3e4b22ebc18ee797dae8e14b5db68e5826ae6337334c276f1cb4ff84fb/idna-3.5.tar.gz", hash = "sha256:27009fe2735bf8723353582d48575b23c533cc2c2de7b5a68908d91b5eb18d08", size = 64640 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/ea/65/9c7a31be86861d43da3d4f8661f677b38120320540773a04979ad6fa9ecd/idna-3.5-py3-none-any.whl", hash = "sha256:79b8f0ac92d2351be5f6122356c9a592c96d81c9a79e4b488bf2a6a15f88057a", size = 61566 },
]
[[package]]
name = "idna"
version = "3.6"
source = { registry = "https://pypi.org/simple" }
resolution-markers = [
]
sdist = { url = "https://files.pythonhosted.org/packages/bf/3f/ea4b9117521a1e9c50344b909be7886dd00a519552724809bb1f486986c2/idna-3.6.tar.gz", hash = "sha256:9ecdbbd083b06798ae1e86adcbfe8ab1479cf864e4ee30fe4e46a003d12491ca", size = 175426 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/c2/e7/a82b05cf63a603df6e68d59ae6a68bf5064484a0718ea5033660af4b54a9/idna-3.6-py3-none-any.whl", hash = "sha256:c05567e9c24a6b9faaa835c4821bad0590fbb9d5779e7caa6e1cc4978e7eb24f", size = 61567 },
]
[[package]]
name = "pkg"
version = "0.1.0"
source = { virtual = "." }
dependencies = [
{ name = "anyio" },
]
[package.optional-dependencies]
bar = [
{ name = "idna", version = "3.6", source = { registry = "https://pypi.org/simple" } },
]
extra-3-pkg-foo = [
{ name = "sortedcontainers" },
]
foo = [
{ name = "idna", version = "3.5", source = { registry = "https://pypi.org/simple" } },
]
[package.metadata]
requires-dist = [
{ name = "anyio" },
{ name = "idna", marker = "extra == 'bar'", specifier = "==3.6" },
{ name = "idna", marker = "extra == 'foo'", specifier = "==3.5" },
{ name = "sortedcontainers", marker = "extra == 'extra-3-pkg-foo'", specifier = ">=2" },
]
[[package]]
name = "sniffio"
version = "1.3.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/a2/87/a6771e1546d97e7e041b6ae58d80074f81b7d5121207425c964ddf5cfdbd/sniffio-1.3.1.tar.gz", hash = "sha256:f4324edc670a0f49750a81b895f35c3adb843cca46f0530f79fc1babb23789dc", size = 20372 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235 },
]
[[package]]
name = "sortedcontainers"
version = "2.4.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/e8/c4/ba2f8066cceb6f23394729afe52f3bf7adec04bf9ed2c820b39e19299111/sortedcontainers-2.4.0.tar.gz", hash = "sha256:25caa5a06cc30b6b83d11423433f65d1f9d76c4c6a0c90e3379eaa43b9bfdb88", size = 30594 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl", hash = "sha256:a163dcaede0f1c021485e957a39245190e74249897e2ae4b2aa38595db237ee0", size = 29575 },
]
"###
);
});
uv_snapshot!(context.filters(), context.sync().arg("--frozen"), @r###"
success: true
exit_code: 0
----- stdout -----
----- stderr -----
Prepared 3 packages in [TIME]
Installed 3 packages in [TIME]
+ anyio==4.3.0
+ idna==3.6
+ sniffio==1.3.1
"###);
// The extra `extra-3-pkg-foo` is meant to collide with the encoded
// extra name generated by the extra `foo`. When `foo` is enabled,
// we expect to see `idna==3.5`, but when `extra-3-pkg-foo` is enabled,
// we don't. Instead, we should just see `anyio` and `sortedcontainers`
// installed.
uv_snapshot!(
context.filters(),
context.sync().arg("--frozen").arg("--extra=extra-3-pkg-foo"),
@r###"
success: true
exit_code: 0
----- stdout -----
----- stderr -----
Prepared 1 package in [TIME]
Installed 1 package in [TIME]
+ sortedcontainers==2.4.0
"###
);
// Verify that activating `foo` does result in `idna==3.5`.
uv_snapshot!(
context.filters(),
context.sync().arg("--frozen").arg("--extra=foo"),
@r###"
success: true
exit_code: 0
----- stdout -----
----- stderr -----
Prepared 1 package in [TIME]
Uninstalled 2 packages in [TIME]
Installed 1 package in [TIME]
- idna==3.6
+ idna==3.5
- sortedcontainers==2.4.0
"###
);
// And that activating both is fine and dandy. We get `idna==3.5`
// and `sortedcontainers`.
uv_snapshot!(
context.filters(),
context.sync().arg("--frozen").arg("--extra=extra-3-pkg-foo").arg("--extra=foo"),
@r###"
success: true
exit_code: 0
----- stdout -----
----- stderr -----
Installed 1 package in [TIME]
+ sortedcontainers==2.4.0
"###
);
Ok(())
}

I think it's because we don't tend to have both extra == "foo" and extra == "pkg-3-foo-extra" in the same marker at the same time. (And I use the weasel words "tend to" because I find it very hard to reason about this.)

@BurntSushi
Copy link
Member Author

@konstin I'm going to hold off merging this until you're able to review it since this is making user visible changes to the lock file. I want to give folks a chance to give input before this gets merged, because once this goes out to end users, making changes at that point will become more annoying (or provoke churn).

@NellyWhads
Copy link

Confirmed to resolve #9701 as well.

Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fix the graph traversal order (if my example is correct), other looks good.

)?;

let mut cmd = context.sync();
// I guess --exclude-newer doesn't work with the torch indices?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The torch indices are missing the upload date

.collect();

let mut seen: FxHashSet<NodeIndex> = FxHashSet::default();
while let Some(parent_index) = queue.pop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the propagated value set order dependent? Below, the green and the red items from A and C are propagated through, but B -> D happens before E -> D, so B -> D isn't propagated. Numbers indicate the order of edge traversals.

PXL_20241209_130608932~2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier if we started at the conflicting nodes and propagated knowledge about the "side" of the conflict through from them? E.g.

[tool.uv]
conflicts = [
    [
      { extra = "extra1" },
      { extra = "extra2" },
    ],
]

we could start at project[extra1] and project[extra2] respectively, mark every node with (project, extra1) or (project, extra2) respectively and then simplify the outgoing edges of all nodes that only have either marker. This might make the world-knowledge easier to move in? (Caveat: That's from reading code only)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed in 8a5ce95

I opted to fix this by revisiting nodes when the set of extras gets updated.

// character in `package` or `extra` values. But if we know the length of
// the package name, we can always parse each field unambiguously.
let package_len = package.as_str().len();
ExtraName::new(format!("extra-{package_len}-{package}-{extra}")).unwrap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we bound to PEP 508, or could we do something like adding a sigil that only allowed here and in the lockfile but gets rejected everywhere else, or even as a specific new marker type? I see the problem with the MarkerTree being nested and such, but this looks like a potential collision and it's user facing

@BurntSushi BurntSushi force-pushed the ag/i9289 branch 3 times, most recently from ba47e18 to b5ef4c6 Compare December 9, 2024 15:37
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
This commit adds a new concept to uv: conflict markers. Conflict markers
resolve the ambiguity created in lock files, in some cases, when
conflicting extras/groups are activated. Conflict markers allow us to
make edges in the dependency graph conditional based only on which
extras are and aren't enabled.

Adding conflict markers themselves turned out to be somewhat easy. But
what was hard was simplifying them. A good portion of the change in this
commit is dedicated to simplification. We do two different kinds:

1. We imbibe "world knowledge" into the marker. That is, we take what we
   know about which extras/groups cannot be enabled simultaneously, then
   we simplify the markers based on this.
2. We walk the dependency graph and annotate which extras/groups are
   activated for each node for all possible paths into that node. When
   an extra (e.g., `foo`) is activated (or not activated) for all
   possible such paths, then we can replace `extra == 'foo'` (or
   `extra != 'foo'`) with `true` in any corresponding conflict marker
   expessions.

Without these simplifications, it's very easy for conflict markers to
appear almost everywhere in a lock file (when conflicting extras/groups
are declared). Many of which are trivially true and aren't actually
resolving any ambiguity and thus aren't needed.
These are all of the fixes to what's installed and the lock files that
we expected.
These are mostly just copying the existing tests for extras, but
replacing extras with groups. And then creating another set (where it
makes sense to) for tests with a conflict between an extra and a group.
BurntSushi added a commit that referenced this pull request Dec 10, 2024
This fixes a bug found by Konsti[1] where not all extras would propagate
correctly. This could lead to improper simplification steps.

The fix here is to revisit nodes in the graph unless no changes have
been made to the set of enabled extras.

We also add a regression test whose snapshot changes without this fix. I
tried writing a test case by hand but couldn't figure it out. The
original pytorch MRE in #9289 also has a different lock file with this
fix, but we really shouldn't be adding more pytorch tests given how
beefy they are. So I found this case using airflow, which is also beefy,
but hopefully good enough.

[1]: #9370 (comment)
@BurntSushi BurntSushi requested a review from konstin December 10, 2024 15:08
This fixes a bug found by Konsti[1] where not all extras would propagate
correctly. This could lead to improper simplification steps.

The fix here is to revisit nodes in the graph unless no changes have
been made to the set of enabled extras.

We also add a regression test whose snapshot changes without this fix. I
tried writing a test case by hand but couldn't figure it out. The
original pytorch MRE in #9289 also has a different lock file with this
fix, but we really shouldn't be adding more pytorch tests given how
beefy they are. So I found this case using airflow, which is also beefy,
but hopefully good enough.

[1]: #9370 (comment)
@konstin
Copy link
Member

konstin commented Dec 10, 2024

We probably want to replace extra_inferences with something slimmer in the future, but it's no so bad that's it's blocking this.

@BurntSushi BurntSushi merged commit edf875e into main Dec 10, 2024
64 checks passed
@BurntSushi BurntSushi deleted the ag/i9289 branch December 10, 2024 15:57
@BurntSushi
Copy link
Member Author

Note: I squashed and merged this PR because the commits were more optimized for review and didn't individually each pass tests.

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Dec 12, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.5.7` -> `0.5.8` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.5.8`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#058)

[Compare Source](astral-sh/uv@0.5.7...0.5.8)

**This release does not include the `powerpc64le-unknown-linux-musl` target due to a build issue. See [#&#8203;9793](astral-sh/uv#9793) for details. If this change affects you, please file an issue with your use-case.**

##### Enhancements

-   Omit empty resolution markers in lockfile  ([#&#8203;9738](astral-sh/uv#9738))
-   Add `--install-dir` to to `uv python install` and `uninstall` commands ([#&#8203;7920](astral-sh/uv#7920))
-   Add `--show-urls` and `--only-downloads` to `uv python list` ([#&#8203;8062](astral-sh/uv#8062))
-   Add `uv python list --all-arches` ([#&#8203;9782](astral-sh/uv#9782))
-   Add `uv run --gui-script` flag for running Python scripts with `pythonw.exe` ([#&#8203;9152](astral-sh/uv#9152))
-   Allow `--gui-script` on Unix ([#&#8203;9787](astral-sh/uv#9787))
-   Allow download of Python distribution variants optimized for newer x86\_64 microarchitectures ([#&#8203;9781](astral-sh/uv#9781))
-   Allow execution of `pyw` files on Unix ([#&#8203;9759](astral-sh/uv#9759))
-   Allow users to specify URLs in `project.dependencies` and `tool.uv.sources` ([#&#8203;9718](astral-sh/uv#9718))
-   Encode mutually-incompatible pairs of markers ([#&#8203;9444](astral-sh/uv#9444))
-   Improve the error message when a Python install request is not valid ([#&#8203;9783](astral-sh/uv#9783))
-   Preserve directory-level standalone build symlinks ([#&#8203;9723](astral-sh/uv#9723))
-   Add support for `uv publish --index <name>` ([#&#8203;9694](astral-sh/uv#9694))
-   Reframe `--locked` and `--frozen` as `--check` operations for `uv lock` ([#&#8203;9662](astral-sh/uv#9662))
-   Rename Python install scratch directory from `.cache` -> `.temp` ([#&#8203;9756](astral-sh/uv#9756))
-   Enable `uv tool uninstall uv` on Windows ([#&#8203;8963](astral-sh/uv#8963))
-   Improve self-dependency hint to make shadowing clear ([#&#8203;9716](astral-sh/uv#9716))
-   Refactor unavailable metadata to shrink the resolver ([#&#8203;9769](astral-sh/uv#9769))
-   Show 'depends on itself' for proxy packages ([#&#8203;9717](astral-sh/uv#9717))
-   Show a dedicated error for missing subdirectories ([#&#8203;9761](astral-sh/uv#9761))
-   Show a dedicated hint for missing `git+` prefixes ([#&#8203;9789](astral-sh/uv#9789))

##### Performance

-   Eagerly error when parsing `pyproject.toml` requirements ([#&#8203;9704](astral-sh/uv#9704))
-   Use copy-on-write when normalizing paths ([#&#8203;9710](astral-sh/uv#9710))

##### Bug fixes

-   Avoid enforcing non-conflicts in `uv export` ([#&#8203;9751](astral-sh/uv#9751))
-   Don't drop comments between items in TOML tables ([#&#8203;9784](astral-sh/uv#9784))
-   Don't fail with `--no-build` when static metadata is available ([#&#8203;9785](astral-sh/uv#9785))
-   Don't filter non-patch registry version ([#&#8203;9736](astral-sh/uv#9736))
-   Don't read metadata from stale `.egg-info` files ([#&#8203;9760](astral-sh/uv#9760))
-   Enforce correctness of self-dependencies ([#&#8203;9705](astral-sh/uv#9705))
-   Fix projects's typo in resolver error messages ([#&#8203;9708](astral-sh/uv#9708))
-   Ignore `.` prefixed directories during managed Python installation discovery ([#&#8203;9786](astral-sh/uv#9786))
-   Improve handling of invalid virtual environments during interpreter discovery ([#&#8203;8086](astral-sh/uv#8086))
-   Normalize relative paths when `--project` is specified ([#&#8203;9709](astral-sh/uv#9709))
-   Respect self-constraints on recursive extras ([#&#8203;9714](astral-sh/uv#9714))
-   Respect user settings for tracing coloring ([#&#8203;9733](astral-sh/uv#9733))
-   Retry on tar extraction errors ([#&#8203;9753](astral-sh/uv#9753))
-   Add conflict markers to the lock file ([#&#8203;9370](astral-sh/uv#9370))
-   De-duplicate resolution markers ([#&#8203;9780](astral-sh/uv#9780))
-   Avoid 403 error hint for PyTorch URLs ([#&#8203;9750](astral-sh/uv#9750))
-   Avoid treating non-existent `--find-links` as relative URLs ([#&#8203;9720](astral-sh/uv#9720))
-   Omit Windows Store `python3.13.exe` et al ([#&#8203;9679](astral-sh/uv#9679))
-   Replace executables with broken symlinks during `uv python install` ([#&#8203;9706](astral-sh/uv#9706))

##### Documentation

-   Fix build failure links ([#&#8203;9740](astral-sh/uv#9740))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
nathanpainchaud added a commit to nathanpainchaud/lightning-hydra-template that referenced this pull request Dec 13, 2024
uv v0.5.8 lands [this PR](astral-sh/uv#9370), which fixes issues where the `uv.lock` file generated when syncing could change between invocations because of the use of conflicting dependencies to manage CUDA versions (as recommended in uv's documentation)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment