Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert kpts in Kpoints to Sequence[tuple] and set it as property #3758

Merged
merged 42 commits into from
Apr 28, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Apr 14, 2024

Summary

As discussed in 6a06f3c#r140944147.

  • Convert kpts in Kpoints to Sequence[tuple[float, float, float]], and related type fixes
  • Set kpts as property and add its setter, also add unit tests for this property
  • Relocate custom types (Vector3D Matrix3D SitePropsType) to util.typing

Copy link

coderabbitai bot commented Apr 14, 2024

Warning

Rate Limit Exceeded

@DanielYang59 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 39 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.

Commits Files that changed from the base of the PR and between 596d256 and 879530c.

Walkthrough

The updates across various files in the pymatgen library primarily involve reorganization and clarification of type annotations and imports, particularly concerning Vector3D and Matrix3D types. These types are now centralized under pymatgen.util.typing. Enhancements in type safety and consistency, such as using tuples instead of lists for fixed-length sequences and improving method signatures with explicit type hints, are notable. These changes aim to increase the maintainability and clarity of the codebase.

Changes

Files Change Summary
interface.py, lattice.py, cif.py, res.py Moved Vector3D import from pymatgen.core.trajectory to pymatgen.util.typing.
trajectory.py, io/aims/outputs.py, io/aims/parsers.py Reorganized and introduced new type aliases like Matrix3D, SitePropsType. Adjusted imports related to these types.
io/cp2k/inputs.py, io/lobster/inputs.py, io/lobster/outputs.py Enhanced type annotations and handling of tuples. Improved conditional checks and error handling.
io/vasp/inputs.py, io/vasp/outputs.py, io/vasp/sets.py Extensive updates to type annotations, handling of Kpoints and Vector3D. Improved conditions and error handling in label processing.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@DanielYang59
Copy link
Contributor Author

I just opened this early PR @janosh. Seems a lot of work and I'm quite concerned this might break many others (in typing, I don't expect much functional change thought, except for mutability).

Would you suggest tuple[float, float, float] or tuple[int, int, int] for each single kpoint (compatibility for 0-decimal float? )?

@janosh janosh added io Input/output functionality vasp Vienna Ab initio Simulation Package types Type all the things api Application programming interface labels Apr 14, 2024
@janosh
Copy link
Member

janosh commented Apr 14, 2024

I would prefer typing as int since float will still work and only cause a type error

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 15, 2024

I would prefer typing as int since float will still work and only cause a type error

I agree with you on this, we should not cater too much to edge cases (0-decimal float in this case) , int is much clearer for KPoint.

Shouldn't we add another custom type to util.typing for simplicity (which also ensures that we don't need to change types everywhere like we're doing now, in case we need to change the type of Kpoint in the future)?

Kpoint = tuple[int, int, int]

Maybe do the same to coordinates in the near future?

Coordinate = tuple[float, float, float]

Meanwhile, maybe we should consider using a more general Sequence[Kpoint] instead of tuple[Kpoint]? Currently we just want to hint developers/users that kpts is a "collection" of Kpoint, and using tuple for Kpoint could provide fixed length. Do we have a good reason for immutability for the collection itself (I thought it might be natural to add/remove Kpoint from the collection)?

@janosh
Copy link
Member

janosh commented Apr 15, 2024

Shouldn't we add another custom type to util.typing for simplicity (which also ensures that we don't need to change types everywhere like we're doing now, in case we need to change the type of Kpoint in the future)?

good idea!

Maybe do the same to coordinates in the near future?

yes, i forgot to add could you move the Vector3D type to util/typing.py? not sure about renaming to Coordinate. Vector3D sounds more general to me.

Vector3D = tuple[float, float, float]

Meanwhile, maybe we should consider using a more general Sequence[Kpoint] instead of tuple[Kpoint]? Currently we just want to hint developers/users that kpts is a "collection" of Kpoint, and using tuple for Kpoint could provide fixed length. Do we have a good reason for immutability for the collection itself (I thought it might be natural to add/remove Kpoint from the collection)?

that's a good point. let's do it your way

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 15, 2024

yes, i forgot to add could you move the Vector3D type to util/typing.py? not sure about renaming to Coordinate. Vector3D sounds more general to me.

Vector3D = tuple[float, float, float]

Yes sure. Vector3D sounds better (more general), and could be also used on types like velocity which is not actually a "coordinate".

Meanwhile, maybe we should consider using a more general Sequence[Kpoint] instead of tuple[Kpoint]? Currently we just want to hint developers/users that kpts is a "collection" of Kpoint, and using tuple for Kpoint could provide fixed length. Do we have a good reason for immutability for the collection itself (I thought it might be natural to add/remove Kpoint from the collection)?

that's a good point. let's do it your way

Great. I would try implement this. Thanks for the feedback.

Meanwhile, I don't know what is the best place to put the type Kpoint. Contrary to Vector3D, this sounds more specific to vasp. But if we put it under io.vasp, it might be confused with Kpoints (the data class). Though don't need to worry about this too much now, it could be easily relocated (but might break other's type checking).

@DanielYang59 DanielYang59 marked this pull request as ready for review April 28, 2024 08:48
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 28, 2024

Please review this if you have time @janosh :)

I ended up adding type alias as Kpoint = Union[tuple[float, float, float], tuple[int,]]:

  • tuple[int,] for "automatic" mode:
    @classmethod
    def automatic(cls, subdivisions) -> Self:
    """
    Constructor for a fully automatic Kpoint grid, with
    gamma centered Monkhorst-Pack grids and the number of subdivisions
    along each reciprocal lattice vector determined by the scheme in the
    VASP manual.
    Args:
    subdivisions: Parameter determining number of subdivisions along
    each reciprocal lattice vector.
    Returns:
    Kpoints object
    """
    return cls("Fully automatic kpoint scheme", 0, style=Kpoints.supported_modes.Automatic, kpts=[[subdivisions]])
  • I have to use float for tuple[float, float, float] to consider the line mode (where each Kpoint is actually a Vector3D).

Any suggestion would be appreciated :)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Out of diff range and nitpick comments (4)
pymatgen/io/vasp/inputs.py (1)

Line range hint 1083-1099: Consider adding type hints for the constructor parameters of the Kpoints class to improve code readability and maintainability.

- def __init__(
+ def __init__(self,
      comment: str = "Default gamma",
      num_kpts: int = 0,
      style: KpointsSupportedModes = supported_modes.Gamma,
      kpts: Sequence[Kpoint] = ((1, 1, 1),),
      kpts_shift: Vector3D = (0, 0, 0),
      kpts_weights: list[float] | None = None,
      coord_type: Literal["Reciprocal", "Cartesian"] | None = None,
      labels: list[str] | None = None,
      tet_number: int = 0,
      tet_weight: float = 0,
      tet_connections: list[tuple] | None = None,
  ) -> None:
pymatgen/io/vasp/sets.py (3)

Line range hint 235-235: Add type hints to the class definition for better clarity and consistency with modern Python practices.

- class DictSet(VaspInputSet):
+ class DictSet(VaspInputSet):

Line range hint 297-297: Add type hints to the class definition for better clarity and consistency with modern Python practices.

- class MITRelaxSet(DictSet):
+ class MITRelaxSet(DictSet):

Line range hint 315-315: Add type hints to the class definition for better clarity and consistency with modern Python practices.

- class MPRelaxSet(DictSet):
+ class MPRelaxSet(DictSet):

tests/io/vasp/test_sets.py Show resolved Hide resolved
tests/io/vasp/test_sets.py Show resolved Hide resolved
tests/io/vasp/test_sets.py Show resolved Hide resolved
pymatgen/util/typing.py Show resolved Hide resolved
tests/io/vasp/test_inputs.py Show resolved Hide resolved
pymatgen/io/cp2k/inputs.py Show resolved Hide resolved
pymatgen/io/vasp/inputs.py Show resolved Hide resolved
pymatgen/io/vasp/inputs.py Outdated Show resolved Hide resolved
pymatgen/io/vasp/inputs.py Show resolved Hide resolved
pymatgen/io/vasp/sets.py Show resolved Hide resolved
Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @DanielYang59, this looks good! just 2 minor comments 👍

pymatgen/util/typing.py Outdated Show resolved Hide resolved
pymatgen/io/vasp/inputs.py Show resolved Hide resolved
@DanielYang59 DanielYang59 changed the title Convert kpts in Kpoints to tuple and set it as property Convert kpts in Kpoints to Sequence[tuple] and set it as property Apr 28, 2024
@DanielYang59 DanielYang59 requested a review from janosh April 28, 2024 13:12
Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff as usual, thanks @DanielYang59! 👍

you're writing excellent tests 🧪

@janosh janosh merged commit bf2f421 into materialsproject:master Apr 28, 2024
22 checks passed
@DanielYang59 DanielYang59 deleted the kpoints-tuple-type branch April 29, 2024 01:48
@DanielYang59
Copy link
Contributor Author

you're writing excellent tests 🧪

Thanks for reviewing and saying that! Appreciate it.

@DanielYang59
Copy link
Contributor Author

@janosh Also worth noting it appears uv is less reliable than native pip, as you might have noticed in tests for previous several commits where tests failed randomly. I'm not sure how much speedup uv gives over native pip, but if not that much, I would personally prefer a "slower but more reliable" method than a "faster but less reliable" method (because I assume CI is not performance critical?).

I was always wondering, does GitHub provide some dedicated container for CI tests (like codespace for CI)? As I notice environment preparation time is almost as much as actual test running time, it would be good to avoid building a new test env every time when there is no dependency change?

@janosh
Copy link
Member

janosh commented Apr 29, 2024

@janosh Also worth noting it appears uv is less reliable than native pip,

are you referring to this error?

E       _tkinter.TclError: Can't find a usable tk.tcl in the following directories: 

That also used to happen with pip occasionally. I wasn't keeping a close eye on CI last week so it's possible i missed other issues but didn't notice anything specific to uv other than the pytorch installation time out.

I was always wondering, does GitHub provide some dedicated container for CI tests (like codespace for CI)?

The setup-python GH action, supports caching which we used before with pip. But I read in here that it's slower than installing from scratch with uv. I didn't test this myself, but I also noticed that the post-setup step for creating the pip cache could be quite slow so it's possible that for uv it's slower than not using a cache

@DanielYang59
Copy link
Contributor Author

are you referring to this error?

E       _tkinter.TclError: Can't find a usable tk.tcl in the following directories: 

That also used to happen with pip occasionally. I wasn't keeping a close eye on CI last week so it's possible i missed other issues

Yes I did notice this error multiple times within this PR, and sometimes the following:

>> Run micromamba activate pmg
/home/runner/work/_temp/225b20ec-6d19-4bb8-b69d-17e4bb02f7d0.sh: line 2: pytest: command not found

for example here. I would let you know if I notice any others.

The reason seems to be: in cases where uv failed to install some package, it would not interrupt the workflow (not sure if intended), and thus the "dependency installation" stage would pass without any issue, for example this.

didn't notice anything specific to uv other than the pytorch installation time out.

For some reason uv cannot install boltztrap2 in #3786 too, other than this, no other issues so far.

I was always wondering, does GitHub provide some dedicated container for CI tests (like codespace for CI)?

The setup-python GH action, supports caching which we used before with pip. But I read in here that it's slower than installing from scratch with uv. I didn't test this myself, but I also noticed that the post-setup step for creating the pip cache could be quite slow so it's possible that for uv it's slower than not using a cache

That sounds interesting and worth some experiments on it. I might give it a try in the near future and let you know how everything goes :) Thanks for the input!

@janosh
Copy link
Member

janosh commented Apr 29, 2024

>> Run micromamba activate pmg
/home/runner/work/_temp/225b20ec-6d19-4bb8-b69d-17e4bb02f7d0.sh: line 2: pytest: command not found

you're right, i did see that one as well. are you sure this is a uv issue? could also be a micromamba or a uv-micromamba interop issue

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 29, 2024

I'm not 100% sure. But as we could see from the log here inside the "install pymatgen and dependencies" section:

error: Failed to download distributions
  Caused by: Failed to fetch wheel: chgnet==0.3.5
  Caused by: Failed to extract archive
  Caused by: error decoding response body
  Caused by: request or response body error
  Caused by: error reading a body from connection
  Caused by: end of file before message length reached

I would assume it's likely a uv issue (uv randomly failed to install some package), as the venv creates and activates just fine. But I'm not sure if micromamba is in play here.

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented May 7, 2024

FYI that this is a slight breaking change re: kpts no longer being a list but a tuple, although you likely are aware of this due to the change in the test suite.

Edit: nvm, I see the comment here 6a06f3c#r140944147

@DanielYang59
Copy link
Contributor Author

Hi @Andrew-S-Rosen, yes we're fully aware it might be breaking to change Kpoint from a list to a tuple. Janosh said he prefer it be to immutable wherever possible, and I agree using tuple would bring more convenience when typing (list[float, float, float] would not work).

But frankly I'm not quite sure if there is any unexpected side effect. Ideally things should only rely on the values of Kpoint instead of the type. Any suggestion would be appreciated.

@Andrew-S-Rosen
Copy link
Member

I don't think there will be too many practical side effects. I just noticed it failing my CI runs because I was checking for exact matches on [kpt, kpt, kpt] instead of (kpt, kpt, kpt).

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented May 7, 2024

Yes the original unit tests were doing the same and we have to change the types too. Hopefully there would not be many operations that replies on Kpoint to be exactly list or mutable. Thanks for the input!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Application programming interface io Input/output functionality types Type all the things vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants