Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dataclasses to store and document parameters #68

Open
lukethehuman opened this issue Sep 3, 2024 · 2 comments
Open

Use dataclasses to store and document parameters #68

lukethehuman opened this issue Sep 3, 2024 · 2 comments
Labels
code quality documentation Improvements or additions to documentation enhancement New feature or request

Comments

@lukethehuman
Copy link
Collaborator

Use python dataclasses instead of dicts to store and use the various large sets of parameters required in the Hypnos workflow.

Documentation here: https://docs.python.org/3/library/dataclasses.html

Dataclasses provide a convenient place for us to document the required parameters and options in the dataclass docstring, as well as add type hints, in a much more convient format than the equivalent standard class object init we would need to write/maintain. Dataclasses can be quickly instantiated from dictionaries by unpacking (i.e. DataClass(**dict)) assuming the keys match the args.

For example:

@dataclass
class ComponentMaterials:
    """Dataclass for a given component's geometric parameters.
    
    Parameters
    ----------
    body : str
        The body material. Can be either "steel" or "super steel".
    widgets : str
        The widget material. Can be one of "raspberry jam", "blackcurrant jam", or "strawberry jam".
    """

    body: str
    widgets: str

Another benefit is the __post_init__ method, which gives us a sensible place to compute derived parameters automatically upon instantiation. It is especially useful to do this when multiple components of an assembly can access the same dataclass instance for their parameters.

@dataclass
class MyAssemblyGeometry:
    """Dataclass for a given assembly's geometric parameters.
    """

    radius: float
    length: float
    num_widgets: int
    reticulate_splines: bool

    def __post_init__(self):
        """Calculate derived parameters.
        """
        self.widget_radius = self.radius/10

Instantiating the classes would be relatively seamless from the JSON format, with something like this in the MyAssembly.__init__():

with open(json_file, "r") as file:
    raw_data = file.read()
    data = json.loads(raw_data)

materials_dict = data["materials"]
self.materials = MyAssemblyMaterials(**materials_dict)
geometry_dict = data["geometry"]
self.geometry = MyAssemblyGeometry(**geometry_dict)

Or to go a step further, we could have the components/assemblies ask for their corresponding dataclasses as arguements, which would allow readers of the code to easily find the dataclass docstrings which describe the parameters required by a component. (This would also enable things like IDE autocompletion for developers).

The existing ease of instantiation from json could be maintained with a json classmethod which unpacks the file, and keeping dict as a valid arg type.

Something like:

# User's code
component_instance = MyAssembly.from_json(json_file)
# Hypnos code
class MyAssembly(CreatedComponentAssembly):

    def __init__(
        materials: MyAssemblyMaterials | dict
        geometry: MyAssemblyGeometry | dict
    ):
        if isinstance(materials, dict):
            materials = MyAssemblyMaterials(**materials)
        if isinstance(geometry, dict):
            geometry = MyAssemblyGeometry(**geometry)
        super().__init__()
    
    @classmethod
    def from_json(cls, json_file):
        with open(json_file, "r") as file:
            raw_data = file.read()
            data = json.loads(raw_data)
        return cls(**data)

An alternate version of the above code block, which avoids the variable type args would be:

# Hypnos code
class MyAssembly(CreatedComponentAssembly):

    def __init__(
        materials: MyAssemblyMaterials
        geometry: MyAssemblyGeometry
    ):
        super().__init__()
    
    @classmethod
    def from_json(cls, json_file):
        with open(json_file, "r") as file:
            raw_data = file.read()
            data = json.loads(raw_data)
        materials = data["materials"]
        geometry = data["geometry"]
        materials = MyAssemblyMaterials(**materials)
        geometry = MyAssemblyGeometry(**geometry)
        return cls(materials, geometry)
@lukethehuman lukethehuman added documentation Improvements or additions to documentation enhancement New feature or request code quality labels Sep 3, 2024
@lukethehuman
Copy link
Collaborator Author

Note that dataclasses support default values, so we can use them to store default values for parameters like so:

class MyAssemblyGeometry:
    """Dataclass for a given assembly's geometric parameters.
    """

    radius: float = 6e-3  # m
    length: float = 19.94  # m
    num_widgets: int = 7
    reticulate_splines: bool = True

@lukethehuman
Copy link
Collaborator Author

Suggest we use a standard post init on the geometry dataclass like so:

    def __post_init__(self):
        self.calculate_derived_parameters()
        self.validate_parameters()

We can also use pydantic's PositiveFloat type to automatically validate any such parameters without needing to explicitly check those in the validate_params() method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code quality documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant