Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infrastructure for saving/loading hls4ml models #1158

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

vloncar
Copy link
Contributor

@vloncar vloncar commented Dec 22, 2024

Description

Adds the ability to save and load hls4ml models (serialize/deserialize them). Given a ModelGraph, this will serialize it in a a single file which can be loaded at a later stage. The saved model doesn't depend on the original Keras/PyTorch/ONNX model in any way.

The feature is in part inspired by Keras' model saving feature. The main format used for serialization is JSON, all objects save their state in dictionaries which are serialized into JSON. Assuming disk space is not a problem, generated JSON is nicely formatted during writing to file. No objects are pickled, as this is way too unsafe. The numpy arrays (weights) are saved in npz format. We save model graph (list of layers), the model information and config into separate files. This (along with some versioning information is packaged into a .fml file, which is just a .tar.gz with a different name.

Internally, this works by adding a few methods to types, quantizers, layers and model graph itself. The interface is defined by the Serializable class. Classes would typically implement serialize_state() method, which should return a dictionary of current state of the object. Additionally, there's also a serialize_class_name() which is needed to know what instance are we saving, but most classes won't need to deal with this. Deserialization is done with a class method deserialize(). To support this feature some restructuring had to be done. ModelGraph has been intended to be created only with a layer list from a converter, which is not compatible with (de)serialization, so it was split into initialization of empty ModelGraph and conversion of layer list from converters to Layer objects. Furthermore, Layer's initialization has to be skipped, as we're basically restoring a state post-initialization. Types and quantizers are more straightforward to save/load. Loaded model should be indistinguishable from the original, but there may be some corner cases of some hacks of internal state of layers (or partially optimized models) not working on loaded models, we can catch these over time. But for "final" models (one you're happy enough with to call write()/compile()/build() on) saving/loading should always work.

One somewhat ugly part in the current implementation is that due to the creation of dynamic wrapper classes, we cannot directly deserialize to them, instead we create the original types, and have to run <backend>:specific_types optimizer to truly get an object that is identical to the original one. Running that optimizer for a given backend looks a bit hacky, but is ok for now since all backends have an optimizer by that name.

Type of change

  • New feature (non-breaking change which adds functionality)

Tests

Included is a test in test_serialization.py that tests saving/loading QKeras and QONNX models. These cover serialization of most types and quantizers that can appear in a model, but obviously not all possible layers. Maybe a more thorough test would be to extend most existing tests to save and load a model and then continue working with a loaded model. But I'll leave that to a future PR.

Checklist

I've done all the usual checks prior to opening this PR.

@vloncar vloncar added the please test Trigger testing by creating local PR branch label Dec 22, 2024
@bo3z bo3z added this to the v1.1.0 milestone Jan 7, 2025
@JanFSchulte JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Jan 7, 2025
@jmitrevs
Copy link
Contributor

Even though the running time exceeded the limits, there were failures in test_serialization beforehand.

@JanFSchulte
Copy link
Contributor

Even though the running time exceeded the limits, there were failures in test_serialization beforehand.

Indeed. I am just rerunning the tests to see which part is taking so long, and the QONNX test runs really fast, so it is the serialization test itself that is very slow.

@jmitrevs
Copy link
Contributor

Running locally on my linux machine the serialization tests are pretty quick (1-2min), but I get:

FAILED test_serialization.py::test_qkeras_model[io_stream-oneAPI] - subprocess.CalledProcessError: Command 'make lib' returned non-zero exit status 2.
FAILED test_serialization.py::test_qonnx_model[oneAPI] - subprocess.CalledProcessError: Command 'make lib' returned non-zero exit status 2.
FAILED test_serialization.py::test_qkeras_model[io_parallel-oneAPI] - subprocess.CalledProcessError: Command 'make lib' returned non-zero exit status 2.

The exact failure is:

icpx: error: fpga compiler command failed with exit code 14 (use -v to see invocation)

I will investigate. One thing that is kind of annoying is that I get more failures on my mac since ap_math doesn't really support clang:

firmware/ac_math/include/ac_math/ac_pow_pwl.h:300:70: error: typedef 'pit_t' cannot be referenced with a class specifier
  300 |     typedef class comp_pii_exp<W, I, S, n_frac_bits + extra_f_bits>::pit_t input_inter_type;
      |                                                                      ^

But that's unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
please test Trigger testing by creating local PR branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants