NMRcraft is a project that predicts ligands of complexes from their chemical shift tensors.
See installation instructions
Docker Desktop 🐳
First you need to install Docker.
Download Docker Image
You can download the image by going onto the searchbar on top and searching for 'tiaguinho/nmrcraft_arch' and clicking on pull.
Running the Image
To run the image you need to go to the 'Images' tab and click the "play" button on the nmrcraftarch container you pulled. It should appear as running in the 'Containers' tab and there you should click on the ⋮ symbol and click on '> open in termnial'. After that a terminal window should pop up where you will type in the command zsh
.
Console 🐧
Download Docker Image
To use the docker image, pull it from Docker Hub and make sure that Docker is installed. To pull it you can execute this command:
docker pull tiaguinho/nmrcraft_arch
(If running on windows, you might need to call docker.exe instead of just docker)
Running the Image
docker run -it nmrcraft_arch
Visual Studio Code 🪟
To download the image, follow the same steps as either console or docker desktop.
Running the Docker Image
Using Docker in VS Code
- Open VS Code and install the extensions for Docker and Dev Containers.
- Go to the newly added Docker Tab. Here you should now see three sections: Containers, Images and Registries. And under Images the tiaguinho/nmrcraft_arch image should be visible.
- In order for the container not to be deleted every time you stop it we have to remove the --rm commad. For this go to the settings (Ctrl + , on Mac) and type `docker run`. Select 'Edit the settings.json' for the 'Run Interactive' command and remove the --rm to get: "docker.commands.runInteractive": "${containerCommand} run -it ${exposedPorts} ${tag}", "docker.commands.run": "${containerCommand} run -d ${exposedPorts} ${tag}". Save the file.
- In the Docker Tab on the right, right click on the image and select run interactive. Now a conainer should appear in the Container section. Right click on it and select stop to start it back up.
- Right click again on the container and select start to start it back up.
- Right click again on the container and select attach Visual Studio Code. A new VS Code window should apear, this window is now fully in the container. If necessary, switch to `/home/steve/NMRcraft`.
- Pull the latest changes to the repository with `git pull origin main`.
- Have fun developing.
Getting Access to the Dataset 💾
For the script to be able to access the dataset, you must login via to huggingface by using the following command:
pip install -U "huggingface_hub[cli]" # if not installed already
huggingface-cli login # log in after generating an authentification token for huggingface
We include the link to be authenticated in the report appendix. If you run into issues accessing the dataset, contact [email protected].
To reproduce all results shown in the report, run the following commands:
poetry shell
python scripts/reproduce_results.py
This script will interatively
- plot dataset statistics and PCA plots (stored in
./plots/dataset
) - train and evaluate all single-output models (stored in
./metrics/results_one_targets.csv
) - train and evaluate all multi-output models (stored in
./metrics/results_multi_target.csv
) - train and evaluate all baseline models (stored in
./metrics/results_baselines.csv
) - create the plots (stored in
./plots/{models,baselines,dataset_statistics,results}
) - print the table of experiment 3 to the terminal.
When the parameter max_eval
is set to a high value such as 50, expect the whole process to take about two hours. Alternatively – which results in worse model performance –, max_eval
can be set to a low value such as 2 for testing. Run scripts/training/{one_target,multi_targets}.sh
for running individual pipelines (although running scripts/reproduce_results.py
is recommended). Results are also accessible via the polybox here.
If you were not able to visit our beautiful poster at ETH Zurich on May 30th 2024, you can access our poster here!
See developer instructions
To use the packages installed via poetry you need to execute the following command:
poetry shell
This will put you into the poetry shell from where you have direct access to all packages managed by poetry.
To authenticate the Docker comes with the github cli application. To login execute this command:
gh auth login
and follow the interactive instructions with enter and the arrow keys. Once logged in you should be able to push changes to the repo.
If you added a new feature that requires a new package/library, you can add by running poetry add <package-name>
and run make install
to install the new dependencies.
(You might need to run poetry lock
to update the poetry.lock
file if you added a dependency manually in the pyproject.toml
file.)
The dataset is stored in a private repository on HuggingFace.
To download the dataset on the Hub in Python, you need to log in to your Hugging Face account:
huggingface-cli login
@software{nmrcraft2024,
author = {Magdalena Lederbauer and Karolina Biniek and Tiago Würthner and Samuel Stricker and Yingnan Wang},
title = {{mlederbauer/NMRcraft: Crafting Catalysts from NMR Features}},
month = may,
year = 2024
}
Repository initiated with fpgmaas/cookiecutter-poetry.