Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation #93

Open
wachsylon opened this issue Nov 11, 2021 · 8 comments
Open

Installation #93

wachsylon opened this issue Nov 11, 2021 · 8 comments
Labels
documentation Improvements or additions to documentation

Comments

@wachsylon
Copy link
Contributor

wachsylon commented Nov 11, 2021

Hi, in order to simplify the installation, I would suggest that

  1. You merge
environment.yml
mamba_requirements.txt
requirements.txt

so that users can decide what package it uses for installation. I also suggest that mamba_requirements.txt is not needed at all as mamba is able to install environment.yml. mamba and conda again can install pip packages by putting a section

  - pip:
    - pyessv
    - git+https://github.com/cedadev/cc-yaml
    - git+https://github.com/cedadev/compliance-check-lib

in the environment.yml

If you do so, your 6 installation steps can be shortened to two:
1. Install conda or pip
2. Run conda env create -n atmodatenv -f environment.yml or pip install git+https://github.com/AtMoDat/atmodat_data_checker
This should do it all...

  1. I think it is also hard to create reproducible software when @master is in the installation because you cannot know how the development strategy is in the required package repository. Maybe they push things into master that they should not. You already created tags. I got atmodat-check-lib 1.1.0 depends on cc-yaml 0.0.0 (from git+https://github.com/cedadev/cc-yaml@master) when I install atmodat. I wonder how you will reproduce issues if someone pushes to cc-yaml's master after 1.1.0 was created?
@wachsylon
Copy link
Contributor Author

wachsylon commented Nov 11, 2021

Sorry, I edited that a lot. When I opened it, I thought it was more straight forward but it was not.... I am not sure how good conda and pip interact with each other. Maybe only one of pip install or conda env create can work and you cannot specify the same dependencies in both .yml and .txt. I hope you can find that out :)

@wachsylon
Copy link
Contributor Author

What already worked for me was to put

  - pip:
    - git+https://github.com/AtMoDat/atmodat_data_checker

and all dependencies from mamba_requirements.txt into the environment.yml and run conda env create -n atmodatenv -f environment.yml

@jkretz jkretz added the documentation Improvements or additions to documentation label Nov 11, 2021
@wachsylon
Copy link
Contributor Author

Ok it works only for the installation. run_checks seems not be installed correctly. If I activate the environment and run run_checks, I get

  File "/home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/bin/run_checks", line 5, in <module>
    from run_checks import main
ModuleNotFoundError: No module named 'run_checks'

It also looks different:

cat /home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/bin/run_checks
#!/home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from run_checks import main
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

while a version that works looks like

cat /work/bm0021/conda-envs/quality-assurance/bin/run_checks
#!/work/bm0021/conda-envs/quality-assurance/bin/python3.9
# EASY-INSTALL-ENTRY-SCRIPT: 'atmodat-check-lib','console_scripts','run_checks'
import re
import sys

# for compatibility with easy_install; see #2198
__requires__ = 'atmodat-check-lib'

try:
    from importlib.metadata import distribution
except ImportError:
    try:
        from importlib_metadata import distribution
    except ImportError:
        from pkg_resources import load_entry_point


def importlib_load_entry_point(spec, group, name):
    dist_name, _, _ = spec.partition('==')
    matches = (
        entry_point
        for entry_point in distribution(dist_name).entry_points
        if entry_point.group == group and entry_point.name == name
    )
    return next(matches).load()


globals().setdefault('load_entry_point', importlib_load_entry_point)


if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(load_entry_point('atmodat-check-lib', 'console_scripts', 'run_checks')())

But even if I use the working run_checks in my installation, I get

  File "/home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/bin/run_checks", line 33, in <module>
    sys.exit(load_entry_point('atmodat-check-lib', 'console_scripts', 'run_checks')())
  File "/home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/bin/run_checks", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/lib/python3.9/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/home/dkrz/k204210/miniconda3/envs/mambaenv/envs/taucenv/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'run_checks'

It seems like
pip install -U -e .
installs a module run_checks while the

  - pip:
    - git+https://github.com/AtMoDat/atmodat_data_checker

does not. I suspect the run_checks.py must be inside a source directory in the repo but it is just a guess..

@jkretz
Copy link
Collaborator

jkretz commented Nov 11, 2021

Thanks for all your comments.

Regarding the splitting up of the requirements/environments into different files. We recently decided to use mamba to install the conda packages, which gave us a significant speed up in the installation process. This, unfortunately, meant that the creation of the conda environment got a little more elaborate as instead of two command (i.e. conda env create -f environment.yml and conda activate atmodat) four commands were needed. Using mamba gave us a speed up of more than 30 seconds in the installation process at the expense of two more commands and the mamba_requirements.txt, which seemed acceptable for me.

I was not sure how well the dependencies between conda and pip can be resolved when installing them both via conda, that's why this is split up into two steps.

Regarding the installation of the checker itself via pip will definitely be something that we will address. At the moment, one has to point pip to the github repos, but in the long run, we aim to put the checker itself into a package. For that, we need to wait for cc-yaml and compliance-check-lib into be packaged first. The way that you proposed is definitely a first step into that direction. Nevertheless, one would then have two different codebases, one installed via pip and the other in the repo which is not related to the one installed via pip. While this might not be an issue for users, it definitely not ideal for development. We plan to release a new version of the checker next week that will have separate installation procedures for users and developers and maybe I can include your suggestion.

Fixing the respective version of the packages in the requirements was something I wanted to add for long but I always forgot to do it. I will open an issue for that so I won't forget.

@jkretz
Copy link
Collaborator

jkretz commented Nov 11, 2021

Interesting that this way of installing doesn't work. In the setup.py, run_checks.py is set as an entry point and the pip install then creates a link to the conda binaries so that it can be run from any directory. As I have never used your way of installing the checker, I never ran into this issue. As this might cause problems in a later pip package, I will have a closer look at that.

@wachsylon
Copy link
Contributor Author

Thanks for the detailed answer! One comment:

mamba and conda

I understand that you want to support mamba but I would only recommend it and not include that as a part of your program. You also do not provide a install script for conda in your repo. Even if you do, I would include mamba there instead of in a conda environment. In addition, your program does work with conda as well so why require mamba?

So all in all, I would split mamba from the repo - especially to have only one environment.yml which can be used by both mamba and conda.

@jkretz
Copy link
Collaborator

jkretz commented Nov 12, 2021

Well, the major reason for using mamba was to get a speedup of the whole installation process. The 30-45 second longer installation process via conda might not be too big of an issue for users, as they only need to do it once. The reason why I eventually decided to use mamba was that it speeded up the creation of the conda environment for the automated tests via the github actions.
We indeed had the idea to provide an installation script and one is already available in the repo for Linux/MacOS (see ancillary/install_adc.sh) and we would need to provide another one for Windows. Nevertheless, I am not in favor of offering such scripts as we would somehow break the OS independence that conda/mamba offers. Eventually, we hope to have everything nicely packaged, which will further simplify the installation process.
Overall, it is a matter of taste what you want to use as both, conda and mamba, install the same packages, only that solving the dependencies is much quicker via mamba as it is based on C++. Having only one envirnment.yml would be a nice thing and I will have a look if one can do so while still retaining the mamba compatibility. I like your idea of offering the two installation options so that the user has the choice.

@jkretz
Copy link
Collaborator

jkretz commented Nov 12, 2021

Did a quick test and sharing the envirnment.yml works if one installs mamba into the base environment. I usually like to keep the base environment as clean possible but it would be okay for me.
Another option would be to get rid of the envirnment.yml altogether and only provide a requirements.txt that installs the needed packages via a conda install or mamba install. With that mamba does not have to be installed in the base environment. This would then require an extra step in the installation procedure as we need to set up the conda environment first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants