-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: add full stack NLP regression test #295
Conversation
6af91ae
to
0c47a2a
Compare
tags: smartonfhir/cumulus-etl:latest | ||
|
||
- name: Download NLP images | ||
run: docker compose --profile covid-symptom up -d --quiet-pull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear how to best cache the pulled images...? Do you think GitHub does some network-side caching? There are some rando actions in the github action marketplace that promise to cache pulled images... But it might be easy enough to do on our own too.
On the other hand... it's just GitHub's resources we're using - which makes me suspect they do some their-side caching, just to save their own money and maybe we don't have to be clever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume they've got something akin to artifactory instance(s) inside their infrastructure. Since it's open source and we don't pay for time, I don't think we need to worry about optimizing - but if we were, it would probably be something similar - add a cache box and update docker hosts to point at it?
uses: actions/checkout@v3 | ||
uses: actions/checkout@v4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated a lot of actions here, but nothing interesting - just node 20 upgrades.
matches = list(filter(is_covid_match, matches)) | ||
matches = filter(is_covid_match, matches) | ||
|
||
# For better reliability when regression/unit testing, sort matches by begin / first code. | ||
# (With stable sorting, we want the primary sort to be done last.) | ||
matches = sorted(matches, key=lambda x: x.conceptAttributes and x.conceptAttributes[0].code) | ||
matches = sorted(matches, key=lambda x: x.begin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only actual code change. Just sorting the ctakes results so we can safely compare over time.
This is overdue. We want to make sure that any NLP changes are not suprises. Questions that we now have PR-time answers for: - Does our Dockerfile build? (was only checked after merge before) - Does our built Docker work even a little bit? - Do the current NLP dependent images work even a little bit? - Are there unexpected regressions in our NLP pipeline? There might still be errors that could creep up in our NLP that this quick smoketest don't uncover. But it's a lot better than nothing!
0c47a2a
to
b143b5d
Compare
tags: smartonfhir/cumulus-etl:latest | ||
|
||
- name: Download NLP images | ||
run: docker compose --profile covid-symptom up -d --quiet-pull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume they've got something akin to artifactory instance(s) inside their infrastructure. Since it's open source and we don't pay for time, I don't think we need to worry about optimizing - but if we were, it would probably be something similar - add a cache box and update docker hosts to point at it?
sed -i 's/"generated_on": "[^"]*", //g' $OUTDIR/*.ndjson | ||
diff -upr $DATADIR/expected-output $OUTDIR | ||
|
||
echo "All Good!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no emojis? i am SHOCKED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it! 😄
This is overdue. We want to make sure that any NLP changes are not suprises.
Checklist
docs/
) needs to be updated