Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added average grain width measurement to disordered tracing #999

Merged
merged 28 commits into from
Dec 4, 2024

Conversation

tcatley
Copy link
Collaborator

@tcatley tcatley commented Nov 14, 2024

Added function calculate_dna_width() into class disorderedTrace which uses the smoothed mask and the pruned trace to calculate the average width of the mask using a distance transform. Outputs a float value of the width in metres to the all_statistics.csv.

Copy link
Collaborator

@ns-rse ns-rse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this @tcatley appreciated.

The tests are failing because you've added in the grain_width column to the resulting output files.

In order to update the regression test files you will have to install the tests dependencies and whilst at it may as well install the dev dependencies then you'll have the ability to run pre-commit locally.

pip install -e .[tests,dev]

To install pre-commit you can then...

pre-commit install

To update the test files so they include the additional field you then...

pytest --regtest-reset tests/test_processing.py

This will update files in the tests/_regtests_outputs/. You can add this to git with...

git add -u
git commit -m "tests(processing): Update regression test targets to include dna width"
git push

This will add the updated files, make a commit with the above message and push to the branch here on GitHub (it will be included in this Pull Request automagically).

Tests

Ideally we should have a test in place to check the method returns the correct values in isolation from the global regression test.

This involves adding a test to tests/tracing/test_disordered_tracing.py but if you look through you will see many are marked as skipped as they are awaiting being written.

I can write this up in the existing issue (or create a new one) if you don't fancy having a stab at it.

topostats/tracing/disordered_tracing.py Outdated Show resolved Hide resolved
@ns-rse
Copy link
Collaborator

ns-rse commented Nov 15, 2024

Thanks for the quick update/fixes @tcatley almost there but your second commit #09e10d1 change the column name to grain_width_mean whilst your first commit updated the regression test files with grain_width as the new column.

Thus you need to run the regeneration of the regression test files again. We can avoid duplicating commit messages by ammending and force pushing...

pytest --regtest-reset tests/test_processing.py
git add -u
git commit --amend
git push --force

@ns-rse
Copy link
Collaborator

ns-rse commented Nov 18, 2024

Hi @tcatley,

Thanks for addressing the changes. Unfortunately the tests still fail, but in a different location (this is progress!!! 😁 ).

Here its tests/tracing/test_disordered_tracing.py::test_trace_image_disordered that fails because again we have a dataframe with new columns.

Unfortunately these checks are made outside of the pytest-regtest frame work so we can't just run pytest --regtest-reset tests/tracing/test_disordered_tracing.py::test_trace_image_disordered. Rather we have to uncomment a section of code that writes the target files to disk, run the test to update the target files and include those. This is a bit more involved and for that reason I'll undertake the changes on a separate branch which we can merge back into this branch once I've fixed them.

@tcatley
Copy link
Collaborator Author

tcatley commented Nov 18, 2024

Great, thanks @ns-rse! Let me know if you need anything else from me

ns-rse added a commit that referenced this pull request Nov 18, 2024
Splits `test_trace_image_disordered` into two tests, the original compares dictionaries, the newer (suffixed with
`_dataframes`) compares the dataframes/csv files that are produced, of which there are two and one needed updating to
include the `grain_width_mean` column that #999 introduces.
ns-rse and others added 11 commits November 18, 2024 14:35
As [suggested](#996 (comment)) by @SylviaWhittle we could all do
with an _aide memoire_ when making Pull Requests that we have covered everything we should.

This Pull Request therefore introduces just such a template to aid contributors in making sure we have covered
everything.

---

- [ ] Existing tests pass. **N/A** no changes to code base introduced.
- [ ] Documentation has been updated and builds. **N/A** no changes to code base introduced.
- [x] Pre-commit checks pass.
- [ ] New functions/methods have typehints and docstrings. **N/A** no changes to code base introduced.
- [ ] New functions/methods have tests which check the intended behaviour is correct. **N/A** no changes to code base
  introduced.
Splits `test_trace_image_disordered` into two tests, the original compares dictionaries, the newer (suffixed with
`_dataframes`) compares the dataframes/csv files that are produced, of which there are two and one needed updating to
include the `grain_width_mean` column that #999 introduces.
Two final tests needed updating to account for new `grain_width_mean` columns.

Also required rebasing onto `main` and `ns-rse/994-unpin-topoly` so the changes look larger than they might.
@ns-rse
Copy link
Collaborator

ns-rse commented Nov 18, 2024

It was a bit more involved than I thought @tcatley and involved another set of regression tests being updated and rebasing onto existing branches that are out for pull requests (#995) which fix the tests in light of a newer version of topoly.

#1001 looks big but that is a consequence of "rebasing" which brings the changes from the other branches into yours.

Once #1001 is merged into this we should wait for #995 to be merged to main first (in essence this PR now branches from #995 and my #1001 branches from this #999, for those interested an article I found useful on this is Using stacked pull requests in GitHub - LogRocket Blog).

@ns-rse
Copy link
Collaborator

ns-rse commented Nov 18, 2024

For other reviewers...

I split the failing test into two, one for the nested dictionaries that are loaded from pickles and one for comparing dataframes via pytest-regtest.

I did some more reading on syrupy and think it might be a viable alternative to replace pytest-regtest as tests inherit/have the snapshot function passed in which is used to compare against the returned value of whatever is being tested. These are updated with the command line flag --snapshot-update (akin to --regtest-reset) but objects are stored as serialised objects rather than requiring to be print() and written to file.

Copy link
Collaborator

@ns-rse ns-rse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing tests pass.

Thanks for your contribution @tcatley, feel free to make your first merge to the main branch. 🎉

@MaxGamill-Sheffield
Copy link
Collaborator

@tcatley already had a simple test which I've quickly implemented and expanded slightly.
Might need a new reviewer as I've contributed now.

@MaxGamill-Sheffield MaxGamill-Sheffield linked an issue Nov 19, 2024 that may be closed by this pull request
Copy link
Collaborator

@ns-rse ns-rse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nomenclature and a query about the returned value for "jagged square" which to my naive mind should be wider than "simple square".

tests/tracing/test_disordered_tracing.py Show resolved Hide resolved
tests/tracing/test_disordered_tracing.py Outdated Show resolved Hide resolved
@ns-rse ns-rse added the v2.3.0 label Nov 19, 2024
@ns-rse ns-rse added v2.4.0 and removed v2.3.0 labels Nov 27, 2024
@ns-rse ns-rse mentioned this pull request Dec 4, 2024
2 tasks
- `px2nm` > `pixel_to_nm_scaling` so this keyword is consistent across all functions/methods.
- Removed an unnecessary comment line `# Code will go here`
- Already import `ndimage` from `scipy` so prefixed `distance_transform_edit()` rather than importing it in isolation.

I _haven't_ updated `docs/data_dictionary.md` with the new `grain_width_mean` field that is written to the output files
and will include that in #1001.
@ns-rse ns-rse merged commit 85d65c3 into main Dec 4, 2024
11 checks passed
@ns-rse ns-rse deleted the tcatley/dna-width branch December 4, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] : Calculate width of DNA
3 participants