-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added new infofilestyle compatible with BIDS #12
Conversation
@chrisfilo - this should be checked against philips and ge dicoms as well. some of the things @Hanke worked on was to make heudiconv work with philips and siemens dicoms. (i still think that bids should just use dicom terms instead of using supposedly "common" vocabulary that brain imagers use) |
def get_slice_direction(dcm_file): | ||
|
||
|
||
ret_dirs = [("x","x-"),("y","y-"),("z-","z")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this should be dimension not direction. i don't think you want think about orientation in terms of x, y, and z except to relate to the scanner space. for example, if i'm collecting coronal slices the slice direction would be ~"y", while slice dimension would be 3.
ditto for phase encode - there is a reason why dicom uses row or col.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the discussion about the spec away from the implementation. I'm happy to address your concerns here: https://docs.google.com/document/d/1HFUkAEE-pB-angVcYe6pf_-fVf4sCpOHKesUvfb8Grc/edit#
Re: phillips and GE - my implementation is Siemens specific, but it will just not set some fields if the Siemens specific headers are not set. I'm happy to extend it with phillips and GE specific heuristics when we figure them out, but that can be in another PR. Re: BIDS spec/DICOM names. Happy to address this on the spec discussion: https://docs.google.com/document/d/1HFUkAEE-pB-angVcYe6pf_-fVf4sCpOHKesUvfb8Grc/edit# |
I am torn. Advantage: A LOT faster than dcmstack. Disadvantage: (Over-)simplified. At first glance, I don't think this could work reliably with Philips (one slice per DICOM) files. I am personally leaning towards a consolidation on a single method to get DICOM meta data. I have given test files to the dcmstack author. He already has an implementation the uses 1/25 of the memory the current one needs for Philips datasets. Implementing a whitelist of meta data fields would speed things up in a similar fashion as this PR, but without loosing the generality. Conversion into BIDS-style info files (hence keeping the mode switch) could be done as a subsequent step, after the meta data extraction. Makes sense? |
+1 |
I agree that this code will extract fewer fields from phillips dicoms than siemens ones (it should not crash though), but I don't see this as a deal breaker since since we can add a philips specific implementation (which BTW might have to look at more than one file). Mind that dcmstack does not interpret the metadata from dicom files or try to conform how different vendors save certain fields. Even if we refactor the code to work on the metadata dictionary extracted by dcmstack it will still extract fewer fields from phillips files, because we simply have not figured out what are the right vendor specific heuristics. The true bottleneck is not whether we use dcmstack output or dicom files directly to generate BIDS JSON files, but figuring out quirks of different scanner vendors. I'm happy to add a third option "bids+dcmstack" that basically merges the two dictionaries, but I think users should have an option to choose "bids" alone as well since the extraction of metadata using dcmstack is so inefficient (currently if you choose "dcm2nii" converter you convert the files twice: once to write them down to disk and the second time to get "dcmstack" formatted metadata). |
@chrisfilo - some quirks - not all. in general dicom is pretty well defined. it's the individual research sequences and others that do not conform to it. for example there is a public field for diffusion gradient info, but few sequences use it. it has always been the case that private fields are problematic, but 90+% of the info is in the public fields. so dicomstack on any other would mostly have to spend the effort to extract the private pieces, which i completely agree happen to be important in brain imaging. |
fp.writelines(meta) | ||
elif infofile_style == 'bids': | ||
# we are not taking the first file because some multiband sequences report wrong slice timing in the first volume | ||
extract_metadata(dcmfiles[-1], infofile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other issue with this statement is that it assumes that all files have consistent info. something dcmstack clearly separates into fixed and lists. i think that should be maintained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I can add a check throwing a warning or exception if the parameters are not the same over all dicom files.
I would like to move forward with this PR. I can test it on Phillis data if you can send me some. In the meantime I started a conversation with dcmstack about adding BIDS fields into the JSON they embed: moloney/dcmstack#39 |
I still feel like this should be a step that is done after conversion to nifti and after extraction of DICOM meta data -- as it doesn't seem to impact any of the two procedures. I would rather not have to decide which ONE style of meta information STORAGE to use when I can have MULTIPLE for different purposes. It looks to me as if the desired content can be generated from the meta data extracted by dcmstack. If that is the case, I think we should have a dedicated to the turn dcmstack JSON into BIDS meta data. If that is not the case, dcmstack is not extracting some critical meta data -- which it should. |
# Conflicts: # bin/heudiconv
This has been made obsolete by switching to dcm2niix |
BF: In `find_compatible_fmaps_for_run`, actually check if the first element of the parameters list is a string. Thanks, @neurorepro
🚀 PR was released in |
Now users can choose between dcmstack and bids style JSON files.
We should consider combining those in the future, but this should serve as a working draft.
Note that I turned off provenance tracking for metadata extraction. I did this for simplicty and I can bring it back, but I wasn't sure how useful it is since the operations are performed by unversioned external python code (either dcmstack or the heuristics I just added).