-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promoting INS to DUP issue #488
Comments
Hi Tom, Not as a Mutalyzer developer but as a long-time user; you can make your life much easier by combining steps 2-4 (or even 1-4 if you want) in only one Mutalyzer call, which also prevents you from having this issue. Your steps would then look like:
Cheers, |
Hi Tom, Thanks for posting this issue. Indeed, it seems like Mutalyzer should not normalize Best regards, |
Hi @ifokkema, I'm going through your suggestion in detail and I tried sending the variant with an NC name (i.e. I'm just a bit confused about your second note on point 2, about rolling in both directions. My understanding, which now might be local dogma not canonical at all, was that 3' shifting should be applied only at the transcript level. It certainly makes sense in terms of our use case. We rewrite the VCF so that the variants line up with the c. description, so that when the curators in Molecular Pathology look at them in IGV, the c., g. and VCF descriptions all line up. (We don't however do any kind of realignment to the BAM, so occasionally the pileup doesn't line up.) T. |
In the recommendations the following is mentioned: "the 3’rule applies to ALL descriptions (genome, gene, transcript and protein) of a given variant". I hope this answers your question. |
Hi @drtconway,
Yeah, you do need to reason over the JSON reply a bit; open
No, like @jfjlaros already mentions, it should be applied on all levels. This makes the variant "misalign" in the case of shifting in genes on the reverse strands, and makes genomic HGVS descriptions misalign with VCF notation always, as VCF likes to 5'-align the variant. Sources don't always fully 5'-shift their VCF files, so for comparison between data sets on a genomic level, both sources should be fully normalized to correctly compare them. |
Yeah, I totally understand and agree about the necessity for shifting to compare genomic descriptions. Possibly, we should compute both - for the reasons I cited. Thanks for the link - I'll take a look at your paper (I'm all for shamelessly plugging one's own work. :) ). |
We actually do calculate both, but at the moment this information is not exposed via the API, we will do so in an upcoming version of Mutalyzer. |
Hi Mutalyzer,
Thanks for your awesome work! We use mutalyzer on well over 1000 research & clinical assays per month and it's great!
We've run in to an issue which I think is actually a flaw in the HGVS specification, which Mutalyzer is faithfully implementing.
We have an annotation pipeline which has the following steps (mixed in with other stuff):
We've recently had a couple of cases where this blows up in our faces.
Consider the transilterated HGVSg for a 4bp insertion:
This is converted by Mutalyzer to the following HGVSc:
Which in turn is normalised by Mutalyzer to:
And the corresponding HGVSg produced by Mutalyzer for this is:
What started out as a 4bp insertion has turned into a 1124bp duplication (insertion)!
The reason is clear when we examine (e.g. in the web interface) the HGVSc form. In particular, the coding positions of the 5th and 6th exons:
The original HGVSc is an insertion just after the splice junction joining exons 5 & 6. Just by coincidence, the inserted sequence happens be the same as the preceding 4bp of the transcript (i.e. the last 2bp of exon 5 and the first 2bp of exon 6) (1/(4^4)=1/256, so I guess we shouldn't be too shocked). Accordingly, and following the prioritisation rules, the INS variant is promoted to a DUP variant. However, like 3' shifting variants onto or across splice junctions, this is misleading.
It seems to me that just as the 3' shifting rule has explicit exceptions relating to splice junctions, the prioritisation rules for duplications and inversions should also disallow promotion from insertion or deletion-insertion. Both duplications and inversions describe variants in terms of the sequence content of a reference sequence, not just position in the reference. As such, it is problematic when the implied sequence contains a splice junction.
Would you consider modifying Mutalyzer to avoid promoting ins->dup and delins->inv in such circumstances.
Thanks,
Tom.
The text was updated successfully, but these errors were encountered: