-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to get modbase info for record 6bcfdea5-e9eb-4fc9-a72e-81744a77df80, Skipped: AUX data not found #335
Comments
Hello @Tang-pro, The modified base information is contained in MM/ML auxiliary information tags, usually in SAM format. There is a specification here, see page 7. Although it is possible to preserve this information in FASTQ files, it tends to be brittle and I don't recommend it. I would try and get an unaligned BAM file from Dorado and align that with |
Thank you! The dorado step was performed by a partner company. However, I still have a question. I have two species, which are allotetraploids belonging to the same genus. If I want to compare the two species later, should I use their respective reference genomes or a unified reference genome?" |
Hello @Tang-pro, That's an interesting question! For most of the functions in |
Thank you for your patient reply! I have another question: The pass.fq.gz file provided by the sequencing company already contains methylation modification information and polyA tail length. Can I follow this workflow: first, align the fq file with these modifications to the reference genome to extract a reference transcriptome, and then align the fq file to the reference transcriptome? However, during this second alignment step, the resulting bam file no longer retains the polyA tail length and methylation information. Some people have suggested that this might be due to not using the -y parameter in minimap2. Could the errors mentioned earlier be related to the use of the -y parameter? As for why I need to take such a roundabout approach: unfortunately, the company did not provide a bam file, and I do not have sufficient GPU resources to rerun Dorado. |
Hello @Tang-pro,
It's hard for me to recommend a method to use data in this form. Can you request the unaligned BAM from the sequencing provider? If so, you can pass this file directly to |
@ArtRand @rmp
Hey,
I received a file named pass.fq.gz, which the company called using Dorado. I then mapped the reads from the .fq.gz file to the transcriptome, when I run this command
modkit pileup ../reftrans/Y1_5_1.bam --ref /public/home/DRS/data_241224/Isoquant/Y1/Y1quant/Ref_trans/Y1_transcripts.fa Y1_5_1/Y1_5_1.bed --with-header -t 20 --motif DRACH 2
the error is occured
calculated chunk size: 30, interval size 100000, processing 3000000 positions concurrently
The text was updated successfully, but these errors were encountered: