Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

secondary collapse fails on down/sub-sampled fastq #105

Open
sand60 opened this issue Oct 9, 2017 · 10 comments
Open

secondary collapse fails on down/sub-sampled fastq #105

sand60 opened this issue Oct 9, 2017 · 10 comments

Comments

@sand60
Copy link

sand60 commented Oct 9, 2017

Hi Daniel,
I am unable to run collapse on subset of downsampled fastq, it fails without any useful message. I used seqtk to downsample. All other tools can use those fastq without any errors,.
Thanks
Sukhinder

@dnbaker
Copy link
Contributor

dnbaker commented Oct 9, 2017

seqtk probably mangled the fastq and/or metadata somehow. Can you post a small sample of your seqtk-processed fastq?

@sand60
Copy link
Author

sand60 commented Oct 9, 2017

attached is sample fastq with 5000 reads,.thanks
seqtk_sample_5k_R1.fastq.gz

@dnbaker
Copy link
Contributor

dnbaker commented Oct 9, 2017

  1. Your reads are different lengths. BMFtools requires uniform read length.
  2. seqtk has mangled the names such that there is no comment field, which might cause some problems.

The first is probably your real problem.

@sand60
Copy link
Author

sand60 commented Oct 9, 2017

1 has usually worked for me,.and since the only difference during failing was seqtk. I think it is 2nd. Is that comment filed is mandatory for bmftools? I can try adding it back,.because I can see that is the only difference from my original fastq which worked. Thanks

@sand60
Copy link
Author

sand60 commented Oct 9, 2017

and that "Description" field is optional for fastq format,.so not sure why bmf must want it,.

@dnbaker
Copy link
Contributor

dnbaker commented Oct 9, 2017

What release are you using? Something related to this issue was fixed recently. #98 (comment)

@sand60
Copy link
Author

sand60 commented Oct 9, 2017

I have this, I guess I should re-pull?
commit 7279788
Merge: b40d9d6 cca362d
Author: Brett Kennedy [email protected]
Date: Wed Feb 22 15:33:22 2017 -0700
Merge pull request #95 from dnbh/master
Bug fixes, portability.

@dnbaker
Copy link
Contributor

dnbaker commented Oct 9, 2017

If you update to master, it should handle problem 2. And you shouldn't be working with varying read lengths as that assumption is made throughout the code base, but if you think it's working, then you can do what you want.

@sand60
Copy link
Author

sand60 commented Oct 10, 2017

Should this fix it? Do I need to do anything different to implement the change? Thanks

commit f9483ea
Merge: 7279788 7a6a3e4
Author: Brett Kennedy [email protected]
Date: Mon May 15 11:33:59 2017 -0600
Merge pull request #99 from dnbh/master
Fix issue 98

@dnbaker
Copy link
Contributor

dnbaker commented Oct 10, 2017

Yes, that commit does have the fix. (See

std::strcpy(ret->comment, seq->comment.s ? seq->comment.s: "");
).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants