-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion right_sib[last_root] == NULL_NODE
failed
#983
Comments
Hmmm, "interesting"! |
The reason for all ancestors ending up in a single group seems to be that all ancestors have >0 incoming edges, so this loop Lines 124 to 126 in 10b3c81
exits immediately. |
Ping @benjeffery, as this is a linesweep thing. |
Here's what the ancestor age/length distribution looks like, for those ancestors that actually get passed to So I should probably be truncating lengths. Here's a heatmap of where ancestors are located spatially (in terms of site index) and temporally: this also looks reasonable aside from the really long old ancestors. So may be this is a bug rather than an issue with the ancestors themselves. |
There aren't any NaN or negative times that you are passing as |
Nope, everything looks good up to Line 176 in 10b3c81
whereafter things start to look off |
Thanks @nspope, I'll try to recreate |
@nspope The root of the issue here I think is that there are so many unique times. Could you try discretising the time array as a quick workaround? The code here should deal better with the original array but that is more extensive work. |
Thanks Ben-- it does work when I round times to the nearest integer (which collapses a lot of the recent stuff, resulting in maybe 65K unique time points). Shall I leave this issue open as a reminder about the underlying issue with linesweep? |
I think we should leave this open (and maybe change the title to reflect what needs to be done). What is the exact issue with having a huge number of unique times? Is the linesweep running out of space to store separate execution paths or something? I'm not sure I quite get what the underlying problem is, but it's very likely that reinference will involve every ancestor having a unique time. |
You don't have to go as far as integers, 0.1 should do the trick too. |
The grouping algorithm proceeds by building the DAG of dependencies and then topologically sorting the DAG. If all ancestors are considered at the scale of Nate's data you get a DAG with 4,096,554,905 edges, which is too many to do in a reasonable time. Most of the edges are at the bottom of the DAG, so we just use the top. We look at the ancestors binned by time and when the groups by time are large enough, we use the time groupings instead of the DAG. |
I think in this case the number of edges overflowed, the code clearly needs to cope with that and error out. |
I can't recreate this locally as I don't have enough RAM. Will see if I can get on a bigger box and do it. |
If it'd help, I can stick the ancestors somewhere where you can get them (the zarr store is 10Gb or so) |
Thanks, but I've got the ancestors - it's the line sweep that is the issue. I think I might be able to do it if I truncate the samples enough to fit in RAM, but not too much to suppress the issue. |
I wanted to reinfer an inferred tree sequence using site times, but am hitting this error (with current development version):
Here's code to reproduce, and here is tree sequence, takes 8-9 hours to get through ancestor matching with 24 threads, but only 30 min or so to generate ancestors:
This ran fine when trimming to a smaller interval (5Mb) and produced reasonable-looking results. A few odd things pop out here relative to the well-behaved test run-- all ancestors are getting stuck into one group by the linesweep (as opposed to very many small groups with the test run); and after running ancestor matching separately to investigate it turns out there's a single edge per node and a huge number of mutations (300 million).
The text was updated successfully, but these errors were encountered: