different way to calculate approximate interaction strength in spatial sims? #360
Replies: 4 comments 3 replies
-
So, this was originally a chat on Slack between @andrewkern and myself. I will paste below my comments from there: Interesting. One aspect of this, the division of space into nearby versus distant cells for efficient spatial proximity searches, is already in SLiM in the form of the k-d tree (and did speed it up considerably!). The other aspect of this, lumping distant interactions together as a “single large particle centered at the cell’s center of mass”, is interesting but is typically handled in SLiM by simply neglecting those long-distance interactions altogether (via the maxDistance cutoff for interactions). That’s an approximation, and it sounds like a Barnes-Hut approach would provide a closer approximation, but really this seems like an area where physics and biology differ. In physics, the Andromeda galaxy’s gravitational effect on the Milky Way cannot be neglected entirely even though it is many light-years away; in biology, beyond a certain distance organisms are really just not going to interact at all; no matter how large a cluster of organisms might exist at some very distant location, that cluster is not going to exert any interaction effect upon a distant individual at all if the cluster’s distance is larger than the distance that the organisms might plausibly range over the course of their lifetimes. So SLiM’s maxDistance cutoff seems fine to me, and in a sense should be even more efficient than Barnes-Hut (since we just neglect those long-distance effects completely). All that said, Sam Champer in the Messer Lab is working on a very interesting and clever approach to optimizing spatial interaction calculations that does speed things up by perhaps 2x-3x. I can’t say more right now since it’s not my work. :-> @petrelharp this will interest you I’m sure. You know the current design well; do you see a way to leverage these ideas for further gains? I guess even within the maxDistance, the Barnes-Hut approach could be used to approximate relatively distant interactions via the “center of mass” and “total mass” as the Wikipedia page says, avoiding some of the interaction calculations that presently have to get calculated separately. Hmm, interesting. But to approximate a bunch of interactions as a single interaction with a “center of mass” would only work for some interaction kernel functions, right? Which ones? The math for that is beyond me. But it does seem like it would have the potential to considerably accelerate spatial simulations in which individuals interact with a large number of other individuals. This is an idea worth pursuing, perhaps. If N is the number of individuals that a focal individuals interacts with, within maxDistance, on average, then I wonder how large N has to be before this optimization would start to pay off? It might be larger than the typical biological value of N, since there is a fair bit of overhead involved in this algorithm – calculating the center of mass and total mass for the internal nodes of the spatial tree, and the additional logic for deciding when to calculate an individual interaction versus when to use the approximation with the center of mass. |
Beta Was this translation helpful? Give feedback.
-
For one focal individual in a population of size N, searching the k-d tree for nearby neighbors is O(log N) (https://en.wikipedia.org/wiki/K-d_tree), so it is O(N log N) to find nearby neighbors for everyone; so at that stage we are getting more or less the same performance as Barnes-Hut, which uses the very similar octree data structure for spatial searches (https://en.wikipedia.org/wiki/Octree). Then, if an average individual has M nearby neighbors (within maxDistance), evaluating the total interaction strength upon a focal individual is O(M), so evaluating that for all individuals is O(NM). Typically, M can be treated as a constant; it does not grow as N grows, assuming the "local neighborhood" for an individual stays at constant density (i.e., as N grows the size of the landscape grows proportionally to keep a constant density across the landscape). So in effect, evaluating the total interaction strength for all individuals, after nearby neighbors have been found, can be thought of as O(N) – constant time per individual. Given that, I think a Barnes-Hut type approach where the interaction strength of distant "clusters" of individuals gets approximated by a "center of mass" and "total mass" would basically just reduce M; the overall time for totalling the interactions would remain O(N) but would get smaller by some constant factor. For models in which an average individual interacts with many other individuals, the amount of that constant speedup might be large enough to be interesting; but as I wrote above, setting up for the Barnes-Hut approach would entail a substantial amount of overhead, and I think that setup overhead might be O(N log N), so it might not pay off. |
Beta Was this translation helpful? Give feedback.
-
@samchamper this thread might interest you... |
Beta Was this translation helpful? Give feedback.
-
This thread does indeed interest me! Those two algorithms (k-d trees and Octrees) look like they have a lot in common. I guess you'd call them siblings? The innovation of the Barnes-Hut seems to be the logic about when to not dig down all the way to the leaves of far away nodes. And as Ben said, with interactions between individuals we're pretty much always truncating the interaction at some distance anyway. And I'd hazard that in most systems we're modeling, the truncated area of the interaction is probably a relatively small portion of the entire modeled area, which I think would mean that this exact method isn't necessarily applicable here. But are there other optimization tricks to be done which are domain appropriate for us? I believe there are! As Ben mentioned, my current project is a new method to speed up spatial interactions. I'm particularly interested in simulating very large populations (i.e., tens of millions of individuals; I see this as quite critical for modeling gene drive). I've seen performance increases while running locally of up to 4x or so, but that number is pending validation after running the simulations on the cluster. Of course the speed isn't completely free - this method involves some spatial approximations in some ways. But I'm quite excited by the potential applications. The paper-writing has begun, but I think the work is still not quite ready to share. But at the rate the project is progressing, I'd say that should change within a month or so, if you're interested in hearing more! |
Beta Was this translation helpful? Give feedback.
-
During some random Saturday reading I came across the Barnes-Hut simulation algorithm for (approximately) doing n-body calculations in physics.
Given that calculating interaction strengths among individuals is a computational bottleneck in spatial sims, I wonder if some of the ideas here might be useful for SLiM?
As a relevant aside, the complexity of Barnes-Hut is O(n log n)-- what is the complexity of the current SLiM interaction strength calculation? Is that known?
Beta Was this translation helpful? Give feedback.
All reactions