-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
United-atom Types #220
Comments
@ramess101 -- I don't understand, what is the purpose of having a virtual site which is located ON the carbon atoms? Or are you just saying we want a single interaction site on the carbon, where the hydrogens are not used? @jchodera and @bannanc - thoughts on how we would want to do this with SMARTS/SMIRKS? Perhaps just a SMIRNOFF tag for "united atom" forcefields which allows specific SMIRKS patterns to be specified for atoms which are going to be dropped? Initially I thought that we should simply say that particles which do not get assigned LJ parameters get dropped, but this would be difficult given the hierarchical nature of the force fields (e.g. it would be difficult to ensure that a generic for a particular element doesn't cover atoms which should be treated as united atoms). This would allow us to approach the problem in several ways; for example, we might say that |
Sorry, I was just trying to be consistent with your terminology of "virtual site" but clearly that created more confusion than clarity. Yes, the united-atom model just has a single interaction site on the carbon whereas the hydrogens have no interaction site. |
Sorry, @ramess101 - I should clarify "virtual site" then. Say we are simulating |
@davidlmobley Yes, that makes sense. Thanks! |
Are united-atoms always the same combination of atoms or are they something we're going to learn? I'm not very familiar with united atoms so I'm trying to understand when they'd be applied. From the call today it sounds like you're assigning parameters (specifically Lennard Jones, non-bonding?) to a group of atoms rather than a single atom. In the SMIRNOFF format we don't really have
as those atoms would be assigned the generic non-bonding SMIRKS So far I've thought about two options for assign UA parameters.
If this is a hard and fast rule then you could imagine doing no.1 more genericly assigning null parameters to any hydrogen bound to a carbon (i.e.
|
The united-atom types have historically been assigned with chemical intuition. For example, distinguishing between CH3, CH2, CH2=, CH, CH=, etc. UA types make sense based on the notion of group contributions and what makes two groups different. Although I am not sure that learning the appropriate UA types is in the future plans it is certainly an interesting idea. For example, we could learn which hydrogens are not necessary to have explicitly in the model and which ones are required to predict a given property. My plan is to stick with the traditional approach of pre-assigning the UA types and then use Bayesian approaches to determine which non-bonded potentials are required to predict the desired properties. |
@bannanc's suggestions are a good starting point here! We first have to decide, however, which of two potential representations we are going to use:
I tend to favor approach 1, though I realize that may differ from how things are traditionally done. If we're OK with that, we can discuss the next steps that @bannanc has already proposed some ideas for---namely, how we tell SMIRNOFF to skip checks that ensure all atoms have nonbonded parameters when only a subset of sites will have interactions defined. |
@jchodera - do you have any thoughts on the likely relative performance of the two approaches? I agree that Option 2 is likely incompatible with a SMIRKS-based representation if we START with a reduced representation, though there could be an explicit reduction step. Probably this would require some sort of custom SMIRKS which would define groups of atoms to be replaced by an individual atom type. I find Option 1 conceptually more appealing because it recognizes the reality that there are still atoms involved and would make it possible to go back to an atomistic representation if we wanted. However, I worry that it might become slow to run, as well as cumbersome to build in the ways of keeping groups rigid, etc. On the other hand, Option 2 also sounds cumbersome to fit into the SMIRKS framework, so that still leaves me leaning towards Option 1. @ramess101 , @bannanc ? |
I don't think the performance of Option 1 will be very much degraded compared to Option 2, and it does have big advantages that make it easier to compute various properties directly from the simulations. The main issue is that Option 1 requires more valence parameters than Option 2, though this might be mitigated if groups of atoms could be rigidly constrained. |
OK, I'm still more inclined towards Option 1 then. It's an excellent point about the analysis -- this would make analysis via "standard tools" work well, whereas other approaches will require a mapping procedure before analysis can be done. |
In that case, we're back to the question of how to tell SMIRNOFF to assign nonbonded types to real or virtual sites for a subset of atoms, and ignore the fact that some atoms won't have nonbonded types. There are several options on the table (slightly modified from @bannanc's suggestions):
|
Sorry I didn't respond to this sooner.
I've been thinking about how we might want to use these in the long run.
I'm assuming that eventually we might want to learn which atoms are grouped
together. One way of checking learned types is that all atoms are labeled.
So everything in your system should have a non bonding force assigned to
it.
That was the thought process that initially led to my two thoughts about
labeling above. I don't think we want options where "missing nonbonded
types will be ignored." However, the SMIRKS pattern with multiple indices
in option B/C could still be used to explicitly label the hydrogens, even
though the group is sharing an interaction site.
Given that we're want options for real or virtual sites I think it makes
more sense to keep the groups of atoms labeled together (I.e. not option
A). Otherwise a virtual site would have to be defined with one of the
explicit atoms.
I'm not sure I understand the difference between options B and C wouldn't
labeling the group have to " assign[ing] real or virtual sites a nonbonded
parameter."
On Thu, Mar 23, 2017 at 14:19 John Chodera ***@***.***> wrote:
In that case, we're back to the question of how to tell SMIRNOFF to assign
nonbonded types to real or virtual sites for a subset of atoms, and ignore
the fact that some atoms won't have nonbonded types.
There are several options on the table (slightly modified from @bannanc
<https://github.com/bannanc>'s suggestions):
- *Option A:* Explicitly match atoms that are subsumed into parent
atoms, and assign null parameters. For hydrogens in a united-atom
methyl group, this might be "[#1
<#1
<#1>]".
- *Option B:* Create a special tag to mark groups of atoms that will
be represented by a single (real or virtual) site, and so missing nonbonded
types will be ignored. For example, for a methyl group, this could be
[#1:2]-[#6:1](-[#1:3])-[#1:4], where all 4 atoms in a methyl group are
grouped together because we will separately assign either a real or virtual
site in this group a parameter.
- *Option C:* Same as B, but also let this tag assign real or virtual
sites a nonbonded parameter. This could either be a modified form of
<NonbondedForce/> tags or a new type of <UnitedAtomNonbondedForce/>
tag.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#220 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ALagHXlRLZFhRXUrM2hnIjLIcr_hUu61ks5rouEEgaJpZM4Mbjtd>
.
--
********************************
Caitlin C. Bannan
Ph.D. Candidate
Mobley Group
Chemistry Department at UC Irvine
[email protected]
|
Since "labeling" atoms that are subsumed into a single interaction site isn't something that would appear in a
Option B would not change the existing type definitions for Option C would ether modify the way we use |
Yes, I think we're on the same page here @jchodera Thanks for clarifying with B/C I still don't have a preference for which one of those makes more sense. @davidlmobley do you? |
I guess the biggest question is whether we want to extend NonbondedForce to allow additionally tagged atoms (2, 3, 4...) to be "subsumed" into a single interaction site, or whether we want to create a whole new tag and generator for this. I can see advantages to simply extending NonbondedForce if it doesn't break current functionality, since otherwise we need to engineer communication between different generators so the type checking doesn't complain that atoms are untyped. Alternatively, we can engineer a simpler type checking scheme that runs separately at the end of the process to flag missing types, but this involves rearchitecting things with a new idea of how to do the checking (which may not be bad, just more effort now). We should also finalize the specification of both virtual sites and how we will assign interaction types to virtual sites. |
Assuming it doesn't break current uses it seems like it makes more sense to extend the NonbondedForce. This seems more logical to me than having multiple generators that create specify non-bonding interactions. Could we define subtypes similar to the proper and improper options for torsions. |
@davidlmobley Any more thoughts on this? I think we're leaning toward extending the For example, to create united-atom methyls and methylenes, we would go from <NonbondedForce coulomb14scale="0.833333" lj14scale="0.5" sigma_unit="angstroms" epsilon_unit="kilocalories_per_mole">
<Atom smirks="[#1:1]" rmin_half="1.1870" epsilon="0.0157"/> # hydrogen
<Atom smirks="[#6:1]" rmin_half="1.9080" epsilon="0.1094 "/> # carbon
</NonbondedForce> to <NonbondedForce coulomb14scale="0.833333" lj14scale="0.5" sigma_unit="angstroms" epsilon_unit="kilocalories_per_mole">
<Atom smirks="[#1:1]" rmin_half="1.1870" epsilon="0.0157"/> # hydrogen
<Atom smirks="[#6:1]" rmin_half="1.9080" epsilon="0.1094 "/> # carbon
<Atom smirks="[#1:2]-[#6:1]-[#1:3]" rmin_half="1.9080" epsilon="0.1094 "/> # -CH2-
<Atom smirks="[#1:2]-[#6:1]([#1:3])-[#1:4]" rmin_half="1.9080" epsilon="0.1094 "/> # -CH3
</NonbondedForce> where the labeled atoms numbered greater than 1 in the We could also add an optional attribute that would explicitly say that this atom should be treated as a united-atom site, like |
Yes, @jchodera - I didn't have further thoughts on this because I like the plan. |
I also like the idea of having BUT - does this create any complications for the other sections, such as bonds and angles? |
We still need to solidify the spec changes.
|
I'm happy with the above proposed spec, but I haven't had time to think through the broader implications for what would have to change in the code, nor do I understand what this would mean for bonds, angles, and torsions, and how we would manage "virtual bonds, angles, and torsions", or if we even need such things. It seems to me we need input from people who use these types of force fields ( @ramess101 ?) on what they want to be able to support, as I don't have any recent experience with coarse grained FFs.
It seems like we should punt until later, since we don't yet even have virtual sites for charges, which is a higher priority. I'll have to digest the rest a bit more and respond again soon. |
OK, I've thought through this a bit more, @jchodera , and, to respond to your comment:
I'm happy with it. I do have questions though - for example I'm not understanding how this would interplay with the assignment of the other parameters (bonds, angles, torsions); I'm thinking we would still assign these parameters to the atoms which are subsumed into other atoms, but then we wouldn't actually be using them when running dynamics because we'd be keeping the groups rigid? Is that what you have in mind?
I would punt. I'm tagging Richard on Slack though to see if he wants to weigh in on this.
This sort of gets at my question above. I don't really know because I have so little experience with CG force fields. I think my approach would be to keep the atomic ones (though they might need to be different in a CG FF) and reduce the number of nonbonded interaction sites. Effectively we'd be carrying dummy atoms around, like in an alchemical calculation, I think.
Right, though presumably computing the bonds, angles, and torsions for these extra sites will have some cost. But perhaps we can keep them rigid without too much trouble? Not sure which would be more costly though -- applying constraints or just computing the full bonded interactions for what are effectively "dummy" atoms. I do very much like that this makes it trivial to compute properties for a full all-atom simulation even from a united atom approach. |
Sorry for not seeing this earlier. The general approach for united-atom models is to utilize the same bond, angle, and torsional potentials as the all-atom cases but only for the heavy atoms. In other words, for a molecule like n-butane that is represented as CH3-CH2-CH2-CH3 we would use the same C-C bond lengths (and harmonic force constants) as the all-atom C-C bonds and the same for the C-C-C angles and C-C-C-C dihedrals. That being said, sometimes an all-atom model will use slightly more precise bond lengths and angles. For example, the TraPPE-UA model uses a 0.154 nm C-C length whereas the TraPPE-EH (explicit hydrogen) model uses a 0.1535 nm C-C length. Usually you do not waste computational time computing bonds, angles, and torsions for the extra hydrogen sites that are not included in the nonbonded calculations. But if the C-H bonds, H-C-H bends, etc. are kept rigid with little effort that is also a viable option. And, like you mentioned, this can make it quite simple to convert the united-atom model to an all-atom model. As far as the nonbonded potentials, I do think the Lennard-Jones m-n potential is gaining a lot of attention in force field development. The Exp-6 model is much less common in the recent literature, so if I had to pick one I would say that including the LJ m-n model would be the higher priority. As for virtual site based united-atoms, I agree that we should punt that until later. |
Thanks, @ramess101 . So, @jchodera - I think we agree on the path forward for now. |
United-atom types are quite popular in "data driven" force fields. For hydrocarbons the "virtual site" is located on the carbon atoms while the hydrogens are not accounted for explicitly. Hydrogens attached to oxygen, nitrogen, etc. are typically still included explicitly which means that for non-hydrocarbons there is a mix of explicit and implicit hydrogens. This might complicate matters since you cannot just ignore all hydrogens. Fortunately, in the immediate future (i.e. the next 2 years) the NIST force field will be focusing on hydrocarbons so we just need some way for smarty to ignore the hydrogens attached to carbon.
The text was updated successfully, but these errors were encountered: