Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-check SMIRFF energies vs parm@frosst on full AlkEthOH set (finished, tests added) #60

Merged
merged 13 commits into from
Jul 12, 2016

Conversation

davidlmobley
Copy link
Contributor

@davidlmobley davidlmobley commented Jul 12, 2016

Continuing my energy validation by cross-checking full set.

Found another parm@frosst/parm99 set of bugs where H2-CT-CT-O* and H3-CT-CT-O* were erroneously inheriting a generic torsion (X-CT-CT-X). Checking continues.

@davidlmobley
Copy link
Contributor Author

Resolves #58

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 12, 2016

I still have an issue with c1161 torsion C2 - O1 - C5 - O2 which is CT-OS-CT-OS; I get the error:

Exception: Error:  (proper) PeriodicTorsionForce entry (atoms 1, 5, 4, 6) has different numbers of terms (3 and 1, respectively).

It seems the expected SMARTS is not being matched in this case:

[#6X4:1]-[#8X4:2]-[#6X4:3]-[O&X2&H0:4] :        0 matches

Troubleshooting. Ah, yes, SMARTS typo in XML and prototype file. We can't have [#8X4]; even Christopher makes mistakes occasionally. Should be [#8X2]

@jchodera
Copy link
Member

jchodera commented Jul 12, 2016

Are you sure the O in [O&X2&H0:4] is valid, and that you don't actually want [#8X2&H0:4]?

@davidlmobley
Copy link
Contributor Author

@jchodera : Both [O&X2&H0:4] and [#8X2&H0:4] are valid. It was the X4 that was the issue, I believe, since oxygen can't have four connections. Testing.

@jchodera
Copy link
Member

Ha! Excellent point.

@davidlmobley
Copy link
Contributor Author

Now 253 out of ~900 in the "chains" subset pass, so we're making real headway. Next up, AlkEthOH_c1302, where the error is

AssertionError: Error: Harmonic Bond distance (0, 1) has equilibrium distances of 0.957200 and 0.960000 angstroms, respectively.

(now with units, thanks to debugging I did earlier on the system checker in openmoltools - choderalab/openmoltools#228 ).

OK, but this is water, so our tests should bypass this as the AMBER number is drawing on TIP3P water and our AlkEthOH parameter file is not attempting to reproduce that. Skipping in tests.

@davidlmobley
Copy link
Contributor Author

All 900+ chains now pass. Onward to rings. First up, r0:

AssertionError: Systems have different (proper) PeriodicTorsionForce entries: extra keys are:
set([(0, 3, 2, 1), (0, 1, 2, 3)])

Well, that's interesting. This is a fun compound, SMILES OC1C(O)CO1. I have no idea so far what's going on here. I believe this corresponds to the CT-CT-CT-OS torsion, so the torsion within the ring.

It's perhaps worth noting that C1-C2-C3-O1 (0, 1, 2, 3) and C3-C2-C1-O1 (2, 1, 0, 3) ought to be symmetry-equivalent torsions I think, but that is NOT what's going on here. The (0, 3, 2, 1) torsion is C1-O1-C3-C2, or CT-OS-CT-CT. Weird.

Torsional matches here:

PeriodicTorsionGenerator:

                               [a,A:1]-[#6X4:2]-[#6X4:3]-[a,A:4] :       17 matches
                                [a,A:1]-[#6X4:2]-[#8X2:3]-[#1:4] :        6 matches
                               [a,A:1]-[#6X4:2]-[#8X2:3]-[!#1:4] :        5 matches
                                 [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4] :        3 matches
                               [#1:1]-[#6X4:2]-[#6X4:3]-[#6X4:4] :        3 matches
                               [#6X4:1]-[#6X4:2]-[#8X2:3]-[#1:4] :        3 matches
                             [#6X4:1]-[#6X4:2]-[#6X4:3]-[#6X4:4] :        0 matches
                             [#6X4:1]-[#6X4:2]-[#8X2:3]-[#6X4:4] :        1 matches
                          [#6X4:1]-[#8X2:2]-[#6X4:3]-[O&X2&H0:4] :        0 matches
                             [#8X2:1]-[#6X4:2]-[#6X4:3]-[#8X2:4] :        3 matches
                               [#8X2:1]-[#6X4:2]-[#6X4:3]-[#1:4] :        6 matches
                          [#1:1]-[#6X4:2]-[#6X4:3](-[#8])-[#1:4] :        7 matches
                        [#1:1]-[#6X4:2](-[#8])-[#6X4:3]-[#6X4:4] :        4 matches
                                [#1:1]-[#6X4:2]-[#6X4:3]-[OX2:4] :        6 matches
                  [#1:1]-[#6X4:2](-[#8])(-[#8])-[#6X4:3]-[OX2:4] :        1 matches
                               [a,A:1]~[#6X3:2]([a,A:3])~[OX1:4] :        0 matches

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 12, 2016

Seems like the relevant term should be

<Proper smirks="[#6X4:1]-[#6X4:2]-[#8X2:3]-[#6X4:4]" idivf1="1" periodicity1="3" phase1="0.0" k1="0.383" idivf2="1" periodicity2="2" phase2="180.0" k2="0.1"/> <!-- CT-CT-OS-CT from frcmod.Frosst_AlkEthOH -->

the CT-CT-OS-CT torsion, which is not consistent with either ordering here. This DOES get one match here, so possibly this is just a problem with handling of torsions in cyclic molecules in the system_checker... ??

Relevant code is here: https://github.com/choderalab/openmoltools/blob/master/openmoltools/system_checker.py#L388

@davidlmobley
Copy link
Contributor Author

So if I manually look at this SMIRKS ([#6X4:1]-[#6X4:2]-[#8X2:3]-[#6X4:4]) and the molecule, I conclude it should either match C2-C3-O1-C1 (1, 2, 3, 0) or the symmetry-equivalent C2-C1-O1-C3 (1, 0, 3, 2). Neither of these matches EITHER set of orderings above. So, I'm at a loss for now. I'll have to revisit tomorrow.

Maybe AMBER and/or OpenMM is doing something internally to reorder torsions for rings.

I'll have to revisit tomorrow/soon, as I'm at a loss for now.

@davidlmobley
Copy link
Contributor Author

I suppose the above atoms could be covered by the generic X-CT-CT-X, so maybe it's just an issue with pairing of torsional terms in the tests - i.e. since there are multiple torsional terms involving some of the same atoms in this ring, maybe the checker is not picking the correct pair for comparison.

@jchodera
Copy link
Member

I think you're right---I expect the problem is with rendering the (i,j,k,l) quartet into a set in which the order is invariant here to check equivalence.

Perhaps you could instead use the same ValenceDict I wrote for SMIRFF that checks a key forward and backwards to find a match?

@davidlmobley
Copy link
Contributor Author

I'll check that out, @jchodera . Thanks!

@davidlmobley
Copy link
Contributor Author

Hmm, @jchodera - it's not actually looking to me like that's the issue. I've output the entire table of torsions identified for this molecule and the AMBER-originated system really is picking up an 0-3-2-1 (C1-O1-C3-C2 / CT-OS-CT-CT) torsion which is not being picked up when generating from SMIRFF - i.e. for SMIRFF we don't get an 0-3-2-1 torsion NOR an 1-2-3-0 torsion.

(units degrees and kilocalories_per_mole):

Num (type)   Num (type)      Num (type)      Num (type)      per     phase   k0
...
  0 ( C1)-   3 ( O1)-    2 ( C3)-      1 ( C2)-      2.000000    180.000077      0.418400
  0 ( C1)-   3 ( O1)-    2 ( C3)-      1 ( C2)-      3.000000    0.000000    1.602472
..

I'm seeing O1 appearing in 19 torsions in the AMBER case and 16 torsions in OpenMM. Here's the full output (AMBER at the top). I'm walking through it to see if I can tell what's wrong.

Num (type)   Num (type)      Num (type)      Num (type)      per     phase   k0
  0 ( C1)-   1 ( C2)-    2 ( C3)-      9 ( H4)-      3.000000    0.000000    0.650844
  0 ( C1)-   1 ( C2)-    4 ( O2)-     10 ( H5)-      1.000000    0.000000    1.046000
  0 ( C1)-   1 ( C2)-    4 ( O2)-     10 ( H5)-      3.000000    0.000000    0.669440
  0 ( C1)-   3 ( O1)-    2 ( C3)-      9 ( H4)-      3.000000    0.000000    1.603867
  1 ( C2)-   2 ( C3)-    5 ( O3)-     11 ( H6)-      1.000000    0.000000    1.046000
  1 ( C2)-   2 ( C3)-    5 ( O3)-     11 ( H6)-      3.000000    0.000000    0.669440
  6 ( H1)-   0 ( C1)-    1 ( C2)-      2 ( C3)-      3.000000    0.000000    0.650844
  7 ( H2)-   0 ( C1)-    1 ( C2)-      2 ( C3)-      3.000000    0.000000    0.650844
  2 ( C3)-   1 ( C2)-    4 ( O2)-     10 ( H5)-      1.000000    0.000000    1.046000
  2 ( C3)-   1 ( C2)-    4 ( O2)-     10 ( H5)-      3.000000    0.000000    0.669440
  6 ( H1)-   0 ( C1)-    3 ( O1)-      2 ( C3)-      3.000000    0.000000    1.603867
  7 ( H2)-   0 ( C1)-    3 ( O1)-      2 ( C3)-      3.000000    0.000000    1.603867
  3 ( O1)-   0 ( C1)-    1 ( C2)-      8 ( H3)-      1.000000    0.000000    1.046000
  3 ( O1)-   0 ( C1)-    1 ( C2)-      8 ( H3)-      3.000000    0.000000    0.000000
  3 ( O1)-   2 ( C3)-    1 ( C2)-      8 ( H3)-      1.000000    0.000000    1.046000
  3 ( O1)-   2 ( C3)-    1 ( C2)-      8 ( H3)-      3.000000    0.000000    0.000000
  3 ( O1)-   2 ( C3)-    5 ( O3)-     11 ( H6)-      3.000000    0.000000    0.697333
  6 ( H1)-   0 ( C1)-    1 ( C2)-      4 ( O2)-      1.000000    0.000000    1.046000
  6 ( H1)-   0 ( C1)-    1 ( C2)-      4 ( O2)-      3.000000    0.000000    0.000000
  7 ( H2)-   0 ( C1)-    1 ( C2)-      4 ( O2)-      1.000000    0.000000    1.046000
  7 ( H2)-   0 ( C1)-    1 ( C2)-      4 ( O2)-      3.000000    0.000000    0.000000
  4 ( O2)-   1 ( C2)-    2 ( C3)-      9 ( H4)-      3.000000    0.000000    0.650844
  5 ( O3)-   2 ( C3)-    1 ( C2)-      8 ( H3)-      1.000000    0.000000    1.046000
  5 ( O3)-   2 ( C3)-    1 ( C2)-      8 ( H3)-      3.000000    0.000000    0.000000
  6 ( H1)-   0 ( C1)-    1 ( C2)-      8 ( H3)-      3.000000    0.000000    0.650844
  7 ( H2)-   0 ( C1)-    1 ( C2)-      8 ( H3)-      3.000000    0.000000    0.650844
  8 ( H3)-   1 ( C2)-    2 ( C3)-      9 ( H4)-      3.000000    0.000000    0.650844
  8 ( H3)-   1 ( C2)-    4 ( O2)-     10 ( H5)-      3.000000    0.000000    0.697333
  9 ( H4)-   2 ( C3)-    5 ( O3)-     11 ( H6)-      3.000000    0.000000    0.697333
  0 ( C1)-   1 ( C2)-    2 ( C3)-      3 ( O1)-      3.000000    0.000000    0.650844
  0 ( C1)-   1 ( C2)-    2 ( C3)-      5 ( O3)-      3.000000    0.000000    0.650844
  0 ( C1)-   3 ( O1)-    2 ( C3)-      1 ( C2)-      2.000000    180.000077      0.418400
  0 ( C1)-   3 ( O1)-    2 ( C3)-      1 ( C2)-      3.000000    0.000000    1.602472
  0 ( C1)-   3 ( O1)-    2 ( C3)-      5 ( O3)-      3.000000    0.000000    1.603867
  1 ( C2)-   0 ( C1)-    3 ( O1)-      2 ( C3)-      2.000000    180.000077      0.418400
  1 ( C2)-   0 ( C1)-    3 ( O1)-      2 ( C3)-      3.000000    0.000000    1.602472
  3 ( O1)-   0 ( C1)-    1 ( C2)-      2 ( C3)-      3.000000    0.000000    0.650844
  3 ( O1)-   0 ( C1)-    1 ( C2)-      4 ( O2)-      2.000000    0.000000    4.916200
  3 ( O1)-   0 ( C1)-    1 ( C2)-      4 ( O2)-      3.000000    0.000000    0.602496
  3 ( O1)-   2 ( C3)-    1 ( C2)-      4 ( O2)-      2.000000    0.000000    4.916200
  3 ( O1)-   2 ( C3)-    1 ( C2)-      4 ( O2)-      3.000000    0.000000    0.602496
  4 ( O2)-   1 ( C2)-    2 ( C3)-      5 ( O3)-      2.000000    0.000000    4.916200
  4 ( O2)-   1 ( C2)-    2 ( C3)-      5 ( O3)-      3.000000    0.000000    0.602496
Num (type)   Num (type)      Num (type)      Num (type)      per     phase   k0
  1 ( C2)-     0 ( C1)-        3 ( O1)-        2 ( C3) -     3.000000    0.000000    1.602472
  1 ( C2)-     0 ( C1)-        3 ( O1)-        2 ( C3) -     2.000000    180.000000      0.418400
  4 ( O2)-     1 ( C2)-        2 ( C3)-        5 ( O3) -     3.000000    0.000000    0.602496
  4 ( O2)-     1 ( C2)-        2 ( C3)-        5 ( O3) -     2.000000    0.000000    4.916200
  5 ( O3)-     2 ( C3)-        1 ( C2)-        8 ( H3) -     3.000000    0.000000    0.000000
  5 ( O3)-     2 ( C3)-        1 ( C2)-        8 ( H3) -     1.000000    0.000000    1.046000
  0 ( C1)-     1 ( C2)-        2 ( C3)-        5 ( O3) -     3.000000    0.000000    0.650844
  8 ( H3)-     1 ( C2)-        4 ( O2)-       10 ( H5) -     3.000000    0.000000    0.697333
  4 ( O2)-     1 ( C2)-        0 ( C1)-        6 ( H1) -     3.000000    0.000000    0.000000
  4 ( O2)-     1 ( C2)-        0 ( C1)-        6 ( H1) -     1.000000    0.000000    1.046000
  8 ( H3)-     1 ( C2)-        2 ( C3)-        9 ( H4) -     3.000000    0.000000    0.650844
  2 ( C3)-     1 ( C2)-        4 ( O2)-       10 ( H5) -     3.000000    0.000000    0.669440
  2 ( C3)-     1 ( C2)-        4 ( O2)-       10 ( H5) -     1.000000    0.000000    1.046000
  3 ( O1)-     0 ( C1)-        1 ( C2)-        8 ( H3) -     3.000000    0.000000    0.000000
  3 ( O1)-     0 ( C1)-        1 ( C2)-        8 ( H3) -     1.000000    0.000000    1.046000
  6 ( H1)-     0 ( C1)-        1 ( C2)-        8 ( H3) -     3.000000    0.000000    0.650844
  2 ( C3)-     1 ( C2)-        0 ( C1)-        7 ( H2) -     3.000000    0.000000    0.650844
  0 ( C1)-     1 ( C2)-        4 ( O2)-       10 ( H5) -     3.000000    0.000000    0.669440
  0 ( C1)-     1 ( C2)-        4 ( O2)-       10 ( H5) -     1.000000    0.000000    1.046000
  9 ( H4)-     2 ( C3)-        5 ( O3)-       11 ( H6) -     3.000000    0.000000    0.697333
  0 ( C1)-     1 ( C2)-        2 ( C3)-        9 ( H4) -     3.000000    0.000000    0.650844
  2 ( C3)-     1 ( C2)-        0 ( C1)-        6 ( H1) -     3.000000    0.000000    0.650844
  2 ( C3)-     3 ( O1)-        0 ( C1)-        6 ( H1) -     3.000000    0.000000    1.603867
  4 ( O2)-     1 ( C2)-        0 ( C1)-        7 ( H2) -     3.000000    0.000000    0.000000
  4 ( O2)-     1 ( C2)-        0 ( C1)-        7 ( H2) -     1.000000    0.000000    1.046000
  3 ( O1)-     0 ( C1)-        1 ( C2)-        4 ( O2) -     3.000000    0.000000    0.602496
  3 ( O1)-     0 ( C1)-        1 ( C2)-        4 ( O2) -     2.000000    0.000000    4.916200
  1 ( C2)-     2 ( C3)-        5 ( O3)-       11 ( H6) -     3.000000    0.000000    0.669440
  1 ( C2)-     2 ( C3)-        5 ( O3)-       11 ( H6) -     1.000000    0.000000    1.046000
  0 ( C1)-     3 ( O1)-        2 ( C3)-        5 ( O3) -     3.000000    0.000000    1.603867
  3 ( O1)-     2 ( C3)-        1 ( C2)-        8 ( H3) -     3.000000    0.000000    0.000000
  3 ( O1)-     2 ( C3)-        1 ( C2)-        8 ( H3) -     1.000000    0.000000    1.046000
  2 ( C3)-     1 ( C2)-        0 ( C1)-        3 ( O1) -     3.000000    0.000000    0.650844
  2 ( C3)-     3 ( O1)-        0 ( C1)-        7 ( H2) -     3.000000    0.000000    1.603867
  0 ( C1)-     3 ( O1)-        2 ( C3)-        9 ( H4) -     3.000000    0.000000    1.603867
  7 ( H2)-     0 ( C1)-        1 ( C2)-        8 ( H3) -     3.000000    0.000000    0.650844
  4 ( O2)-     1 ( C2)-        2 ( C3)-        9 ( H4) -     3.000000    0.000000    0.650844
  3 ( O1)-     2 ( C3)-        1 ( C2)-        4 ( O2) -     3.000000    0.000000    0.602496
  3 ( O1)-     2 ( C3)-        1 ( C2)-        4 ( O2) -     2.000000    0.000000    4.916200
  3 ( O1)-     2 ( C3)-        5 ( O3)-       11 ( H6) -     3.000000    0.000000    0.697333

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 12, 2016

OK, well this is interesting. Focusing in on all torsions involving O1 and parsing down to only what is different (i.e. removing all torsions which are the same) I get these for O1:

Num (type)   Num (type)      Num (type)      Num (type)      per     phase   k0
  0 ( C1)-   1 ( C2)-    2 ( C3)-      3 ( O1)-      3.000000    0.000000    0.650844
  0 ( C1)-   3 ( O1)-    2 ( C3)-      1 ( C2)-      2.000000    180.000077      0.418400
  0 ( C1)-   3 ( O1)-    2 ( C3)-      1 ( C2)-      3.000000    0.000000    1.602472
  3 ( O1)-   2 ( C3)-    1 ( C2)-      4 ( O2)-      2.000000    0.000000    4.916200
  3 ( O1)-   2 ( C3)-    1 ( C2)-      4 ( O2)-      3.000000    0.000000    0.602496
Num (type)   Num (type)      Num (type)      Num (type)      per     phase   k0
  3 ( O1)-     0 ( C1)-        1 ( C2)-        4 ( O2) -     3.000000    0.000000    0.602496
  3 ( O1)-     0 ( C1)-        1 ( C2)-        4 ( O2) -     2.000000    0.000000    4.916200

Top is AMBER prmtop, bottom is SMIRFF.

Top corresponds to CT-CT-CT-OS, CT-OS-CT-CT (two torsions), and OS-CT-CT-OH (two torsions).

This coudl be a bug relating to sets -- specifically, Christopher suggests that ONLY for four membered rings can you have non-equivalent torsions which correspond to the same set. Specifically, here, the AMBER (apparently correct) CT-CT-CT-OS torsion (or OS-CT-CT-CT) is atoms [ 0, 1, 2, 3 ] AND ALSO [ 2, 1, 0, 3] or [ 3, 0, 1, 2] (both should be present, as they are alternative paths around the molecule). These correspond to the same set. (It's also the case that there are other torsions which also correspond to the same set here, i.e. CT-OS-CT-CT is for example [ 1, 0, 3, 2 ] (C2, C1, O1, C3). Not all of these (but some of these) get dropped, so it's not clear to me that it's definitely a problem with sets.)

@davidlmobley
Copy link
Contributor Author

OK, I just tested on cyclobutane and it fails as well.

@davidlmobley
Copy link
Contributor Author

Yup, specific four-membered ring bug, @jchodera , as C1CCC1 fails but C1CCCC1 and C1CCCCC1 pass. Will troubleshoot code looking for problematic set comparisons.

@davidlmobley
Copy link
Contributor Author

Not seeing any apparently problematic set comparisons. The only thing I see which I wonder about is the use of the "unique" flag when finding SMIRKS matches, but I can't find the docs on that. Investigating.

@davidlmobley
Copy link
Contributor Author

If I switch https://github.com/open-forcefield-group/smarty/blob/master/smarty/forcefield.py#L244 to unique = False, cyclobutane passes tests. Per @cbayly13 , from the docs:

Match
OESystem::OEIterBase *Match(const OEMolBase &,
bool uniquematch=false) const
Performs search in order to identify subgraphs of OEMolBase which are isomorphic to the graph with which an OESubSearch object was initialized. It returns an iterator pointer (OEIterBase) over graph matches (OEMatchBase), and should be assigned to an OEIter < OEMatchBase > in order to prevent memory leaks.

By default all possible subgraphs up to and including the maximum number of matches are returned by these methods. If the ‘uniquematch’ argument passed to the methods is true then only the unique matches will be included in the iterator over the matches. A match or subgraph is considered unique if it differs from all other subgraphs found previously by at least one atom. Two subgraph matches which cover the same atoms, albeit in different orders, will be called duplicates and the subgraph found later in the search will be discarded.

I'm going to blast through the rest of the molecules again with this set to False to see if it creates any other problems.

@davidlmobley
Copy link
Contributor Author

Now everything passes.

Next stop: implement some energy comparison checks as part of the standard tests, then this can be closed and SMIRFF v1 will be done!

@davidlmobley davidlmobley changed the title [WIP] Cross-check SMIRFF energies vs parm@frosst on full AlkEthOH set Cross-check SMIRFF energies vs parm@frosst on full AlkEthOH set (finished, tests added) Jul 12, 2016
@davidlmobley
Copy link
Contributor Author

Added new Travis/nose tests to cross-check OpenMM energies beginning from SMIRFF parm@frosst against AMBER .prmtop and .crd parm@frosst. The entire AlkEthOH set now passes locally on my machine, and several molecules are now automatically tested as part of the tests.

@davidlmobley
Copy link
Contributor Author

Going ahead and merging because it passed before I updated the README.md. Travis is stuck for some reason I think.

@davidlmobley
Copy link
Contributor Author

@bannanc - you were curious about non-unique torsion matches. See particularly this thread, and especially #60 (comment) and #60 (comment). Basically if we set unique=True in the OpenEye tools we run into problems for torsions which involve all of the same four atoms but in different orders even if they ARE a different torsion. This happens I think only for four-membered rings.

@davidlmobley
Copy link
Contributor Author

(To be more specific, @bannanc , imagine a four-membered ring involving carbons 1, 2, 3, and 4. There is a torsion 1-2-3-4 and a torsion 4-3-2-1 which of course are the same torsion. But there are also other torsions involving the same four atoms (2-3-4-1, 3-4-1-2) which are not the same torsion. Per the OpenEye docs, since these all involve the same four atoms, only one of them is unique.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants