Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orangutan model how to add adaptive introgression? #1554

Open
h-pawar opened this issue Mar 18, 2024 · 1 comment
Open

orangutan model how to add adaptive introgression? #1554

h-pawar opened this issue Mar 18, 2024 · 1 comment

Comments

@h-pawar
Copy link

h-pawar commented Mar 18, 2024

Short description of the problem:

Hello, building on the orangutan “TwoSpecies_2L11” demographic model, I would like to add adaptive introgression (from bornean into sumatran). Mutation arises any time from the species split to 1000 generations ago, passes to sumatran & is beneficial with s=0.1.

Following the tutorial - https://github.com/popsim-consortium/stdpopsim/blob/main/docs/selection_example.py, I

  • draw mutation in source popn
  • condition on allele freq >0 in source popn
  • condition on allele freq >0 in recipient popn (lines 84-90) # but there has not been an explicit call to an introgression pulse, so, I think migration happens in the next continuous migration rate instance.
  • mutation undergoes selection in recipient popn
  • condition on allele freq > 0.05 in recipient popn at end of simulation # line 106, I do not understand why the start time for this call (stdpopsim.ext.ConditionOnAlleleFrequency) is 0, should it be start_time=T_sel?

Would you expect this model to take such a long time to run (~6 days)? Or have I specified the introgression pulse incorrectly, does this need a separate call to be added?

Is there a way to check that the mutation has introgressed & correctly undergone selection in the recipient population?

How to reproduce the problem:

#!/usr/bin/python
#Pongo, adaptive introgression : test bornean into sumatran
#adaptive introgression, following https://github.com/popsim-consortium/stdpopsim/blob/main/docs/selection_example.py
import stdpopsim
import tskit
import random
import numpy
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("rep", type=int, help="rep number")
parser.add_argument("seln", type=float, help="selection coeff")
args = parser.parse_args()
repnum = args.rep
selcoeff = args.seln
#------------------------------------------------------------------------------------------------------------------------
species = stdpopsim.get_species("PonAbe")
demography = species.get_demographic_model("TwoSpecies_2L11")
contig = species.get_contig(length=100000,mutation_rate=demography.mutation_rate)
engine = stdpopsim.get_engine("slim")
samples = {"Sumatran": 11, "Bornean": 14}
#------------------------------------------------------------------------------------------------------------------------
#time species split in generations -https://popsim-consortium.github.io/stdpopsim-docs/stable/catalog.html#sec_catalog_PonAbe_genome
T_Abe_Pyg_split = 20157
#migrations occur per generation in ponabe locke model

t_delta = 1000 / demography.generation_time
#time of mutation
T_mut = random.uniform(1000, T_Abe_Pyg_split)

#have mutation segregating in source pop for a few generations
T_mig = random.uniform(T_mut - 100, T_mut - 5)

#but need to ensure mutation is passed before selection acts *
T_sel = random.uniform(t_delta, T_mig)
s = selcoeff

print(f'Parameters of rep:{repnum}, selcoeff:{s}, T_mut:{T_mut}, T_mig:{T_mig}, T_sel:{T_sel}')
#------------------------------------------------------------------------------------------------------------------------
    # Place the drawn mutation in the middle of the contig.
locus_id = "introgressed_locus"
contig.add_single_site(id=locus_id, coordinate=round(contig.length / 2))

extended_events = [
        # Draw mutation in source pop
        stdpopsim.ext.DrawMutation(
            time=T_mut,
            single_site_id=locus_id,
            population="Bornean",
        ),
        # Because the drawn mutation is neutral at the time of introduction,
        # it's likely to be lost due to drift. To avoid this, we condition on
        # the mutation having AF > 0 in source pop. If this condition is false at any
        # time between start_time and end_time, the simulation will be
        # returned to the point where the mutation was introduced.
        # Conditioning should start one generation after T_mut (not at T_mut!),
        # to avoid checking for the mutation before SLiM can introduce it.
            # have mutation segregating in source pop for a few generations
        stdpopsim.ext.ConditionOnAlleleFrequency(
            # Note: if T_mut ~= T_Den_split, then we end up with:
            #       GenerationAfter(T_mut) < T_Den_split,
            #       which will give an error due to "start_time < end_time".
            start_time=stdpopsim.ext.GenerationAfter(T_mut),
            end_time=T_mig,
            single_site_id=locus_id,
            population="Bornean",
            op=">",
            allele_frequency=0,
        ),
        # The source lineage has migrants entering the recipient lineage at T_mig,
        # so condition on AF > 0 in recipient pop
        stdpopsim.ext.ConditionOnAlleleFrequency(
            start_time=stdpopsim.ext.GenerationAfter(T_mig),
            end_time=0,
            single_site_id=locus_id,
            population="Sumatran",
            op=">",
            allele_frequency=0,
        ),
        # The mutation is positively selected in recipient pop at T_sel.
        # Note that this will have no effect, unless/until a mutation with the
        # specified mutation_type_id is found in the population.
        stdpopsim.ext.ChangeMutationFitness(
            start_time=T_sel,
            end_time=0,
            single_site_id=locus_id,
            population="Sumatran",
            selection_coeff=s,
            dominance_coeff=1,
        ),
        # Condition on AF > 0.05 in recipient pop at the end of the simulation.
          # Q - here do not understand why start time should be 0??
        stdpopsim.ext.ConditionOnAlleleFrequency(
            start_time=0,
            end_time=0,
            single_site_id=locus_id,
            population="Sumatran",
            op=">",
            allele_frequency=0.05,
        )
    ]

#Simulate.
ts = engine.simulate(
    demography,
    contig,
    samples,
    extended_events=extended_events,
    slim_burn_in=10,
    )

filename_abe=f"/scratch/admixlab/harvi/slim_out/orang/ai/testai.intoabe.{repnum}.{selcoeff}.trees"
ts.dump(filename_abe)
@petrelharp
Copy link
Contributor

Hello, @h-pawar! Sorry we didn't notice this earlier. I can't dig in deep here right now, but if this is still an issue:

Would you expect this model to take such a long time to run (~6 days)? Or have I specified the introgression pulse incorrectly, does this need a separate call to be added?

Well, how long does the model without adaptive introgression take to run? If it is much shorter than 6 days, probably what you are seeing is that the events you're conditioning on are very unlikely, so it is restarting a lot. I guess the way I would debug this is to start simple - so, remove some of the conditions and see if that helps? I would also do this debugging using a very strong rescaling factor, so it runs in minutes not days.

Is there a way to check that the mutation has introgressed & correctly undergone selection in the recipient population?

The short answer is: no, not an easy way? We do have pretty rigorous checks that these things function as they should, though. You could come up with ad-hoc ways like turn the selection coefficient way up and check that this increases the ending frequency?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants