Parallel PopGen Package
GO_Fish Namespace Reference

Namespace for single-locus, forward, Monte-Carlo Wright-Fisher simulation and output data structures. More...

Classes

struct  allele_trajectories
 control and output data structure for GO_Fish simulation More...
 
struct  mutID
 structure specifying the ID for a mutation in a GO_Fish simulation More...
 

Functions

template<typename Functor_mutation , typename Functor_demography , typename Functor_migration , typename Functor_selection , typename Functor_inbreeding , typename Functor_dominance , typename Functor_preserve , typename Functor_timesample >
__host__ void run_sim (allele_trajectories &all_results, const Functor_mutation mu_rate, const Functor_demography demography, const Functor_migration mig_prop, const Functor_selection sel_coeff, const Functor_inbreeding FI, const Functor_dominance dominance, const Functor_preserve preserve_mutations, const Functor_timesample take_sample)
 runs a single-locus Wright-Fisher simulation specified by the given simulation functions and sim_constants, storing the results into all_results More...
 
template<typename Functor_mutation , typename Functor_demography , typename Functor_migration , typename Functor_selection , typename Functor_inbreeding , typename Functor_dominance , typename Functor_preserve , typename Functor_timesample >
__host__ void run_sim (allele_trajectories &all_results, const Functor_mutation mu_rate, const Functor_demography demography, const Functor_migration mig_prop, const Functor_selection sel_coeff, const Functor_inbreeding FI, const Functor_dominance dominance, const Functor_preserve preserve_mutations, const Functor_timesample take_sample, const allele_trajectories &prev_sim)
 runs a single-locus Wright-Fisher simulation specified by the given simulation functions and sim_constants, storing the results into all_results More...
 
std::ostream & operator<< (std::ostream &stream, const mutID &id)
 insertion operator: sends mutID id into the ostream stream More...
 
std::ostream & operator<< (std::ostream &stream, allele_trajectories &A)
 insertion operator: sends allele_trajectories A into the ostream stream More...
 
void swap (allele_trajectories &a, allele_trajectories &b)
 swaps data held by allele_trajectories a and b More...
 

Detailed Description

Namespace for single-locus, forward, Monte-Carlo Wright-Fisher simulation and output data structures.

GO_Fish is a single-locus Wright-Fisher forward simulation where individual sites are assumed to be independent from each other and mutations are irreversible (Poisson Random Field model). Mutations are the “unit” of simulation for the single-locus Wright-Fisher algorithm. Thus a generation of organisms is represented by an array of mutations and their frequency in the (each) population (if there are multiple in the simulation). There are several options for how to initialize the mutation array to start a simulation: a blank mutation array, the output of a previous simulation run, or mutation-selection equilibrium. Simulating each discrete generation consists of calculating the new allele frequency of each mutation after a round of migration, selection, and drift. Concurrently, new mutations are added to the array. Those mutations that become lost or fixed are discarded in a compact step. The resulting offspring array of mutation frequencies becomes the parent array of the next generation and the cycle is repeated until the end of the simulation when the final mutation array is output. Further, the user can sample individual generations in the simulation.

The function run_sim runs a GO_Fish simulation (see documentation for run_sim below). The sampled and final generations are stored in allele_trajectories all_results. all_results stores the frequency(ies) in the population(s) and mutID of every mutation in the sample in RAM, from which population genetics statistics can be calculated or which can be manipulated and output as the user sees fit.

To use all GO_Fish functions and objects, include header file: go_fish.cuh.
Optionally, to use only the GO_Fish data structures, include header file: go_fish_data_struct.h.

Function Documentation

§ run_sim() [1/2]

template<typename Functor_mutation , typename Functor_demography , typename Functor_migration , typename Functor_selection , typename Functor_inbreeding , typename Functor_dominance , typename Functor_preserve , typename Functor_timesample >
__host__ void GO_Fish::run_sim ( allele_trajectories all_results,
const Functor_mutation  mu_rate,
const Functor_demography  demography,
const Functor_migration  mig_prop,
const Functor_selection  sel_coeff,
const Functor_inbreeding  FI,
const Functor_dominance  dominance,
const Functor_preserve  preserve_mutations,
const Functor_timesample  take_sample 
)

runs a single-locus Wright-Fisher simulation specified by the given simulation functions and sim_constants, storing the results into all_results

calls run_sim(..., const allele_trajectories & prev_sim) with prev_sim set to a blank allele_trajectory. Saves on some unnecessary typing when starting from mutation-selection-equilibrium or a blank simulation.


Below is the description for function run_sim(..., const allele_trajectories & prev_sim):

A simulation run is controlled by the template functions and allele_trajectories.sim_input_constants (which are then accessible from allele_trajectories.last_run_constants() even if the input constants are later changed). The user can write their own simulation functions to input into run_sim or use those provided in namespace Sim_Model. For details on how to write your own simulation functions, go to the Modules page, click on the simulation function group which describes the function you wish to write, and read its detailed description. They can be standard functions, functors, or (coming with C+11 support) lambdas.

Pro Tip: For extra speed, it is desirable that the simulation functions input into run_sim are known at compile-time (i.e. avoid function pointers and non-inline functions unless necessary). The parameters input into the constructors of functors (as used by Sim_Model) may be set at runtime, but the the function itself (the structure/operator in the case of a functor) should be known at compile-time. The functions are input into run_sim via templates, so that, at compile-time, known functions can be compiled directly into run_sim's code (fast) as opposed to called from the function stack (slow). This is especially important for Selection, Migration, and, to a lesser extent, Demographic functions, which are run on the GPU many times over for every mutation, every generation (on the GPU every mutation, every compact generation for Demography).

Parameters
all_resultsall_results.sim_input_constants help control the simulation run whose results are stored in all_results
mu_rateFunction specifying the mutation rate per site for a given population, generation
demographyFunction specifying then population size (individuals) for a given population, generation
mig_propFunction specifying the migration rate, which is the proportion of chromosomes in population pop_TO from population pop_FROM for a given generation
sel_coeffFunction specifying the selection coefficient for a given population, generation, frequency
FIFunction specifying the inbreeding coefficient for a given population, generation
dominanceFunction specifying the dominance coefficient for a given population, generation
preserve_mutationsFunction specifying if the mutations extant in a generation should be preserved for the rest of the simulation run
take_sampleFunction specifying if a time sample should be taken in a generation - note this will automatically preserve those mutations for the rest of the simulation run
prev_simif prev_sim_sample in all_results.sim_input_constants is greater than 0 (and less than the number of time samples in prev_sim), then run_sim will use the corresponding time sample in prev_sim to initialize the new simulation provided that the number of populations and number of sites are equivalent to those in all_results.sim_input_constants or an error will be thrown.
Examples:
Example1-Speed, Example2-DaDi, and Example3-Compilation.

Definition at line 525 of file go_fish_impl.cuh.

§ run_sim() [2/2]

template<typename Functor_mutation , typename Functor_demography , typename Functor_migration , typename Functor_selection , typename Functor_inbreeding , typename Functor_dominance , typename Functor_preserve , typename Functor_timesample >
__host__ void GO_Fish::run_sim ( allele_trajectories all_results,
const Functor_mutation  mu_rate,
const Functor_demography  demography,
const Functor_migration  mig_prop,
const Functor_selection  sel_coeff,
const Functor_inbreeding  FI,
const Functor_dominance  dominance,
const Functor_preserve  preserve_mutations,
const Functor_timesample  take_sample,
const allele_trajectories prev_sim 
)

runs a single-locus Wright-Fisher simulation specified by the given simulation functions and sim_constants, storing the results into all_results

A simulation run is controlled by the template functions and allele_trajectories.sim_input_constants (which are then accessible from allele_trajectories.last_run_constants() even if the input constants are later changed). The user can write their own simulation functions to input into run_sim or use those provided in namespace Sim_Model. For details on how to write your own simulation functions, go to the Modules page, click on the simulation function group which describes the function you wish to write, and read its detailed description. They can be standard functions, functors, or (coming with C+11 support) lambdas.

Pro Tip: For extra speed, it is desirable that the simulation functions input into run_sim are known at compile-time (i.e. avoid function pointers and non-inline functions unless necessary). The parameters input into the constructors of functors (as used by Sim_Model) may be set at runtime, but the the function itself (the structure/operator in the case of a functor) should be known at compile-time. The functions are input into run_sim via templates, so that, at compile-time, known functions can be compiled directly into run_sim's code (fast) as opposed to called from the function stack (slow). This is especially important for Selection, Migration, and, to a lesser extent, Demographic functions, which are run on the GPU many times over for every mutation, every generation (on the GPU every mutation, every compact generation for Demography).

Parameters
all_resultsall_results.sim_input_constants help control the simulation run whose results are stored in all_results
mu_rateFunction specifying the mutation rate per site for a given population, generation
demographyFunction specifying then population size (individuals) for a given population, generation
mig_propFunction specifying the migration rate, which is the proportion of chromosomes in population pop_TO from population pop_FROM for a given generation
sel_coeffFunction specifying the selection coefficient for a given population, generation, frequency
FIFunction specifying the inbreeding coefficient for a given population, generation
dominanceFunction specifying the dominance coefficient for a given population, generation
preserve_mutationsFunction specifying if the mutations extant in a generation should be preserved for the rest of the simulation run
take_sampleFunction specifying if a time sample should be taken in a generation - note this will automatically preserve those mutations for the rest of the simulation run
prev_simif prev_sim_sample in all_results.sim_input_constants is greater than 0 (and less than the number of time samples in prev_sim), then run_sim will use the corresponding time sample in prev_sim to initialize the new simulation provided that the number of populations and number of sites are equivalent to those in all_results.sim_input_constants or an error will be thrown.

Definition at line 554 of file go_fish_impl.cuh.

§ operator<<() [1/2]

std::ostream & GO_Fish::operator<< ( std::ostream &  stream,
const mutID id 
)
inline

insertion operator: sends mutID id into the ostream stream

returns ostream stream containing string id.toString()

string format: (origin_generation,origin_population,origin_thread,reserved)

Stream can be fed into terminal output, file output, or into an istream for extraction with the >> operator.

Definition at line 198 of file inline_go_fish_data_struct.hpp.

§ operator<<() [2/2]

std::ostream & GO_Fish::operator<< ( std::ostream &  stream,
allele_trajectories A 
)
inline

insertion operator: sends allele_trajectories A into the ostream stream

returns ostream stream containing the last simulation run information stored by allele_trajectories A

First function inserts the run constants (not input constants) held by A into the output stream with the variable name tab-delimited from its value. This is followed by the feature information (e.g. generation, number of mutations, population size, population extinction) from each time sample (if any). Each feature of a time sample is a row in the stream while each time sample is a major column and each population is a minor column. Finally, the allele trajectory of each mutation (if any) is added to the stream. The allele trajectories are mutation row-ordered (by origin_generation then origin_population then origin_threadID), where each major column is a time sample and each minor column is a population. All columns are tab-delimited. An example is provided in example_compilation/bfile.dat.

Stream can be fed into terminal output, file output, or into an istream for extraction with the >> operator.

Definition at line 208 of file inline_go_fish_data_struct.hpp.

§ swap()

void GO_Fish::swap ( allele_trajectories a,
allele_trajectories b 
)
inline

swaps data held by allele_trajectories a and b

Definition at line 269 of file inline_go_fish_data_struct.hpp.