Simulations

This module contains functions for generating the simulated data used in the LineageOT paper. Most of this is not required for applying LineageOT to experimental data, and none of it needs to be used directly.

class lineageot.simulation.Cell(x, barcode, seed=None)

Bases: object

Wrapper for (rna expression, barcode) arrays

deepcopy()
reset_seed()
class lineageot.simulation.SimulationParameters(timestep=0.01, diffusion_constant=0.001, mean_division_time=10, division_time_distribution='normal', division_time_std=0, target_num_cells=inf, mutation_rate=1, flow_type='bifurcation', x0_speed=1, barcode_length=15, back_mutations=False, num_genes=3, initial_distribution_std=0, alphabet_size=200, relative_mutation_likelihoods=array([0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]), keep_tree=True, enforce_barcode_reproducibility=True, keep_cell_seeds=True)

Bases: object

Storing the parameters for simulated data

lineageot.simulation.center(barcode, params)

Returns the center of the distribution p(x0|barcode)

lineageot.simulation.convergent_flow(x, params)

Single bifurcation followed by convergence of the two clusters

lineageot.simulation.convert_data_to_arrays(data)

Converts a list of cells to two ndarrays, one for expression and one for barcodes

lineageot.simulation.evolve_b(initial_barcode, time, params)

Returns the new barcode after mutations have occurred for some time

lineageot.simulation.evolve_cell(initial_cell, time, params)

Returns a new cell after both barcode and x have evolved for some time

lineageot.simulation.evolve_x(initial_x, time, params)

Returns a sample from Langevin dynamics following potential_gradient

lineageot.simulation.flatten_list_of_lists(tree_data)

Converts a dataset of cells with their ancestral tree structure to a list of cells (with ancestor and time information dropped)

lineageot.simulation.mask_barcode(barcode, p)

Replaces a subset of the entries of barcode with -1 to simulate missing data

Entries are masked independently with probability p

Also works for an array of barcodes

lineageot.simulation.mismatched_clusters_flow(x, params)

Single bifurcation followed by bifurcation of each cluster

lineageot.simulation.mutate_barcode(barcode, params)

Randomly changes one entry of the barcode

lineageot.simulation.partial_convergent_flow(x, params)

Single bifurcation followed by bifurcation of each cluster, where two of the new clusters subsequently merge

lineageot.simulation.reproducible_poisson(rate)

Samples a single Poisson random variable, in a way that is reproducible, i.e. after

np.random.seed(s) a = divisible_poisson(r1) np.random.seed(s) b = divisible_poisson(r2)

with r1 > r2, b ~ binomial(n = a, p = r2/r1)

This is the standard numpy Poisson sampling algorithm for rate <= 10.

Note that this is relatively slow, running in O(rate) time.

lineageot.simulation.sample_barcode(params)

Samples an initial barcode

lineageot.simulation.sample_cell(params)

Samples an initial cell

lineageot.simulation.sample_descendants(initial_cell, time, params, target_num_cells=None)

Samples the descendants of an initial cell

lineageot.simulation.sample_division_time(params)

Samples the time until a cell divides

lineageot.simulation.sample_pop(num_initial_cells, time, params)

Samples a population after some intervening time

num_initial_cells: Number of cells in the population at time 0 time: Time when population is measured params: Simulation parameters

lineageot.simulation.sample_population_descendants(pop, time, params)

Samples the descendants of each cell in a population pop: list of (expression, barcode) tuples

lineageot.simulation.sample_x0(barcode, params)

Samples the initial position in gene expression space

lineageot.simulation.single_bifurcation_flow(x)
lineageot.simulation.split_targets_between_daughters(time_remaining, target_num_cells, params)

Given a target number of cells to sample, divides the samples between daughters assuming both have the expected number of descendants at the sampling time

lineageot.simulation.subsample_list(sample, target_num_cells)

Randomly samples target_num_cells from the sample

If there are fewer than target_num_cells in the sample, returns the whole sample

lineageot.simulation.subsample_pop(sample, target_num_cells, params, num_cells=None)

Randomly samples target_num_cells from the sample. Subsampling during the simulation by setting params.target_num_cells is a more efficient approximation of this.

If there are fewer than target_num_cells in the sample, returns the whole sample

sample should be either:

  • a list of cells, if params.keep_tree is False

  • nested lists of lists of cells encoding the tree structure, if params.keep_tree is True

(i.e., it should match the output of sample_descendants with the same params)

lineageot.simulation.vector_field(x, params)

Selects a vector field and returns its value at x