gpmap.errors module

class gpmap.errors.BaseErrorMap(Map)

Bases: object

Object to attach to seqspace objects for managing errors, standard deviations, and their log transforms.

If a lower bound is given, use it instead of -variances.

lower

Get lower error bound.

upper

Get upper error bound

wrapper(bound, **kwargs)

Wrapper function that changes variances to whatever bound desired.

class gpmap.errors.StandardDeviationMap(Map)

Bases: gpmap.errors.BaseErrorMap

wrapper(bounds, **kwargs)

Wrapper function to convert Variances if necessary

class gpmap.errors.StandardErrorMap(Map)

Bases: gpmap.errors.BaseErrorMap

wrapper(bounds)

Wrapper function to convert Variances if necessary

gpmap.errors.lower_transform(mean, bound, logbase)

Log transformation scaling.

Examples

Untransformed data looks as so:

Yupper = Ymean + bound Ylower = Ymean - bound
We want log(bounds)
ie.
log(Yupper) - log(Ymean) log(Ylower) + log(Ymean)
so log(bound) = log(1 + bound/Ymean)
log(bound) = log(1 - bound/Ymean)
gpmap.errors.upper_transform(mean, bound, logbase)

Log transformation scaling.

Examples

Untransformed data looks as so:

Yupper = Ymean + bound Ylower = Ymean - bound
We want log(bounds)
ie.
log(Yupper) - log(Ymean) log(Ylower) + log(Ymean)
so log(bound) = log(1 + bound/Ymean)
log(bound) = log(1 - bound/Ymean)

gpmap.stats module

gpmap.stats.c4_correction(n_samples)

Return the correction scalar for calculating standard deviation from a normal distribution.

gpmap.stats.corrected_std(var, n_samples=2)

Calculate the unbiased standard deviation from a biased standard deviation.

gpmap.stats.corrected_sterror(var, n_samples=2)

Calculate an unbiased standard error from a BIASED standard deviation.

gpmap.stats.coverage(gpm)
gpmap.stats.unbiased_std(x, axis=None)

A correction to numpy’s standard deviation calculation. Calculate the unbiased estimation of standard deviation, which includes a correction factor for sample sizes < 100.

gpmap.stats.unbiased_sterror(x, axis=None)

Unbiased error.

gpmap.stats.unbiased_var(x, axis=None)

This enforces that the unbias estimate for variance is calculated

gpmap.utils module

Utility functions for managing genotype-phenotype map data and conversions.

Glossary:

mutations : doct
keys are site numbers in the genotypes. Values are alphabet of mutations at that sites
encoding : dict
keys are site numbers in genotype. Values are dictionaries mapping each mutation to its binary representation.
gpmap.utils.farthest_genotype(reference, genotypes)

Find the genotype in the system that differs at the most sites.

gpmap.utils.find_differences(s1, s2)

Return the index of differences between two sequences.

gpmap.utils.genotypes_to_binary(genotypes, encoding_table)

Using an encoding table (see get_encoding_table function), build a set of binary genotypes.

Parameters:
  • genotypes – List of the genotypes to encode.
  • encoding_table – DataFrame that encodes the binary representation of each mutation in the list of genotypes. (See the get_encoding_table).
gpmap.utils.genotypes_to_mutations(genotypes)

Create mutations dictionary from a list of mutations.

gpmap.utils.get_base(logbase)

Get base from logbase :param logbase: logarithm function :type logbase: callable

Returns:base – returns base of logarithm.
Return type:float
gpmap.utils.get_encoding_table(wildtype, mutations, site_labels=None)

This function constructs a lookup table (pandas.DataFrame) for mutations in a given mutations dictionary. This table encodes mutations with a binary representation.

gpmap.utils.get_missing_genotypes(genotypes, mutations=None)

Get a list of genotypes not found in the given genotypes list.

Parameters:
  • genotypes (list) – List of genotypes.
  • mutations (dict (optional)) – Mutation dictionary
Returns:

missing_genotypes – List of genotypes not found in genotypes list.

Return type:

list

gpmap.utils.hamming_distance(s1, s2)

Return the Hamming distance between equal-length sequences

gpmap.utils.ipywidgets_missing(function)

Wrapper checks that ipython widgets are install before trying to render them.

gpmap.utils.length_to_mutations(length, alphabet=['0', '1'])

Build a mutations dictionary for a given alphabet

Parameters:
  • length (int) – length of the genotypes
  • alphabet (list) – List of mutations at each site.
gpmap.utils.list_binary(length)

List all binary strings with given length.

gpmap.utils.mutations_to_encoding(wildtype, mutations)

Encoding map for genotype-to-binary

Parameters:
  • wildtype (str) – Wildtype sequence.
  • mutations (dict) – Mapping of each site’s mutation alphabet. {site-number: [alphabet]}
Returns:

encode – Encoding dictionary that maps site number to mutation-binary map

Return type:

OrderedDict of OrderDicts

Examples

{ <site-number> : {<mutation>: <binary>} }

gpmap.utils.mutations_to_genotypes(mutations, wildtype=None)

Use a mutations dictionary to construct an array of genotypes composed of those mutations.

Parameters:
  • mutations (dict) – A mapping dict with site numbers as keys and lists of mutations as values.
  • wildtype (str) – wildtype genotype (as string).
Returns:

genotypes – list of genotypes comprised of mutations in given dictionary.

Return type:

list

gpmap.utils.sample_phenotypes(phenotypes, errors, n=1)

Generate n phenotypes from from normal distributions.