API Documentation

The GenotypePhenotypeMap is the main entry point to the gpmap package. Load in your data using the read methods attached to this object. The following subpackages include various objects to analyze this object.

GenotypePhenotypeMap

class gpmap.gpm.GenotypePhenotypeMap(wildtype, genotypes, phenotypes=None, stdeviations=None, mutations=None, site_labels=None, n_replicates=1, **kwargs)

Bases: object

Object for containing genotype-phenotype map data.

Parameters:
  • wildtype (string) – wildtype sequence.
  • genotypes (array-like) – list of all genotypes
  • phenotypes (array-like) – List of phenotypes in the same order as genotypes. If None, all genotypes are assigned a phenotype = np.nan.
  • mutations (dict) – Dictionary that maps each site indice to their possible substitution alphabet.
  • site_labels (array-like) – list of labels to apply to sites. If this is not specified, the first site is assigned a label 0, the next 1, etc. If specified, sites are assigned labels in the order given. For example, if the genotypes specify mutations at positions 12 and 75, this would be a list [12,75].
  • n_replicates (int) – number of replicate measurements comprising the mean phenotypes
  • include_binary (bool (default=True)) – Construct a binary representation of the space.
data

The core data object. Columns are ‘genotypes’, ‘phenotypes’, ‘n_replicates’, ‘stdeviations’, and (option) ‘binary’.

Type:pandas.DataFrame
complete_data

A dataframe mapping the complete set of genotypes possible, given the mutations dictionary. Contains all columns in data. Any missing data is reported as NaN.

Type:pandas.DataFrame (optional, created by BinaryMap)
missing_data

A dataframe containing the set of missing genotypes; complte_data - data. Two columns: ‘genotypes’ and ‘binary’.

Type:pandas.DataFrame (optional, created by BinaryMap)
binary

object that gives you (the user) access to the binary representation of the map.

Type:BinaryMap
encoding_table

Pandas DataFrame showing how mutations map to binary representation.

add_binary()

Build a binary representation of set of genotypes.

Add as a column to the main DataFrame.

add_n_mutations()

Build a column with the number of mutations in each genotype.

Add as a column to the main DataFrame.

binary

Binary representation of genotypes.

classmethod from_dict(metadata)
classmethod from_json(json_str)

Load a genotype-phenotype map directly from a json. The JSON metadata must include the following attributes

Note

Keyword arguments override input that is loaded from the JSON file.

genotypes

Get the genotypes of the system.

get_all_possible_genotypes()

Get the complete set of genotypes possible. There is no particular order to the genotypes. Consider sorting.

get_missing_genotypes()

Get all genotypes missing from the complete genotype-phenotype map.

index

Return numpy array of genotypes position.

length

Get length of the genotypes.

map(attr1, attr2)

Dictionary that maps attr1 to attr2.

mutant

Get the farthest mutant in genotype-phenotype map.

mutations

Get the furthest genotype from the wildtype genotype.

n

Get number of genotypes, i.e. size of the genotype-phenotype map.

n_replicates

Return the number of replicate measurements made of the phenotype

phenotypes

Get the phenotypes of the system.

classmethod read_csv(fname, wildtype, **kwargs)
classmethod read_dataframe(dataframe, wildtype, **kwargs)

Construct a GenotypePhenotypeMap from a dataframe.

classmethod read_excel(fname, wildtype, **kwargs)
classmethod read_json(filename, **kwargs)

Load a genotype-phenotype map directly from a json file. The JSON metadata must include the following attributes

Note

Keyword arguments override input that is loaded from the JSON file.

classmethod read_pickle(filename, **kwargs)

Read GenotypePhenotypeMap from pickle

stdeviations

Get stdeviations

to_csv(filename=None, **kwargs)

Write genotype-phenotype map to csv spreadsheet.

Keyword arguments are passed directly to Pandas dataframe to_csv method.

Parameters:filename (str) – Name of file to write out.
to_dict(complete=False)

Write genotype-phenotype map to dict.

to_excel(filename=None, **kwargs)

Write genotype-phenotype map to excel spreadsheet.

Keyword arguments are passed directly to Pandas dataframe to_excel method.

Parameters:filename (str) – Name of file to write out.
to_json(filename=None, complete=False)

Write genotype-phenotype map to json file. If no filename is given returns

to_pickle(filename, **kwargs)

Write GenotypePhenotypeMap object to a pickle file.

wildtype

Get reference genotypes for interactions.