API Documentation¶
The GenotypePhenotypeMap
is the main entry point to the gpmap package. Load
in your data using the read
methods attached to this object. The following
subpackages include various objects to analyze this object.
GenotypePhenotypeMap¶
-
class
gpmap.gpm.
GenotypePhenotypeMap
(wildtype, genotypes, phenotypes=None, stdeviations=None, mutations=None, site_labels=None, n_replicates=1, **kwargs)¶ Bases:
object
Object for containing genotype-phenotype map data.
Parameters: - wildtype (string) – wildtype sequence.
- genotypes (array-like) – list of all genotypes
- phenotypes (array-like) – List of phenotypes in the same order as genotypes. If None, all genotypes are assigned a phenotype = np.nan.
- mutations (dict) – Dictionary that maps each site indice to their possible substitution alphabet.
- site_labels (array-like) – list of labels to apply to sites. If this is not specified, the first site is assigned a label 0, the next 1, etc. If specified, sites are assigned labels in the order given. For example, if the genotypes specify mutations at positions 12 and 75, this would be a list [12,75].
- n_replicates (int) – number of replicate measurements comprising the mean phenotypes
- include_binary (bool (default=True)) – Construct a binary representation of the space.
-
data
¶ The core data object. Columns are ‘genotypes’, ‘phenotypes’, ‘n_replicates’, ‘stdeviations’, and (option) ‘binary’.
Type: pandas.DataFrame
-
complete_data
¶ A dataframe mapping the complete set of genotypes possible, given the mutations dictionary. Contains all columns in data. Any missing data is reported as NaN.
Type: pandas.DataFrame (optional, created by BinaryMap)
-
missing_data
¶ A dataframe containing the set of missing genotypes; complte_data - data. Two columns: ‘genotypes’ and ‘binary’.
Type: pandas.DataFrame (optional, created by BinaryMap)
-
binary
¶ object that gives you (the user) access to the binary representation of the map.
Type: BinaryMap
-
encoding_table
¶ Pandas DataFrame showing how mutations map to binary representation.
-
add_binary
()¶ Build a binary representation of set of genotypes.
Add as a column to the main DataFrame.
-
add_n_mutations
()¶ Build a column with the number of mutations in each genotype.
Add as a column to the main DataFrame.
-
binary
Binary representation of genotypes.
-
classmethod
from_dict
(metadata)¶
-
classmethod
from_json
(json_str)¶ Load a genotype-phenotype map directly from a json. The JSON metadata must include the following attributes
Note
Keyword arguments override input that is loaded from the JSON file.
-
genotypes
¶ Get the genotypes of the system.
-
get_all_possible_genotypes
()¶ Get the complete set of genotypes possible. There is no particular order to the genotypes. Consider sorting.
-
get_missing_genotypes
()¶ Get all genotypes missing from the complete genotype-phenotype map.
-
index
¶ Return numpy array of genotypes position.
-
length
¶ Get length of the genotypes.
-
map
(attr1, attr2)¶ Dictionary that maps attr1 to attr2.
-
mutant
¶ Get the farthest mutant in genotype-phenotype map.
-
mutations
¶ Get the furthest genotype from the wildtype genotype.
-
n
¶ Get number of genotypes, i.e. size of the genotype-phenotype map.
-
n_replicates
¶ Return the number of replicate measurements made of the phenotype
-
phenotypes
¶ Get the phenotypes of the system.
-
classmethod
read_csv
(fname, wildtype, **kwargs)¶
-
classmethod
read_dataframe
(dataframe, wildtype, **kwargs)¶ Construct a GenotypePhenotypeMap from a dataframe.
-
classmethod
read_excel
(fname, wildtype, **kwargs)¶
-
classmethod
read_json
(filename, **kwargs)¶ Load a genotype-phenotype map directly from a json file. The JSON metadata must include the following attributes
Note
Keyword arguments override input that is loaded from the JSON file.
-
classmethod
read_pickle
(filename, **kwargs)¶ Read GenotypePhenotypeMap from pickle
-
stdeviations
¶ Get stdeviations
-
to_csv
(filename=None, **kwargs)¶ Write genotype-phenotype map to csv spreadsheet.
Keyword arguments are passed directly to Pandas dataframe to_csv method.
Parameters: filename (str) – Name of file to write out.
-
to_dict
(complete=False)¶ Write genotype-phenotype map to dict.
-
to_excel
(filename=None, **kwargs)¶ Write genotype-phenotype map to excel spreadsheet.
Keyword arguments are passed directly to Pandas dataframe to_excel method.
Parameters: filename (str) – Name of file to write out.
-
to_json
(filename=None, complete=False)¶ Write genotype-phenotype map to json file. If no filename is given returns
-
to_pickle
(filename, **kwargs)¶ Write GenotypePhenotypeMap object to a pickle file.
-
wildtype
¶ Get reference genotypes for interactions.