Reading/Writing =========== The ``GenotypePhenotypeMap`` object is a Pandas DataFrame at its core. Most tabular formats (i.e. Excel files, csv, tsv, ...) can be read/written. Excel Spreadsheets ------------------ Excel files are supported through the ``read_excel`` method. This method requires `genotypes` and `phenotypes` columns, and can include `n_replicates` and `stdeviations` as optional columns. All other columns are ignored. **Example**: Excel spreadsheet file ("data.xlsx") .. raw:: html
genotypes phenotypes stdeviations n_replicates
0 PTEE 0.243937 0.013269 1
1 PTEY 0.657831 0.055803 1
2 PTFE 0.104741 0.013471 1
3 PTFY 0.683304 0.081887 1
4 PIEE 0.774680 0.069631 1
5 PIEY 0.975995 0.059985 1
6 PIFE 0.500215 0.098893 1
7 PIFY 0.501697 0.025082 1
8 RTEE 0.233230 0.052265 1
9 RTEY 0.057961 0.036845 1
10 RTFE 0.365238 0.050948 1
11 RTFY 0.891505 0.033239 1
12 RIEE 0.156193 0.085638 1
13 RIEY 0.837269 0.070373 1
14 RIFE 0.599639 0.050125 1
15 RIFY 0.277137 0.072571 1

Read the spreadsheet directly into the GenotypePhenotypeMap. .. code-block:: python from gpmap import GenotypePhenotypeMap gpm = GenotypePhenotypeMap.read_excel(wildtype="PTEE", filename="data.xlsx") CSV File -------- CSV files are supported through the ``read_excel`` method. This method requires `genotypes` and `phenotypes` columns, and can include `n_replicates` and `stdeviations` as optional columns. All other columns are ignored. **Example**: CSV File .. raw:: html
genotypes phenotypes stdeviations n_replicates
0 PTEE 0.243937 0.013269 1
1 PTEY 0.657831 0.055803 1
2 PTFE 0.104741 0.013471 1
3 PTFY 0.683304 0.081887 1
4 PIEE 0.774680 0.069631 1
5 PIEY 0.975995 0.059985 1
6 PIFE 0.500215 0.098893 1
7 PIFY 0.501697 0.025082 1
8 RTEE 0.233230 0.052265 1
9 RTEY 0.057961 0.036845 1
10 RTFE 0.365238 0.050948 1
11 RTFY 0.891505 0.033239 1
12 RIEE 0.156193 0.085638 1
13 RIEY 0.837269 0.070373 1
14 RIFE 0.599639 0.050125 1
15 RIFY 0.277137 0.072571 1

Read the csv directly into the GenotypePhenotypeMap. .. code-block:: python from gpmap import GenotypePhenotypeMap gpm = GenotypePhenotypeMap.read_csv(wildtype="PTEE", filename="data.csv") JSON Format ----------- The only keys recognized by the json reader are: 1. `genotypes` 2. `phenotypes` 3. `stdeviations` 4. `mutations` 5. `n_replicates` All other keys are ignored in the epistasis models. You can keep other metadata stored in the JSON, but it won't be appended to the epistasis model object. .. code-block:: javascript { "genotypes" : [ '000', '001', '010', '011', '100', '101', '110', '111' ], "phenotypes" : [ 0.62344582, 0.87943151, -0.11075798, -0.59754471, 1.4314798, 1.12551439, 1.04859722, -0.27145593 ], "stdeviations" : [ 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, ], "mutations" : { 0 : ["0", "1"], 1 : ["0", "1"], 2 : ["0", "1"], } "n_replicates" : 12, "title" : "my data", "description" : "a really hard experiment" }