kithairon.SurveyData#

aoeu

class kithairon.SurveyData(data: ~polars.dataframe.frame.DataFrame = <factory>)#

Bases: object

A container for Echo survey data, potentially from many plates.

SurveyData holds Echo survey data, potentially from many individual surveys and sources, in a Polars DataFrame. It is intended to allow for easy access and use of individual surveys, while allowing for extensive analysis when required. It is primarily intended to ingest PlateSurvey XML files from Echo Liquid Handler software (accessible directly via platesurvey.EchoPlateSurveyXML). The format is Kithairon-specific, but can export back to EchoPlateSurveyXML format. It can be easily and compactly written to and read from Parquet files, with compression making them smaller than the originals despite increased verbosity.

All data is held in a single DataFrame, data, and every row is self-contained, with all survey metadata duplicated for each well, and all well, signal, and feature data included. This allows for easy multi-survey analyses and selections of data. Like a DataFramee, SurveyData is immutable: manipulation and selection operations efficiently return new SurveyData objects, only copying data when required.

__init__(data: ~polars.dataframe.frame.DataFrame = <factory>) None#

Methods

__init__([data])

extend(other)

Extend this survey data with another survey data object or an iterable of survey data objects.

extend_read_parquet(path)

Extend the current SurveyData object with data from a Parquet file.

extend_read_xml(path)

Extend the current SurveyData object with the data from an XML file located at the given path.

find_latest_survey(*args, **kwargs)

Find the latest survey based on the given criteria, returning a SurveyData.

find_survey(**kwargs)

Find a single survey timestamp based on the given criteria, returning a SurveyData.

find_survey_timestamp(**kwargs)

Find a single survey timestamp based on the given criteria.

find_survey_timestamps(*args, **kwargs)

Find survey timestamps based on the given criteria.

from_json_dict(d)

from_platesurvey(ps)

Create a new instance of SurveyData from an EchoPlateSurveyXML object.

from_xml(xml_str)

Create a new instance of SurveyData from an XML string.

from_xml_tree(xml_tree)

Create a new instance of SurveyData from an XML string.

heatmap([value, sel, axs, title, fill_value])

Generate a heatmap for each survey in the SurveyData.

read_parquet(path[, polars_options])

Read SurveyData from a Parquet file.

read_xml(path)

Read survey data from an Echo-produced XML file.

to_json_dict()

to_platesurveys()

Convert survey data to a list of EchoPlateSurveyXML objects.

volumes_array(*[, full_plate, fill_value])

Generate a 2D array of the volumes in each well.

with_columns(*args, **kwargs)

with_comment(comment[, overwrite])

with_plate_name(name[, overwrite])

write_parquet(path[, polars_options])

Write the survey data to a Parquet file.

write_platesurveys(paths[, path_str_format])

Write plate surveys to disk as Echo PlateSurvey format.

Attributes

is_single_survey

True if the SurveyData contains a single survey, False otherwise.

latest_single_surveys

A list of SurveyData instances, for the latest survey of each plate in the data.

lazy_data

plate_name

Number of rows in the survey.

plate_shape

Shape of the full plate, in (rows, columns).

plate_total_wells

Total number of wells in the plate (not survey).

single_surveys

A list of SurveyData instances, for each individual plate survey in the data.

survey_columns

Number of columns in the survey.

survey_offset

Top left (row, column) offset from (0, 0) ("A1") of the survey data.

survey_rows

Number of rows in the survey.

survey_shape

Shape of the survey, which may not be the full plate, in (rows, columns).

surveys

A DataFrame listing the surveys in the SurveyData.

timestamp

Timestamp of the survey.

data

extend(
other: Self | Iterable[Self],
) Self#

Extend this survey data with another survey data object or an iterable of survey data objects.

Parameters:
otherSelf | Iterable[Self]

The survey data object or iterable of survey data objects to extend with.

Returns:
Self

The extended survey data object.

Raises:
TypeError

If other is not an instance of Self or an iterable of Self.

extend_read_parquet(
path: str | Path | BinaryIO | BytesIO | bytes,
) Self#

Extend the current SurveyData object with data from a Parquet file.

Parameters:
pathstr or Path or BinaryIO or BytesIO or bytes

The path to the Parquet file or a file-like object containing the data.

Returns:
SurveyData

A new SurveyData object that includes the data from the Parquet file.

extend_read_xml(path: str | PathLike) Self#

Extend the current SurveyData object with the data from an XML file located at the given path.

Parameters:
pathstr or os.PathLike

The path to the XML file to read.

Returns:
SurveyData

A new SurveyData object that contains the data from the current object as well as the data from the XML file.

find_latest_survey(
*args,
**kwargs: _SurveySelectorArgs,
) Self#

Find the latest survey based on the given criteria, returning a SurveyData.

Parameters:
plate_namestr or None, optional

Name of the plate.

plate_typestr or None, optional

Type of the plate.

plate_barcodestr or None, optional

Barcode of the plate.

exprpl.Expr or None, optional

Additional expression to filter the surveys.

Returns:
SurveyData

The data for the latest survey that matches the given criteria.

find_survey(
**kwargs: _SurveySelectorArgs,
) Self#

Find a single survey timestamp based on the given criteria, returning a SurveyData.

Parameters:
plate_namestr or None, optional

Name of the plate.

plate_typestr or None, optional

Type of the plate.

plate_barcodestr or None, optional

Barcode of the plate.

exprpl.Expr or None, optional

Additional expression to filter the surveys.

Returns:
SurveyData

The data for survey that matches the given criteria.

Raises:
ValueError

If no or multiple timestamps are found.

find_survey_timestamp(
**kwargs: _SurveySelectorArgs,
) datetime#

Find a single survey timestamp based on the given criteria.

Parameters:
plate_namestr or None, optional

Name of the plate.

plate_typestr or None, optional

Type of the plate.

plate_barcodestr or None, optional

Barcode of the plate.

exprpl.Expr or None, optional

Additional expression to filter the surveys.

Returns:
datetime

The timestamps that matches the given criteria.

Raises:
ValueError

If no or multiple timestamps are found.

find_survey_timestamps(
*args,
**kwargs: _SurveySelectorArgs,
) Series#

Find survey timestamps based on the given criteria.

Parameters:
plate_namestr or None, optional

Name of the plate.

plate_typestr or None, optional

Type of the plate.

plate_barcodestr or None, optional

Barcode of the plate.

exprpl.Expr or None, optional

Additional expression to filter the surveys.

Returns:
pl.Series

A series of timestamps that match the given criteria.

classmethod from_platesurvey(
ps: EchoPlateSurveyXML,
) Self#

Create a new instance of SurveyData from an EchoPlateSurveyXML object.

Parameters:
psEchoPlateSurveyXML

The EchoPlateSurveyXML object to create the new instance from.

Returns:
SurveyData

A new instance of SurveyData created from the EchoPlateSurveyXML object.

classmethod from_xml(xml_str: str | bytes) Self#

Create a new instance of SurveyData from an XML string.

Parameters:
xml_strstr or bytes

The XML string to parse.

Returns:
SurveyData

A new instance of SurveyData created from the parsed XML.

Raises:
ParsingError

If the XML string cannot be parsed.

classmethod from_xml_tree(xml_tree: etree._Element) Self#

Create a new instance of SurveyData from an XML string.

Parameters:
xml_treelxml.etree._Element

The XML tree to parse.

Returns:
SurveyData

A new instance of SurveyData created from the parsed XML.

Raises:
ParsingError

If the XML tree cannot be parsed.

heatmap(
value: str | Expr = 'volume',
sel: Expr | None = None,
axs: Axes | Iterable[Axes | None] | None = None,
title: str | Callable | None = None,
*,
fill_value: Any = nan,
**kwargs,
) list[Axes]#

Generate a heatmap for each survey in the SurveyData.

Parameters:
valuestr | pl.Expr, optional

Value to use. May be a string (for a column), or a polars expression. By default “volume”

selpl.Expr | None, optional

Selector for surveys, by default None (use all surveys)

axsAxes | Iterable[Axes | None] | None, optional

Axes to use, by default None (generate new axes)

titlestr | Callable | None, optional

Title for the heatmap. If a callable, called with the survey-specific SurveyData for each survey heatmap. By default None (generate a default)

fill_valueAny, optional

Fill value for missing values, by default np.nan

Returns:
list[Axes]

Each Axes used: one per survey.

Raises:
ValueError

Ran out of provided axes to use.

property is_single_survey: bool#

True if the SurveyData contains a single survey, False otherwise.

property latest_single_surveys: list[Self]#

A list of SurveyData instances, for the latest survey of each plate in the data.

property plate_name: int#

Number of rows in the survey. Single survey only.

property plate_shape: tuple[int, int]#

Shape of the full plate, in (rows, columns).

Calculated based on the standard Echo source plate name format, which includes the number of wells at the beginning. May fail for unusual plates. Single survey only.

Returns:
tuple[int, int]

See also

plate_size
property plate_total_wells#

Total number of wells in the plate (not survey). Single survey only.

classmethod read_parquet(
path: str | Path | BinaryIO | BytesIO | bytes,
polars_options: dict[str, Any] | None = None,
) Self#

Read SurveyData from a Parquet file.

Parameters:
pathstr | Path | BinaryIO | BytesIO | bytes

Path to read from, or file-like object.

polars_kwdict[str, Any] | None, optional

Options to pass to polars:polars.read_parquet, by default None

Returns:
SurveyData
classmethod read_xml(path: str | PathLike) Self#

Read survey data from an Echo-produced XML file.

Parameters:
pathstr or os.PathLike

The path to the XML file.

Returns:
SurveyData

The survey data read from the XML file.

Raises:
ParsingError

If the XML file cannot be parsed.

property single_surveys: list[Self]#

A list of SurveyData instances, for each individual plate survey in the data.

property survey_columns: int#

Number of columns in the survey. Single survey only.

property survey_offset: tuple[int, int]#

Top left (row, column) offset from (0, 0) (“A1”) of the survey data. Single survey only.

Returns:
tuple[int, int]
Raises:
ValueError

Multiple offsets were returned.

property survey_rows: int#

Number of rows in the survey. Single survey only.

property survey_shape: tuple[int, int]#

Shape of the survey, which may not be the full plate, in (rows, columns). Single survey only.

Returns:
tuple[int, int]
property surveys: DataFrame#

A DataFrame listing the surveys in the SurveyData.

Returns:
pl.DataFrame
property timestamp: datetime#

Timestamp of the survey. Single survey only.

to_platesurveys() list[EchoPlateSurveyXML]#

Convert survey data to a list of EchoPlateSurveyXML objects.

Returns:
list[EchoPlateSurveyXML]

A list of EchoPlateSurveyXML objects representing the survey data.

volumes_array(
*,
full_plate: bool = False,
fill_value: Any = nan,
) ndarray#

Generate a 2D array of the volumes in each well. Single survey only.

Parameters:
full_platebool, optional

Return the full plate if true (filling un-surveyed wells with fill_value), by default False

fill_valueAny, optional

Value to fill unsurveyed wells and wells with no value in the survey, by default np.nan

Returns:
np.ndarray
write_parquet(
path: str | Path | BytesIO,
polars_options: dict[str, Any] | None = None,
) None#

Write the survey data to a Parquet file.

Parameters:
pathstr or Path or BytesIO

The path to write the Parquet file to, or a BytesIO object to write to.

polars_optionsdict[str, Any] or None, optional

Options to pass to the Polars DataFrame.write_parquet method.

Examples

>>> survey_data = SurveyData(...)
>>> survey_data.write_parquet("survey_data.parquet")
write_platesurveys(
paths: str | PathLike[str] | Iterable[str | PathLike[str]] | Callable[[EchoPlateSurveyXML], str],
path_str_format=True,
) None#

Write plate surveys to disk as Echo PlateSurvey format.

Parameters:
pathsstr or os.PathLike or iterable of str or os.PathLike or callable

The path(s) to write the plate surveys to. If a callable is provided, it should take an EchoPlateSurveyXML object as input and return a string path.

path_str_formatbool, optional

Whether to format the path(s) using the format method of the paths argument, by default True.

Returns:
None
Raises:
ValueError

If a duplicate path is encountered.