kithairon.SurveyData#
aoeu
- class kithairon.SurveyData(data: ~polars.dataframe.frame.DataFrame = <factory>)#
Bases:
objectA container for Echo survey data, potentially from many plates.
SurveyDataholds Echo survey data, potentially from many individual surveys and sources, in a PolarsDataFrame. It is intended to allow for easy access and use of individual surveys, while allowing for extensive analysis when required. It is primarily intended to ingest PlateSurvey XML files from Echo Liquid Handler software (accessible directly viaplatesurvey.EchoPlateSurveyXML). The format is Kithairon-specific, but can export back to EchoPlateSurveyXML format. It can be easily and compactly written to and read from Parquet files, with compression making them smaller than the originals despite increased verbosity.All data is held in a single DataFrame,
data, and every row is self-contained, with all survey metadata duplicated for each well, and all well, signal, and feature data included. This allows for easy multi-survey analyses and selections of data. Like a DataFramee, SurveyData is immutable: manipulation and selection operations efficiently return new SurveyData objects, only copying data when required.Methods
__init__([data])extend(other)Extend this survey data with another survey data object or an iterable of survey data objects.
extend_read_parquet(path)Extend the current SurveyData object with data from a Parquet file.
extend_read_xml(path)Extend the current SurveyData object with the data from an XML file located at the given path.
find_latest_survey(*args, **kwargs)Find the latest survey based on the given criteria, returning a SurveyData.
find_survey(**kwargs)Find a single survey timestamp based on the given criteria, returning a SurveyData.
find_survey_timestamp(**kwargs)Find a single survey timestamp based on the given criteria.
find_survey_timestamps(*args, **kwargs)Find survey timestamps based on the given criteria.
from_json_dict(d)from_platesurvey(ps)Create a new instance of
SurveyDatafrom anEchoPlateSurveyXMLobject.from_xml(xml_str)Create a new instance of
SurveyDatafrom an XML string.from_xml_tree(xml_tree)Create a new instance of
SurveyDatafrom an XML string.heatmap([value, sel, axs, title, fill_value])Generate a heatmap for each survey in the SurveyData.
read_parquet(path[, polars_options])Read SurveyData from a Parquet file.
read_xml(path)Read survey data from an Echo-produced XML file.
to_json_dict()Convert survey data to a list of EchoPlateSurveyXML objects.
volumes_array(*[, full_plate, fill_value])Generate a 2D array of the volumes in each well.
with_columns(*args, **kwargs)with_comment(comment[, overwrite])with_plate_name(name[, overwrite])write_parquet(path[, polars_options])Write the survey data to a Parquet file.
write_platesurveys(paths[, path_str_format])Write plate surveys to disk as Echo PlateSurvey format.
Attributes
True if the SurveyData contains a single survey, False otherwise.
A list of SurveyData instances, for the latest survey of each plate in the data.
lazy_dataNumber of rows in the survey.
Shape of the full plate, in (rows, columns).
Total number of wells in the plate (not survey).
A list of SurveyData instances, for each individual plate survey in the data.
Number of columns in the survey.
Top left (row, column) offset from (0, 0) ("A1") of the survey data.
Number of rows in the survey.
Shape of the survey, which may not be the full plate, in (rows, columns).
A DataFrame listing the surveys in the SurveyData.
Timestamp of the survey.
data- extend( ) Self#
Extend this survey data with another survey data object or an iterable of survey data objects.
- Parameters:
- otherSelf | Iterable[Self]
The survey data object or iterable of survey data objects to extend with.
- Returns:
- Self
The extended survey data object.
- Raises:
- TypeError
If
otheris not an instance ofSelfor an iterable ofSelf.
- extend_read_parquet( ) Self#
Extend the current SurveyData object with data from a Parquet file.
- Parameters:
- pathstr or Path or BinaryIO or BytesIO or bytes
The path to the Parquet file or a file-like object containing the data.
- Returns:
- SurveyData
A new SurveyData object that includes the data from the Parquet file.
- extend_read_xml(path: str | PathLike) Self#
Extend the current SurveyData object with the data from an XML file located at the given path.
- Parameters:
- pathstr or os.PathLike
The path to the XML file to read.
- Returns:
- SurveyData
A new SurveyData object that contains the data from the current object as well as the data from the XML file.
- find_latest_survey(
- *args,
- **kwargs: _SurveySelectorArgs,
Find the latest survey based on the given criteria, returning a SurveyData.
- Parameters:
- plate_namestr or None, optional
Name of the plate.
- plate_typestr or None, optional
Type of the plate.
- plate_barcodestr or None, optional
Barcode of the plate.
- exprpl.Expr or None, optional
Additional expression to filter the surveys.
- Returns:
- SurveyData
The data for the latest survey that matches the given criteria.
- find_survey(
- **kwargs: _SurveySelectorArgs,
Find a single survey timestamp based on the given criteria, returning a SurveyData.
- Parameters:
- plate_namestr or None, optional
Name of the plate.
- plate_typestr or None, optional
Type of the plate.
- plate_barcodestr or None, optional
Barcode of the plate.
- exprpl.Expr or None, optional
Additional expression to filter the surveys.
- Returns:
- SurveyData
The data for survey that matches the given criteria.
- Raises:
- ValueError
If no or multiple timestamps are found.
- find_survey_timestamp(
- **kwargs: _SurveySelectorArgs,
Find a single survey timestamp based on the given criteria.
- Parameters:
- plate_namestr or None, optional
Name of the plate.
- plate_typestr or None, optional
Type of the plate.
- plate_barcodestr or None, optional
Barcode of the plate.
- exprpl.Expr or None, optional
Additional expression to filter the surveys.
- Returns:
- datetime
The timestamps that matches the given criteria.
- Raises:
- ValueError
If no or multiple timestamps are found.
- find_survey_timestamps(
- *args,
- **kwargs: _SurveySelectorArgs,
Find survey timestamps based on the given criteria.
- Parameters:
- plate_namestr or None, optional
Name of the plate.
- plate_typestr or None, optional
Type of the plate.
- plate_barcodestr or None, optional
Barcode of the plate.
- exprpl.Expr or None, optional
Additional expression to filter the surveys.
- Returns:
- pl.Series
A series of timestamps that match the given criteria.
- classmethod from_platesurvey(
- ps: EchoPlateSurveyXML,
Create a new instance of
SurveyDatafrom anEchoPlateSurveyXMLobject.- Parameters:
- psEchoPlateSurveyXML
The
EchoPlateSurveyXMLobject to create the new instance from.
- Returns:
- SurveyData
A new instance of
SurveyDatacreated from theEchoPlateSurveyXMLobject.
- classmethod from_xml(xml_str: str | bytes) Self#
Create a new instance of
SurveyDatafrom an XML string.- Parameters:
- xml_strstr or bytes
The XML string to parse.
- Returns:
- SurveyData
A new instance of
SurveyDatacreated from the parsed XML.
- Raises:
- ParsingError
If the XML string cannot be parsed.
- classmethod from_xml_tree(xml_tree: etree._Element) Self#
Create a new instance of
SurveyDatafrom an XML string.- Parameters:
- xml_treelxml.etree._Element
The XML tree to parse.
- Returns:
- SurveyData
A new instance of
SurveyDatacreated from the parsed XML.
- Raises:
- ParsingError
If the XML tree cannot be parsed.
- heatmap(
- value: str | Expr = 'volume',
- sel: Expr | None = None,
- axs: Axes | Iterable[Axes | None] | None = None,
- title: str | Callable | None = None,
- *,
- fill_value: Any = nan,
- **kwargs,
Generate a heatmap for each survey in the SurveyData.
- Parameters:
- valuestr | pl.Expr, optional
Value to use. May be a string (for a column), or a polars expression. By default “volume”
- selpl.Expr | None, optional
Selector for surveys, by default None (use all surveys)
- axsAxes | Iterable[Axes | None] | None, optional
Axes to use, by default None (generate new axes)
- titlestr | Callable | None, optional
Title for the heatmap. If a callable, called with the survey-specific SurveyData for each survey heatmap. By default None (generate a default)
- fill_valueAny, optional
Fill value for missing values, by default np.nan
- Returns:
- list[Axes]
Each Axes used: one per survey.
- Raises:
- ValueError
Ran out of provided axes to use.
- property latest_single_surveys: list[Self]#
A list of SurveyData instances, for the latest survey of each plate in the data.
- property plate_shape: tuple[int, int]#
Shape of the full plate, in (rows, columns).
Calculated based on the standard Echo source plate name format, which includes the number of wells at the beginning. May fail for unusual plates. Single survey only.
- Returns:
- tuple[int, int]
See also
plate_size
- property plate_total_wells#
Total number of wells in the plate (not survey). Single survey only.
- classmethod read_parquet( ) Self#
Read SurveyData from a Parquet file.
- Parameters:
- pathstr | Path | BinaryIO | BytesIO | bytes
Path to read from, or file-like object.
- polars_kwdict[str, Any] | None, optional
Options to pass to polars:polars.read_parquet, by default None
- Returns:
- SurveyData
- classmethod read_xml(path: str | PathLike) Self#
Read survey data from an Echo-produced XML file.
- Parameters:
- pathstr or os.PathLike
The path to the XML file.
- Returns:
- SurveyData
The survey data read from the XML file.
- Raises:
- ParsingError
If the XML file cannot be parsed.
- property single_surveys: list[Self]#
A list of SurveyData instances, for each individual plate survey in the data.
- property survey_offset: tuple[int, int]#
Top left (row, column) offset from (0, 0) (“A1”) of the survey data. Single survey only.
- Returns:
- tuple[int, int]
- Raises:
- ValueError
Multiple offsets were returned.
- property survey_shape: tuple[int, int]#
Shape of the survey, which may not be the full plate, in (rows, columns). Single survey only.
- Returns:
- tuple[int, int]
- property surveys: DataFrame#
A DataFrame listing the surveys in the SurveyData.
- Returns:
- pl.DataFrame
- to_platesurveys() list[EchoPlateSurveyXML]#
Convert survey data to a list of EchoPlateSurveyXML objects.
- Returns:
- list[EchoPlateSurveyXML]
A list of EchoPlateSurveyXML objects representing the survey data.
- volumes_array( ) ndarray#
Generate a 2D array of the volumes in each well. Single survey only.
- Parameters:
- full_platebool, optional
Return the full plate if true (filling un-surveyed wells with
fill_value), by default False- fill_valueAny, optional
Value to fill unsurveyed wells and wells with no value in the survey, by default np.nan
- Returns:
- np.ndarray
- write_parquet( ) None#
Write the survey data to a Parquet file.
- Parameters:
- pathstr or Path or BytesIO
The path to write the Parquet file to, or a BytesIO object to write to.
- polars_optionsdict[str, Any] or None, optional
Options to pass to the Polars DataFrame.write_parquet method.
Examples
>>> survey_data = SurveyData(...) >>> survey_data.write_parquet("survey_data.parquet")
- write_platesurveys(
- paths: str | PathLike[str] | Iterable[str | PathLike[str]] | Callable[[EchoPlateSurveyXML], str],
- path_str_format=True,
Write plate surveys to disk as Echo PlateSurvey format.
- Parameters:
- pathsstr or os.PathLike or iterable of str or os.PathLike or callable
The path(s) to write the plate surveys to. If a callable is provided, it should take an
EchoPlateSurveyXMLobject as input and return a string path.- path_str_formatbool, optional
Whether to format the path(s) using the
formatmethod of thepathsargument, by default True.
- Returns:
- None
- Raises:
- ValueError
If a duplicate path is encountered.