DataFrame Processing Tools

The DataFrame Processing submodule contains a number of functions used internally (e.g., by the Match module) to perform certain operations on Pandas DataFrames like adding new columns or filtering rows by some value.

Functions

add_columns_from_one_row_to_another(...)

Function that adds data from one row to another using a list of column names.

check_dict_keys(dictionary, permitted)

Function that checks all keys in a passed dictionary against a list of permissible keys.

column_adjust(dataframe[, add_col, ...])

Function used to add, remove, or rename columns in a passed DataFrame

row_filter(dataframe, filter_dict)

Function used to filter a passed DataFrame such that it only contains certain values in specific columns.

test_for_column_values(dataframe, ...)

Function used to test for the presence or absence of passed values in some column of a given DataFrame.

verify_column_not_empty(dataframe, column)

Function used to test whether a column in a passed DataFrame is empty.

chromaquant.utils.dataframe_processing.add_columns_from_one_row_to_another(first_row: Series, second_row: Series, add_columns: list[str]) Series

Function that adds data from one row to another using a list of column names.

Parameters

first_rowpandas.Series

A row (or Series) to have data appended to it.

second_rowpandas.Series

A row (or Series) containing data to be added to another row (or Series).

add_columnslist[str]

A list of column names indicating which data from second_row to add to first_row.

Returns

new_firstpandas.Series

A copy of first_row containing added data.

chromaquant.utils.dataframe_processing.check_dict_keys(dictionary: dict[Any, Any], permitted: list[Any])

Function that checks all keys in a passed dictionary against a list of permissible keys.

Parameters

dictionarydict[Any, Any]

The dictionary containing keys to be checked against permitted.

permittedlist[Any]

A list of permitted keys.

Returns

None.

Raises

ValueError

If at least one key in the dictionary is not in the list of permitted keys.

chromaquant.utils.dataframe_processing.column_adjust(dataframe: DataFrame, add_col: list[str] = [], remove_col: list[str] = [], rename_dict: dict[str, str] = {}) DataFrame

Function used to add, remove, or rename columns in a passed DataFrame

Parameters

dataframepandas.DataFrame

DataFrame to have columns adjusted.

add_collist[str], optional

List of column headers to add, by default [].

remove_collist[str], optional

List of column headers to remove, by default [].

rename_dictdict, optional

Dictionary of headers to rename as keys and new headers as values, by default {}.

Returns

new_dataframepandas.DataFrame

DataFrame post-adjustments.

chromaquant.utils.dataframe_processing.row_filter(dataframe: DataFrame, filter_dict: dict) DataFrame

Function used to filter a passed DataFrame such that it only contains certain values in specific columns.

Parameters

dataframepandas.DataFrame

DataFrame to have rows filtered.

filter_dictdict

Dictionary containing column names as keys and desired cell values as values.

Returns

new_dataframepandas.dataframe

Dataframe post-filtering.

chromaquant.utils.dataframe_processing.test_for_column_values(dataframe: DataFrame, column_name: str, test_values: list[Any]) dict[Any, bool]

Function used to test for the presence or absence of passed values in some column of a given DataFrame.

Parameters

dataframepandas.DataFrame

A DataFrame containing a column with a name matching column_name.

columnstr

The name of a column of interest.

test_valueslist[Any]

A list of values to be searched for in the column of interest.

Returns

test_result_dict: dict[Any, bool]

A dictionary containing passed test_values as keys and bools indicating the presence or absence of each test value in the column of interest.

chromaquant.utils.dataframe_processing.verify_column_not_empty(dataframe: DataFrame, column: str) bool

Function used to test whether a column in a passed DataFrame is empty.

Parameters

dataframepandas.DataFrame

DataFrame containing column to be tested.

columnstr

Header of a column which may or may not contain values.

Returns

test_resultbool

Result of the conditional test: True if the column is not empty and False if the column is empty.