DataFrame Processing Tools
The DataFrame Processing submodule contains a number of functions used internally (e.g., by the Match module) to perform certain operations on Pandas DataFrames like adding new columns or filtering rows by some value.
Functions
Function that adds data from one row to another using a list of column names. |
|
|
Function that checks all keys in a passed dictionary against a list of permissible keys. |
|
Function used to add, remove, or rename columns in a passed DataFrame |
|
Function used to filter a passed DataFrame such that it only contains certain values in specific columns. |
|
Function used to test for the presence or absence of passed values in some column of a given DataFrame. |
|
Function used to test whether a column in a passed DataFrame is empty. |
- chromaquant.utils.dataframe_processing.add_columns_from_one_row_to_another(first_row: Series, second_row: Series, add_columns: list[str]) Series
Function that adds data from one row to another using a list of column names.
Parameters
- first_rowpandas.Series
A row (or Series) to have data appended to it.
- second_rowpandas.Series
A row (or Series) containing data to be added to another row (or Series).
- add_columnslist[str]
A list of column names indicating which data from second_row to add to first_row.
Returns
- new_firstpandas.Series
A copy of first_row containing added data.
- chromaquant.utils.dataframe_processing.check_dict_keys(dictionary: dict[Any, Any], permitted: list[Any])
Function that checks all keys in a passed dictionary against a list of permissible keys.
Parameters
- dictionarydict[Any, Any]
The dictionary containing keys to be checked against permitted.
- permittedlist[Any]
A list of permitted keys.
Returns
None.
Raises
- ValueError
If at least one key in the dictionary is not in the list of permitted keys.
- chromaquant.utils.dataframe_processing.column_adjust(dataframe: DataFrame, add_col: list[str] = [], remove_col: list[str] = [], rename_dict: dict[str, str] = {}) DataFrame
Function used to add, remove, or rename columns in a passed DataFrame
Parameters
- dataframepandas.DataFrame
DataFrame to have columns adjusted.
- add_collist[str], optional
List of column headers to add, by default [].
- remove_collist[str], optional
List of column headers to remove, by default [].
- rename_dictdict, optional
Dictionary of headers to rename as keys and new headers as values, by default {}.
Returns
- new_dataframepandas.DataFrame
DataFrame post-adjustments.
- chromaquant.utils.dataframe_processing.row_filter(dataframe: DataFrame, filter_dict: dict) DataFrame
Function used to filter a passed DataFrame such that it only contains certain values in specific columns.
Parameters
- dataframepandas.DataFrame
DataFrame to have rows filtered.
- filter_dictdict
Dictionary containing column names as keys and desired cell values as values.
Returns
- new_dataframepandas.dataframe
Dataframe post-filtering.
- chromaquant.utils.dataframe_processing.test_for_column_values(dataframe: DataFrame, column_name: str, test_values: list[Any]) dict[Any, bool]
Function used to test for the presence or absence of passed values in some column of a given DataFrame.
Parameters
- dataframepandas.DataFrame
A DataFrame containing a column with a name matching column_name.
- columnstr
The name of a column of interest.
- test_valueslist[Any]
A list of values to be searched for in the column of interest.
Returns
- test_result_dict: dict[Any, bool]
A dictionary containing passed test_values as keys and bools indicating the presence or absence of each test value in the column of interest.
- chromaquant.utils.dataframe_processing.verify_column_not_empty(dataframe: DataFrame, column: str) bool
Function used to test whether a column in a passed DataFrame is empty.
Parameters
- dataframepandas.DataFrame
DataFrame containing column to be tested.
- columnstr
Header of a column which may or may not contain values.
Returns
- test_resultbool
Result of the conditional test: True if the column is not empty and False if the column is empty.