Module qsarmodelingpy.Utils

Functions

def detect_header_and_indices(path: str) ‑> tuple

Detect if the csv in path has header and columns, returning in the Pandas format. It only works if the first header column or first index line are strings. Numeric values will be considered as data.

Args

path : str
The path to the csv file.

Returns

tuple
(index_col, header), where both can be None (not found) or 0 (first line/column).
def load_matrix(path: str, usecols: Union[list, Callable, NoneType] = None) ‑> pandas.core.frame.DataFrame

Load a matrix detecting automatically whether or not use header and index_col (in pandas terms).

See detect_header_and_indices().

Args

path : str
Filename to be feed to Pandas.
usecols : Union[list, Callable, None], optional
Columns to be loaded (by feeding to usecols parameter of pandas.read_csv). It'll improve dramatically the performance for large matrices if the entire matrix isn't needed. Defaults to None.

Returns

pandas.DataFrame
The DataFrame with correct header/index_col and only with the columns determined by usecols (or all columns if usecols=None).