Module qsarmodelingpy.cross_validation_class

Cross-Validation routines.

From Wikipedia:

Cross-validation is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of unknown data (or first seen data) against which the model is tested (called the validation dataset or testing set). The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem).

Classes

class CrossValidation (X: pandas.core.frame.DataFrame, y, nLVMax: int = None, scale: bool = True)

Perform the Cross Validation via Leave-One-Out.

Args

X : DataFrame
The matrix with descriptors.
y : DataFrame, list, array
The dependent variable vector.
nLVMax : int, optional
Maximum number of Latent Variables. Defaults to None.
scale : bool, optional
Defaults to True.

See also:

Methods

def Q2(self)
def R2(self)
def RMSEC(self)
def RMSECV(self)
def press(self)
def presscv(self)
def rcal(self)
def rcv(self)
def returnParameters(self, nLV=None)
def saveParameters(self, fileName, nLV=None) ‑> NoneType

Save filename containing the calculated parameters for CrossValidation.

Args

fileName (str, path, file-like, io): The filename to be saved.
nLV : int, optional
Number of Latent Variables. Defaults to None.