eva_toolkit module

eva_toolkit.classify_data(hist, data1, data2, xedges, yedges, annotation=None, plot=False)

Classifies the given data with the classifier.

Parameters

histarray: The PDF-values.
data1array: The skewness values of a signal.
data2array: The kurtosis values of a signal.
xedgesarray: The xedges of the bins.
yedgesarray: The yedges of the bins.
annotationarray, optional: The annotation. It is only used to plot the data into the PDF. The default is None.
plotbool, optional: If True, the PDFs will be plotted with the annotated data. The default is False.

Returns

scoresarray: The likelihood of a value being good.

eva_toolkit.import_training_data(training_values_path)

Imports a excel-sheet with all data. This sheet must be created in beforehand. Futhermore, a rondomization and subdivison into test- and trainingsets is performed.

Parameters

training_values_pathstring: The specific path of the excel-sheet. The path is provided in the config.ini. For more information go to the PPG_EVA_GUI.set_values() function.

Returns

validation_sets: dict: Five subsets with a raugh equal amount of good and bad data.
training_sets: dict: Five subsets wich contain four validation_sets. In every training_set one validation_set is missing.
test_set: array: One set wich contains 20% of the whole dataset with respect to the subdivision in good and bad data.

eva_toolkit.lin_reg(signal_chunks, cross_pos)

A function wicht performs a lin. regression for every given signalchunk.

Parameters

signal_chunks2D-array: The sequenced signal.
cross_pos2D-list: The Position of each zero crossing for every signal_chunks

Returns

slopelist: The slope of every lin. regression.
intersectlist: The y-intercept for every lin. regression.

eva_toolkit.mean_hists(hists, thresholds)

Calculates the mean from all PDFs as final classifier.

Parameters

histsdict: All five PDFs from the trainingsets.
thresholdsdict: best threshold for every subset.

Returns

mean_histarray: The mean PDF.
xedgesarray: The xedges of the bins.
yedgesarray: The yedges of the bins.
mean_threshfloat: the mean threshold from the best thresholds.

eva_toolkit.read_signal(mod_path, signal_key, time_key, sep=',', skiprows=0, date_format=None, header='infer')

Reads the signal stored in the given path.

Parameters

mod_pathstring

The modified filepath to the signalfile wit its name as last part.

Syntax: path + ‘filename’

signal_keystring

The keyword/number for Pandas.Dataframe signal-column.

Somno: signal_key= 2

Corsano: signal_key=’value’

time_keystring

The keyword/number for Pandas.Dataframe timestamp-column.

Somno: time_key=0

Corsano: time_key=’date’

sepstring, optional

The seperator used to seperate the columns. Only necessary for Somno. Corsanofiles use the default. The default is ‘,’.

sep_somno = ‘;’

skiprowsinteger, optional

!!!Deprecated!!! The number of rows that should be skipped. The default is 0.

date_formatstring, optional

Date-format for the timestamps. Only necessary for Somno (“%d.%m.%Y %H:%M:%S,%f”). The default is None.

Returns

signalsarray: The signal as a timeseries.
timestampsSeries: the timestamps as pandas.Series.

eva_toolkit.test_subsets(validation_sets, mean_hist, xedges, yedges, best_thresh, plot=False)

This funktion applies the final classifier to the five subsets to check the consitency of the classifier. It should be selfexplainatory that the performances should have similar values, otherwise something went wron and a redo is recommended.

Parameters

validation_setsdict: A dictionaity containing the data of the subsets -> import_training_data().
mean_histarray: The classifying PDF.
xedgesarray: The xedges of the bins.
yedgesarray: The yedges of the bins.
best_threshfloat: The threshold.
plotTYPE, optional: If True, the performance of the calssifier on the subsets will be plotted in ROC-Space. The default is False.

Returns

tuple: A tuple containing the mean fpr, mean tpr, standard deviation of fpr, standard deviation of tpr, on the respective indices.

eva_toolkit.test_testset(test_set, mean_hist, xedges, yedges, best_thresh, val_tupel=(0, 0, 0, 0), plot=False)

This funktion applies the final classifier to the subsets e.g. unseen data. This resembles the final performancetest.

Parameters

test_setdict: A dictionaity containing the data of the testset -> import_training_data().
mean_histarray: The classifying PDF.
xedgesarray: The xedges of the bins.
yedgesarray: The yedges of the bins.
best_threshfloat: The threshold.
val_tupeltuple, optional: A tuple containing the, mean fpr, mean tpr, standard deviation of fpr, standard deviation of tpr, on the respective indices. Only necessary if the Performance is plotted.
plotbool, optional: If True, the performance ofe the classifier on the testset will be plottet in the ROC-space with the data of the subsets as errorbars. The default is False.

Returns

test_fprfloat: The fpr of the classifier on the testset.
test_tprfloat: The tpr of the classifier on the testset.

eva_toolkit.train_hist(validation_sets, hists, plot=False)

This funktion iterates over the subsets. Here the thresholds for every subset are determined with the use of ROC.

Parameters

validation_setsdict: A dictionaity containing the data of the subsets -> import_training_data().
histsdict: A dictionairy containing the PDFs of every trainingset.
plotbool, optional: If True, all five ROC-Curves are plotted. The default is False.

Returns

master_thresholdsdict: A dictionairy containing the best threshold for every subset.

eva_toolkit.variance(slopes, intersect, cross_pos)

A function wich calculates the variance of the zero crossings relative to the lin. regression from the eva_toolkit.lin_reg() function.

Parameters

slopeslist: The slopes for every lin. regression.
intersectlist: The y-intersect for ever lin. regression.
cross_poslist: The position of every zero crossing as an array for every signalchunk.

Returns

varlist: The variance of the data relative to the lin. regression

eva_toolkit.wrapping_results(prediction)

Counting the good and bad labeled sequences.

Parameters

predictionarray: The array with the prdicted labels.

Returns

tuple: A tuple containing the number of good labels, bad labels and overall number of labels.