eva_toolkit module

eva_toolkit.classify_data(hist, data1, data2, xedges, yedges, annotation=None, plot=False)

Classifies the given data with the classifier.

Parameters

histarray

The PDF-values.

data1array

The skewness values of a signal.

data2array

The kurtosis values of a signal.

xedgesarray

The xedges of the bins.

yedgesarray

The yedges of the bins.

annotationarray, optional

The annotation. It is only used to plot the data into the PDF. The default is None.

plotbool, optional

If True, the PDFs will be plotted with the annotated data. The default is False.

Returns

scoresarray

The likelihood of a value being good.

eva_toolkit.import_training_data(training_values_path)

Imports a excel-sheet with all data. This sheet must be created in beforehand. Futhermore, a rondomization and subdivison into test- and trainingsets is performed.

Parameters

training_values_pathstring

The specific path of the excel-sheet. The path is provided in the config.ini. For more information go to the PPG_EVA_GUI.set_values() function.

Returns

validation_sets: dict

Five subsets with a raugh equal amount of good and bad data.

training_sets: dict

Five subsets wich contain four validation_sets. In every training_set one validation_set is missing.

test_set: array

One set wich contains 20% of the whole dataset with respect to the subdivision in good and bad data.

eva_toolkit.lin_reg(signal_chunks, cross_pos)

A function wicht performs a lin. regression for every given signalchunk.

Parameters

signal_chunks2D-array

The sequenced signal.

cross_pos2D-list

The Position of each zero crossing for every signal_chunks

Returns

slopelist

The slope of every lin. regression.

intersectlist

The y-intercept for every lin. regression.

eva_toolkit.mean_hists(hists, thresholds)

Calculates the mean from all PDFs as final classifier.

Parameters

histsdict

All five PDFs from the trainingsets.

thresholdsdict

best threshold for every subset.

Returns

mean_histarray

The mean PDF.

xedgesarray

The xedges of the bins.

yedgesarray

The yedges of the bins.

mean_threshfloat

the mean threshold from the best thresholds.

eva_toolkit.read_signal(mod_path, signal_key, time_key, sep=',', skiprows=0, date_format=None, header='infer')

Reads the signal stored in the given path.

Parameters

mod_pathstring

The modified filepath to the signalfile wit its name as last part.

Syntax: path + ‘filename’

signal_keystring

The keyword/number for Pandas.Dataframe signal-column.

Somno: signal_key= 2

Corsano: signal_key=’value’

time_keystring

The keyword/number for Pandas.Dataframe timestamp-column.

Somno: time_key=0

Corsano: time_key=’date’

sepstring, optional

The seperator used to seperate the columns. Only necessary for Somno. Corsanofiles use the default. The default is ‘,’.

sep_somno = ‘;’

skiprowsinteger, optional

!!!Deprecated!!! The number of rows that should be skipped. The default is 0.

date_formatstring, optional

Date-format for the timestamps. Only necessary for Somno (“%d.%m.%Y %H:%M:%S,%f”). The default is None.

Returns

signalsarray

The signal as a timeseries.

timestampsSeries

the timestamps as pandas.Series.

eva_toolkit.test_subsets(validation_sets, mean_hist, xedges, yedges, best_thresh, plot=False)

This funktion applies the final classifier to the five subsets to check the consitency of the classifier. It should be selfexplainatory that the performances should have similar values, otherwise something went wron and a redo is recommended.

Parameters

validation_setsdict

A dictionaity containing the data of the subsets -> import_training_data().

mean_histarray

The classifying PDF.

xedgesarray

The xedges of the bins.

yedgesarray

The yedges of the bins.

best_threshfloat

The threshold.

plotTYPE, optional

If True, the performance of the calssifier on the subsets will be plotted in ROC-Space. The default is False.

Returns

tuple

A tuple containing the mean fpr, mean tpr, standard deviation of fpr, standard deviation of tpr, on the respective indices.

eva_toolkit.test_testset(test_set, mean_hist, xedges, yedges, best_thresh, val_tupel=(0, 0, 0, 0), plot=False)

This funktion applies the final classifier to the subsets e.g. unseen data. This resembles the final performancetest.

Parameters

test_setdict

A dictionaity containing the data of the testset -> import_training_data().

mean_histarray

The classifying PDF.

xedgesarray

The xedges of the bins.

yedgesarray

The yedges of the bins.

best_threshfloat

The threshold.

val_tupeltuple, optional

A tuple containing the, mean fpr, mean tpr, standard deviation of fpr, standard deviation of tpr, on the respective indices. Only necessary if the Performance is plotted.

plotbool, optional

If True, the performance ofe the classifier on the testset will be plottet in the ROC-space with the data of the subsets as errorbars. The default is False.

Returns

test_fprfloat

The fpr of the classifier on the testset.

test_tprfloat

The tpr of the classifier on the testset.

eva_toolkit.train_hist(validation_sets, hists, plot=False)

This funktion iterates over the subsets. Here the thresholds for every subset are determined with the use of ROC.

Parameters

validation_setsdict

A dictionaity containing the data of the subsets -> import_training_data().

histsdict

A dictionairy containing the PDFs of every trainingset.

plotbool, optional

If True, all five ROC-Curves are plotted. The default is False.

Returns

master_thresholdsdict

A dictionairy containing the best threshold for every subset.

eva_toolkit.variance(slopes, intersect, cross_pos)

A function wich calculates the variance of the zero crossings relative to the lin. regression from the eva_toolkit.lin_reg() function.

Parameters

slopeslist

The slopes for every lin. regression.

intersectlist

The y-intersect for ever lin. regression.

cross_poslist

The position of every zero crossing as an array for every signalchunk.

Returns

varlist

The variance of the data relative to the lin. regression

eva_toolkit.wrapping_results(prediction)

Counting the good and bad labeled sequences.

Parameters

predictionarray

The array with the prdicted labels.

Returns

tuple

A tuple containing the number of good labels, bad labels and overall number of labels.