eva_toolkit module
- eva_toolkit.classify_data(hist, data1, data2, xedges, yedges, annotation=None, plot=False)
Classifies the given data with the classifier.
Parameters
- histarray
The PDF-values.
- data1array
The skewness values of a signal.
- data2array
The kurtosis values of a signal.
- xedgesarray
The xedges of the bins.
- yedgesarray
The yedges of the bins.
- annotationarray, optional
The annotation. It is only used to plot the data into the PDF. The default is None.
- plotbool, optional
If True, the PDFs will be plotted with the annotated data. The default is False.
Returns
- scoresarray
The likelihood of a value being good.
- eva_toolkit.import_training_data(training_values_path)
Imports a excel-sheet with all data. This sheet must be created in beforehand. Futhermore, a rondomization and subdivison into test- and trainingsets is performed.
Parameters
- training_values_pathstring
The specific path of the excel-sheet. The path is provided in the config.ini. For more information go to the PPG_EVA_GUI.set_values() function.
Returns
- validation_sets: dict
Five subsets with a raugh equal amount of good and bad data.
- training_sets: dict
Five subsets wich contain four validation_sets. In every training_set one validation_set is missing.
- test_set: array
One set wich contains 20% of the whole dataset with respect to the subdivision in good and bad data.
- eva_toolkit.lin_reg(signal_chunks, cross_pos)
A function wicht performs a lin. regression for every given signalchunk.
Parameters
- signal_chunks2D-array
The sequenced signal.
- cross_pos2D-list
The Position of each zero crossing for every signal_chunks
Returns
- slopelist
The slope of every lin. regression.
- intersectlist
The y-intercept for every lin. regression.
- eva_toolkit.mean_hists(hists, thresholds)
Calculates the mean from all PDFs as final classifier.
Parameters
- histsdict
All five PDFs from the trainingsets.
- thresholdsdict
best threshold for every subset.
Returns
- mean_histarray
The mean PDF.
- xedgesarray
The xedges of the bins.
- yedgesarray
The yedges of the bins.
- mean_threshfloat
the mean threshold from the best thresholds.
- eva_toolkit.read_signal(mod_path, signal_key, time_key, sep=',', skiprows=0, date_format=None, header='infer')
Reads the signal stored in the given path.
Parameters
- mod_pathstring
The modified filepath to the signalfile wit its name as last part.
Syntax: path + ‘filename’
- signal_keystring
The keyword/number for Pandas.Dataframe signal-column.
Somno: signal_key= 2
Corsano: signal_key=’value’
- time_keystring
The keyword/number for Pandas.Dataframe timestamp-column.
Somno: time_key=0
Corsano: time_key=’date’
- sepstring, optional
The seperator used to seperate the columns. Only necessary for Somno. Corsanofiles use the default. The default is ‘,’.
sep_somno = ‘;’
- skiprowsinteger, optional
!!!Deprecated!!! The number of rows that should be skipped. The default is 0.
- date_formatstring, optional
Date-format for the timestamps. Only necessary for Somno (“%d.%m.%Y %H:%M:%S,%f”). The default is None.
Returns
- signalsarray
The signal as a timeseries.
- timestampsSeries
the timestamps as pandas.Series.
- eva_toolkit.test_subsets(validation_sets, mean_hist, xedges, yedges, best_thresh, plot=False)
This funktion applies the final classifier to the five subsets to check the consitency of the classifier. It should be selfexplainatory that the performances should have similar values, otherwise something went wron and a redo is recommended.
Parameters
- validation_setsdict
A dictionaity containing the data of the subsets -> import_training_data().
- mean_histarray
The classifying PDF.
- xedgesarray
The xedges of the bins.
- yedgesarray
The yedges of the bins.
- best_threshfloat
The threshold.
- plotTYPE, optional
If True, the performance of the calssifier on the subsets will be plotted in ROC-Space. The default is False.
Returns
- tuple
A tuple containing the mean fpr, mean tpr, standard deviation of fpr, standard deviation of tpr, on the respective indices.
- eva_toolkit.test_testset(test_set, mean_hist, xedges, yedges, best_thresh, val_tupel=(0, 0, 0, 0), plot=False)
This funktion applies the final classifier to the subsets e.g. unseen data. This resembles the final performancetest.
Parameters
- test_setdict
A dictionaity containing the data of the testset -> import_training_data().
- mean_histarray
The classifying PDF.
- xedgesarray
The xedges of the bins.
- yedgesarray
The yedges of the bins.
- best_threshfloat
The threshold.
- val_tupeltuple, optional
A tuple containing the, mean fpr, mean tpr, standard deviation of fpr, standard deviation of tpr, on the respective indices. Only necessary if the Performance is plotted.
- plotbool, optional
If True, the performance ofe the classifier on the testset will be plottet in the ROC-space with the data of the subsets as errorbars. The default is False.
Returns
- test_fprfloat
The fpr of the classifier on the testset.
- test_tprfloat
The tpr of the classifier on the testset.
- eva_toolkit.train_hist(validation_sets, hists, plot=False)
This funktion iterates over the subsets. Here the thresholds for every subset are determined with the use of ROC.
Parameters
- validation_setsdict
A dictionaity containing the data of the subsets -> import_training_data().
- histsdict
A dictionairy containing the PDFs of every trainingset.
- plotbool, optional
If True, all five ROC-Curves are plotted. The default is False.
Returns
- master_thresholdsdict
A dictionairy containing the best threshold for every subset.
- eva_toolkit.variance(slopes, intersect, cross_pos)
A function wich calculates the variance of the zero crossings relative to the lin. regression from the eva_toolkit.lin_reg() function.
Parameters
- slopeslist
The slopes for every lin. regression.
- intersectlist
The y-intersect for ever lin. regression.
- cross_poslist
The position of every zero crossing as an array for every signalchunk.
Returns
- varlist
The variance of the data relative to the lin. regression