DistanceResult DoDistance(DataSet, DataGrouping, FilteredScore) - Runs distance on the provided traning dataset.
DistanceResult DoDistance(DataSet, DataGrouping, FilteredScore, int) - Runs distance on the provided traning dataset and calculates the permutation p value.
DistanceResult DoDistance(DataSet, DataGrouping, FilteredScore, DataSet, DataGrouping) - Runs distance on the provided traning and testing datasets.
DistanceResult DoDistance(DataSet, DataGrouping, FilteredScore, DataSet, DataGrouping, int) - Runs distance on the provided traning and testing datasets and calculates the permutation p value.
Protected Methods
DistanceOutput buildOutput(DistanceOutput, grouping) - Fills in the output object.
validateTest(DataSet, DataGrouping, FilteredScore) - Checks the validity of the information for use as testing data.
validateTest(DataSet, DataGrouping, FilteredScore) - Checks the validity of the information for use as training data.
fillTrainOutput(DistanceOutput) - Fills the columns of the training DistanceOutput.
fillTestOutput(DistanceOutput) - Fills the columns of the testing DistanceOutput.
generatePvalue(DistanceOutput, int) - Generate the pvalue based on the number of permutations indicated.
Exceptions
Testing data set does not have the same ids as as the training data set.
Incorrect groupings
Remarks
Training DataGrouping must consist of 0,1,2
Training DataGrouping may consist of 0,1,2 or 0,1
P Value is generated by permuting the values and leaving the labels the same. For every permutation, the accuracy is checked against the real data. The number of permutations that have accuracy that is equal or better then the real data is divided by the number of permutations. This value is returned as the p value. 2000 is the smallest value recomended, 10,000 is recomended. This process is not very computationaly intensive, and 10,000 is completed on a fast system in a few seconds.