DistanceColumn GetByColName(string name) - returns the column with the indicated name
Properties
NumFlagged = numFLagged1 + numFlagged2 - number of flagged columns
NumColumns = numColumns1 + numColumns2 - Total number of columns
NumMissing = numMissing1 + numMissing2 - Total number of columns that have all null data (NaN)
PercentCorrect - returns (total columns not flagged/total columns)
Public Members
double PValue - Permutation P value for this run.
double Mean1 - mean of group 1
double Mean2 - mean of group 2
int NumFlagged1 - number of flagged columns from group 1
int NumFlagged2 - number of flagged columns from group 2
int NumColumns1 - Total number of columns in group 1
int NumColumns2 - Total number of columns in group 2
int NumMissing1 - Total number of columns in group 1 that have all null data (NaN)
int NumMissing2 - Total number of columns in group 2 that have all null data (NaN)
ArrayList of DistanceColumn
Class DistanceColumn
Properties
bool Flag = RealGroup = AssignedGroup ? true : false - true denotes that given group does not equal assigned group
double Dist1 = abs(ModScore-ModMean1) - dist to group 1
double Dist2 = abs(ModScore-ModMean2) - dist to group 2
Public Members
string Descriptor - Group descriptor
int RealGroup - given group #
int AssignedGroup - assigned group #
string Name - column name
double ModScore - modified score
double ModMean1 - mean group 1 with out this patient
double ModMean2 - mean group 2 with out this patient
Remarks
The items marked ??? can be derived from other parts of the data. This can be dealt with in a couple of ways. 1: Leave them out and let the user of the class derive them when needed. 2: Hard code them in the class which will use more memory. 3: Set everything as a property and calculate the derived values from the set values. I believe the 3rd option to be the best. While it wile be slightly slower, if something that the derived value relies on changes, the derived value is change also.