Score

Parent: DataContainer

Public Methods

  • DONE Score() - Default Constructor.
  • DONE Score(Score) - Copy Constructor.
  • DONE Score(string) - Loads the Score from a file.
  • DONE Copy(Score) - Copy a Score.
  • DONE SetDataPoint(int, int, double) - Sets a data point.
  • DONE GetDataPoint(int, int) - Gets a data point.
  • DONE Load(string) - Loads the object from the file given in the file path.
  • DONE Save(string) - Writes the data to the file given in the file path.
  • DONE AppendSave(string) - Appends the Score to the file given in the file path.
  • DONE AppendSave(string, char) - Appends the Score to the file given in the file path that uses the given delimiter.
  • DONE Filter(string) - Filters out rows based on a valid function.
  • DONE SignChange(string, arrayList) - Check to see that "lead" represents a valid column name. For each row, get the sign of the lead column, and set the specified columns to the same sign. Will modify the data!
  • DONE Standardize(ArrayList) - For each column, create a new column "columnName_std". For each of the new columns, get the standard deviation and the mean. For each row in the "_std" columns, modify the value by $\frac{value - mean}{Standard Deviation}$ (the mean and standard deviation will be column specific). Will modify the data!
  • DONE GenerateRankings(CriteriaSet) - Has the columns to rank and asc or desc as their operator. Defaults to no cross-validation.
  • DONE GenerateRankings(CriteriaSet, int) - Has the columns to ranks and asc or desc as their operator. Takes an integer for cross-validation.
  • DONE GenerateRankings(ArrayList, ArrayList) - Ranks scores ascending or descending. Defaults to no cross-validation.
  • DONE GenerateRankings(ArrayList, ArrayList, int) - For each column that is listed, sort the score ascending or descending according to the correspoinding specification. Create a new column "columnName_rank" for each score that is given. Assign a numeric value 1 - n for each row id. If there is a tie, the row ids get the same rank until the tie is broken. The score should resume as though there is no interruption is ranks. Ex: 1, 2, 3, 3, 3, 6, 7. After individual colums have been ranked, create a new column "rank_sum" The rank sum is the sum of the ranks. Sort rank_sum and create a new column "rank" which is the rank of rank_sum. Ties in rank sum are broken by the first column in the scoreList. Ties that remain are broken by the row_id. Takes an integer for cross-validation.

Private Methods

  • DONE RankByColumn(int, bool, int) - Generates the ranks for a column in another column. There is no tie-breaking.

Virtual Methods

  • GenerateScores(DataSet, DataGrouping) - Place holder for child class implementations. Defaults to no cross-validation.
  • GenerateScores(DataSet, DataGrouping, int) - Place holder for child class implementation. Takes an integer for cross-validation.

Exceptions

  • Standerdize/signchange tries to modify a column that does not exist.
  • DONE The grouping is not correctly set up. E.g. Missing group 1 or 2. Grouping other the 1, 2 or 0.
  • DONE A file is not formatted correctly.
  • DONE A file to read does not exist.

Remarks

  • All columns must be group 1, 2, or 0. The data grouping must have groups 1, 2. Group 0 is optional.
  • Missing values: stored as Double.NaN, written as ".".
  • Delimiters: tab(default) or space.

Topic revision: r28 - 06 May 2009, WikiGuest
This site is powered by FoswikiCopyright &© 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback