DataSet

Parent: DataContainer, Interface: IDataSet

Public Methods

  • DONE DataSet(DataSet) - Copy Constructor overload that handles the name list.
  • DONE SetDataPoint(int, int, double) - Set the data at that point.
  • DONE GetDataPoint(int, int) - Gets the data at that point.
  • DONE SetRowName(int, string) - Sets the row name for the row.
  • DONE GetRowName(int) - Gets the row name for the row given.
  • DONE AddRow(int, string) - Overload to include row name.
  • DONE RemoveRows(string, ...) - Removes multiple rows of data by name.
  • DONE RemoveRows(ArrayList) - Removes multiple rows of data by id or name. Do not include both.
  • DONE Filter(string, ...) - Removes all rows except the ones specified by name.
  • DONE Filter(ArrayList) - Removes all rows except the ones specified by id or name. Do not include both.
  • DONE Copy() - Returns a new DataSet that is a deep copy of this object.
  • DONE Clear() - Overload that also clears the row names.
  • DONE Append(DataSet) - Takes another DataSet and adds the rows to this object. There must be the same number of columns and the column names must be identical. The row ids must be exclusive to each data set, e.g. there can't be the same row id in both data sets. If there is a row id conflict, an error will be thrown and no changes are made to the data.
  • DONE Merge(DataSet) - Takes another DataSet and adds the columns to this object. Rows that are in one data set and not the other will be added and null values inserted for the rest of the other data set. Column names must be unique to each data set, e.g. there can't be the same column name in both data sets. When there is a column name conflict, an error will be thrown and no changes are made to the data.
  • DONE Log(int) - Transforms all values in the data set to the log n value.
  • DONE Avgerage(DataGrouping, int) - Combines columns in the DataSet based on groups.
  • DONE Load(string) - Loads the object from the file given in the file path.
  • DONE Save(string, string) - Write the DataSet to the file given, using the specified delimiter.
  • DONE SaveEisen(string, DataGrouping) - Writes the data set with the column headers as the group number concatenated with "---" and the column name. The column names should be padded with "_" at the end to make the column names the same length.
  • DONE GeneratePercentage(DataGrouping, int) - Calculate the % of each group that has non-null data or non-zero data. Flag to determine whether to count null or zero.
  • DONE GenerateBasicStats() - Calculates the basic statistics (min, first quartile, median, third quartile, max, mean, std dev, number of nulls, number of zeros) for each column and returns an InformationSet. The quartiles are computed by the weighted average at $x_{(n+1)p}$.
  • DONE MeanStdev() - generates the mean and standard deviation separately for all the positive and negative values and returns them as "out" parameters

Protected Methods

  • DONE Copy(DataSet) - Copies a DataSet over the current object.

Private Members

  • ArrayList rowNames - The row names.

Exceptions

  • Averaging a data set with a gouping that does not reduce the column numbers.
  • Averaging a data set with the 0 group.
  • DONE Logging a data set that has negative values.
  • DONE A file is not formatted correctly.
  • DONE A file to read does not exist.

Remarks

  • When averaging, the default behoviour is to ignore the zeros. Treat them as if they do not exist. Caveat: The only time a null should be returned is if only nulls are in the average set. The function can be flagged to include zeros in the average. The new column name will be the DataGroupings group descriptor. The data sew will be modified from its original state to reflect the changes. The old state will be lost after the change.
  • Log transformations must be performed on data that is >= 0. If any data point is < 0, the transformation must fail. If any data is < 0, the function should not change any data. For values that are equal to 0, the value will remain zero. When the only function available is log base 10, a value v can be log transformed with a base of b: $\frac{\log _{10}v}{\log_{10}b}$
  • Missing values: stored as Double.NaN, written as ".".

Edit | Attach | Print version | History: r33 < r32 < r31 < r30 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r32 - 19 Apr 2005, WillGray
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback