DataContainer

Public Methods

  • DONE DataContainer(DataContainer) - Copy constructer.
  • DONE AddColumn(string) - Appends a new column of data, sets the name, and returns the new column's index. Initilizes with null values.
  • DONE RemoveColumn(int) - Removes a column of data.
  • DONE RemoveColumns(string, ...) - Removes multiple columns of data by name.
  • DONE GetColumnName(int) - Gets the column name.
  • DONE SetColumnName(int, string) - Sets the column name.
  • DONE GetColIndexFromName(string) - Returns the index for a given column name.
  • DONE AddRow(int) - Appends a new row of data, sets the id, and returns the new row's index. Initilize with null values.
  • DONE RemoveRow(int) - Removes a row of data by index.
  • DONE RemoveRows(int, ...) - Removes multiple rows of data by id.
  • DONE RemoveRows(ArrayList) - Removes multiple rows of data by id.
  • DONE Filter(int, ...) - Removes all rows except the ones specified.
  • DONE Filter(ArrayList) - Removes all rows except the ones specified.
  • DONE GetRowId(int) - Gets a row id.
  • DONE GetRowIndexFromId(int) - Returns the index for a given row id.
  • DONE Copy() - Returns a new DataContainer that is a deep copy of this object.
  • DONE Clear() - Removes all the data, row ids, and col names.
  • DONE Trim() - Trims the data to the minimum size and runs the garbage collector.
  • DONE Append(DataContainer) - Takes another DataContainer and adds the rows to this object. There must be the same number of columns and the column names must be identical. The row ids must be exclusive to each data set, e.g. there can't be the same row id in both data sets. If there is a row id conflict, an error will be thrown and no changes are made to the data.
  • DONE Merge(DataContainer) - Takes another DataContainer and adds the columns to this object. Rows that are in one data set and not the other will be added and null values inserted for the rest of the other data set. Column names must be unique to each data set, e.g. there can't be the same column name in both data sets. When there is a column name conflict, an error will be thrown and no changes are made to the data.
  • DONE Save(string) - Gets the default delimiter and calls the appropriate Save(string, string).
  • DONE GetDataHeader(string) - Returns an arraylist that contains the header (the column info) for the data set.

Properties

  • DONE Name - Gets/sets the name.
  • DONE NumRows - Gets the number of rows.
  • DONE NumCols - Gets the number of columns.

Protected Methods

  • DONE Copy(DataContainer) - Copies a DataContainer over the current object.
  • DONE GetDataObject(int, int) - Accesses a data item.
  • DONE SetDataObject(int, int, object) - Sets a data item.
  • DONE Delimiter(string) - Determines the default delimiter based on filename.
  • DONE RebuildColIndices() - Rebuilds the column name to index map.
  • DONE RebuildRowIndices() - Rebuilds to row id to index map.

Virtual Methods

  • Load(string) - Place holder for child class implementations.
  • Save(string, string) - Place holder for child class implementations.

Protected Members

  • ArrayList rowIds - Row index to row id mapping.
  • SortedList rowIndices - Row id to row index mapping.
  • ArrayList colNames - Column index to column name mapping.
  • SortedList colIndices - Column name to column index mapping.
  • ArrayList data - Set of data columns.
  • string name - The name of the data set.

Exceptions

  • Merging data with different number of rows.
  • Merging data with different row ids.
  • DONE Merging data with duplicate column names.
  • DONE Appending data with different number of columns.
  • DONE Appending data with different column names.
  • DONE Appending data with duplicate row ids.
  • DONE Adding a column with a duplicate name.
  • DONE Adding a row with a duplicate id.
  • DONE Setting a column with a duplicate name.
  • DONE Setting a row with a duplicate id.
  • DONE Invalid indices.

Rationale

  • Using columns of data: Columns of data are used due to the fact that there are more rows then columns. This should save some on memory.
  • Whenever a row or column is added, the values are initilised to null values. This takes care of the need for the integrity check.
  • Whenever a row or column is added, the id or name used cannot interfere with another id/name already in use.

Remarks

  • Delimiters: tab(default) or comma.

Topic revision: r26 - 09 Dec 2004, WillGray
 

This site is powered by FoswikiCopyright © 2013-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback