Requirements Statement
The software to be designed will control the generation of scores to be used is the distance analysis of genes and proteins via WFCCM. The program will generate data sets and score files to be used in the distance program (WFCCM).
The software will load data sets from files. The files may be text or Excel files. The data sets may reside in more then one file or excel page. The data to be loaded will be data that pertains to patients or descriptive information that describes other data. The user will be able to transform the data. The user will be able to transform the data by taking the log 10 or log 2. The User will be able to average the data using additional information specified by the user in the descriptive data files. When the data is transformed, the user will be able to enter a new name for the transformed data set. The user will be able to give each data set a name and description.
The software will create groupings that will divide data sets into separate parts. The groupings will be built from data sets or from additional data information. Groupings may have a null (ignore) group. There will be no limit to the number of individual groups that can be in a grouping set. The user will be able to give each grouping set a name and description.
Individual scores will use grouping and data to generate a data set that will contain scores for each protein or gene. At score generation time, the user will be allowed to choose which score methods will be used. If more then one score type is requested, scores will be generated and merged together to create a score set without user intervention. Scores must be generated from groupings that have 2 groups (beside the null group).
Users will be able to add additional scores. Scores may come from text files or excel files. The external scores will be merged into the existing score set.
The user will be able to save the programs state so that the state can be loaded from a file. The file that is written should be able to be read in a text editor so that the program is not required to understand how the scores were generated. The state will save all grouping, data set and score set information. The state file will be human readable in a text editor.
The program will generate summery tables for the score files. This will create preset summaries. The summery information will be able to be saved to an execl or text file.
Data sets must follow the convention that is defined in the
data submission guidelines.
Second phase goals: integrate all of the functionality from the distance program into this program.
- Flexible function: text parser to evaluate the function.
- Sign and starderdize.
- pre filter.
- top N distance.
- Training/testing system.
Third Phase goals: Add new functionality.
- Graphic made by program.
- Raw data to be able to be binned by program.
Inteface Requirements:
- Load Data
- From text files
- From Excel files
- Merge Data
- One or more data files may contribute
- Merge by row id
- Append data sets
- One or more data files may contirbute
- Merge by column id
- Load Data Info
- From text files
- From Excel files
- Transform Data
- Attach Info to dataSets
- Average based on data Info
- Groupings
- From data sets
- From information sets
- View
- Modify
- Link multiple grouping to data set
- Write State
- Score Sets
- Create from groupings
- From external files
- Summery tables
Interface
+ DataSet1
+ DataSet2
- DataSet3
- Groups
| x Gp1
| x Gp2
| x Gp3
x InfoTable
x Distance
x Summery
DataSet Display
DataSet file location
ScoreSet file location
Remove
Add Gp
create new dasaset based on transform
Groups Display
add/remove/show Scores
Generate all scores
Gp Context
Infotable column to use to create grouping
live view/edit grouping
remove
Gp Display
Summery statistics
Score file location
Summery Display
Summery statistics for all groups
Modify statistis
Write summery file
Distance Display
Show graph
Setup
Infotable display
File path
Show the table with the additional columns
File menu
save project
load project
Options menu
--
JeremyRoberts - 30 Jun 2004