You are here:
Vanderbilt Biostatistics Wiki
>
Main Web
>
Projects
>
MicroArrayMassSpec
>
DistanceCheckSheet
(19 Apr 2004,
JeremyRoberts
)
(raw view)
E
dit
A
ttach
* Microarray or mass spec? * Average by sample location? * Do PROC MEANS on each patient to verify normalization * Create/verify peak definition file for each dataset (if not provided by investigator) * Peak ID: 1-n * Patient ID * sample location # for patient * spectra # for sample location * Group * Easy Id for investigator (uniform length) * Create/verify patient definition file for each dataset (if not provided by investigator) * Patient ID: 1-n. * Patient name: name that the investigator can use to easily id the patient * number of sample locations * number of spectra * If averaging Create averaged peak definition file (SasAverageProgramTemplate) * Peak ID: 1-n * Patient ID * Group * Easy ID for investigator (uniform length) * Create/verify gene/protien definition file * Gene/protien ID: 1-n * name/label given by investigator * Create group definition file. * Show number of patients/sample locations. * If more than one data set, define and name. * Define each grouping for each data set. * Verify with investigator. * Group definiton file. * Number of patients. * Number of sample locations per patient. * Number of spectra. * Number of patients/sample location in each group. * Number of Groupings. * Generate Scores * Mass Spec. (Log 10) * WGA - log * Sam - log * Info - log * TTest - log * Fisher - 0=0, >0 = 1 * Wilcoxen - raw * Microarray (Log 2) * WGA - log * Sam - log * Info - log * TTest - log * Create Summery Chart for scores * Distance run * Verify sign function. * Verify distance function. * Distance graph lables for each grouping * Verify train data sizes with group def. * Verify testing data sizes with group def. * Training error checking * Sample detailed distance output 1-2 per grouping. * Verify sign function. * Verify distance function. * Verify grouping. * Verify groups. * General Tips * Sas * Make sure that ttest/fisher/wilcoxen script has the correct number of genes/protients. * Averaging * Ignore 0 (average only peaks) * logging * Log only peaks > 0, 0s (non peaks) stay 0 * Average before taking the log. This will protect the data distribution (2004-04-19). -- Main.JeremyRoberts - 09 Mar 2004
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r9
<
r8
<
r7
<
r6
|
B
acklinks
|
V
iew topic
|
Edit
w
iki text
|
M
ore topic actions
Topic revision: r9 - 19 Apr 2004,
JeremyRoberts
Main
Department Home Page
Biostatistics Graduate Program
Vanderbilt University Medical Center
Main Web
Main Web Home
Search
Recent Changes
Changes
Topic list
Biostatistics Webs
Archive
Main
Sandbox
System
Register
|
Log In
Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki?
Send feedback