Data Quality Control



  • Ability to generate ICC Kappa value, Agreement rate, and Correlation values based on a data set.
    • Ability to generate Kappa value, Agreement rate, and Correlation based on a data set which comes with an Information set.

TODO - July 7th

  • add a control statement if info set dosen't match data set


  • Kappa : Kappa measures the percentage of data values in the main diagonal of the table and then adjusts these values for the amount of agreement that could be expected due to chance alone.
    Kappa is always less than or equal to 1. A value of 1 implies perfect agreement and values less than 1 imply less than perfect agreement.
    In rare situations, Kappa can be negative. This is a sign that the two observers agreed less than would be expected just by chance.
  • Agreement : Measures the agreement between the two individuals.
    • Poor agreement = Less than 0.20
    • Fair agreement = 0.20 to 0.40
    • Moderate agreement = 0.40 to 0.60
    • Good agreement = 0.60 to 0.80
    • Very good agreement = 0.80 to 1.00
  • Pearson Correlation (parametric): It measures the strength of a linear relationship between two variables.
    • -1.0 to -0.7 strong negative association.
    • -0.7 to -0.3 weak negative association.
    • -0.3 to +0.3 little or no association.
    • +0.3 to +0.7 weak positive association.
    • +0.7 to +1.0 strong positive association.
  • R^2 R^2 is the square of correlation coefficients. R square is equal to the percent of the variation in one variable that is related to the variation in the other.
  • Spearman Correlation (Nonparametric): It measures the strength of a linear relationship between rank of two variables.
  • 95% confidence interval : Confidence intervals are constructed at a confidence level, such as 95%. It means that if the same population is sampled on numerous occasions and interval estimates are made on each occasion, the resulting intervals would bracket the true population parameter in approximately 95% of the cases.
  • ICC : ICC stands for Intraclass Correlation, it assesses rating reliability by comparing the variability of different ratings of the same subject to the total variation across all ratings and all subjects. There are 3 types of ICC
    • Case 1: Reaters for each subjects are selected at random
    • Case 2: The same raters rate each case. These are a random smaple.
    • Case 3: The same raters rate each case. These are the only raters.

Comparison Rationale of Agreement and Correlation

  • Input files: DataSet, InformationSet(optional)
  • Conditions of comparison
    • Condition1 (default, without grouping, no additional InformationSet is required.):
      • overall detailed agreement rate
      • average of overall agreement rate
    • Condition2 (within group, InformationSet is required)
      • detailed agreement rate
      • average of agreement
    • Condition3 (one subject in one group cross all subjects in other groups, InformationSet is required)
      • detailed agreement rate
      • average of agreement

Wfccm Quality Control Design

Agreement/Kappa Algorithm

Correlation Coefficient Algorithm

ICC related information

Quality Control How-To

Progress and Goal

TimeLine Joan Bashar & Shuo Status
April 18-April 29
1. finish specification of Kappa/Agreement
2. implement Kappa/Agreement
3. testing
1. doing research in Correlation
2. doing research on Mixed Model
May 2-May 13
13 days
1. finish specification of Correlation
2. implement Correlation
3. testing
doing research on Mixed Model 100%
May 16-May 27
10 days
1. adding Kappa/Agreement and Correlation into interface
2. testing through interface
1. find out related info of R and hopefully sample R codes which generates the same result of SAS
2. specification of ICC
May 31-June 15
15 days
preparation of ICC implementation 1. ICC algorithm
2. testing kappa/correlation through interface
June 6-July 13
28 days
1.implement ICC
2.improve interface
July 14-July 29
12 days
1. adding ICC to interface 2. testing testing  
due to the difficulty if ICC algorithm, we use either SAS or R instead of adding it into WCFFM

Topic revision: r20 - 18 Jul 2005, JoanZhang

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback