PATHO dataset

Local lab pathology results. It should contain entry biopsy, for-cause biopsy, and surgery dates.

BIOPSY dataset

should contain the 24 month scheduled biopsies (Visit 6) and the 48 month scheduled biopsies (Visit 10).

PATHOLOG dataset

Central lab pathology results.It is important to note that the Central Lab pathology results for all data in PATHO and BIOPSY should be in the PATHOLOG dataset. These Central Lab pathology results are of primary interest (as opposed to the local pathology lab results provided in the PATHO dataset). Hence you should in theory be able to match up every biopsy or surgery date that is in PATHO and BIOPSY with the corresponding Central Lab pathology results in PATHOLOG by subject number, procedure type and procedure date.
  1. Cancer data
  2. Cleason data
  3. HGPIN
  4. % core involved and # of cores with cancer
  5. Treatment alteration scores

postbaseBiop dataset

This dataset includes the biopsy results (unscheduled, scheduled, surgery) for each subject. Subjects may have multiple records, but each record will have a unique biopsy date (PRCDT).

Data Reduction Rules for Creating postbaseBiop Dataset and for Reporting Biopsy

As follows from the data biopsy procedure involves examining several cores. Therefore each subject may have different results for different cores. Moreover subjects may have several unscheduled biopsies. Reporting frequencies of cancer occurence and other results would be incorrect without preliminary data processing since a biopsy result of one subject would be counted several times. Therefore the following data reduction rules will be applied:
  1. If a subject has several biopsy results for the same date the worst biopsy result will be chosen.
  2. If a subject has more than one unshceduled biopsy the latest biopsy result will be chosen. These are general rules, their application varies accross variables. For example for Gleason Score the first rule should be worded differently but the idea is the same.

(Date: Wed, 24 Aug 2005 15:40:54) Your item #1 refers to multiple slides/cores for a given biopsy, and your item #2 refers to multiple unscheduled biopsies for a given subject. As you note, various assessments will have differing considerations regarding data summarization. Here is my current understanding regarding the appropriate approach to summarize the endpoints:
  1. prostate cancer diagnosis - this is straightforward (subject considered diagnosed with prostate cancer if any one slide or core is diagnosed accordingly)

  1. Gleason sum (sum of Gleason primary and secondary scores) - the database contains both the per-core/per-slide value (GLSC) and the overall biopsy value (OVGLSM). OVGLSM is the variable of primary interest. Unfortunately sometimes OVGLSM is not populated but GLSC is (I think that we have already discussed this). It could be possible to see two biopsies for one subject, both positive for prostate cancer and both with a Gleason sum (this is not likely to happen often but could if first biopsy showed low Gleason sum and then the subject was rebiopsied later to see if things worsened). Interest is focused on Gleason grade at initial diagnosis.

  1. % core involved and number of cancer positive cores - values available on a per-core/per-slide basis. Interest is focused on values at initial diagnosis.

  1. Trash.HGPIN occurrence and ASAP occurrence - values available on a per-core/per-slide basis. Interest is focused on subjects who have any Trash.HGPIN and who have any ASAP in the given time period.

  1. treatment alteration scores - similar to Gleason sum, values are available on both a per-core/per-slide basis and an overall basis, with interest focusing on the latter. For multiple biopsies use the values from the latest biopsy in the time period of interest.

  1. acute inflammation, chronic inflammation and atrophy - values available on a per-core/per-slide basis only. For multiple values use the maximum score in the time period of interest.

Please note that the analysis plan is in draft form and so the above is subject to change. We plan to summarize the post-baseline pathology data overall and by time period (years 1-2, years 3-4), combining the scheduled and unscheduled data together. There will likely also be interest in looking at these by biopsy type (scheduled vs. unscheduled) - especially prostate cancer occurrence and Gleason sum.
  1. To be continued...

The For-Cause Biopsy spreadsheet

It is provided by the Central Lab (Bostwick Labs) on an irregular basis and is essentially a log-in record. That is, it is an operational record of when biopsy material is checked into the Central Lab, along with various other data attached as well. It does not replace PATHOLOG and is only to be used as supplemental information to indicate how many biopsies have been received


"ASAP" indicates presence of atypical small acinar proliferation, "CAG" means cancer diagnosed.

Biopsy Reporting

Final Reporting

Final reporting of the biopsy data should be based on the PATHOLOG dataset alone (since it is Central Lab pathology results - the protocol section 6.3.2).

Interim reporting

PATHOLOG should be the primary source of the data (since it is Central Lab pathology results), and it has to be supplemented by PATHO, BIOPSY and FOR-CAUSE-BIOPSY. The for-cause biopsy spreadsheet can be utilized if there is no corresponding PATHOLOG data, understanding that the data so used should be considered tentative. If any data inconsistencies arise: PATHOLOG is prefered to PATHO, BIOPSY and FOR-CAUSE-BIOPSY, FOR-CAUSE-BIOPSY is prefered to PATHO.
Topic revision: r7 - 05 Jan 2006, SvetlanaEden

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback