1 Summary of Project Requests


Title: Epidemiological characterization and clinical outcomes of LAC-Caribbean (LAC) Patients Living with HIV and Viro–Immunological Discordance

Principal Investigators: Claudia Cortes Moncada, Gabriel Castillo Rozas

Database: CCASAnet Database

Biostatisticians: Shengxin Tu, Bryan Shepherd

1.1 Aims

  • Aim 1: Estimate the proportion of patients on ART under medical surveillance at LAC HIV centers who meet criteria for viro-immunological discordance. Describe clinical hallmarks of LAC patients living with HIV and viro-immunological discordance.
  • Aim 2: Describe risk factors for failing to achieve immune recovery during follow-up after ART initiation
  • Aim 3: Assess overall survival during follow-up in LAC patients living with HIV and viro-immunological discordance.
  • Aim 4: Estimate risk of specific diseases, including cardiovascular events, non-AIDS defining cancer, AIDS defining cancer, opportunistic infections

1.2 Hypotheses

LAC patients living with HIV and viro-immunological discordance are at higher risk for all-cause mortality in comparison to patients living with HIV without viro-immunological discordance

1.3 Variables

  • Dependent variables (outcomes) of interest
    • All-cause mortality

    • Viro-immunological concordance/discordance

      There is a lack of consensus to define VIC/VID. Here we use absolute CD4 T-cell count and increase in CD4 T-cell count after two years of ART treatment to characterize immunological performance.

    • Diseases of interest

      • Cardiovascular events (including acute myocardial infarction, stroke and heart failure)
      • Non–AIDS defining cancer
      • AIDS defining cancer
      • Opportunistic infections

Notes: CD4 T cell count at ART initiation was defined as the CD4 T cell count of the day which was closest to ART initiation and within 180 days before or 30 days after ART initiation. CD4 T cell count at 2 years was defined as the CD4 T cell count of the day which was closest to 2 years of ART treatment and within 3 months before or 3 months after 2 years.

  • Predictors of interest
    • Epidemiological information
      • Sex
      • Age at ART initiation
      • HIV transmission route
      • History of tuberculosis
      • HIV center
    • HIV-related information
      • CDC stage at ART initiation
      • AIDS-defining event prior to ART initiation
      • Viral load at ART initiation, defined as the viral load of the day closest to ART initiation and between 180 days before and 1 week after ART initiation
      • CD4 T cell lymphocytes count at ART initiation, defined as the CD4 T cell count of the day which was closest to ART initiation and within 180 days before or 30 days after ART initiation.
      • Days from diagnosis to ART initiation

2 Statistical Analysis Plan


  • Aim 1
    • Table 1 Baseline characteristics

      In order to describe the baseline characteristics in Table 1, we will divide patients into four groups by quartiles of increase in CD4 count after 2-year treatment. All the baseline characteristics will be summarized by groups. Mean with standard deviation and median with 25th and 75 percentiles will be provided for continuous variables, proportions and frequency will be calculated for categorical variables. The percentages of missingness will also be shown in the table. Further, p-values for testing the difference in the variables between groups will be presented. Kruskal-Wallis rank sum test will be used for for continuous variables and Chi-square test will be used for categorical variables.

    • Figure 1 Longitudinal CD4 T cell counts within 2 years after ART initiation by groups

  • Aim 2
    • We will build a cumulative probability model for increase in CD4 count. The explanatory variables will be chosen from epidemiological variables and HIV-related variables, including sex, HIV center, HIV transmission route, age at ART initiation, CD4 T cell count at ART initiation (cell/ml), history of TB, history of AIDS-defining events, days from HIV diagnosis to ART initiation, viral load at ART initiation.
    • Table 2 Odds ratios of the cumulative probability model
  • Aim 3
    • Figure 2 Survival Probability within 10 years after ART initiation by groups

      Kaplan-Meier estimators will be used to estimate the survival probability of each group within 10 years after ART initiation. Log-rank test will be used to test for the difference in survival probability between the four groups.

    • We will build a Cox proportional hazard model for survival on increase in CD4 count. The covariates will be chosen from the factors identified by the analysis in Aim 2

  • Aim 4
    • Table 2 The number of censor, event and death will be summarized for each disease by groups.
    • We will use cumulative incidence analysis to estimate marginal probability of the diseases in the presence of death. The difference in the cumulative incidence of each disease across the four groups will be tested by Gray’s test. The estimated probability of the diseases will be visualized.
    • Figure 3 Risk of specific diseases over time
  • The significant level is 0.05.

3 Apply Inclusion/Exclusion Criteria


  • Inclusion criteria:
    1. Age \(\geq\) 18 years old with confirmed HIV infection and under medical follow–up at an HIV center for at least 18 months after treatment initiation.
    2. Receiving a 3 drug-based ART regime for at least 6 months without modifications.
    3. Having sustained undetectable viral loads (< 200 copies/mL) during follow–up once ART is initiated (detection limit <= 200 copies/ml)
  • Exclusion criteria:
    1. History of a 1 or 2 drug-based ART regimes.
    2. Patients without CD4 lab between 180 days before and 30 days after treatment initiation.
  • Sample size:
    1. The total sample size of the whole CCASAnet Database was 49449. After the inclusion/exclusion criteria was applied, the sample size for this study was 4756.
Consort diagram of applying inclusion/exclusion criteria
(Scroll down for more content)
#read in datasets 

basic <- read.csv("ccasanetdatabase_2021jan21/basic.csv", stringsAsFactors = F)
#This is the main table and contains one row per patient. All patients in the cohort must be represented in this table. (49449 patients)

follow <- read.csv("ccasanetdatabase_2021jan21/follow.csv", stringsAsFactors = F)
#This table is used for death and loss-to-follow-up information. Patients have no more than one record in this table. (49449 patients)

art <- read.csv("ccasanetdatabase_2021jan21/art.csv", stringsAsFactors = F)
#This table captures ARV Therapy. (45332 patients)

visit <- read.csv("ccasanetdatabase_2021jan21/visit.csv", stringsAsFactors = F)
#This table documents all the clinical encounters (visits) with patient. Longitudinal clinical information is stored in this table. (44111 patients)

rna <-  read.csv("ccasanetdatabase_2021jan21/lab_rna.csv", stringsAsFactors = F) #37132 patients 

cd4 <-  read.csv("ccasanetdatabase_2021jan21/lab_cd4.csv", stringsAsFactors = F) #43093 patients 

ce <-  read.csv("ccasanetdatabase_2021jan21/ce.csv", stringsAsFactors = F)
#This table is used to capture in one place all the patients’ clinical endpoints (opportunistic infections or other co-morbid conditions including Non-AIDS Defining Events). 21394 patients

# Inclusion criteria:
# 1.    Age ≥ 18 years old with confirmed HIV infection and under medical follow – up at an HIV center for at least 18 months after treatment initiation
# 2.    Receiving a 3 drug-based ART regime for at least 6 months without modifications
# 3.    Having sustained undetectable viral loads during follow – up once ART (3 and more drug-based) is initiated (< 40 copies/mL); only blips (< 200 copies/ml) are allowed
# 
# Exclusion criteria:
# 1.    History of a 1- or 2 drug-based ART regimes

dat <- basic
#join tblfollow and tblbasic
dat <- follow %>% right_join(dat, by = "patient")
#obtain the last visit day from tblvisit 
dat <- visit %>% 
  group_by(patient) %>% 
  slice(n()) %>% 
  rename(l_visit_d = visit_d) %>% 
  right_join(dat, by = "patient") 
#select patients without history of a 1 or 2 drug-based ART regimes
idx <- unique(art[art$art_class == "non-HAART",]$patient)
dat <- dat[!(dat$patient %in% idx),]
#got the first date of HAART from tblART
dat <- art[!(art$patient %in% idx), ] %>% 
  mutate(art_sdate = strptime(art_sd, format = "%Y-%m-%d")) %>% 
  filter(art_class == "HAART") %>% 
  group_by(patient) %>% 
  arrange(art_sdate,.by_group = T) %>%
  slice(1) %>% 
  select(patient, art_sdate, art_sd) %>% 
  right_join(dat, by = "patient")
dat <- dat[!is.na(dat$art_sdate),]
#age = 1st HAART treatment date - birth_d
dat$age_trt <- difftime(dat$art_sdate, strptime(dat$birth_d, format = "%Y-%m-%d"), units = "days")/365
#select age >=18 
dat <- dat[dat$age_trt >= 18, ]
#select patients under follow-up for >= 18 months (18*30), follow-up = last visit date - haart start date
idx <- which(difftime(strptime(dat$l_visit_d, format = "%Y-%m-%d"), dat$art_sdate, units = "days") >= 18*30)
dat <- dat[idx, ]
#calculate the time of taking drugs, art_ed - art_sd and l_alive_d - art_sd if art_ed=""
idx <- unique(art[art$art_class == "non-HAART",]$patient)
art_sub <- art[!(art$patient %in% idx),] %>% 
            filter(patient %in% dat$patient)%>% 
            left_join(dat, by = "patient")
art_sub$haart_len <- difftime(strptime(art_sub$art_ed, format = "%Y-%m-%d"), strptime(art_sub$art_sd.x, format = "%Y-%m-%d"), units = "days")
art_sub[art_sub$art_ed== "", ]$haart_len <- difftime(strptime(art_sub[art_sub$art_ed== "", ]$l_alive_d, format = "%Y-%m-%d"), strptime(art_sub[art_sub$art_ed== "", ]$art_sd.x, format = "%Y-%m-%d"), units = "days")
#select patients receiving HAART for at least 6 months (6*30) without modifications
art_sub <- art_sub %>% filter(haart_len >= 6*30)
dat <- dat[dat$patient %in% unique(art_sub$patient),]

#tblRNA, <200 copies/ml, maximum detection limits = 500 
## Having sustained undetectable viral loads during follow – up once HAART is initiated 
idx <- rna %>%
  filter(patient %in% dat$patient) %>%
  left_join(dat, by = "patient") %>% 
  mutate(rna_date = strptime(rna_d, format = "%Y-%m-%d")) %>%
  group_by(patient) %>% 
  filter(rna_date > art_sdate) %>% 
  mutate(viro = sum((rna_v < 200) & (rna_v >= -200)), count = n(), test = viro - count) %>%
  filter(test == 0) %>% 
  select(patient)
dat <- dat[dat$patient %in% idx$patient,]

#include patients with CD4 lab at most 180 days before or at most 30 days after treatment initiation
idx <- cd4 %>% 
  filter(patient %in% dat$patient) %>% 
  left_join(dat, by ="patient") %>% 
  mutate(d = difftime(strptime(cd4_d, format = "%Y-%m-%d"), art_sdate, units = "days")) %>%
  filter((d <= 30 & d > 0) | (d >= -180 & d <=0 )) %>% 
  select(patient)
dat <- dat[dat$patient %in% idx$patient,]
#merge tblcd4 with dat, keep CD4 lab before last visit
#CD4 lab between art start date -180/+30 as baseline 
dat <- cd4 %>% 
  filter(patient %in% dat$patient) %>% 
  left_join(dat, by ="patient") %>% 
  mutate(cd4_date = strptime(cd4_d, format = "%Y-%m-%d")) %>%
  filter(cd4_date <= strptime(l_visit_d, format = "%Y-%m-%d")) %>% 
  mutate(cd4_times = difftime(cd4_date, art_sdate, units = "days")) %>% 
  filter(cd4_times >= -180) 
#obtain the closest date to baseline
dat_sub <- dat %>% 
  filter(cd4_times < 0) %>% 
  group_by(patient) %>% 
  arrange(cd4_times,.by_group = T) %>%
  slice(n())
dat <- rbind(dat_sub, dat %>% filter(cd4_times >=0)) %>% 
    group_by(patient) %>% 
    arrange(cd4_times,.by_group = T)

#obtained cd4 at 24+/-3 months of HAART initiation 
# For the patients having CD4 lab at the same days before and after 24 months, the CD4 lab after 24 months is used.
dat_sub2yr <- dat %>% 
  group_by(patient) %>% 
  mutate(d = abs(cd4_times - 365*2))%>% 
  filter(d <= 90) %>% 
  arrange(d, desc(cd4_times),.by_group = T) %>%
  slice(1) %>% 
  select(-d)
#calculate CD4 increase after 24+/-3 months 
dat_sub <- dat %>% 
  group_by(patient) %>%
  arrange(cd4_date,.by_group = T) %>%
  slice(1) %>% 
  select(patient, cd4_v, cd4_per) %>%
  rename(f_cd4_v = cd4_v, f_cd4_per = cd4_per) %>% 
  right_join(dat_sub2yr, by = "patient") %>% 
  mutate(cd4_incr = cd4_v - f_cd4_v) %>% 
  rename(cd4_v_2yr = cd4_v)
#use quartiles (138, 238, 362) of increase in CD4 after 2 years to divide patients into four groups
lbs <- paste("Increase",c("<150","=[150,250)","=[250,350]",">350"),sep="")
dat_sub <- dat_sub %>% mutate(Group = cut(cd4_incr, 
                            breaks = c(-Inf, 150, 250, 350, Inf), right=F, include.lowest= T,
                            labels = lbs))
dat_sub[dat_sub$cd4_incr == 350,]$Group <- "Increase=[250,350]"
dat_sub <- dat %>% 
  filter(patient %in% dat_sub$patient) %>% 
  left_join(dat_sub[,c("patient","Group", "f_cd4_v", "cd4_incr", "cd4_v_2yr")], by = "patient")


########################################baseline characteristics
##################epidemiological information (sex, age at diagnosis, HIV transmission route, sexual behavior, history of tuberculosis)
b <- dat_sub
#sex
b$sex <- factor(b$male_y, levels = c(0, 1), labels = c("Female", "Male"))
label(b$sex) <- "Sex"

#age at treatment initiation 
b$age_trt <- as.numeric(b$age_trt)
label(b$age_trt) <- "Age at HAART initiation (years)"

#HIV transmission route
#collapse levels of HIV transmission route 
b$hivtrans <- "Other"
b$hivtrans[b$mode %in% c("Homosexual contact","Bisexual")] <- "Homosexual/Bisexual contact"
b$hivtrans[b$mode == "Heterosexual contact"] <- "Heterosexual contact"
b$hivtrans[b$mode == "Unknown"] <- "Unknown"
label(b$hivtrans) <- "HIV transmission route"

#history of tuberculosis before treatment initiation 
##no missing dates of clinical outcomes for ade_tuberculosis
idx <- ce %>% 
  filter(patient %in% b$patient) %>% 
  filter(ce_id == "ade_tuberculosis") %>% 
  left_join(b[,c("patient","art_sdate")] %>% distinct(), by = "patient") %>% 
  mutate(tb_d = strptime(ce_d, format = "%Y-%m-%d")) %>% 
  filter(tb_d < art_sdate) %>% 
  select(patient) %>% 
  distinct()
b <- b %>% 
  mutate(tb_history = 0) %>% 
  mutate(tb_history = replace(tb_history, patient %in% idx$patient, 1))
b$tb_history <- factor(b$tb_history, levels = c(0, 1), labels = c("No", "Yes"))
label(b$tb_history) <- "History of tuberculosis"

#HIV-related information (CDC stage at treatment initiation, history of opportunistic infections before treatment initiation, viral load at treatment initiation, CD4-T cell lymphocytes count prior to therapy initiation, time from diagnosis to HAART initiation).

#CDC stage at treatment initiation
#obtain the visit date closest to treatment initiation
#196 patients only had visit date after treatment initiation 
b <- visit %>% 
  filter(patient %in% b$patient) %>% 
  left_join(b[,c("patient","art_sdate")] %>% distinct(), by = "patient") %>% 
  mutate(visit_date = strptime(visit_d, format = "%Y-%m-%d")) %>% 
  mutate(d = difftime(art_sdate, visit_date, units = "days"), i = ifelse(d >= 0, 0, 1)) %>% 
  group_by(patient) %>% 
  arrange(i, abs(d), .by_group = T) %>% 
  slice(1) %>% 
  rename(cdc_history = cdcstage, cdc_history_date = visit_date) %>% 
  select(patient, cdc_history, cdc_history_date) %>% 
  right_join(b, by = "patient")
b[b$cdc_history == "",]$cdc_history <- "Unknown"

b$cdc_history <- factor(b$cdc_history, levels=c("A", paste("A",1:3, sep=""), 
                                                "B",paste("B",1:3, sep=""),
                                                "C", paste("C",1:3, sep=""), "Unknown"))
label(b$cdc_history) <- "CDC stage at HAART initiation"


#history of opportunistic infections at treatment initiation 
##AIDS Defining Events
#One representing clinical aids only (preARTAIDS.clinical)
#One representing clinical aids + CD4 < 200 (preARTAIDS.all)
load("ccasanetdatabase_2021jan21/CCASAnetAIDSstatus.Rda")
b <- AIDSstatus %>% 
  filter(patient %in% b$patient) %>% 
  right_join(b, by = "patient")
b$preARTAIDS.clinical <- factor(b$preARTAIDS.clinical, levels = c("not AIDS","AIDS", "Unknown"), labels = c("No", "Yes", "Unknown"))
label(b$preARTAIDS.clinical) <- "AIDS defining events prior to HAART initiation" 

#viral load at treatment initiation 
#<=180 days before and <7 days after HAART initiation (1806 patients)
b <- rna %>%
  filter(patient %in% b$patient) %>%
  left_join(b[,c("patient","art_sdate")] %>% distinct(), by = "patient") %>% 
  mutate(rna_date = strptime(rna_d, format = "%Y-%m-%d")) %>% 
  mutate(d = difftime(rna_date, art_sdate, units = "days")) %>% 
  filter(d >= -180 & d <= 7) %>% 
  mutate(i = ifelse(d < 0, 0, 1)) %>% 
  group_by(patient) %>% 
  arrange(i, abs(d), .by_group = T) %>%
  slice(1) %>% 
  select(patient, rna_d, rna_d_a, rna_v) %>% 
  rename(rna_d_trt = rna_d, rna_d_a_trt = rna_d_a, rna_v_trt = rna_v) %>% 
  right_join(b, by ="patient")
#53/3698 patients with undetectable viral load
b$rna_v_trt <- b$rna_v_trt/1e5
b$catrna_v_trt <- "Yes"
b$catrna_v_trt[b$rna_v_trt < 0] <- "No"
label(b$catrna_v_trt) <- "Detectable viral load at HAART initiation"
b$catrna_v_trt[is.na(b$rna_v_trt)] <- NA
b$rna_v_trt <- abs(b$rna_v_trt) #undetectable values as detection limits 
label(b$rna_v_trt) <-"Viral load at HAART initiation (10^5 copies/ml)"

#CD4-T cell lymphocytes count prior to therapy initiation
##Baseline CD4 -180/+30 days after treatment initiation 
label(b$f_cd4_v) <- "CD4 T cell count at HAART initiation (cells/ml)" 

#Days from diagnosis to HAART initiation
b$hivdiag_date <- b$hivdiagnosis_d#HIV diagnosis date 
b$hivdiag_date[b$hivdiagnosis_d_a == "U"] <- NA
b$hivdiag_date <- strptime(b$hivdiag_date, format = "%Y-%m-%d")
b <- b %>% 
  mutate(diag_to_trt = difftime(art_sdate, hivdiag_date, unit="days"))
b$diag_to_trt <- as.numeric(b$diag_to_trt)
#force negative values to be 0, 20 patients with negative days 
b$diag_to_trt[(b$diag_to_trt < 0) & !is.na(b$diag_to_trt)] <- 0
label(b$diag_to_trt) <- "Days from HIV diagnosis to HAART initiation"

#HIV center 
b[b$center.x %in% c("hn-ss","hn-he"),]$center.x <- "hn-ss/he"
label(b$center.x) <- "HIV center"
b$center.x <- toupper(b$center.x)

#follow-up times
# b <-  b %>% mutate(follow_up_time = interval(ymd(art_sd), ymd(l_visit_d)) %/% years(1)) 
b <- b %>% mutate(follow_up_time = difftime(strptime(l_visit_d, format = "%Y-%m-%d"), art_sdate, units = "days") / 365.25)
b$follow_up_time <- as.numeric(b$follow_up_time)
label(b$follow_up_time) <- "Follow-up years"

4 Aim 1 Statistical Analysis


The quartiles of increase in CD4 were 138 (25%), 238 (50%), 362 (75%). Based on the quartiles, we divided the 4756 patients into four groups; increase<150, [150,250), [250, 350], >350 cells/ml. This grouping was used for baseline description and it was not used for any regression analysis. The baseline characteristics were summarized in Table 1. All the baseline variables, except history of tuberculosis, significantly differed between the four groups.

#Table for baseline characteristics by Increase 
b.bsl <- b %>% 
  group_by(patient) %>% 
  arrange(cd4_times) %>% 
  slice(1) 
# add p-value as a level 
lbs <- paste("Increase",c("<150","=[150,250)","=[250,350]",">350"),sep="")
b.bsl$otc <- factor(as.integer(b.bsl$Group), levels=1:5,
                       labels=c(lbs, "P-value"))
# functions to add p-value to table 1
rndr <- function(x, name, ...) {
    if (length(x) == 0) {
        y <- b.bsl[[name]]
        s <- rep("", length(render.default(x=y, name=name, ...)))
        if (is.numeric(y)) {
          # modify to use non-parametric kruskal test for continuous variables
            p <- kruskal.test(y ~ b.bsl$otc)$p.value
        } else {
            p <- ifelse(length(unique(y[!is.na(y)])) == 1, NA, chisq.test(table(y, droplevels(b.bsl$otc)), simulate.p.value=F)$p.value)
        }
        s[2] <- sub("<", "&lt;", format.pval(p, digits=3, eps=0.001))
        s
    } else {
        render.default(x=x, name=name, render.continuous = c(.="Mean (SD)", .="Median [Q1, Q3]"),...)
    }
}

rndr.strat <- function(label, n, ...) {
    ifelse(n==0, label, render.strat.default(label, n,...))
}


tbl_bs <- table1::table1( ~ sex + age_trt + center.x + hivtrans + tb_history + preARTAIDS.clinical + catrna_v_trt + rna_v_trt + f_cd4_v + diag_to_trt + follow_up_time| otc, data = b.bsl, droplevels = F, render = rndr, render.strat = rndr.strat, overall = "Total",caption="Table Baseline Characteristics")
tbl_bs
Table Baseline Characteristics
Increase<150
(N=1326)
Increase=[150,250)
(N=1127)
Increase=[250,350]
(N=972)
Increase>350
(N=1331)
P-value Total
(N=4756)
Sex
Female 427 (32.2%) 357 (31.7%) 327 (33.6%) 517 (38.8%) <0.001 1628 (34.2%)
Male 899 (67.8%) 770 (68.3%) 645 (66.4%) 814 (61.2%) 3128 (65.8%)
Age at HAART initiation (years)
Mean (SD) 39.3 (11.2) 38.4 (10.2) 37.7 (9.85) 36.1 (9.81) <0.001 37.9 (10.4)
Median [Q1, Q3] 38.4 [31.0, 46.7] 37.0 [31.0, 45.1] 36.3 [30.4, 44.0] 34.9 [28.5, 42.3] 36.6 [30.1, 44.7]
HIV center
BR-IPEC 174 (13.1%) 148 (13.1%) 156 (16.0%) 329 (24.7%) <0.001 807 (17.0%)
CL-FA 138 (10.4%) 146 (13.0%) 94 (9.7%) 113 (8.5%) 491 (10.3%)
HN-SS/HE 86 (6.5%) 50 (4.4%) 38 (3.9%) 37 (2.8%) 211 (4.4%)
HT-GHESKIO 518 (39.1%) 381 (33.8%) 350 (36.0%) 457 (34.3%) 1706 (35.9%)
MX-INCMNSZ 170 (12.8%) 152 (13.5%) 107 (11.0%) 113 (8.5%) 542 (11.4%)
PE-IMTAVH 240 (18.1%) 250 (22.2%) 227 (23.4%) 282 (21.2%) 999 (21.0%)
HIV transmission route
Heterosexual contact 310 (23.4%) 302 (26.8%) 242 (24.9%) 358 (26.9%) 0.01 1212 (25.5%)
Homosexual/Bisexual contact 431 (32.5%) 407 (36.1%) 340 (35.0%) 474 (35.6%) 1652 (34.7%)
Other 12 (0.9%) 4 (0.4%) 8 (0.8%) 4 (0.3%) 28 (0.6%)
Unknown 573 (43.2%) 414 (36.7%) 382 (39.3%) 495 (37.2%) 1864 (39.2%)
History of tuberculosis
No 1198 (90.3%) 1029 (91.3%) 859 (88.4%) 1185 (89.0%) 0.101 4271 (89.8%)
Yes 128 (9.7%) 98 (8.7%) 113 (11.6%) 146 (11.0%) 485 (10.2%)
AIDS defining events prior to HAART initiation
No 870 (65.6%) 694 (61.6%) 577 (59.4%) 810 (60.9%) 0.002 2951 (62.0%)
Yes 303 (22.9%) 317 (28.1%) 300 (30.9%) 374 (28.1%) 1294 (27.2%)
Unknown 153 (11.5%) 116 (10.3%) 95 (9.8%) 147 (11.0%) 511 (10.7%)
Detectable viral load at HAART initiation
No 61 (4.6%) 20 (1.8%) 12 (1.2%) 19 (1.4%) <0.001 112 (2.4%)
Yes 616 (46.5%) 615 (54.6%) 503 (51.7%) 735 (55.2%) 2469 (51.9%)
Missing 649 (48.9%) 492 (43.7%) 457 (47.0%) 577 (43.4%) 2175 (45.7%)
Viral load at HAART initiation (10^5 copies/ml)
Mean (SD) 1.77 (8.53) 1.80 (5.21) 2.72 (6.15) 3.18 (10.3) <0.001 2.38 (8.04)
Median [Q1, Q3] 0.365 [0.0566, 1.08] 0.637 [0.190, 1.53] 0.858 [0.209, 2.30] 0.750 [0.162, 2.79] 0.640 [0.140, 1.79]
Missing 649 (48.9%) 492 (43.7%) 457 (47.0%) 577 (43.4%) 2175 (45.7%)
CD4 T cell count at HAART initiation (cells/ml)
Mean (SD) 300 (245) 207 (169) 200 (173) 225 (180) <0.001 237 (201)
Median [Q1, Q3] 246 [142, 370] 181 [69.5, 301] 175 [58.0, 290] 202 [77.0, 322] 204 [85.0, 326]
Days from HIV diagnosis to HAART initiation
Mean (SD) 722 (1190) 556 (991) 595 (939) 622 (999) 0.013 629 (1040)
Median [Q1, Q3] 174 [47.0, 943] 125 [39.0, 598] 136 [40.0, 750] 139 [40.0, 806] 143 [41.0, 790]
Missing 8 (0.6%) 6 (0.5%) 4 (0.4%) 3 (0.2%) 21 (0.4%)
Follow-up years
Mean (SD) 6.94 (3.50) 7.34 (3.36) 7.09 (3.48) 6.85 (3.22) 0.002 7.04 (3.39)
Median [Q1, Q3] 6.16 [4.20, 9.21] 6.71 [4.57, 9.85] 6.43 [4.10, 9.58] 6.47 [4.15, 8.85] 6.45 [4.25, 9.38]

Undetectable viral loads were set as their detection limits. The less frequent levels of HIV transmission were collapsed, including injecting drug user, homo/bisexual and injecting drug user, transfusion nonhemophilia related, generic sexual, perinatal and other. There are 20 patients who started ART treatment prior to their HIV diagnosis, their days from HIV diagnosis to ART initiation were forced to 0.


HIV transmission route and baseline viral load were missing for 99.9% (1705/1706) and 99.8% (1702/1706), respectively, of patients from Haiti. Table 4 summarizes the baseline characteristics of the data with Haiti excluded.

b.bsl <- b.bsl[b.bsl$country != "HTI",]
tbl_bs2 <- table1::table1( ~ sex + age_trt + center.x + hivtrans + tb_history + preARTAIDS.clinical + catrna_v_trt + rna_v_trt + f_cd4_v + diag_to_trt + follow_up_time | otc, data = b.bsl, droplevels = F, render = rndr, render.strat = rndr.strat, overall = "Total",caption="Table Baseline Characteristics (Haiti excluded)")
tbl_bs2
Table Baseline Characteristics (Haiti excluded)
Increase<150
(N=808)
Increase=[150,250)
(N=746)
Increase=[250,350]
(N=622)
Increase>350
(N=874)
P-value Total
(N=3050)
Sex
Female 153 (18.9%) 148 (19.8%) 125 (20.1%) 219 (25.1%) 0.009 645 (21.1%)
Male 655 (81.1%) 598 (80.2%) 497 (79.9%) 655 (74.9%) 2405 (78.9%)
Age at HAART initiation (years)
Mean (SD) 38.3 (11.7) 37.0 (10.1) 36.2 (10.0) 35.0 (9.73) <0.001 36.6 (10.5)
Median [Q1, Q3] 36.9 [29.3, 45.8] 35.3 [29.5, 43.4] 34.8 [29.0, 41.9] 33.9 [27.4, 41.2] 34.9 [28.7, 43.1]
HIV center
BR-IPEC 174 (21.5%) 148 (19.8%) 156 (25.1%) 329 (37.6%) <0.001 807 (26.5%)
CL-FA 138 (17.1%) 146 (19.6%) 94 (15.1%) 113 (12.9%) 491 (16.1%)
HN-SS/HE 86 (10.6%) 50 (6.7%) 38 (6.1%) 37 (4.2%) 211 (6.9%)
MX-INCMNSZ 170 (21.0%) 152 (20.4%) 107 (17.2%) 113 (12.9%) 542 (17.8%)
PE-IMTAVH 240 (29.7%) 250 (33.5%) 227 (36.5%) 282 (32.3%) 999 (32.8%)
HIV transmission route
Heterosexual contact 310 (38.4%) 302 (40.5%) 240 (38.6%) 358 (41.0%) 0.114 1210 (39.7%)
Homosexual/Bisexual contact 431 (53.3%) 407 (54.6%) 340 (54.7%) 474 (54.2%) 1652 (54.2%)
Other 12 (1.5%) 4 (0.5%) 8 (1.3%) 4 (0.5%) 28 (0.9%)
Unknown 55 (6.8%) 33 (4.4%) 34 (5.5%) 38 (4.3%) 160 (5.2%)
History of tuberculosis
No 743 (92.0%) 671 (89.9%) 547 (87.9%) 786 (89.9%) 0.094 2747 (90.1%)
Yes 65 (8.0%) 75 (10.1%) 75 (12.1%) 88 (10.1%) 303 (9.9%)
AIDS defining events prior to HAART initiation
No 501 (62.0%) 405 (54.3%) 334 (53.7%) 499 (57.1%) 0.005 1739 (57.0%)
Yes 199 (24.6%) 241 (32.3%) 211 (33.9%) 267 (30.5%) 918 (30.1%)
Unknown 108 (13.4%) 100 (13.4%) 77 (12.4%) 108 (12.4%) 393 (12.9%)
Detectable viral load at HAART initiation
No 61 (7.5%) 20 (2.7%) 12 (1.9%) 18 (2.1%) <0.001 111 (3.6%)
Yes 614 (76.0%) 615 (82.4%) 503 (80.9%) 734 (84.0%) 2466 (80.9%)
Missing 133 (16.5%) 111 (14.9%) 107 (17.2%) 122 (14.0%) 473 (15.5%)
Viral load at HAART initiation (10^5 copies/ml)
Mean (SD) 1.78 (8.54) 1.80 (5.21) 2.72 (6.15) 3.19 (10.3) <0.001 2.39 (8.05)
Median [Q1, Q3] 0.373 [0.0565, 1.08] 0.637 [0.190, 1.53] 0.858 [0.209, 2.30] 0.750 [0.165, 2.80] 0.642 [0.141, 1.79]
Missing 133 (16.5%) 111 (14.9%) 107 (17.2%) 122 (14.0%) 473 (15.5%)
CD4 T cell count at HAART initiation (cells/ml)
Mean (SD) 318 (259) 214 (171) 217 (186) 247 (193) <0.001 252 (210)
Median [Q1, Q3] 265 [138, 402] 189 [67.0, 312] 195 [57.3, 319] 230 [86.3, 345] 220 [88.0, 345]
Days from HIV diagnosis to HAART initiation
Mean (SD) 714 (1270) 551 (1020) 620 (994) 638 (1010) 0.091 633 (1090)
Median [Q1, Q3] 152 [55.0, 780] 126 [47.0, 553] 147 [48.0, 721] 150 [48.0, 851] 144 [50.0, 759]
Missing 3 (0.4%) 5 (0.7%) 2 (0.3%) 1 (0.1%) 11 (0.4%)
Follow-up years
Mean (SD) 6.63 (3.47) 6.99 (3.40) 6.75 (3.54) 6.66 (3.30) 0.076 6.75 (3.42)
Median [Q1, Q3] 5.93 [3.95, 8.42] 6.37 [4.24, 9.30] 5.97 [3.94, 8.91] 6.19 [4.02, 8.38] 6.10 [4.03, 8.73]


The CD4 T cell count of each group within 2 years after HAART initiation is visualized for each group. The black solid curves are natural cubic splines with 5 degrees of freedom.

ggplot(data = b, aes(x = as.numeric(cd4_times), y = cd4_v, group = Group, col=Group)) +
  facet_wrap(~ Group)+ 
  geom_line() + 
  geom_smooth(method = lm, formula = y ~ splines::ns(x, 5), color="grey28", size=0.6,se=T)+
  scale_colour_manual(values = c("red", "lightcoral", "dodgerblue2", "deepskyblue2"))+
  ylab("CD4 T cell count (cells/ml)")+
  xlab("Days since ART initiation")+
  xlim(c(0,365*2))+ylim(c(0,2000))+
  theme_classic()+theme(legend.position = "None")+
  labs(caption = "(Y axis is limited between 0 and 2000)") + theme(text = element_text(size=15))
<center>Figure 2 Longitudinal CD4 T cell counts in 2 years after HAART initiation</center>

Figure 2 Longitudinal CD4 T cell counts in 2 years after HAART initiation

t <- b %>% group_by(patient) %>% slice(1)
# t$hivdiag_date <- t$hivdiagnosis_d#HIV diagnosis date 
# t$hivdiag_date[t$hivdiagnosis_d_a == "U"] <- NA
#summary(t$hivdiag_date) 
#4592 patients were diagnosed prior to 2016-7
#sum(as.Date(t$hivdiag_date) <= "2016-07-01", na.rm=T)
ggplot(t,aes(x=as.Date(hivdiag_date)))+
 stat_bin(colour="black", binwidth=130, alpha=0.5,position="identity") + 
 scale_x_date(breaks=date_breaks("2 years"), labels=date_format("%Y")) + ylab("Frequency") + xlab("Year of HIV diagnosis")+theme_classic() + theme(axis.title = element_text(size=15))

ggplot(t,aes(x=as.Date(art_sd)))+
 stat_bin(colour="black", binwidth=100, alpha=0.5,position="identity") + 
 scale_x_date(breaks=date_breaks("2 years"), labels=date_format("%Y")) + ylab("Frequency") + xlab("Year of treatment initiation")+theme_classic()+theme(axis.title = element_text(size=15))

t <- t %>% mutate(cat_diag_to_trt = case_when(diag_to_trt>365~">1 year", 
                                              diag_to_trt<=365~"<=1 year"))

ggplot(t,aes(x=as.Date(hivdiag_date), group=cat_diag_to_trt,fill=cat_diag_to_trt))+
 stat_bin(colour="black", binwidth=130, alpha=0.55, position="identity") + 
 scale_x_date(breaks=date_breaks("2 years"), labels=date_format("%Y")) + ylab("Frequency") + xlab("Year of HIV diagnosis")+ labs(fill="Time from HIV diagnosis to treatment")+theme_classic()+theme(legend.position = "bottom")

ggplot(t %>% filter(age_trt > 50)) + xlab(latex2exp::TeX("Increase in CD4 count (cells/mm$^3$)")) + 
  geom_histogram(aes(x = cd4_incr, y = ..count..),binwidth = 40) + ylab("Frequency") + 
  facet_grid(~sex) + theme_classic() + labs(title="Histogram of increase in CD4 count of patients >50 years old")

p <- b %>% 
  select(patient, death_y, death_d, Group, art_sdate, age_trt, sex, center.x, l_visit_d, cd4_incr,f_cd4_v) %>% 
  rename("Sex" = sex, "Age" = age_trt, "Center" = center.x) %>%
  distinct() %>% 
  filter(Age > 50)
p <- p %>% 
    mutate(time = ifelse(death_y==1, 
                         difftime(strptime(death_d, format = "%Y-%m-%d"), art_sdate, units = "days"),
                         difftime(strptime(l_visit_d, format = "%Y-%m-%d"), art_sdate, units = "days")))

p$year <- as.numeric(p$time)/365
mod <- survfit(Surv(year, death_y) ~ Sex, data = p)
ggsurv <- ggsurvplot(mod,
          conf.int = FALSE,
          risk.table = TRUE,
          censor = F,
          pval = T,
          pval.method = F,
          pval.coord = c(1, 0.945),
          pval.method.coord = c(1, 0.95),
          risk.table.col = "strata", # increase risk table color by groups
          ggtheme = theme_classic(), # increase ggplot2 theme
          xlab="Years since treatment initiation",
          break.time.by = 1, xlim=c(0, 10), ylim=c(0.90,1),
          title = "Kaplan-Meier Curves of Patients >50 years old"
          )
ggsurv$plot <- ggsurv$plot + theme(legend.position = "none", text = element_text(size=15)) 
ggsurv

5 Aim 2 Statistical Analysis


We built cumulative probability models (CPM) to identify factors associated with immunological discordance and concordance. The potential factors are baseline epidemiological variables and baseline HIV-related variables, including sex, HIV center, age at treament initiation, CD4 T cell count at treament initiation (cell/ml), history of TB, history of AIDS-defining events, time from HIV diagnosis to treatment initiation, HIV transmission route, viral load at treatment initiation. To avoid linearity assumptions, we expanded the continuous variables (i.e., age, baseline CD4 count, time from HIV diagnosis to treatment initiation, viral load at treatment initiation) using restricted cubic splines (three knots for each variable). Also, we applied square root transformation to the baseline CD4 count to reduce its skewness.

Because Haiti had 99.9% patients with missing values of HIV transmission route and 99.8% patients with missing values of viral load at treatment initiation, Haiti’s data should be excluded for analyses which use the two variables. Here we did two sets of analyses: one analysis with Haiti excluded, one analysis with Haiti included and the two variables excluded.

5.1 Cumulative Probability Model with Haiti included

In the following analysis, Haiti’s data was included, viral load at treatment initiation and HIV transmission route were excluded. There were two variables with missing values, history of AIDS defining events (10.7%), days from HIV diagnosis to treatment initiation (0.4%). We used multiple imputation (10 times) to impute the missing values of them.

## Data for modelling 
vn <-  Cs(cd4_incr, f_cd4_v, sex, age_trt, tb_history, center.x, preARTAIDS.clinical, diag_to_trt)
datmod <- b %>% 
  group_by(patient) %>% 
  arrange(cd4_times) %>% 
  slice(1) %>% 
  select(all_of(vn))
datmod$preARTAIDS.clinical[datmod$preARTAIDS.clinical == "Unknown"] <- NA
datmod$preARTAIDS.clinical <- factor(datmod$preARTAIDS.clinical, levels = c("No","Yes"))
## 10 times multiple imputations
set.seed(1234)
dat.impute <- aregImpute( ~ cd4_incr + sex +  age_trt + sqrt(f_cd4_v)  + tb_history + 
                        center.x  + preARTAIDS.clinical + diag_to_trt,
                        data=datmod, n.impute=10, pr = F)
dd <- datadist(datmod)
options(datadist = "dd")

#set reference for HIV center 
dd$limits["Adjust to","center.x"] <- "BR-IPEC"
#Fit the CPM 
mod.HTI <- fit.mult.impute(cd4_incr ~  sex +  rcs(age_trt,3) + rcs(sqrt(f_cd4_v),3)  + tb_history + 
                        center.x  + preARTAIDS.clinical+ rcs(diag_to_trt,3),
                        fitter = orm, data = datmod, xtrans = dat.impute, pr = F)
#Format a table of estimates
pval <- anova(mod.HTI)[,"P"][c("age_trt", "f_cd4_v", "diag_to_trt","sex", "tb_history","center.x", "preARTAIDS.clinical")]
pval <- format.pval(pval,digits = 3, eps = 1e-3)
pval <- c(pval[1:(length(pval)-2)], rep("<0.001", 5), tail(pval,1))
or <- summary(mod.HTI)
or <- or[startsWith(rownames(or), " Odds"),]
or <- or[4:nrow(or),]
or <- rbind(NA,summary(mod.HTI,  age_trt=c(35, 20), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  age_trt=c(35, 30), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  age_trt=c(35, 40), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  age_trt=c(35, 50), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.HTI,  f_cd4_v=c(200, 50), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  f_cd4_v=c(200, 100), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  f_cd4_v=c(200, 300), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  f_cd4_v=c(200, 500), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.HTI,  diag_to_trt=c(180, 0), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  diag_to_trt=c(180, 90), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  diag_to_trt=c(180, 365), est.all = F)[" Odds Ratio", ],
summary(mod.HTI,  diag_to_trt=c(180, 365 * 2), est.all = F)[" Odds Ratio", ],
or)[,c("Effect", "Lower 0.95", "Upper 0.95")]

or <- formatC(or, digits = 2, format="f")
or <- cbind(or[,1], paste("[",or[,2]," ,", or[,3],"]",sep=""))
or[seq(1,15,5),] <- " "
pred.vars <- c("Age (years)",
               "20 vs. 35", 
               "30 vs. 35",
               "40 vs. 35",
               "50 vs. 35",
               "Baseline CD4 count (cells/ml)",
               "50 vs. 200",
               "100 vs. 200",
               "300 vs. 200",
               "500 vs. 200",
               "Time from diagnosis to treatment initiation",
               "0 months vs. 6 months",
               "3 months vs. 6 months",
               "1 years vs. 6 months",
               "2 years vs. 6 months",
               "Female vs. Male",
               "TB history Yes versus No",
               "HIV Center:CL-FA vs. BR-IPEC",
               "HIV Center:HN-HE/SS vs. BR-IPEC",
               "HIV Center:HT-GHESKIO vs. BR-IPEC",
               "HIV Center:MX-INCMNSZ vs. BR-IPEC",
               "HIV Center:PE-IMTAVH vs. BR-IPEC",
               "AIDS-defining events: Yes vs. No")
pval <- c(pval[1], rep(" ",4), 
          pval[2], rep(" ",4), 
          pval[3], rep(" ",4), 
          pval[4:length(pval)])
output <- cbind(pred.vars, or, pval)
rownames(output) <- NULL
colnames(output) <- c("Predictors", 'Odds Ratio', "95% CI", "P-value")
output %>%
  kbl(caption = "<center> Table Odds Ratio of the Cumulative Probability Model (Haiti included)", align = "l")  %>% kable_classic(html_font = "Cambria") %>% 
    add_indent(setdiff(1:15, seq(1,15,5)))
Table Odds Ratio of the Cumulative Probability Model (Haiti included)
Predictors Odds Ratio 95% CI P-value
Age (years) <0.001
20 vs. 35 1.29 [1.08 ,1.54]
30 vs. 35 1.10 [1.04 ,1.15]
40 vs. 35 0.89 [0.87 ,0.92]
50 vs. 35 0.69 [0.63 ,0.75]
Baseline CD4 count (cells/ml) <0.001
50 vs. 200 1.32 [1.20 ,1.45]
100 vs. 200 1.23 [1.17 ,1.28]
300 vs. 200 0.76 [0.73 ,0.79]
500 vs. 200 0.44 [0.38 ,0.49]
Time from diagnosis to treatment initiation 0.298
0 months vs. 6 months 0.97 [0.93 ,1.02]
3 months vs. 6 months 0.99 [0.97 ,1.01]
1 years vs. 6 months 1.02 [0.98 ,1.06]
2 years vs. 6 months 1.04 [0.96 ,1.13]
Female vs. Male 1.40 [1.25 ,1.57] <0.001
TB history Yes versus No 0.93 [0.77 ,1.11] 0.423
HIV Center:CL-FA vs. BR-IPEC 0.43 [0.36 ,0.53] <0.001
HIV Center:HN-HE/SS vs. BR-IPEC 0.23 [0.17 ,0.30] <0.001
HIV Center:HT-GHESKIO vs. BR-IPEC 0.42 [0.35 ,0.49] <0.001
HIV Center:MX-INCMNSZ vs. BR-IPEC 0.39 [0.32 ,0.47] <0.001
HIV Center:PE-IMTAVH vs. BR-IPEC 0.54 [0.45 ,0.63] <0.001
AIDS-defining events: Yes vs. No 1.00 [0.87 ,1.14] 0.976

Note: The odds ratio compares two populations with respect to the odds of high increase relative to low increase in CD4 count after two-year treatment. For the first population in the comparison, an OR>1 means a higher odds of high CD4 count increase relative to low increase, an OR<1 means a lower odds of high increase relative to low increase.

There were four factors significantly associated with increase in CD4 count, including age, baseline CD4 count, sex, and HIV site. Old age, high baseline CD4 count, and male were associated with a higher risk of low CD4 count increase after two years. Compared with HIV centers in Brazil, the HIV centers in other countries had a greater risk of low increase.

#ANOVA table
t <- anova(mod.HTI)
nvars <- c("sex", "cd4_incr", "f_cd4_v", "center.x", "age_trt", "tb_history", "preARTAIDS.clinical", "diag_to_trt")
nlbs <- c("Sex", "Increase in CD4", "Baseline CD4 count", "HIV center", "Age", "TB history", "AIDS defining events", "Time from diagnosis to treatment initiation")
for(i in seq_along(nvars)){
  rownames(t) <- gsub(nvars[i],nlbs[i],rownames(t))
}
t <- cbind(formatC(t[,1:2], digits = 2, format = "f"), format.pval(t[,"P"],digits = 3, eps = 1e-3))
t %>% 
  kbl(caption = "<center> Table Wald tests of predictors in the CPM", digits = 2, align = "l")  %>% kable_classic(html_font = "Cambria") %>%  add_indent(c(3, 5, 10))
Table Wald tests of predictors in the CPM
Chi-Square d.f.
Sex 32.75 1.00 <0.001
Age 85.10 2.00 <0.001
Nonlinear 1.32 1.00 0.250
Baseline CD4 count 217.74 2.00 <0.001
Nonlinear 31.01 1.00 <0.001
TB history 0.64 1.00 0.423
HIV center 177.34 5.00 <0.001
AIDS defining events 0.00 1.00 0.976
Time from diagnosis to treatment initiation 2.42 2.00 0.298
Nonlinear 1.83 1.00 0.176
TOTAL NONLINEAR 35.91 3.00 <0.001
TOTAL 445.02 14.00 <0.001

The nonlinear terms of baseline CD4 count were significant, suggesting that baseline CD4 count had a non-linear association with increase in CD4 count.

The following figures visualize the effect of age and baseline CD4 count on the median increase in CD4 count. Other covariates are fixed at median or mode. The grey area is the 95% confidence interval.

# m <- Mean(mod.HTI)
# mage <- Predict(mod.HTI, age_trt, fun= function(x) m(x))
# q <- Quantile(mod.HTI)
# qest[[i]] <- Predict(mod.HTI, age_trt, fun= function(x) q(0.5, x))
# pr <- ExProb(mod.HTI)
# prage <- Predict(mod.HTI, age_trt, fun= function(x) ExProb(x,y=250))

q <- Quantile(mod.HTI)
nvars <- c("age_trt", "f_cd4_v")
lvars <- c("Age (years)", "Baseline CD4 count (cells/ml)")
qest <- list()
qest[[1]] <- Predict(mod.HTI, age_trt, fun= function(x) q(0.5, x))
qest[[2]] <- Predict(mod.HTI, f_cd4_v=seq(0, 700,1), fun= function(x) q(0.5, x))
lylim <- list(c(150,400), c(0, 400))

par(mfrow=c(1,2))
for(i in seq_along(nvars)){
  v <- nvars[i]
  l <- lvars[i]
  if(i == 1) yl <- "Median increase in CD4 Count (cells/ml)"
  else yl = " "
  plot(qest[[i]][, v],qest[[i]]$yhat, cex= 1, cex.lab=1.5, ylab=yl, xlab=l, ylim=lylim[[i]], type="n")
  polygon(c(qest[[i]][, v], rev(qest[[i]][, v])), c(qest[[i]]$lower, rev(qest[[i]]$upper)),col = "grey", border = NA)
  lines(qest[[i]][, v], qest[[i]]$yhat, cex=1)
  lines(qest[[i]][, v], qest[[i]]$lower, cex=1,lty=2)
  lines(qest[[i]][, v], qest[[i]]$upper, cex=1,lty=2)
}

5.1.1 Sensitivity analysis

A sensitivity analysis was done to check if using only patients with a complete case for all the variables of interest significantly affected the regression. The sample size was 4226 under complete cases.

#Set datadist as the data with complete cases 
dd <- datadist(datmod[complete.cases(datmod),])
options(datadist = "dd")
#set reference for HIV center 
dd$limits["Adjust to","center.x"] <- "BR-IPEC"
#Run the model with complete cases 
mod.complete <- orm(cd4_incr ~  sex +  rcs(age_trt,3) + rcs(sqrt(f_cd4_v),3)  + tb_history + 
                        center.x  + preARTAIDS.clinical+ rcs(diag_to_trt,3), data=datmod[complete.cases(datmod),])
#Format a table of estimates
pval <- anova(mod.complete)[,"P"][c("age_trt", "f_cd4_v", "diag_to_trt","sex", "tb_history","center.x", "preARTAIDS.clinical")]
pval <- format.pval(pval,digits = 3, eps = 1e-3)
pval <- c(pval[1:(length(pval)-2)], rep("<0.001", 5), tail(pval,1))
or <- summary(mod.complete)
or <- or[startsWith(rownames(or), " Odds"),]
or <- or[4:nrow(or),]
or <- rbind(NA,summary(mod.complete,  age_trt=c(35, 20), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  age_trt=c(35, 30), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  age_trt=c(35, 40), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  age_trt=c(35, 50), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.complete,  f_cd4_v=c(200, 50), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  f_cd4_v=c(200, 100), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  f_cd4_v=c(200, 300), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  f_cd4_v=c(200, 500), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.complete,  diag_to_trt=c(180, 0), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  diag_to_trt=c(180, 90), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  diag_to_trt=c(180, 365), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  diag_to_trt=c(180, 365 * 2), est.all = F)[" Odds Ratio", ],
or)[,c("Effect", "Lower 0.95", "Upper 0.95")]

or <- formatC(or, digits = 2, format="f")
or <- cbind(or[,1], paste("[",or[,2]," ,", or[,3],"]",sep=""))
or[seq(1,15,5),] <- " "
pred.vars <- c("Age (years)",
               "20 vs. 35", 
               "30 vs. 35",
               "40 vs. 35",
               "50 vs. 35",
               "Baseline CD4 count (cells/ml)",
               "50 vs. 200",
               "100 vs. 200",
               "300 vs. 200",
               "500 vs. 200",
               "Time from diagnosis to treatment initiation",
               "0 months vs. 6 months",
               "3 months vs. 6 months",
               "1 years vs. 6 months",
               "2 years vs. 6 months",
               "Female vs. Male",
               "TB history Yes versus No",
               "HIV Center:CL-FA vs. BR-IPEC",
               "HIV Center:HN-HE/SS vs. BR-IPEC",
               "HIV Center:HT-GHESKIO vs. BR-IPEC",
               "HIV Center:MX-INCMNSZ vs. BR-IPEC",
               "HIV Center:PE-IMTAVH vs. BR-IPEC",
               "AIDS-defining events: Yes vs. No")
pval <- c(pval[1], rep(" ",4), 
          pval[2], rep(" ",4), 
          pval[3], rep(" ",4), 
          pval[4:length(pval)])
output <- cbind(pred.vars, or, pval)
rownames(output) <- NULL
colnames(output) <- c("Predictors", 'Odds Ratio', "95% CI", "P-value")
output %>%
  kbl(caption = "<center> Table Odds Ratio of the Cumulative Probability Model (Haiti included and complete cases)", align = "l")  %>% kable_classic(html_font = "Cambria") %>% 
    add_indent(setdiff(1:15, seq(1,15,5)))
Table Odds Ratio of the Cumulative Probability Model (Haiti included and complete cases)
Predictors Odds Ratio 95% CI P-value
Age (years) <0.001
20 vs. 35 1.32 [1.10 ,1.59]
30 vs. 35 1.10 [1.05 ,1.16]
40 vs. 35 0.89 [0.86 ,0.91]
50 vs. 35 0.67 [0.61 ,0.73]
Baseline CD4 count (cells/ml) <0.001
50 vs. 200 1.38 [1.25 ,1.52]
100 vs. 200 1.26 [1.20 ,1.32]
300 vs. 200 0.75 [0.72 ,0.78]
500 vs. 200 0.43 [0.37 ,0.50]
Time from diagnosis to treatment initiation 0.876
0 months vs. 6 months 0.99 [0.94 ,1.04]
3 months vs. 6 months 0.99 [0.97 ,1.02]
1 years vs. 6 months 1.01 [0.97 ,1.05]
2 years vs. 6 months 1.02 [0.93 ,1.12]
Female vs. Male 1.40 [1.24 ,1.58] <0.001
TB history Yes versus No 0.90 [0.75 ,1.09] 0.300
HIV Center:CL-FA vs. BR-IPEC 0.42 [0.34 ,0.52] <0.001
HIV Center:HN-HE/SS vs. BR-IPEC 0.23 [0.17 ,0.30] <0.001
HIV Center:HT-GHESKIO vs. BR-IPEC 0.41 [0.34 ,0.48] <0.001
HIV Center:MX-INCMNSZ vs. BR-IPEC 0.35 [0.28 ,0.43] <0.001
HIV Center:PE-IMTAVH vs. BR-IPEC 0.51 [0.43 ,0.61] <0.001
AIDS-defining events: Yes vs. No 1.00 [0.87 ,1.15] 0.986
#ANOVA table
t <- anova(mod.complete)
nvars <- c("sex", "cd4_incr", "f_cd4_v", "center.x", "age_trt", "tb_history", "preARTAIDS.clinical", "diag_to_trt")
nlbs <- c("Sex", "Increase in CD4", "Baseline CD4 count", "HIV center", "Age", "TB history", "AIDS defining events", "Time from diagnosis to treatment initiation")
for(i in seq_along(nvars)){
  rownames(t) <- gsub(nvars[i],nlbs[i],rownames(t))
}
t <- cbind(formatC(t[,1:2], digits = 2, format = "f"), format.pval(t[,"P"],digits = 3, eps = 1e-3))
t %>% 
  kbl(caption = "<center> Table Wald tests of predictors in the CPM (Haiti included and complete cases)", digits = 2, align = "l")  %>% kable_classic(html_font = "Cambria") %>%  add_indent(c(3, 5, 10))
Table Wald tests of predictors in the CPM (Haiti included and complete cases)
Chi-Square d.f.
Sex 30.09 1.00 <0.001
Age 87.70 2.00 <0.001
Nonlinear 1.34 1.00 0.248
Baseline CD4 count 199.91 2.00 <0.001
Nonlinear 23.18 1.00 <0.001
TB history 1.07 1.00 0.300
HIV center 175.84 5.00 <0.001
AIDS defining events 0.00 1.00 0.986
Time from diagnosis to treatment initiation 0.26 2.00 0.876
Nonlinear 0.26 1.00 0.608
TOTAL NONLINEAR 25.61 3.00 <0.001
TOTAL 415.19 14.00 <0.001

We got the similar results in the model under complete cases.

5.2 Cumulative Probability Model with Haiti excluded

There were four variables with missing values, viral load at HAART initiation (15.5%), history of AIDS defining events (12.9%), HIV transmission route (5.2%), days from HIV diagnosis to HAART initiation (0.4%). We used multiple imputation (10 times) to impute the missing values of them. The sample size was 3050 after Haiti was excluded.

#construct data for modelling 
vn <-  Cs(cd4_incr, f_cd4_v, sex, hivtrans, age_trt, rna_v_trt, tb_history, center.x, preARTAIDS.clinical, diag_to_trt)
datmod <- b %>% 
  filter(country != "HTI") %>%
  group_by(patient) %>% 
  arrange(cd4_times) %>% 
  slice(1) %>% 
  select(all_of(vn))
datmod$hivtrans[datmod$hivtrans == "Unknown"] <- NA
datmod$hivtrans[datmod$hivtrans %in% c("Homosexual/Bisexual contact","Other")] <- "Homosexual/Bisexual/ Other contact" 
datmod$preARTAIDS.clinical[datmod$preARTAIDS.clinical == "Unknown"] <- NA
datmod$preARTAIDS.clinical <- factor(datmod$preARTAIDS.clinical, levels = c("No","Yes"))

## 10 times multiple imputations
set.seed(1234)
dat.impute=aregImpute(~ cd4_incr + sex + hivtrans + age_trt + sqrt(f_cd4_v) +
                        rna_v_trt + tb_history + 
                        center.x  + preARTAIDS.clinical+ diag_to_trt,
                        data=datmod, n.impute=10, pr = F, nk = 3)

dd <- datadist(datmod)
options(datadist = "dd")
#set values for comparison in each covariate  
dd$limits["Adjust to","center.x"] <- "BR-IPEC"
#Fit a CPM model 
mod.noHTI <- fit.mult.impute(cd4_incr ~ sex + hivtrans + rcs(age_trt,3) + rcs(sqrt(f_cd4_v),3) +
                        rcs(rna_v_trt,3) + tb_history + 
                        center.x  + preARTAIDS.clinical+ rcs(diag_to_trt,3),
                        fitter = orm, data = datmod, xtrans = dat.impute, pr = F)
#Format a table of estimates
pval <- anova(mod.noHTI)[,"P"][c("age_trt", "f_cd4_v", "rna_v_trt", "diag_to_trt","sex", "hivtrans", "tb_history","center.x", "preARTAIDS.clinical")]
pval <- format.pval(pval,digits = 3, eps = 1e-3)
pval <- c(pval[1:(length(pval)-2)], rep("<0.001", 4), tail(pval,1))
or <- summary(mod.noHTI)
or <- or[startsWith(rownames(or), " Odds"),]
or <- or[5:nrow(or),]
or <- rbind(NA,
            summary(mod.noHTI,  age_trt=c(35, 20), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  age_trt=c(35, 30), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  age_trt=c(35, 40), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  age_trt=c(35, 50), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.noHTI,  f_cd4_v=c(200, 50), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  f_cd4_v=c(200, 100), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  f_cd4_v=c(200, 300), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  f_cd4_v=c(200, 500), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.noHTI, rna_v_trt=c(0.6, 0.01), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI, rna_v_trt=c(0.6, 1), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI, rna_v_trt=c(0.6, 3), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI, rna_v_trt=c(0.6, 5), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.noHTI,  diag_to_trt=c(180, 0), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  diag_to_trt=c(180, 90), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  diag_to_trt=c(180, 365), est.all = F)[" Odds Ratio", ],
summary(mod.noHTI,  diag_to_trt=c(180, 365 * 2), est.all = F)[" Odds Ratio", ],
or)[,c("Effect", "Lower 0.95", "Upper 0.95")]

or <- formatC(or, digits = 2, format="f")
or <- cbind(or[,1], paste("[",or[,2]," ,", or[,3],"]",sep=""))
or[seq(1,20,5),] <- " "
pred.vars <- c("Age (years)",
               "20 vs. 35", 
               "30 vs. 35",
               "40 vs. 35",
               "50 vs. 35",
               "Baseline CD4 count (cells/ml)",
               "50 vs. 200",
               "100 vs. 200",
               "300 vs. 200",
               "500 vs. 200",
               "Baseline viral load (10^5 copies/ml)",
               "0.01 vs. 0.6",
               "1 vs. 0.6",
               "3 vs. 0.6",
               "5 vs. 0.6",
               "Time from diagnosis to treatment initiation",
               "0 months vs. 6 months",
               "3 months vs. 6 months",
               "1 years vs. 6 months",
               "2 years vs. 6 months",
               "Female vs. Male",
               "Heterosexual contact vs. Homo/Bisexual/Other contact",
               "TB history Yes versus No",
               "HIV Center:CL-FA vs. BR-IPEC",
               "HIV Center:HN-HE/SS vs. BR-IPEC",
               "HIV Center:MX-INCMNSZ vs. BR-IPEC",
               "HIV Center:PE-IMTAVH vs. BR-IPEC",
               "AIDS-defining events: Yes vs. No")
pval <- c(pval[1], rep(" ",4), 
          pval[2], rep(" ",4), 
          pval[3], rep(" ",4),
          pval[4], rep(" ",4), 
          pval[5:length(pval)])
output <- cbind(pred.vars, or, pval)
rownames(output) <- NULL
colnames(output) <- c("Predictors", 'Odds Ratio', "95% CI", "P-value")
output %>%
  kbl(caption = "<center> Table Odds Ratio of the Cumulative Probability Model (Haiti excluded)", align = "l")  %>% kable_classic(html_font = "Cambria") %>% 
  add_indent(setdiff(1:20, seq(1,20,5)))
Table Odds Ratio of the Cumulative Probability Model (Haiti excluded)
Predictors Odds Ratio 95% CI P-value
Age (years) <0.001
20 vs. 35 1.27 [1.02 ,1.58]
30 vs. 35 1.10 [1.04 ,1.16]
40 vs. 35 0.88 [0.85 ,0.91]
50 vs. 35 0.64 [0.57 ,0.72]
Baseline CD4 count (cells/ml) <0.001
50 vs. 200 1.06 [0.93 ,1.20]
100 vs. 200 1.08 [1.01 ,1.16]
300 vs. 200 0.81 [0.78 ,0.85]
500 vs. 200 0.48 [0.42 ,0.57]
Baseline viral load (10^5 copies/ml) <0.001
0.01 vs. 0.6 0.78 [0.73 ,0.84]
1 vs. 0.6 1.15 [1.10 ,1.20]
3 vs. 0.6 1.69 [1.44 ,1.98]
5 vs. 0.6 1.81 [1.52 ,2.16]
Time from diagnosis to treatment initiation <0.001
0 months vs. 6 months 0.87 [0.82 ,0.92]
3 months vs. 6 months 0.94 [0.91 ,0.96]
1 years vs. 6 months 1.12 [1.07 ,1.17]
2 years vs. 6 months 1.29 [1.15 ,1.43]
Female vs. Male 1.55 [1.28 ,1.88] <0.001
Heterosexual contact vs. Homo/Bisexual/Other contact 0.90 [0.76 ,1.06] 0.208
TB history Yes versus No 0.80 [0.64 ,1.01] 0.065
HIV Center:CL-FA vs. BR-IPEC 0.39 [0.32 ,0.48] <0.001
HIV Center:HN-HE/SS vs. BR-IPEC 0.24 [0.18 ,0.32] <0.001
HIV Center:MX-INCMNSZ vs. BR-IPEC 0.37 [0.30 ,0.45] <0.001
HIV Center:PE-IMTAVH vs. BR-IPEC 0.49 [0.41 ,0.58] <0.001
AIDS-defining events: Yes vs. No 1.08 [0.89 ,1.31] 0.414

There were six factors significantly associated with increase in CD4 count, including age, baseline CD4 count, baseline viral load, time from diagnosis to treatment initiation, sex, and HIV site. Old age, high baseline CD4 count, low baseline viral load, shorter time from diagnosis to treatment initiation, and male were associated with a higher risk of low CD4 count increase after two years. Compared with HIV centers Brazil, HIV centers in other countries had a higher risk of low CD4 count increase.

#ANOVA table
t <- anova(mod.noHTI)
nvars <- c("sex", "cd4_incr", "f_cd4_v", "center.x", "age_trt", "tb_history", "preARTAIDS.clinical", "diag_to_trt", "rna_v_trt", "hivtrans")
nlbs <- c("Sex", "Increase in CD4", "Baseline CD4 count", "HIV center", "Age", "TB history", "AIDS defining events", "Time from diagnosis to treatment initiation", "Baseline viral load", "HIV transmission route")
for(i in seq_along(nvars)){
  rownames(t) <- gsub(nvars[i],nlbs[i],rownames(t))
}
t <- cbind(formatC(t[,1:2], digits = 2, format = "f"), format.pval(t[,"P"],digits = 3, eps = 1e-3))
t %>% 
  kbl(caption = "<center> Table Wald tests of predictors in the CPM (Haiti excluded)", digits = 2, align = "l")  %>% kable_classic(html_font = "Cambria") %>%  add_indent(c(4, 6, 8, 13))
Table Wald tests of predictors in the CPM (Haiti excluded)
Chi-Square d.f.
Sex 20.56 1.00 <0.001
HIV transmission route 1.58 1.00 0.208
Age 65.45 2.00 <0.001
Nonlinear 2.18 1.00 0.139
Baseline CD4 count 86.54 2.00 <0.001
Nonlinear 34.56 1.00 <0.001
Baseline viral load 47.81 2.00 <0.001
Nonlinear 36.89 1.00 <0.001
TB history 3.40 1.00 0.065
HIV center 159.93 4.00 <0.001
AIDS defining events 0.67 1.00 0.414
Time from diagnosis to treatment initiation 21.10 2.00 <0.001
Nonlinear 20.73 1.00 <0.001
TOTAL NONLINEAR 95.05 4.00 <0.001
TOTAL 371.48 16.00 <0.001

The nonlinear terms of baseline CD4 count, viral load, and time from diagnosis to treatment initiation were significant, which implies that the three variables had a non-linear association with increase in CD4 count.

The following figures show the effect of the four continuous variables on increase in CD4 count. Other covariates were fixed at median or mode.

q <- Quantile(mod.noHTI)
nvars <- c("age_trt", "f_cd4_v", "rna_v_trt", "diag_to_trt")
lvars <- c("Age (years)", "Baseline CD4 count (cells/ml)", " ", "Years from diagnosis to treament initiation")
qest <- list()
qest[[1]] <- Predict(mod.noHTI, age_trt, fun= function(x) q(0.5, x))
qest[[2]] <- Predict(mod.noHTI, f_cd4_v=seq(0,700,1), fun= function(x) q(0.5, x))
qest[[3]] <- Predict(mod.noHTI, rna_v_trt=seq(0,6,0.1), fun= function(x) q(0.5, x))
qest[[4]] <- Predict(mod.noHTI, diag_to_trt=seq(0, 5*365, 1), fun= function(x) q(0.5, x))
qest[[4]]$diag_to_trt <- qest[[4]]$diag_to_trt/365

par(mfrow=c(2,2))
lylim <- list(c(90,400), c(100,400), c(200,450), c(200,400))
for(i in seq_along(nvars)){
  v <- nvars[i]
  l <- lvars[i]
  if(i %in% c(1,3)) yl="Median increase in CD4 Count (cells/ml)"
  else yl=" "
  plot(qest[[i]][, v],qest[[i]]$yhat, cex.axis=1.5, cex.lab=1.5, ylab=yl, xlab=l, ylim=lylim[[i]], type="n")
  if(nvars[i] == "rna_v_trt") title(xlab=latex2exp::TeX("Baseline viral load ($$10^5$$ copies/ml)"), cex.lab=1.5)
  polygon(c(qest[[i]][, v], rev(qest[[i]][, v])), c(qest[[i]]$lower, rev(qest[[i]]$upper)),col = "grey", border = NA)
  lines(qest[[i]][, v], qest[[i]]$yhat, cex=1.5)
  lines(qest[[i]][, v], qest[[i]]$lower, cex=1.5,lty=2)
  lines(qest[[i]][, v], qest[[i]]$upper, cex=1.5,lty=2)
}

5.2.1 Sensitivity analysis

A sensitivity analysis was done to check if using only patients with a complete case for all the variables of interest significantly affected the regression. The sample size was 2140 under complete cases.

dd <- datadist(datmod[complete.cases(datmod),])
options(datadist = "dd")
#set values for comparison in each covariate  
dd$limits["Adjust to","center.x"] <- "BR-IPEC"
#Run the model with complete cases 
mod.complete <- orm(cd4_incr ~ sex + hivtrans + rcs(age_trt,3) + rcs(sqrt(f_cd4_v),3) +
                        rcs(rna_v_trt,3) + tb_history + 
                        center.x  + preARTAIDS.clinical+ rcs(diag_to_trt,3), data=datmod[complete.cases(datmod),])
#Format a table of estimates
pval <- anova(mod.complete)[,"P"][c("age_trt", "f_cd4_v", "rna_v_trt", "diag_to_trt","sex", "hivtrans", "tb_history","center.x", "preARTAIDS.clinical")]
pval <- format.pval(pval,digits = 3, eps = 1e-3)
pval <- c(pval[1:(length(pval)-2)], rep("<0.0001", 4), tail(pval,1))
or <- summary(mod.complete)
or <- or[startsWith(rownames(or), " Odds"),]
or <- or[5:nrow(or),]
or <- rbind(NA,
            summary(mod.complete,  age_trt=c(35, 20), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  age_trt=c(35, 30), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  age_trt=c(35, 40), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  age_trt=c(35, 50), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.complete,  f_cd4_v=c(200, 50), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  f_cd4_v=c(200, 100), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  f_cd4_v=c(200, 300), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  f_cd4_v=c(200, 500), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.complete, rna_v_trt=c(0.6, 0.01), est.all = F)[" Odds Ratio", ],
summary(mod.complete, rna_v_trt=c(0.6, 1), est.all = F)[" Odds Ratio", ],
summary(mod.complete, rna_v_trt=c(0.6, 3), est.all = F)[" Odds Ratio", ],
summary(mod.complete, rna_v_trt=c(0.6, 5), est.all = F)[" Odds Ratio", ],
NA,
summary(mod.complete,  diag_to_trt=c(180, 0), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  diag_to_trt=c(180, 90), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  diag_to_trt=c(180, 365), est.all = F)[" Odds Ratio", ],
summary(mod.complete,  diag_to_trt=c(180, 365 * 2), est.all = F)[" Odds Ratio", ],
or)[,c("Effect", "Lower 0.95", "Upper 0.95")]

or <- formatC(or, digits = 2, format="f")
or <- cbind(or[,1], paste("[",or[,2]," ,", or[,3],"]",sep=""))
or[seq(1,20,5),] <- " "
pred.vars <- c("Age (years)",
               "20 vs. 35", 
               "30 vs. 35",
               "40 vs. 35",
               "50 vs. 35",
               "Baseline CD4 count (cells/ml)",
               "50 vs. 200",
               "100 vs. 200",
               "300 vs. 200",
               "500 vs. 200",
               "Baseline viral load (10^5 copies/ml)",
               "0.01 vs. 0.6",
               "1 vs. 0.6",
               "3 vs. 0.6",
               "5 vs. 0.6",
               "Time from diagnosis to treatment initiation",
               "0 months vs. 6 months",
               "3 months vs. 6 months",
               "1 years vs. 6 months",
               "2 years vs. 6 months",
               "Female vs. Male",
               "Heterosexual contact vs. Homo/Bisexual/Other contact",
               "TB history Yes versus No",
               "HIV Center:CL-FA vs. BR-IPEC",
               "HIV Center:HN-HE/SS vs. BR-IPEC",
               "HIV Center:MX-INCMNSZ vs. BR-IPEC",
               "HIV Center:PE-IMTAVH vs. BR-IPEC",
               "AIDS-defining events: Yes vs. No")
pval <- c(pval[1], rep(" ",4), 
          pval[2], rep(" ",4), 
          pval[3], rep(" ",4),
          pval[4], rep(" ",4), 
          pval[5:length(pval)])
output <- cbind(pred.vars, or, pval)
rownames(output) <- NULL
colnames(output) <- c("Predictors", 'Odds Ratio', "95% CI", "P-value")
output %>%
  kbl(caption = "<center> Table Odds Ratio of the Cumulative Probability Model (Haiti excluded and complete cases)", align = "l")  %>% kable_classic(html_font = "Cambria") %>%  
   add_indent(setdiff(1:20, seq(1,20,5)))
Table Odds Ratio of the Cumulative Probability Model (Haiti excluded and complete cases)
Predictors Odds Ratio 95% CI P-value
Age (years) <0.001
20 vs. 35 1.43 [1.10 ,1.86]
30 vs. 35 1.14 [1.07 ,1.22]
40 vs. 35 0.86 [0.83 ,0.89]
50 vs. 35 0.60 [0.52 ,0.69]
Baseline CD4 count (cells/ml) <0.001
50 vs. 200 1.11 [0.95 ,1.29]
100 vs. 200 1.11 [1.03 ,1.20]
300 vs. 200 0.81 [0.77 ,0.85]
500 vs. 200 0.48 [0.40 ,0.58]
Baseline viral load (10^5 copies/ml) <0.001
0.01 vs. 0.6 0.77 [0.71 ,0.83]
1 vs. 0.6 1.17 [1.12 ,1.22]
3 vs. 0.6 1.79 [1.52 ,2.10]
5 vs. 0.6 1.95 [1.63 ,2.32]
Time from diagnosis to treatment initiation 0.001
0 months vs. 6 months 0.86 [0.79 ,0.93]
3 months vs. 6 months 0.93 [0.89 ,0.97]
1 years vs. 6 months 1.13 [1.06 ,1.21]
2 years vs. 6 months 1.30 [1.13 ,1.49]
Female vs. Male 1.60 [1.28 ,2.02] <0.001
Heterosexual contact vs. Homo/Bisexual/Other contact 0.91 [0.75 ,1.10] 0.330
TB history Yes versus No 0.75 [0.58 ,0.98] 0.033
HIV Center:CL-FA vs. BR-IPEC 0.37 [0.29 ,0.47] <0.0001
HIV Center:HN-HE/SS vs. BR-IPEC 0.19 [0.09 ,0.39] <0.0001
HIV Center:MX-INCMNSZ vs. BR-IPEC 0.32 [0.26 ,0.41] <0.0001
HIV Center:PE-IMTAVH vs. BR-IPEC 0.46 [0.37 ,0.56] <0.0001
AIDS-defining events: Yes vs. No 1.08 [0.88 ,1.33] 0.461
#ANOVA table
t <- anova(mod.complete)
nvars <- c("sex", "cd4_incr", "f_cd4_v", "center.x", "age_trt", "tb_history", "preARTAIDS.clinical", "diag_to_trt", "rna_v_trt", "hivtrans")
nlbs <- c("Sex", "Increase in CD4", "Baseline CD4 count", "HIV center", "Age", "TB history", "AIDS defining events", "Time from diagnosis to treatment initiation", "Baseline viral load", "HIV transmission route")
for(i in seq_along(nvars)){
  rownames(t) <- gsub(nvars[i],nlbs[i],rownames(t))
}
t <- cbind(formatC(t[,1:2], digits = 2, format = "f"), format.pval(t[,"P"],digits = 3, eps = 1e-3))
t %>% 
  kbl(caption = "<center> Table Wald tests of predictors in the CPM (Haiti excluded and complete cases)", digits = 2, align = "l")  %>% kable_classic(html_font = "Cambria") %>%  add_indent(c(4, 6, 8, 13))
Table Wald tests of predictors in the CPM (Haiti excluded and complete cases)
Chi-Square d.f.
Sex 16.40 1.00 <0.001
HIV transmission route 0.95 1.00 0.330
Age 66.47 2.00 <0.001
Nonlinear 0.85 1.00 0.356
Baseline CD4 count 62.78 2.00 <0.001
Nonlinear 20.85 1.00 <0.001
Baseline viral load 60.58 2.00 <0.001
Nonlinear 41.42 1.00 <0.001
TB history 4.55 1.00 0.033
HIV center 115.72 4.00 <0.001
AIDS defining events 0.54 1.00 0.461
Time from diagnosis to treatment initiation 13.44 2.00 0.001
Nonlinear 10.92 1.00 <0.001
TOTAL NONLINEAR 73.79 4.00 <0.001
TOTAL 306.98 16.00 <0.001

We additionally identified TB history as a significant factor of increase in CD4 count under complete cases. TB history was associated with a higher risk of low increase in CD4 count.

5.3 Summary of the two analysis sets

  • Both analysis sets concluded that old age, high baseline CD4 count, and male were associated with a higher risk of low CD4 count increase after two-year treatment. Also, compared with HIV centers Brazil, HIV centers in other countries had a higher risk of low CD4 count increase.

  • In the analysis set excluding Haiti, high baseline CD4 count, low baseline viral load, and longer time from diagnosis to treatment initiation were associated with a higher risk of low CD4 count increase.

6 Aim 3 Statistical Analysis


6.1 Kaplan-Meier Curves

The overall survival probability over time was estimated for the four groups and is shown below.

p <- b %>% 
  select(patient, death_y, death_d, Group, art_sdate, age_trt, sex, center.x, l_visit_d, cd4_incr,f_cd4_v) %>% 
  rename("Sex" = sex, "Age" = age_trt, "Center" = center.x) %>%
  distinct()
p <- p %>% 
    mutate(time = ifelse(death_y==1, 
                         difftime(strptime(death_d, format = "%Y-%m-%d"), art_sdate, units = "days"),
                         difftime(strptime(l_visit_d, format = "%Y-%m-%d"), art_sdate, units = "days")))

p$year <- as.numeric(p$time)/365
mod <- survfit(Surv(year, death_y) ~ Group, data = p)
ggsurv <- ggsurvplot(mod,
          conf.int = FALSE,
          risk.table = TRUE,
          censor = F,
          pval = T,
          pval.method = F,
          pval.coord = c(1, 0.945),
          pval.method.coord = c(1, 0.95),
          risk.table.col = "strata", # increase risk table color by groups
          ggtheme = theme_classic(), # increase ggplot2 theme
          palette = c("red", "lightcoral","dodgerblue2","deepskyblue2"),
          xlab="Years since treatment initiation",
          legend.labs = paste("Increase", c("<150","[150,250)","[250,350]",">350"),sep=""),
          break.time.by = 1, xlim=c(0, 10), ylim=c(0.90,1)
          )
ggsurv$plot <- ggsurv$plot + theme(legend.position = "none", text = element_text(size=15)) 
ggsurv
<center>Figure Survival Probability in 10 years after ART initiation (Y axis=[0.9, 1])</center>

Figure Survival Probability in 10 years after ART initiation (Y axis=[0.9, 1])

The survival probabilities of the four groups were significantly different (log-rank test p-value=0.0061).

6.2 Cox Proportional Hazard Model

The sample size was 4756 and the number of failures was 147. We built the same Cox proportional hazard model for the 4756 patients. The predictors were increase in CD4 count and baseline CD4 count. To avoid the linearity assumption, we added restrict cubic splines of the two variables in the model (three knots for each variable). Because the increase in CD4 count might have different impacts on patients with different baseline CD4 count, the interaction terms of the predictors were also included in the model. Also, we applied square root transformation to the baseline CD4 count to reduce its skewness. The covariates were age, sex, and HIV center.

dd <- datadist(p)
options(datadist = "dd")
#set values for comparison in each covariate  
dd$limits["Adjust to","Center"] <- "BR-IPEC"
mod <- cph(Surv(time, death_y) ~ rcs(cd4_incr,3)*rcs(sqrt(f_cd4_v),3) + Age + Center + Sex, data = p, surv=T, x= T,y=T)
#Format a table of estimates
pval <- anova(mod)[,"P"][c("cd4_incr  (Factor+Higher Order Factors)", "f_cd4_v  (Factor+Higher Order Factors)", "Age", "Sex")]
pval <- format.pval(pval,digits = 3, eps = 1e-3)
#p-values for each HIV center 
idx <- startsWith(rownames(mod$var), "Center")
st <- coef(mod)[idx]/sqrt(diag(mod$var[idx, idx]))
p2 <- pnorm(abs(st), lower.tail = F) * 2
p2 <-  format.pval(p2,digits = 3, eps = 1e-3)
pval <- c(pval[1:(length(pval)-1)], p2, tail(pval,1))
hr <- summary(mod)
hr <- hr[startsWith(rownames(hr), " Hazard"),]
hr <- hr[4:nrow(hr),]
hr <- rbind(NA,
            summary(mod,  cd4_incr=c(200, 0), est.all = F)[" Hazard Ratio", ],
summary(mod,  cd4_incr=c(200, 100), est.all = F)[" Hazard Ratio", ],
summary(mod,  cd4_incr=c(200, 300), est.all = F)[" Hazard Ratio", ],
summary(mod,  cd4_incr=c(200, 500), est.all = F)[" Hazard Ratio", ],
NA,
summary(mod,  f_cd4_v=c(200, 50), est.all = F)[" Hazard Ratio", ],
summary(mod,  f_cd4_v=c(200, 100), est.all = F)[" Hazard Ratio", ],
summary(mod,  f_cd4_v=c(200, 300), est.all = F)[" Hazard Ratio", ],
summary(mod,  f_cd4_v=c(200, 500), est.all = F)[" Hazard Ratio", ],
NA,
summary(mod,  Age=c(35, 20), est.all = F)[" Hazard Ratio", ],
summary(mod,  Age=c(35, 30), est.all = F)[" Hazard Ratio", ],
summary(mod,  Age=c(35, 40), est.all = F)[" Hazard Ratio", ],
summary(mod,  Age=c(35, 50), est.all = F)[" Hazard Ratio", ],
hr)[,c("Effect", "Lower 0.95", "Upper 0.95")]
hr <- formatC(hr, digits = 2, format="f")
hr <- cbind(hr[,1], paste("[",hr[,2]," ,", hr[,3],"]",sep=""))
hr[seq(1,15,5),] <- " "
pred.vars <- c("Increase in CD4 count (cells/ml)",
               "0 vs. 200",
               "100 vs. 200",
               "300 vs. 200",
               "500 vs. 200",
               "Baseline CD4 count (cells/ml)",
               "50 vs. 200",
               "100 vs. 200",
               "300 vs. 200",
               "500 vs. 200",
               "Age (years)",
               "20 vs. 35", 
               "30 vs. 35",
               "40 vs. 35",
               "50 vs. 35",
               "HIV Center:CL-FA vs. BR-IPEC",
               "HIV Center:HN-HE/SS vs. BR-IPEC",
               "HIV Center:HT-GHESKIO vs. BR-IPEC",
               "HIV Center:MX-INCMNSZ vs. BR-IPEC",
               "HIV Center:PE-IMTAVH vs. BR-IPEC",
               "Female vs. Male")
pval <- c(pval[1], rep(" ",4), 
          pval[2], rep(" ",4), 
          pval[3], rep(" ",4),
          pval[4:length(pval)])
output <- cbind(pred.vars, hr, pval)
rownames(output) <- NULL
colnames(output) <- c("Predictors", 'Hazard Ratio', "95% CI", "P-value")
output %>%
  kbl(caption = "<center> Table Hazard Ratio in the Cox Proportional Model", align = "l")  %>% kable_classic(html_font = "Cambria") %>% 
  add_indent(setdiff(1:15, seq(1,15,5)))
Table Hazard Ratio in the Cox Proportional Model
Predictors Hazard Ratio 95% CI P-value
Increase in CD4 count (cells/ml) 0.019
0 vs. 200 2.10 [1.38 ,3.18]
100 vs. 200 1.41 [1.16 ,1.70]
300 vs. 200 0.84 [0.74 ,0.95]
500 vs. 200 0.90 [0.54 ,1.49]
Baseline CD4 count (cells/ml) 0.385
50 vs. 200 1.09 [0.77 ,1.54]
100 vs. 200 1.01 [0.85 ,1.20]
300 vs. 200 1.12 [0.97 ,1.30]
500 vs. 200 1.52 [0.92 ,2.51]
Age (years) <0.001
20 vs. 35 0.37 [0.30 ,0.46]
30 vs. 35 0.72 [0.67 ,0.77]
40 vs. 35 1.39 [1.30 ,1.50]
50 vs. 35 2.71 [2.19 ,3.34]
HIV Center:CL-FA vs. BR-IPEC 1.53 [0.86 ,2.74] 0.148
HIV Center:HN-HE/SS vs. BR-IPEC 0.34 [0.14 ,0.78] 0.011
HIV Center:HT-GHESKIO vs. BR-IPEC 0.67 [0.43 ,1.07] 0.092
HIV Center:MX-INCMNSZ vs. BR-IPEC 0.35 [0.17 ,0.75] 0.007
HIV Center:PE-IMTAVH vs. BR-IPEC 0.84 [0.46 ,1.54] 0.567
Female vs. Male 1.07 [0.75 ,1.54] 0.710

Note: a HR=1 means lack of association, a HR>1 suggests an higher risk, and a HR<1 suggests a lower risk.

Old age and low increase in CD4 count were associated with a high risk of all-cause mortality.

t <- anova(mod)
rownames(t) <- gsub("cd4_incr","Increase in CD4 count",rownames(t))
rownames(t) <- gsub("f_cd4_v","Baseline CD4 count",rownames(t))
rownames(t) <- gsub("Center","HIV center",rownames(t))
t <- cbind(formatC(t[,1:2], digits = 2, format = "f"), format.pval(t[,"P"],digits = 3, eps = 1e-3))
t %>% 
  kbl(caption = "<center> Table Wald tests of predictors in the Cox Proportional Model",digits = 2, align = "l")  %>% kable_classic(html_font = "Cambria") %>%  add_indent(c(2, 3, 5, 6, 11, 12, 13, 14, 15))
Table Wald tests of predictors in the Cox Proportional Model
Chi-Square d.f.
Increase in CD4 count (Factor+Higher Order Factors) 15.20 6.00 0.019
All Interactions 6.30 4.00 0.178
Nonlinear (Factor+Higher Order Factors) 6.85 3.00 0.077
Baseline CD4 count (Factor+Higher Order Factors) 6.35 6.00 0.385
All Interactions 6.30 4.00 0.178
Nonlinear (Factor+Higher Order Factors) 2.35 3.00 0.504
Age 85.15 1.00 <0.001
HIV center 21.77 5.00 <0.001
Sex 0.14 1.00 0.710
Increase in CD4 count * Baseline CD4 count (Factor+Higher Order Factors) 6.30 4.00 0.178
Nonlinear 4.59 3.00 0.205
Nonlinear Interaction : f(A,B) vs. AB 4.59 3.00 0.205
f(A,B) vs. Af(B) + Bg(A) 0.46 1.00 0.499
Nonlinear Interaction in Increase in CD4 count vs. Af(B) 1.86 2.00 0.395
Nonlinear Interaction in Baseline CD4 count vs. Bg(A) 0.65 2.00 0.722
TOTAL NONLINEAR 9.54 5.00 0.090
TOTAL NONLINEAR + INTERACTION 9.56 6.00 0.144
TOTAL 129.04 15.00 <0.001
#estimate survival probability at 10 years 
pred10 <- Predict(mod, cd4_incr, f_cd4_v = seq(100,600,100), 
                time=10*365, conf.int = F)

ggplot(pred10, xlim. = c(-50,400),ylim. = c(0.92,1)) + 
  ylab("Survival probability at 10 years after ART initiation") + 
  xlab("Increases in CD4 T cell count after 2 years of treatment (cells/ml)") +
  scale_color_manual(name="Baseline CD4 T cell count (cells/ml)",
                      values = c("red", "orange", "yellowgreen", "royalblue", "hotpink","purple"))+
  theme_classic()+theme(legend.position = "bottom")+
  guides(color=guide_legend(nrow=2,byrow=TRUE))+
  theme(text = element_text(size = 13))
<center>Figure Estimated overall survival probability at 10 years after treatment initiation </center>

Figure Estimated overall survival probability at 10 years after treatment initiation

Note: Covariates are fixed at median or mode.

7 Aim 4 Statistical Analysis


The diseases of interest are cardiovascular events (cardiovascular disorder, cerebrovascular disease), non–AIDS defining cancer, AIDS defining cancer, opportunistic infections. Because the number of events of each disease was very small, we combined cardiovascular events with non-AIDS defining cancer and combined opportunistic infections with AIDS defining cancer. Also, for each disease group, we excluded the patients who had any of the diseases during the first 2 years of treatment from the data. The sample size was 3553 for OIs and AIDS defining cancer, and was 4697 for cardiovascular events and non-AIDS defining cancer. The following table summarizes the number of censor, event and death for each disease group.

##########  Opportunistic Infections
#create data frame for clinical endpoints data
#last date is death date or last visit date (censored)
sdb <- b %>% 
  select(patient, Group, death_y, death_d, art_sdate, l_visit_d, country) %>% 
  distinct() %>% 
  group_by(patient) %>% 
  mutate(last_date = ifelse(death_y==1, death_d, l_visit_d)) %>% 
  mutate(last_date = strptime(last_date, format = "%Y-%m-%d"))
#after 2-year follow-up, time to 1st AIDS defining event(since treatment initiation)
#exclude patients having OIs in the 2-year follow-up or before treatment initiation
idx <- sdb %>%
  left_join(ce[ce$patient %in% b$patient,], by = "patient") %>% 
  filter(startsWith(ce_id, "ade_")) %>% 
  filter(ce_d_a != "U") %>% # 1 patient with ADE and unknown date 
  mutate(ce_date = strptime(ce_d, format = "%Y-%m-%d")) %>% 
  group_by(patient) %>% 
  arrange(ce_date) %>% 
  slice(1) %>% 
  mutate(time_to_event = difftime(ce_date, art_sdate, units = "days")) %>%
  mutate(oi = ifelse(time_to_event > 365*2, 1, 0)) %>% 
  select(patient, Group, last_date, death_y, art_sdate, time_to_event, ce_date,ce_id, oi, country)
tlbce <- sdb %>% 
  filter(!(patient %in% idx$patient)) %>% 
  mutate(time_to_event=difftime(last_date, art_sdate, units = "days"),
         ce_date=last_date,ce_id="none") %>%
  select(patient, Group, last_date, death_y, art_sdate, time_to_event, ce_date,ce_id, country) %>% 
  rbind(idx[idx$oi==1,] %>% select(-oi))
tlbce$event <- "Censored"
tlbce[tlbce$death_y == 1, ]$event <- "Death"
tlbce[tlbce$ce_id != "none", ]$event <- "Event"
tlbce$event <- factor(tlbce$event, labels =c("Censored","Event","Death"), levels = c("Censored","Event","Death"))

############## Cardiovascular Events
cds <- c("nade_cardiovascular","nade_cerebrovascular","nade_cancer")
idx <- sdb %>%
  left_join(ce[ce$patient %in% b$patient,], by = "patient") %>% 
  filter(ce_id %in% cds) %>% 
  filter(ce_d_a != "U") %>% # no patient with unknown date 
  mutate(ce_date = strptime(ce_d, format = "%Y-%m-%d")) %>% 
  group_by(patient) %>% 
  arrange(ce_date) %>% 
  slice(1) %>% 
  mutate(time_to_event = difftime(ce_date, art_sdate, units = "days")) %>%
  mutate(cd = ifelse(time_to_event > 365*2, 1, 0)) %>% 
  select(patient, Group, last_date, death_y, art_sdate, time_to_event, ce_date,ce_id, cd, country)
tlbcrd <- sdb %>% 
  filter(!(patient %in% idx$patient)) %>% 
  mutate(time_to_event=difftime(last_date, art_sdate, units = "days"),
         ce_date=last_date,ce_id="none") %>%
  select(patient, Group, last_date, death_y, art_sdate, time_to_event, ce_date,ce_id, country) %>% 
  rbind(idx[idx$cd==1,] %>% select(-cd))
tlbcrd$event <- "Censored"
tlbcrd[tlbcrd$death_y == 1, ]$event <- "Death"
tlbcrd[tlbcrd$ce_id != "none", ]$event <- "Event"
tlbcrd$event <- factor(tlbcrd$event, labels =c("Censored","Event","Death"), levels = c("Censored","Event","Death"))

##Show counts of patients with events 
t1 <- rbind(table(tlbce$event,tlbce$Group),
            colSums(table(tlbce$event,tlbce$Group)),
            table(tlbcrd$event,tlbcrd$Group),
            colSums(table(tlbcrd$event,tlbcrd$Group)))
t1 <- cbind(t1, rowSums(t1))
p1 <- t(apply(t1[1:3,],1, function(x) paste(x," (",round(x/t1[4,]*100,2),"%)",sep="")))
p2 <- t(apply(t1[5:7,],1, function(x) paste(x," (",round(x/t1[8,]*100,2),"%)",sep="")))
t1[1:3,] <- p1
t1[5:7,] <- p2
rownames(t1) <- rep(c("Censor","Event","Death","Total"), 2)
colnames(t1) <- c(paste("Increase",c("<150","=[150,250)","=[250,350]",">350"),sep=""),"Total")
kable(t1,booktabs = T, caption = "<center> Table Number of patients with specific diseases in each group", align = "c") %>% kable_classic(html_font = "Cambria") %>%
  pack_rows("Opportunistic infections and AIDS defining cancer", 1, 4) %>%
  pack_rows("Cardiovascular events and Non-AIDS defining cancer", 5, 8)
Table Number of patients with specific diseases in each group
Increase<150 Increase=[150,250) Increase=[250,350] Increase>350 Total
Opportunistic infections and AIDS defining cancer
Censor 955 (93.54%) 800 (95.35%) 699 (96.81%) 942 (97.01%) 3396 (95.58%)
Event 29 (2.84%) 20 (2.38%) 9 (1.25%) 15 (1.54%) 73 (2.05%)
Death 37 (3.62%) 19 (2.26%) 14 (1.94%) 14 (1.44%) 84 (2.36%)
Total 1021 839 722 971 3553
Cardiovascular events and Non-AIDS defining cancer
Censor 1240 (95.17%) 1064 (95.34%) 928 (96.57%) 1282 (97.34%) 4514 (96.1%)
Event 11 (0.84%) 17 (1.52%) 14 (1.46%) 13 (0.99%) 55 (1.17%)
Death 52 (3.99%) 35 (3.14%) 19 (1.98%) 22 (1.67%) 128 (2.73%)
Total 1303 1116 961 1317 4697
##Show counts of patients with events 
t1 <- rbind(table(tlbce$event,tlbce$country),
            colSums(table(tlbce$event,tlbce$country)),
            table(tlbcrd$event,tlbcrd$country),
            colSums(table(tlbcrd$event,tlbcrd$country)))
t1 <- cbind(t1, rowSums(t1))
p1 <- t(apply(t1[1:3,],1, function(x) paste(x," (",round(x/t1[4,]*100,2),"%)",sep="")))
p2 <- t(apply(t1[5:7,],1, function(x) paste(x," (",round(x/t1[8,]*100,2),"%)",sep="")))
t1[1:3,] <- p1
t1[5:7,] <- p2
rownames(t1) <- rep(c("Censor","Event","Death","Total"), 2)
colnames(t1) <- Cs(BRA,CHL,HND,HTI,MEX,PER,Total)

kable(t1,booktabs = T, caption = "<center> Table Number of patients with specific diseases in each country", align = "c") %>% kable_classic(html_font = "Cambria") %>%
  pack_rows("Opportunistic infections and AIDS defining cancer", 1, 4) %>%
  pack_rows("Cardiovascular events and Non-AIDS defining cancer", 5, 8)
Table Number of patients with specific diseases in each country
BRA CHL HND HTI MEX PER Total
Opportunistic infections and AIDS defining cancer
Censor 539 (95.91%) 349 (96.14%) 73 (85.88%) 1365 (94.14%) 354 (96.72%) 716 (98.49%) 3396 (95.58%)
Event 6 (1.07%) 4 (1.1%) 9 (10.59%) 40 (2.76%) 10 (2.73%) 4 (0.55%) 73 (2.05%)
Death 17 (3.02%) 10 (2.75%) 3 (3.53%) 45 (3.1%) 2 (0.55%) 7 (0.96%) 84 (2.36%)
Total 562 363 85 1450 366 727 3553
Cardiovascular events and Non-AIDS defining cancer
Censor 735 (93.87%) 458 (95.22%) 204 (96.68%) 1647 (96.54%) 496 (94.66%) 974 (98.19%) 4514 (96.1%)
Event 24 (3.07%) 6 (1.25%) 1 (0.47%) 0 (0%) 20 (3.82%) 4 (0.4%) 55 (1.17%)
Death 24 (3.07%) 17 (3.53%) 6 (2.84%) 59 (3.46%) 8 (1.53%) 14 (1.41%) 128 (2.73%)
Total 783 481 211 1706 524 992 4697

The 40 patients in Haiti all had tuberculosis and did not have any other AIDS defining events. Because Haiti did not submit clinical endpoints data except for tuberculosis. Therefore, we excluded Haiti’s data in the following analyses.

After Haiti was excluded, the sample size was 2103 for OIs and AIDS defining cancer, and was 2991 for cardiovascular events and non-AIDS defining cancer.

#exclude Haiti 
tlbce <- tlbce[tlbce$country != "HTI",]
tlbcrd <- tlbcrd[tlbcrd$country != "HTI",]
##Show counts of patients with events 
t1 <- rbind(table(tlbce$event,tlbce$Group),
            colSums(table(tlbce$event,tlbce$Group)),
            table(tlbcrd$event,tlbcrd$Group),
            colSums(table(tlbcrd$event,tlbcrd$Group)))
t1 <- cbind(t1, rowSums(t1))
p1 <- t(apply(t1[1:3,],1, function(x) paste(x," (",round(x/t1[4,]*100,2),"%)",sep="")))
p2 <- t(apply(t1[5:7,],1, function(x) paste(x," (",round(x/t1[8,]*100,2),"%)",sep="")))
t1[1:3,] <- p1
t1[5:7,] <- p2
rownames(t1) <- rep(c("Censor","Event","Death","Total"), 2)
colnames(t1) <- c(paste("Increase",c("<150","=[150,250)","=[250,350]",">350"),sep=""),"Total")
kable(t1,booktabs = T, caption = "<center> Table 7 Number of patients with specific diseases in each group (Haiti excluded)", align = "c") %>% kable_classic(html_font = "Cambria") %>%
  pack_rows("Opportunistic infections and AIDS defining cancer", 1, 4) %>%
  pack_rows("Cardiovascular events and Non-AIDS defining cancer", 5, 8)
Table 7 Number of patients with specific diseases in each group (Haiti excluded)
Increase<150 Increase=[150,250) Increase=[250,350] Increase>350 Total
Opportunistic infections and AIDS defining cancer
Censor 557 (94.41%) 482 (97.18%) 417 (98.35%) 575 (96.96%) 2031 (96.58%)
Event 16 (2.71%) 7 (1.41%) 2 (0.47%) 8 (1.35%) 33 (1.57%)
Death 17 (2.88%) 7 (1.41%) 5 (1.18%) 10 (1.69%) 39 (1.85%)
Total 590 496 424 593 2103
Cardiovascular events and Non-AIDS defining cancer
Censor 747 (95.16%) 699 (95.1%) 590 (96.56%) 831 (96.63%) 2867 (95.85%)
Event 11 (1.4%) 17 (2.31%) 14 (2.29%) 13 (1.51%) 55 (1.84%)
Death 27 (3.44%) 19 (2.59%) 7 (1.15%) 16 (1.86%) 69 (2.31%)
Total 785 735 611 860 2991

There were some patients who dead before having an event, so death was a competing event of the diseases. We used cumulative incidence analysis to estimate marginal probability of the diseases in the presence of death.

##########  Model for Opportunistic Infections and AIDS defining cancer
tlbce$year_to_event <- tlbce$time_to_event/365
cif.oi <- cuminc(ftime = tlbce$year_to_event, fstatus = tlbce$event, group = tlbce$Group)
lbs <- paste("Increase",c("<150","=[150,250)","=[250,350]",">350"),sep="")
plot(cif.oi[5:8], col = c("red", "lightcoral", "dodgerblue2","deepskyblue2"), ylab = "Probability", xlab = "Years since treatment initiation",
     main = "Opportunistic infections and AIDS defining cancer", lty = 1, ylim = c(0,0.04), xlim = c(2, 10),lwd = 2,wh=1:2, cex=1.5)
legend(1.8,0.04, legend=lbs, ncol=2,
       col=c("red", "lightcoral", "dodgerblue2","deepskyblue2"), lty=1, cex=0.8, box.lty=0)
<center>Figure Probability of opportunistic infections and AIDS defining cancer in 10 years of HAART treatment</center>

Figure Probability of opportunistic infections and AIDS defining cancer in 10 years of HAART treatment

######Gray's test
cif.oi$Tests
##              stat         pv df
## Censored 3.744073 0.29045872  3
## Event    8.563881 0.03568808  3
## Death    5.119224 0.16327227  3

By Gray’s test, the cumulative incidence of opportunistic infections and AIDS defining cancer was significantly different between the four groups (p-value=0.036).

######### Model for cardiovascular events and non-AIDS defining neoplasms
tlbcrd$year_to_event <- tlbcrd$time_to_event/365
cif.naid <- cuminc(ftime =tlbcrd$year_to_event, fstatus = tlbcrd$event, group = tlbcrd$Group)
plot(cif.naid[5:8], col = c("red", "lightcoral", "dodgerblue2","deepskyblue2"), ylab = "Probability", xlab = "Years since treatment initiation",
     main = "Cardiovascular events and non-AIDS defining cancer", lty = 1, ylim = c(0,0.03), xlim = c(2, 10),lwd = 2,wh=1:2, cex=1.5)
<center>Figure 7 Probability of cardiovascular events and non-AIDS defining cancer in 10 years of HAART treatment</center>

Figure 7 Probability of cardiovascular events and non-AIDS defining cancer in 10 years of HAART treatment

# legend(1.8,0.03, legend=lbs, ncol=2,
#        col=c("red", "lightcoral", "dodgerblue2","deepskyblue2"), lty=1, cex=0.8, box.lty=0)
######Gray's test
cif.naid$Tests
##              stat         pv df
## Censored 6.711760 0.08167505  3
## Event    2.940320 0.40091852  3
## Death    9.194762 0.02681043  3

According to Gray’s test, we did not find sufficient evidence for the difference in the cumulative incidence of cardiovascular events and non-AIDS defining neoplasms between the four groups (p-value=0.40).

8 Summary


  • The CCASAnet database included 49,449 patients, of which 4,756 patients were enrolled in this study. The 4,756 eligible patients were from Brazil (17%), Chile (10%), Honduras (4%), Haiti (36%), Mexico (11%), and Peru (21%). Eligible patients were 66% male and had a median age of 38 years at ART initiation (interquartile range 30-45). A total of 28% (n=1326) had increase in CD4 count <150 cells/ml after two years of treatment, 24% (1127) had increase between 150 and 250 cells/ml (150 included), 20% (972) had increase between 250 and 350 cells/ml (both values included), and 28% (1331) had increase >350 cells/ml. The four groups of patients had significant differences in all the baseline characteristics except history of tuberculosis (Table 1).

  • Old age, high baseline CD4 count, and male were associated with a high risk of low CD4 count increase after two years.

  • Patients with low increase in CD4 count or/and old age would have a high risk of all-cause mortality.

  • Low increase in CD4 count after two years was associated with a high risk of opportunistic infections and AIDS defining events.

9 Software information


sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] cmprsk_2.2-10    scales_1.1.1     table1_1.2.1     rms_6.2-0       
##  [5] SparseM_1.78     Hmisc_4.4-2      Formula_1.2-4    lattice_0.20-41 
##  [9] survminer_0.4.8  ggpubr_0.4.0     survival_3.2-7   kableExtra_1.3.1
## [13] ggplot2_3.3.3    dplyr_1.0.3      tidyr_1.1.2     
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-149        matrixStats_0.58.0  webshot_0.5.2      
##  [4] RColorBrewer_1.1-2  httr_1.4.2          latex2exp_0.4.0    
##  [7] tools_4.0.3         backports_1.2.1     R6_2.5.0           
## [10] rpart_4.1-15        mgcv_1.8-33         colorspace_2.0-0   
## [13] nnet_7.3-14         withr_2.4.1         tidyselect_1.1.0   
## [16] gridExtra_2.3       curl_4.3            compiler_4.0.3     
## [19] rvest_0.3.6         quantreg_5.83       htmlTable_2.1.0    
## [22] xml2_1.3.2          sandwich_3.0-0      labeling_0.4.2     
## [25] checkmate_2.0.0     survMisc_0.5.5      polspline_1.1.19   
## [28] mvtnorm_1.1-1       stringr_1.4.0       digest_0.6.27      
## [31] foreign_0.8-80      rmarkdown_2.6       rio_0.5.16         
## [34] base64enc_0.1-3     jpeg_0.1-8.1        pkgconfig_2.0.3    
## [37] htmltools_0.5.1.1   highr_0.8           htmlwidgets_1.5.3  
## [40] rlang_0.4.10        readxl_1.3.1        rstudioapi_0.13    
## [43] farver_2.0.3        generics_0.1.0      zoo_1.8-8          
## [46] zip_2.1.1           car_3.0-10          magrittr_2.0.1     
## [49] Matrix_1.2-18       Rcpp_1.0.8          munsell_0.5.0      
## [52] abind_1.4-5         lifecycle_1.0.0     stringi_1.5.3      
## [55] multcomp_1.4-15     yaml_2.2.1          carData_3.0-4      
## [58] MASS_7.3-53         grid_4.0.3          forcats_0.5.1      
## [61] crayon_1.3.4        haven_2.3.1         splines_4.0.3      
## [64] hms_1.0.0           knitr_1.31          pillar_1.4.7       
## [67] ggsignif_0.6.1      codetools_0.2-16    glue_1.4.2         
## [70] evaluate_0.14       latticeExtra_0.6-29 data.table_1.13.6  
## [73] png_0.1-7           vctrs_0.3.6         MatrixModels_0.4-1 
## [76] cellranger_1.1.0    gtable_0.3.0        purrr_0.3.4        
## [79] km.ci_0.5-2         xfun_0.20           openxlsx_4.2.3     
## [82] xtable_1.8-4        broom_0.7.4         rstatix_0.7.0      
## [85] viridisLite_0.3.0   tibble_3.0.5        conquer_1.0.2      
## [88] KMsurv_0.1-5        cluster_2.1.0       TH.data_1.0-10     
## [91] ellipsis_0.3.1