This report shows source data, statistical code and outputs of Weibull regression model in a Bayesian framework. The regression model is applied to the time-to-event data, which are shown in Figure 4A, B in the manuscript. The resulting effect size is shown in Figure 4D utilizing full posterior distributions.
The Weibull model allows us to assume a monotonically increasing hazard function h_W (t). In other words, the chance of wound closure increases with time. Results of the Weibull regression model in Figure 4A-B (smooth curves) are shown in terms of the survival function S_W (t), which describes the probability that wound closure has not occurred by the elapsed time.
For the definitions of h_W (t) and S_W (t) see the manuscript.
For a complete discussion of statistical analysis, see the Supporting Information, section Statistical analysis.
All analyses were done using R sofware in RStudio interface.
### Clear R environment:
rm(list=ls())
R.version
## _
## platform x86_64-w64-mingw32
## arch x86_64
## os mingw32
## system x86_64, mingw32
## status
## major 4
## minor 1.2
## year 2021
## month 11
## day 01
## svn rev 81115
## language R
## version.string R version 4.1.2 (2021-11-01)
## nickname Bird Hippie
rstudioapi::versionInfo()$version
## [1] '2022.12.0.353'
### Uploading R packages:
library(here) # file referencing
library(dplyr) # for data manipulation
library(ggplot2) # plot data
library(brms) # Bayesian model fit
library(bayesplot) # plot posterior distributions
library(loo) # leave-one-out cross-validation
library(tidyr) # data manipulation
library(tibble) # related to tidyr
library(ggdist) # plot posterior: stat_halfeye(), stat_slab()
library(wesanderson) # set posterior palette: wes_palette()
treat, group_id, status should be coded as
factors
day_of_scar_formation, censoring_status are
numerical variables
### Load data
rm(list=ls())
### Upload (! Use your own path to the data table !)
file.exists("c:/Users/KindermannM/Documents/OneDrive_hotmail_gmail/4_biomedical_center/R2/1_uochb/Projects/HPMA_patent/data/Figure_4.txt")
## [1] TRUE
### Upload (! Use your own path to the data table !)
data <- read.delim("c:/Users/KindermannM/Documents/OneDrive_hotmail_gmail/4_biomedical_center/R2/1_uochb/Projects/HPMA_patent/data/Figure_4.txt")
attach(data)
treat <- as.factor(treat)
group_ID <- as.factor(group_ID)
status <- as.factor(status)
day_of_scar_formation <- as.numeric(day_of_scar_formation)
censoring_status <- as.numeric(censoring_status)
data
## group_ID glyc_pre glyc_post diff day_of_scar_formation status treat
## 1 1 8.4 8.7 0.3 13 healthy C
## 2 1 7.7 19.3 11.6 14 diabetic B
## 3 1 5.2 19.1 13.9 19 diabetic A
## 4 1 6.7 20.3 13.6 19 diabetic D
## 5 1 6.8 16.4 9.6 20 diabetic E
## 6 2 6.4 5.6 -0.8 13 healthy C
## 7 2 7.8 14.5 6.7 14 diabetic B
## 8 2 8.4 10.7 2.3 20 diabetic A
## 9 2 6.5 10.7 4.2 19 diabetic D
## 10 2 5.9 9.4 3.5 20 diabetic E
## 11 3 7.8 8.3 0.5 12 healthy C
## 12 3 6.8 23.2 16.4 13 diabetic B
## 13 3 7.2 22.8 15.6 19 diabetic A
## 14 3 7.6 22.4 14.8 19 diabetic D
## 15 3 6.2 21.6 15.4 19 diabetic E
## censoring_status treat_name
## 1 1 helthy_control
## 2 1 NF_CopFNDsiRNA
## 3 1 diabetic_control
## 4 1 NF
## 5 1 NF_CopFNDctrlRNA
## 6 1 helthy_control
## 7 1 NF_CopFNDsiRNA
## 8 1 diabetic_control
## 9 1 NF
## 10 1 NF_CopFNDctrlRNA
## 11 1 helthy_control
## 12 1 NF_CopFNDsiRNA
## 13 1 diabetic_control
## 14 1 NF
## 15 1 NF_CopFNDctrlRNA
The weakly informative priors for the coefficients 𝛽 are
centered around 0, reflecting the conservative assumption that the
difference between the reference treatment and the others can be
positive, negative or equal to zero. Standard deviation (sd) of this
prior was estimated as sd = 2 × standard
deviation(day_of_scar_formation); prior(normal(0, 6)).
See section Statistical analysis in manuscript for the formal
model notation.
### Bayesian model
set.seed = 17
bform_1 <- bf(day_of_scar_formation ~ 0 + treat)
prior_1 <- c(prior(normal(0, 6), class = b))
fit_1 <- brm(bform_1,
data = data,
family = brmsfamily("weibull"),
prior = prior_1,
warmup = 2000,
iter = 5000,
chains = 4,
cores = 4,
control = list(adapt_delta = 0.95, max_treedepth = 15))
fit_1 <- add_criterion(fit_1, "loo")
### Summary of a fitted model
summary(fit_1)
## Family: weibull
## Links: mu = log; shape = identity
## Formula: day_of_scar_formation ~ 0 + treat
## Data: data (Number of observations: 15)
## Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
## total post-warmup draws = 12000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatA 2.96 0.02 2.92 3.00 1.00 8144 7341
## treatB 2.62 0.02 2.58 2.66 1.00 8572 6874
## treatC 2.54 0.02 2.51 2.58 1.00 7694 6916
## treatD 2.93 0.02 2.90 2.97 1.00 8513 6071
## treatE 2.98 0.02 2.94 3.01 1.00 8837 6922
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 38.41 10.33 20.54 61.05 1.00 4005 5993
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
### Display conditional effects
conditional_effects(fit_1)
### Distribution of outcome vs. newly simulated data distribution
### (dark line should be close to light lines)
pp_check(fit_1, ndraws = 50)
### Bayesian version of R-squared for regression model
bayes_R2(fit_1)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9710928 0.00994422 0.9454639 0.9804937
### "Leave-one-out extreme value diagnostic"
loo(fit_1, 'moment_match = TRUE')
##
## Computed from 12000 by 15 log-likelihood matrix
##
## Estimate SE
## elpd_loo -14.5 2.5
## p_loo 4.5 0.9
## looic 29.0 4.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 12 80.0% 2194
## (0.5, 0.7] (ok) 2 13.3% 2282
## (0.7, 1] (bad) 1 6.7% 125
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
plot(loo(fit_1, 'moment_match = TRUE'))
### Trace rank plot: these histograms should be reasonable uniform, with no
### chain spiking above or below the others.
mcmc_rank_overlay(fit_1)
### Check whether the priors were informative:
### Is the posterior standard deviation greater than 10% of the prior
### standard deviation?
bayestestR::check_prior(fit_1)
## Parameter Prior_Quality
## 1 b_treatA uninformative
## 2 b_treatB uninformative
## 3 b_treatC uninformative
## 4 b_treatD uninformative
## 5 b_treatE uninformative
Prior sensitivity analysis: Try to check the more narrower prior.
### Bayesian model
set.seed = 17
bform_1.1 <- bf(day_of_scar_formation ~ 0 + treat)
prior_1.1 <- c(prior(normal(0, 1), class = b))
fit_1.1 <- brm(bform_1.1,
data = data,
family = brmsfamily("weibull"),
prior = prior_1.1,
warmup = 2000,
iter = 5000,
chains = 4,
cores = 4,
control = list(adapt_delta = 0.95, max_treedepth = 15))
fit_1.1 <- add_criterion(fit_1.1, "loo")
### Summary of a fitted model
summary(fit_1.1)
## Family: weibull
## Links: mu = log; shape = identity
## Formula: day_of_scar_formation ~ 0 + treat
## Data: data (Number of observations: 15)
## Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
## total post-warmup draws = 12000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatA 2.96 0.02 2.92 3.00 1.00 7993 6348
## treatB 2.62 0.02 2.58 2.66 1.00 7514 6184
## treatC 2.54 0.02 2.50 2.58 1.00 7509 5994
## treatD 2.93 0.02 2.90 2.97 1.00 8921 6266
## treatE 2.97 0.02 2.94 3.02 1.00 8367 6220
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 37.30 10.27 19.81 59.85 1.00 3293 5596
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
### Display conditional effects
conditional_effects(fit_1.1)
### Distribution of outcome vs. newly simulated data distribution
### (dark line should be close to light lines)
pp_check(fit_1.1, ndraws = 50)
### Bayesian version of R-squared for regression model
bayes_R2(fit_1.1)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9706032 0.01113044 0.9422147 0.9805153
### "Leave-one-out extreme value diagnostic"
loo(fit_1.1, 'moment_match = TRUE')
##
## Computed from 12000 by 15 log-likelihood matrix
##
## Estimate SE
## elpd_loo -14.7 2.3
## p_loo 4.5 0.8
## looic 29.3 4.6
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 10 66.7% 1378
## (0.5, 0.7] (ok) 4 26.7% 1813
## (0.7, 1] (bad) 1 6.7% 233
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
plot(loo(fit_1.1, 'moment_match = TRUE'))
### Trace rank plot: these histograms should be reasonable uniform, with no
### chain spiking above or below the others.
mcmc_rank_overlay(fit_1.1)
### Check whether the priors were informative:
### Is the posterior standard deviation greater than 10% of the prior
### standard deviation?
bayestestR::check_prior(fit_1.1)
## Parameter Prior_Quality
## 1 b_treatA uninformative
## 2 b_treatB uninformative
## 3 b_treatC uninformative
## 4 b_treatD uninformative
## 5 b_treatE uninformative
Prior sensitivity analysis: Try to check the more wider prior.
### Bayesian model
set.seed = 17
bform_1.2 <- bf(day_of_scar_formation ~ 0 + treat)
prior_1.2 <- c(prior(normal(0, 12), class = b))
fit_1.2 <- brm(bform_1.2,
data = data,
family = brmsfamily("weibull"),
prior = prior_1.2,
warmup = 2000,
iter = 5000,
chains = 4,
cores = 4,
control = list(adapt_delta = 0.95, max_treedepth = 15))
fit_1.2 <- add_criterion(fit_1.2, "loo")
### Summary of a fitted model
summary(fit_1.2)
## Family: weibull
## Links: mu = log; shape = identity
## Formula: day_of_scar_formation ~ 0 + treat
## Data: data (Number of observations: 15)
## Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
## total post-warmup draws = 12000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatA 2.96 0.02 2.92 3.00 1.00 8461 6713
## treatB 2.62 0.02 2.58 2.66 1.00 8261 6486
## treatC 2.54 0.02 2.51 2.58 1.00 7253 6329
## treatD 2.93 0.02 2.90 2.97 1.00 8537 6291
## treatE 2.98 0.02 2.94 3.02 1.00 8206 6330
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 37.81 10.30 20.37 60.49 1.00 4606 5890
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
### Display conditional effects
conditional_effects(fit_1.2)
### Distribution of outcome vs. newly simulated data distribution
### (dark line should be close to light lines)
pp_check(fit_1.2, ndraws = 50)
### Bayesian version of R-squared for regression model
bayes_R2(fit_1.2)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9707025 0.01056215 0.9430931 0.9804991
### "Leave-one-out extreme value diagnostic"
loo(fit_1.2, 'moment_match = TRUE')
##
## Computed from 12000 by 15 log-likelihood matrix
##
## Estimate SE
## elpd_loo -14.7 2.4
## p_loo 4.6 0.9
## looic 29.5 4.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 14 93.3% 2011
## (0.5, 0.7] (ok) 0 0.0% <NA>
## (0.7, 1] (bad) 1 6.7% 61
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
plot(loo(fit_1.2, 'moment_match = TRUE'))
### Trace rank plot: these histograms should be reasonable uniform, with no
### chain spiking above or below the others.
mcmc_rank_overlay(fit_1.2)
### Check whether the priors were informative:
### Is the posterior standard deviation greater than 10% of the prior
### standard deviation?
bayestestR::check_prior(fit_1.2)
## Parameter Prior_Quality
## 1 b_treatA uninformative
## 2 b_treatB uninformative
## 3 b_treatC uninformative
## 4 b_treatD uninformative
## 5 b_treatE uninformative
### See chapter 11.5 in Statistical rethinking with brms, ggplot2, and the
### tidyverse: Second edition version 0.4.0
### A Solomon Kurz; 2023-01-26
### Bonus: Survival analysis
### Back-transformation of estimated β parameters into the λ metric
### See equation (4) in manuscript
est_shape <- 37.85
est_scaleA <- exp(fixef(fit_1)[1, 1]) / gamma(1 + 1/est_shape)
est_scaleB <- exp(fixef(fit_1)[2, 1]) / gamma(1 + 1/est_shape)
est_scaleC <- exp(fixef(fit_1)[3, 1]) / gamma(1 + 1/est_shape)
est_scaleD <- exp(fixef(fit_1)[4, 1]) / gamma(1 + 1/est_shape)
est_scaleE <- exp(fixef(fit_1)[5, 1]) / gamma(1 + 1/est_shape)
ll_scaleA <- exp(fixef(fit_1)[1, 3]) / gamma(1 + 1/est_shape)
ll_scaleB <- exp(fixef(fit_1)[2, 3]) / gamma(1 + 1/est_shape)
ll_scaleC <- exp(fixef(fit_1)[3, 3]) / gamma(1 + 1/est_shape)
ll_scaleD <- exp(fixef(fit_1)[4, 3]) / gamma(1 + 1/est_shape)
ll_scaleE <- exp(fixef(fit_1)[5, 3]) / gamma(1 + 1/est_shape)
hh_scaleA <- exp(fixef(fit_1)[1, 4]) / gamma(1 + 1/est_shape)
hh_scaleB <- exp(fixef(fit_1)[2, 4]) / gamma(1 + 1/est_shape)
hh_scaleC <- exp(fixef(fit_1)[3, 4]) / gamma(1 + 1/est_shape)
hh_scaleD <- exp(fixef(fit_1)[4, 4]) / gamma(1 + 1/est_shape)
hh_scaleE <- exp(fixef(fit_1)[5, 4]) / gamma(1 + 1/est_shape)
### Get survival functions
### See equation (2) in manuscript
est <-
data.frame(days = seq(0, 30, 0.001)) %>%
mutate(est_SA = exp(-(days/est_scaleA)^est_shape)) %>%
mutate(est_SB = exp(-(days/est_scaleB)^est_shape)) %>%
mutate(est_SC = exp(-(days/est_scaleC)^est_shape)) %>%
mutate(est_SD = exp(-(days/est_scaleD)^est_shape)) %>%
mutate(est_SE = exp(-(days/est_scaleE)^est_shape))
ll <-
data.frame(days = seq(0, 30, 0.001)) %>%
mutate(ll_SA = exp(-(days/ll_scaleA)^est_shape)) %>%
mutate(ll_SB = exp(-(days/ll_scaleB)^est_shape)) %>%
mutate(ll_SC = exp(-(days/ll_scaleC)^est_shape)) %>%
mutate(ll_SD = exp(-(days/ll_scaleD)^est_shape)) %>%
mutate(ll_SE = exp(-(days/ll_scaleE)^est_shape))
hh <-
data.frame(days = seq(0, 30, 0.001)) %>%
mutate(hh_SA = exp(-(days/hh_scaleA)^est_shape)) %>%
mutate(hh_SB = exp(-(days/hh_scaleB)^est_shape)) %>%
mutate(hh_SC = exp(-(days/hh_scaleC)^est_shape)) %>%
mutate(hh_SD = exp(-(days/hh_scaleD)^est_shape)) %>%
mutate(hh_SE = exp(-(days/hh_scaleE)^est_shape))
### Get hazard functions
### See equation (3) in manuscript
est_hazard <-
data.frame(days = seq(0, 20, 0.001)) %>%
mutate(est_SA = (est_shape/est_scaleA) * (days/est_scaleA)^(est_shape - 1)) %>%
mutate(est_SA = est_SA / max(est_SA)) %>%
mutate(est_SB = (est_shape/est_scaleB) * (days/est_scaleB)^(est_shape - 1)) %>%
mutate(est_SB = est_SB / max(est_SB)) %>%
mutate(est_SC = (est_shape/est_scaleC) * (days/est_scaleC)^(est_shape - 1)) %>%
mutate(est_SC = est_SC / max(est_SC)) %>%
mutate(est_SD = (est_shape/est_scaleD) * (days/est_scaleD)^(est_shape - 1)) %>%
mutate(est_SD = est_SD / max(est_SD)) %>%
mutate(est_SE = (est_shape/est_scaleE) * (days/est_scaleE)^(est_shape - 1)) %>%
mutate(est_SE = est_SE / max(est_SE))
### Plot the full posterior distributions
post <-
as_draws_df(fit_1) %>%
pivot_longer(starts_with("b_")) %>%
mutate(days = qweibull(p = .5, shape, scale = exp(value) / gamma( 1 + 1/shape))) %>%
mutate(days_diff = days - days[name == "b_treatC" ])
(g1 <-
post %>%
ggplot(aes(x = days, y = name)) +
stat_halfeye(aes(fill = after_stat(level)), # area under curve colors
point_interval = mean_qi, # circle = mean + quantile intervals
.width = c(0.67, .89, 1), # areas under curve
slab_color = "#000000", # posterior curve color
slab_linewidth = 0.7,
height = 3) +
stat_spike(at = c(mean)) +
scale_fill_brewer(type = "seq", # collor pallete - area under curve
palette = "Greys",
direction = 1,
aesthetics = "fill") +
scale_colour_manual(values = c("#000000")) + # posterior curve color (must be here)
labs(x="Time until 50% of mice experienced wound closure") +
labs(y="") +
scale_x_continuous(limits = c(11, 22),
breaks = (c(10, 12, 14, 16, 18, 20, 22))) +
theme(axis.ticks.y = element_blank()) +
theme(legend.position="none", plot.title = element_text(size=10),
aspect.ratio = 0.8,
plot.margin = margin(1, 1, 1, 1, "cm"),
axis.text=element_text(size=10),
axis.title=element_text(size=10),
plot.background = element_rect(fill='transparent', color=NA),
panel.border = element_rect(colour = "black", fill=NA, size=0.5),
panel.grid.major = element_line(color = "grey", size = 0.5),
panel.background = element_rect(fill = "white", colour = "white"),
legend.background = element_rect(fill='transparent'))
)