Introduction

This report shows source data, statistical code and outputs of Weibull regression model in a Bayesian framework. The regression model is applied to the time-to-event data, which are shown in Figure 4A, B in the manuscript. The resulting effect size is shown in Figure 4D utilizing full posterior distributions.

The Weibull model allows us to assume a monotonically increasing hazard function h_W (t). In other words, the chance of wound closure increases with time. Results of the Weibull regression model in Figure 4A-B (smooth curves) are shown in terms of the survival function S_W (t), which describes the probability that wound closure has not occurred by the elapsed time.

For the definitions of h_W (t) and S_W (t) see the manuscript.

For a complete discussion of statistical analysis, see the Supporting Information, section Statistical analysis.

All analyses were done using R sofware in RStudio interface.


Software version

### Clear R environment:
rm(list=ls())
R.version
##                _                           
## platform       x86_64-w64-mingw32          
## arch           x86_64                      
## os             mingw32                     
## system         x86_64, mingw32             
## status                                     
## major          4                           
## minor          1.2                         
## year           2021                        
## month          11                          
## day            01                          
## svn rev        81115                       
## language       R                           
## version.string R version 4.1.2 (2021-11-01)
## nickname       Bird Hippie
rstudioapi::versionInfo()$version
## [1] '2022.12.0.353'

Libraries

### Uploading R packages:
library(here)           # file referencing
library(dplyr)          # for data manipulation
library(ggplot2)        # plot data
library(brms)           # Bayesian model fit
library(bayesplot)      # plot posterior distributions
library(loo)            # leave-one-out cross-validation
library(tidyr)          # data manipulation
library(tibble)         # related to tidyr
library(ggdist)         # plot posterior: stat_halfeye(), stat_slab()
library(wesanderson)    # set posterior palette: wes_palette()

Load data

treat, group_id, status should be coded as factors
day_of_scar_formation, censoring_status are numerical variables

### Load data
rm(list=ls())
### Upload (! Use your own path to the data table !)
file.exists("c:/Users/KindermannM/Documents/OneDrive_hotmail_gmail/4_biomedical_center/R2/1_uochb/Projects/HPMA_patent/data/Figure_4.txt")
## [1] TRUE
### Upload (! Use your own path to the data table !)
data <- read.delim("c:/Users/KindermannM/Documents/OneDrive_hotmail_gmail/4_biomedical_center/R2/1_uochb/Projects/HPMA_patent/data/Figure_4.txt")
attach(data)

treat <- as.factor(treat)
group_ID <- as.factor(group_ID)
status <- as.factor(status)
day_of_scar_formation <- as.numeric(day_of_scar_formation)
censoring_status <- as.numeric(censoring_status)

data
##    group_ID glyc_pre glyc_post diff day_of_scar_formation   status treat
## 1         1      8.4       8.7  0.3                    13  healthy     C
## 2         1      7.7      19.3 11.6                    14 diabetic     B
## 3         1      5.2      19.1 13.9                    19 diabetic     A
## 4         1      6.7      20.3 13.6                    19 diabetic     D
## 5         1      6.8      16.4  9.6                    20 diabetic     E
## 6         2      6.4       5.6 -0.8                    13  healthy     C
## 7         2      7.8      14.5  6.7                    14 diabetic     B
## 8         2      8.4      10.7  2.3                    20 diabetic     A
## 9         2      6.5      10.7  4.2                    19 diabetic     D
## 10        2      5.9       9.4  3.5                    20 diabetic     E
## 11        3      7.8       8.3  0.5                    12  healthy     C
## 12        3      6.8      23.2 16.4                    13 diabetic     B
## 13        3      7.2      22.8 15.6                    19 diabetic     A
## 14        3      7.6      22.4 14.8                    19 diabetic     D
## 15        3      6.2      21.6 15.4                    19 diabetic     E
##    censoring_status       treat_name
## 1                 1   helthy_control
## 2                 1   NF_CopFNDsiRNA
## 3                 1 diabetic_control
## 4                 1               NF
## 5                 1 NF_CopFNDctrlRNA
## 6                 1   helthy_control
## 7                 1   NF_CopFNDsiRNA
## 8                 1 diabetic_control
## 9                 1               NF
## 10                1 NF_CopFNDctrlRNA
## 11                1   helthy_control
## 12                1   NF_CopFNDsiRNA
## 13                1 diabetic_control
## 14                1               NF
## 15                1 NF_CopFNDctrlRNA

Bayesian model 1 (BM1)

The weakly informative priors for the coefficients 𝛽 are centered around 0, reflecting the conservative assumption that the difference between the reference treatment and the others can be positive, negative or equal to zero. Standard deviation (sd) of this prior was estimated as sd = 2 × standard deviation(day_of_scar_formation); prior(normal(0, 6)).
See section Statistical analysis in manuscript for the formal model notation.

### Bayesian model
set.seed = 17
bform_1 <- bf(day_of_scar_formation ~ 0 + treat)
prior_1 <- c(prior(normal(0, 6), class = b))

fit_1 <- brm(bform_1,
            data = data,
            family = brmsfamily("weibull"),
            prior = prior_1,
            warmup = 2000,
            iter = 5000,
            chains = 4,
            cores = 4,
            control = list(adapt_delta = 0.95, max_treedepth = 15))
            fit_1 <- add_criterion(fit_1, "loo")
###  Summary of a fitted model 
summary(fit_1)
##  Family: weibull 
##   Links: mu = log; shape = identity 
## Formula: day_of_scar_formation ~ 0 + treat 
##    Data: data (Number of observations: 15) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Population-Level Effects: 
##        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatA     2.96      0.02     2.92     3.00 1.00     8144     7341
## treatB     2.62      0.02     2.58     2.66 1.00     8572     6874
## treatC     2.54      0.02     2.51     2.58 1.00     7694     6916
## treatD     2.93      0.02     2.90     2.97 1.00     8513     6071
## treatE     2.98      0.02     2.94     3.01 1.00     8837     6922
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape    38.41     10.33    20.54    61.05 1.00     4005     5993
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
### Display conditional effects
conditional_effects(fit_1) 

### Distribution of outcome vs. newly simulated data distribution
### (dark line should be close to light lines)
pp_check(fit_1, ndraws = 50)

### Bayesian version of R-squared for regression model
bayes_R2(fit_1)
##     Estimate  Est.Error      Q2.5     Q97.5
## R2 0.9710928 0.00994422 0.9454639 0.9804937
### "Leave-one-out extreme value diagnostic"
loo(fit_1, 'moment_match = TRUE')
## 
## Computed from 12000 by 15 log-likelihood matrix
## 
##          Estimate  SE
## elpd_loo    -14.5 2.5
## p_loo         4.5 0.9
## looic        29.0 4.9
## ------
## Monte Carlo SE of elpd_loo is NA.
## 
## Pareto k diagnostic values:
##                          Count Pct.    Min. n_eff
## (-Inf, 0.5]   (good)     12    80.0%   2194      
##  (0.5, 0.7]   (ok)        2    13.3%   2282      
##    (0.7, 1]   (bad)       1     6.7%   125       
##    (1, Inf)   (very bad)  0     0.0%   <NA>      
## See help('pareto-k-diagnostic') for details.
plot(loo(fit_1, 'moment_match = TRUE'))

### Trace rank plot: these histograms should be reasonable uniform, with no
### chain spiking above or below the others.
mcmc_rank_overlay(fit_1) 

### Check whether the priors were informative: 
### Is the posterior standard deviation greater than 10% of the prior 
### standard deviation?
bayestestR::check_prior(fit_1)
##   Parameter Prior_Quality
## 1  b_treatA uninformative
## 2  b_treatB uninformative
## 3  b_treatC uninformative
## 4  b_treatD uninformative
## 5  b_treatE uninformative

Bayesian model 1.1 (BM1.1)

Prior sensitivity analysis: Try to check the more narrower prior.

### Bayesian model
set.seed = 17
bform_1.1 <- bf(day_of_scar_formation ~ 0 + treat)
prior_1.1 <- c(prior(normal(0, 1), class = b))

fit_1.1 <- brm(bform_1.1,
            data = data,
            family = brmsfamily("weibull"),
            prior = prior_1.1,
            warmup = 2000,
            iter = 5000,
            chains = 4,
            cores = 4,
            control = list(adapt_delta = 0.95, max_treedepth = 15))
            fit_1.1 <- add_criterion(fit_1.1, "loo")
###  Summary of a fitted model 
summary(fit_1.1)
##  Family: weibull 
##   Links: mu = log; shape = identity 
## Formula: day_of_scar_formation ~ 0 + treat 
##    Data: data (Number of observations: 15) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Population-Level Effects: 
##        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatA     2.96      0.02     2.92     3.00 1.00     7993     6348
## treatB     2.62      0.02     2.58     2.66 1.00     7514     6184
## treatC     2.54      0.02     2.50     2.58 1.00     7509     5994
## treatD     2.93      0.02     2.90     2.97 1.00     8921     6266
## treatE     2.97      0.02     2.94     3.02 1.00     8367     6220
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape    37.30     10.27    19.81    59.85 1.00     3293     5596
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
### Display conditional effects
conditional_effects(fit_1.1) 

### Distribution of outcome vs. newly simulated data distribution
### (dark line should be close to light lines)
pp_check(fit_1.1, ndraws = 50)

### Bayesian version of R-squared for regression model
bayes_R2(fit_1.1)
##     Estimate  Est.Error      Q2.5     Q97.5
## R2 0.9706032 0.01113044 0.9422147 0.9805153
### "Leave-one-out extreme value diagnostic"
loo(fit_1.1, 'moment_match = TRUE')
## 
## Computed from 12000 by 15 log-likelihood matrix
## 
##          Estimate  SE
## elpd_loo    -14.7 2.3
## p_loo         4.5 0.8
## looic        29.3 4.6
## ------
## Monte Carlo SE of elpd_loo is NA.
## 
## Pareto k diagnostic values:
##                          Count Pct.    Min. n_eff
## (-Inf, 0.5]   (good)     10    66.7%   1378      
##  (0.5, 0.7]   (ok)        4    26.7%   1813      
##    (0.7, 1]   (bad)       1     6.7%   233       
##    (1, Inf)   (very bad)  0     0.0%   <NA>      
## See help('pareto-k-diagnostic') for details.
plot(loo(fit_1.1, 'moment_match = TRUE'))

### Trace rank plot: these histograms should be reasonable uniform, with no
### chain spiking above or below the others.
mcmc_rank_overlay(fit_1.1) 

### Check whether the priors were informative: 
### Is the posterior standard deviation greater than 10% of the prior 
### standard deviation?
bayestestR::check_prior(fit_1.1)
##   Parameter Prior_Quality
## 1  b_treatA uninformative
## 2  b_treatB uninformative
## 3  b_treatC uninformative
## 4  b_treatD uninformative
## 5  b_treatE uninformative

Bayesian model 1.2 (BM1.2)

Prior sensitivity analysis: Try to check the more wider prior.

### Bayesian model
set.seed = 17
bform_1.2 <- bf(day_of_scar_formation ~ 0 + treat)
prior_1.2 <- c(prior(normal(0, 12), class = b))

fit_1.2 <- brm(bform_1.2,
            data = data,
            family = brmsfamily("weibull"),
            prior = prior_1.2,
            warmup = 2000,
            iter = 5000,
            chains = 4,
            cores = 4,
            control = list(adapt_delta = 0.95, max_treedepth = 15))
            fit_1.2 <- add_criterion(fit_1.2, "loo")
###  Summary of a fitted model 
summary(fit_1.2)
##  Family: weibull 
##   Links: mu = log; shape = identity 
## Formula: day_of_scar_formation ~ 0 + treat 
##    Data: data (Number of observations: 15) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Population-Level Effects: 
##        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatA     2.96      0.02     2.92     3.00 1.00     8461     6713
## treatB     2.62      0.02     2.58     2.66 1.00     8261     6486
## treatC     2.54      0.02     2.51     2.58 1.00     7253     6329
## treatD     2.93      0.02     2.90     2.97 1.00     8537     6291
## treatE     2.98      0.02     2.94     3.02 1.00     8206     6330
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape    37.81     10.30    20.37    60.49 1.00     4606     5890
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
### Display conditional effects
conditional_effects(fit_1.2) 

### Distribution of outcome vs. newly simulated data distribution
### (dark line should be close to light lines)
pp_check(fit_1.2, ndraws = 50)

### Bayesian version of R-squared for regression model
bayes_R2(fit_1.2)
##     Estimate  Est.Error      Q2.5     Q97.5
## R2 0.9707025 0.01056215 0.9430931 0.9804991
### "Leave-one-out extreme value diagnostic"
loo(fit_1.2, 'moment_match = TRUE')
## 
## Computed from 12000 by 15 log-likelihood matrix
## 
##          Estimate  SE
## elpd_loo    -14.7 2.4
## p_loo         4.6 0.9
## looic        29.5 4.9
## ------
## Monte Carlo SE of elpd_loo is NA.
## 
## Pareto k diagnostic values:
##                          Count Pct.    Min. n_eff
## (-Inf, 0.5]   (good)     14    93.3%   2011      
##  (0.5, 0.7]   (ok)        0     0.0%   <NA>      
##    (0.7, 1]   (bad)       1     6.7%   61        
##    (1, Inf)   (very bad)  0     0.0%   <NA>      
## See help('pareto-k-diagnostic') for details.
plot(loo(fit_1.2, 'moment_match = TRUE'))

### Trace rank plot: these histograms should be reasonable uniform, with no
### chain spiking above or below the others.
mcmc_rank_overlay(fit_1.2) 

### Check whether the priors were informative: 
### Is the posterior standard deviation greater than 10% of the prior 
### standard deviation?
bayestestR::check_prior(fit_1.2)
##   Parameter Prior_Quality
## 1  b_treatA uninformative
## 2  b_treatB uninformative
## 3  b_treatC uninformative
## 4  b_treatD uninformative
## 5  b_treatE uninformative

Get survival and hazard function - BM1

### See chapter 11.5 in Statistical rethinking with brms, ggplot2, and the 
### tidyverse: Second edition version 0.4.0 
### A Solomon Kurz; 2023-01-26
### Bonus: Survival analysis

### Back-transformation of estimated β parameters into the λ metric 
### See equation (4) in manuscript
est_shape <- 37.85
est_scaleA <- exp(fixef(fit_1)[1, 1]) / gamma(1 + 1/est_shape)
est_scaleB <- exp(fixef(fit_1)[2, 1]) / gamma(1 + 1/est_shape)
est_scaleC <- exp(fixef(fit_1)[3, 1]) / gamma(1 + 1/est_shape)
est_scaleD <- exp(fixef(fit_1)[4, 1]) / gamma(1 + 1/est_shape)
est_scaleE <- exp(fixef(fit_1)[5, 1]) / gamma(1 + 1/est_shape)

ll_scaleA <- exp(fixef(fit_1)[1, 3]) / gamma(1 + 1/est_shape)
ll_scaleB <- exp(fixef(fit_1)[2, 3]) / gamma(1 + 1/est_shape)
ll_scaleC <- exp(fixef(fit_1)[3, 3]) / gamma(1 + 1/est_shape)
ll_scaleD <- exp(fixef(fit_1)[4, 3]) / gamma(1 + 1/est_shape)
ll_scaleE <- exp(fixef(fit_1)[5, 3]) / gamma(1 + 1/est_shape)

hh_scaleA <- exp(fixef(fit_1)[1, 4]) / gamma(1 + 1/est_shape)
hh_scaleB <- exp(fixef(fit_1)[2, 4]) / gamma(1 + 1/est_shape)
hh_scaleC <- exp(fixef(fit_1)[3, 4]) / gamma(1 + 1/est_shape)
hh_scaleD <- exp(fixef(fit_1)[4, 4]) / gamma(1 + 1/est_shape)
hh_scaleE <- exp(fixef(fit_1)[5, 4]) / gamma(1 + 1/est_shape)

### Get survival functions
### See equation (2) in manuscript
est <-
  data.frame(days = seq(0, 30, 0.001)) %>%
  mutate(est_SA = exp(-(days/est_scaleA)^est_shape)) %>%
  mutate(est_SB = exp(-(days/est_scaleB)^est_shape)) %>%
  mutate(est_SC = exp(-(days/est_scaleC)^est_shape)) %>%
  mutate(est_SD = exp(-(days/est_scaleD)^est_shape)) %>%
  mutate(est_SE = exp(-(days/est_scaleE)^est_shape))

ll <-
  data.frame(days = seq(0, 30, 0.001)) %>%
  mutate(ll_SA = exp(-(days/ll_scaleA)^est_shape)) %>%
  mutate(ll_SB = exp(-(days/ll_scaleB)^est_shape)) %>%
  mutate(ll_SC = exp(-(days/ll_scaleC)^est_shape)) %>%
  mutate(ll_SD = exp(-(days/ll_scaleD)^est_shape)) %>%
  mutate(ll_SE = exp(-(days/ll_scaleE)^est_shape))

hh <-
  data.frame(days = seq(0, 30, 0.001)) %>%
  mutate(hh_SA = exp(-(days/hh_scaleA)^est_shape)) %>%
  mutate(hh_SB = exp(-(days/hh_scaleB)^est_shape)) %>%
  mutate(hh_SC = exp(-(days/hh_scaleC)^est_shape)) %>%
  mutate(hh_SD = exp(-(days/hh_scaleD)^est_shape)) %>%
  mutate(hh_SE = exp(-(days/hh_scaleE)^est_shape))

### Get hazard functions
### See equation (3) in manuscript
est_hazard <-
  data.frame(days = seq(0, 20, 0.001)) %>%
  mutate(est_SA = (est_shape/est_scaleA) * (days/est_scaleA)^(est_shape - 1)) %>%
  mutate(est_SA = est_SA / max(est_SA)) %>%
  mutate(est_SB = (est_shape/est_scaleB) * (days/est_scaleB)^(est_shape - 1)) %>%
  mutate(est_SB = est_SB / max(est_SB)) %>%
  mutate(est_SC = (est_shape/est_scaleC) * (days/est_scaleC)^(est_shape - 1)) %>%
  mutate(est_SC = est_SC / max(est_SC)) %>%
  mutate(est_SD = (est_shape/est_scaleD) * (days/est_scaleD)^(est_shape - 1)) %>%
  mutate(est_SD = est_SD / max(est_SD)) %>%
  mutate(est_SE = (est_shape/est_scaleE) * (days/est_scaleE)^(est_shape - 1)) %>% 
  mutate(est_SE = est_SE / max(est_SE)) 

Plot Bayesian posteriors - BM1

### Plot the full posterior distributions
post <-
  as_draws_df(fit_1)  %>% 
  pivot_longer(starts_with("b_")) %>%
  mutate(days = qweibull(p = .5, shape, scale = exp(value) / gamma( 1 + 1/shape))) %>%
  mutate(days_diff = days - days[name == "b_treatC" ])

(g1 <-
  post %>%
  ggplot(aes(x = days, y = name)) +
  stat_halfeye(aes(fill = after_stat(level)),    # area under curve colors
               point_interval = mean_qi,         # circle = mean + quantile intervals
               .width = c(0.67, .89, 1),         # areas under curve
               slab_color = "#000000",           # posterior curve color
               slab_linewidth = 0.7,
               height = 3) +
               stat_spike(at = c(mean)) +
  scale_fill_brewer(type = "seq",                # collor pallete - area under curve
                    palette = "Greys",
                    direction = 1,
                    aesthetics = "fill") +
  scale_colour_manual(values = c("#000000")) +   # posterior curve color (must be here)
  labs(x="Time until 50% of mice experienced wound closure") +
  labs(y="") +
  scale_x_continuous(limits = c(11, 22),
                     breaks = (c(10, 12, 14, 16, 18, 20, 22))) +
  theme(axis.ticks.y = element_blank()) +
  theme(legend.position="none", plot.title = element_text(size=10),
  aspect.ratio = 0.8,
  plot.margin = margin(1, 1, 1, 1, "cm"),
  axis.text=element_text(size=10),
  axis.title=element_text(size=10),
  plot.background = element_rect(fill='transparent', color=NA),
  panel.border = element_rect(colour = "black", fill=NA, size=0.5),
  panel.grid.major = element_line(color = "grey", size = 0.5),
  panel.background = element_rect(fill = "white", colour = "white"),
  legend.background = element_rect(fill='transparent'))
)