This vignette is largely based on the PharmaSUG 2023 conference paper.
When using a maraca plot in a regulatory setting, for example to visualize clinical study results for regulatory submissions, there will be strict validation demands. For example, the results might need to be double programmed for validation purposes.
In order to facilitate the validation of the graphic output, the
maraca package includes the function validate_maraca_plot()
that allows the user to extract important metrics from the plot itself.
This allows to programmatically compare the results of a plot produced
using the maraca package with other programmatic approaches. The paper
linked above walks through a validation example where the double
programming was done in SAS.
To use the validation functionality, we need to first create a maraca object.
library(maraca)
data(hce_scenario_a)
maraca_dat <- maraca(
data = hce_scenario_a,
step_outcomes = c("Outcome I", "Outcome II", "Outcome III", "Outcome IV"),
last_outcome = "Continuous outcome",
fixed_followup_days = 3 * 365,
column_names = c(outcome = "GROUP", arm = "TRTP", value = "AVAL0"),
arm_levels = c(active = "Active", control = "Control"),
compute_win_odds = TRUE
)
We then create a maraca plot and save the actual plot as an object.
# Save plot as its own object
maraca_plot <- plot(maraca_dat)
# The plot has its own class called "maracaPlot"
class(maraca_plot)
## [1] "maracaPlot" "gg" "ggplot"
# Display plot
maraca_plot
Now we can validate the plot using the
validate_maraca_plot()
function.
validation_list <- validate_maraca_plot(maraca_plot)
# Display which metrics are included
str(validation_list)
## List of 9
## $ plot_type : chr "GeomViolin+GeomBoxplot"
## $ proportions : Named num [1:5] 13.8 18.1 12.7 9.8 45.6
## ..- attr(*, "names")= chr [1:5] "Outcome I" "Outcome II" "Outcome III" "Outcome IV" ...
## $ tte_data :'data.frame': 544 obs. of 3 variables:
## ..$ x : num [1:544] 0.125 0.159 0.463 0.591 0.621 ...
## ..$ y : num [1:544] 0.2 0.4 0.6 0.2 0.4 0.6 0.8 0.8 1 1.2 ...
## ..$ group: Factor w/ 2 levels "Active","Control": 2 2 2 1 1 1 2 1 2 2 ...
## $ binary_step_data: NULL
## $ binary_last_data: NULL
## $ scatter_data : NULL
## $ boxstat_data :'data.frame': 2 obs. of 9 variables:
## ..$ group : Factor w/ 2 levels "Active","Control": 1 2
## ..$ x_lowest : num [1:2] 60.8 54.4
## ..$ whisker_lower: num [1:2] 63.8 57.8
## ..$ hinge_lower : num [1:2] 74.9 71
## ..$ median : num [1:2] 79.7 75.9
## ..$ hinge_upper : num [1:2] 83.9 80.7
## ..$ whisker_upper: num [1:2] 95.5 94.5
## ..$ x_highest : num [1:2] 95.5 100
## ..$ outliers :List of 2
## .. ..$ : num 60.8
## .. ..$ : num [1:5] 54.4 96.4 97.2 99.7 100
## $ violin_data :'data.frame': 1024 obs. of 5 variables:
## ..$ group : Factor w/ 2 levels "Active","Control": 1 1 1 1 1 1 1 1 1 1 ...
## ..$ x : num [1:1024] 60.8 60.8 60.9 61 61 ...
## ..$ y : num [1:1024] 44 44 44 44 44 44 44 44 44 44 ...
## ..$ density: num [1:1024] 0.00117 0.00121 0.00124 0.00127 0.0013 ...
## ..$ width : num [1:1024] 18.7 18.7 18.7 18.7 18.7 ...
## $ wo_stats : Named num [1:4] 1.64 1.42 1.91 9.35e-12
## ..- attr(*, "names")= chr [1:4] "winodds" "lowerCI" "upperCI" "p_value"
Running the validate_maraca_plot()
function on a maraca
plot object returns a list with the following items:
plot_type
: depending on which
density_plot_type
was selected for the plot either
GeomPoint
, GeomViolin
and/or
GeomBoxplot
proportions
: the proportions of the HCE componentstte_data
: time-to-event data if part of the step
outcomes has type tte, otherwise NULL
binary_step_data
: binary data if part of the step
outcomes has type binary, otherwise NULL
binary_step_data
: if last endpoint was binary then
contains the data for the minimum, maximum and middle point x values
displayed in the ellipsis, otherwise NULL
scatter_data
: if last endpoint was continuous and plot
was created with density_plot_type = "scatter"
then
contains dataset that was plotted in scatter plot, otherwise
NULL
boxstat_data
: if last endpoint was continuous and if
plot was created with density_plot_type = "box"
or
density_plot_type = "default"
then contains the boxplot
statistics, otherwise NULL
violin_data
: if last endpoint was continuous and if
plot was created with density_plot_type = "violin"
or
density_plot_type = "default"
then contains the violin
distribution data, otherwise NULL
wo_stats
: if maraca object was created with
compute_win_odds = TRUE
then contains the win odds
statistics, otherwise NULL
These can then be converted to a convenient format for validation, such as as individual data.frames.
library(dplyr)
library(tidyr)
validation_list$proportions %>%
as.data.frame() %>%
rename("proportion" = ".")
## proportion
## Outcome I 13.8
## Outcome II 18.1
## Outcome III 12.7
## Outcome IV 9.8
## Continuous outcome 45.6
head(validation_list$tte_data)
## x y group
## 3 0.1251495 0.2 Control
## 4 0.1593700 0.4 Control
## 5 0.4625571 0.6 Control
## 6 0.5905571 0.2 Active
## 7 0.6209704 0.4 Active
## 8 0.6478802 0.6 Active
validation_list$boxstat_data %>%
unnest_wider(outliers, names_sep = "") %>%
pivot_longer(., cols = -group, names_to = "stat_name", values_to = "values") %>%
filter(!is.na(values)) %>%
as.data.frame()
## group stat_name values
## 1 Active x_lowest 60.77490
## 2 Active whisker_lower 63.82323
## 3 Active hinge_lower 74.87904
## 4 Active median 79.68861
## 5 Active hinge_upper 83.87907
## 6 Active whisker_upper 95.53322
## 7 Active x_highest 95.53322
## 8 Active outliers1 60.77490
## 9 Control x_lowest 54.40000
## 10 Control whisker_lower 57.76833
## 11 Control hinge_lower 70.99354
## 12 Control median 75.90131
## 13 Control hinge_upper 80.73315
## 14 Control whisker_upper 94.45711
## 15 Control x_highest 100.00000
## 16 Control outliers1 54.40000
## 17 Control outliers2 96.39603
## 18 Control outliers3 97.24586
## 19 Control outliers4 99.68534
## 20 Control outliers5 100.00000
head(validation_list$violin_data)
## group x y density width
## 1 Active 60.77490 44 0.001174508 18.72
## 2 Active 60.84292 44 0.001205014 18.72
## 3 Active 60.91094 44 0.001236189 18.72
## 4 Active 60.97896 44 0.001268532 18.72
## 5 Active 61.04698 44 0.001301859 18.72
## 6 Active 61.11500 44 0.001336067 18.72
validation_list$wo_stats
## winodds lowerCI upperCI p_value
## 1.643265e+00 1.416117e+00 1.906848e+00 9.354073e-12