--- title: "Maraca Plots - Validation" author: "Monika Huhn" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Maraca Plots - Validation} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- This vignette is largely based on the PharmaSUG 2023 conference paper. ```{r setup, include = FALSE} knitr::opts_chunk$set(echo = TRUE, collapse = TRUE) library(dplyr) library(ggplot2) library(maraca) ``` When using a maraca plot in a regulatory setting, for example to visualize clinical study results for regulatory submissions, there will be strict validation demands. For example, the results might need to be double programmed for validation purposes. In order to facilitate the validation of the graphic output, the maraca package includes the function `validate_maraca_plot()` that allows the user to extract important metrics from the plot itself. This allows to programmatically compare the results of a plot produced using the maraca package with other programmatic approaches. The paper linked above walks through a validation example where the double programming was done in SAS. To use the validation functionality, we need to first create a maraca object. ```{r maraca1, eval = TRUE} library(maraca) data(hce_scenario_a) maraca_dat <- maraca( data = hce_scenario_a, step_outcomes = c("Outcome I", "Outcome II", "Outcome III", "Outcome IV"), last_outcome = "Continuous outcome", fixed_followup_days = 3 * 365, column_names = c(outcome = "GROUP", arm = "TRTP", value = "AVAL0"), arm_levels = c(active = "Active", control = "Control"), compute_win_odds = TRUE ) ``` We then create a maraca plot and save the actual plot as an object. ```{r fig.width=9, fig.height=6} # Save plot as its own object maraca_plot <- plot(maraca_dat) # The plot has its own class called "maracaPlot" class(maraca_plot) # Display plot maraca_plot ``` Now we can validate the plot using the `validate_maraca_plot()` function. ```{r} validation_list <- validate_maraca_plot(maraca_plot) # Display which metrics are included str(validation_list) ``` Running the `validate_maraca_plot()` function on a maraca plot object returns a list with the following items: 1. `plot_type`: depending on which `density_plot_type` was selected for the plot either `GeomPoint`, `GeomViolin` and/or `GeomBoxplot` 2. `proportions`: the proportions of the HCE components 3. `tte_data`: time-to-event data if part of the step outcomes has type tte, otherwise `NULL` 4. `binary_step_data`: binary data if part of the step outcomes has type binary, otherwise `NULL` 5. `binary_step_data`: if last endpoint was binary then contains the data for the minimum, maximum and middle point x values displayed in the ellipsis, otherwise `NULL` 6. `scatter_data`: if last endpoint was continuous and plot was created with `density_plot_type = "scatter"` then contains dataset that was plotted in scatter plot, otherwise `NULL` 7. `boxstat_data`: if last endpoint was continuous and if plot was created with `density_plot_type = "box"` or `density_plot_type = "default"` then contains the boxplot statistics, otherwise `NULL` 8. `violin_data`: if last endpoint was continuous and if plot was created with `density_plot_type = "violin"` or `density_plot_type = "default"` then contains the violin distribution data, otherwise `NULL` 9. `wo_stats`: if maraca object was created with `compute_win_odds = TRUE` then contains the win odds statistics, otherwise `NULL` These can then be converted to a convenient format for validation, such as as individual data.frames. ```{r} library(dplyr) library(tidyr) validation_list$proportions %>% as.data.frame() %>% rename("proportion" = ".") head(validation_list$tte_data) validation_list$boxstat_data %>% unnest_wider(outliers, names_sep = "") %>% pivot_longer(., cols = -group, names_to = "stat_name", values_to = "values") %>% filter(!is.na(values)) %>% as.data.frame() head(validation_list$violin_data) validation_list$wo_stats ```