Superspreader model analysis¶
CAS 520, Spring B 2025¶
Load required packages¶
tidyverseprovides multiple packages for data reading, analyzing, and plottingggthemesprovides the Tufte theme for the plotsreprmakes it easier to change the appearance of assorted objects; specifically, changing the size of the data plots
library(tidyverse)
library(ggthemes)
library(repr)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ✔ dplyr 1.1.4 ✔ readr 2.1.5 ✔ forcats 1.0.0 ✔ stringr 1.5.1 ✔ ggplot2 3.5.2 ✔ tibble 3.2.1 ✔ lubridate 1.9.4 ✔ tidyr 1.3.1 ✔ purrr 1.0.4 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Custom color palette for the multi-line graphs¶
palette_ordered <- c('darkred', 'deepskyblue2', 'cadetblue', 'red3',
'steelblue3', 'lightpink2', 'darkslategrey', 'paleturquoise3',
'royalblue3', 'darkorange', 'palegreen3', 'navajowhite3')
Load the BehaviorSpace results from M6_yes_rewiring_table.csv file¶
Variables varied as follows:
["rewire-networks" true false]
["heterogenous" true false]
["travel-radius" 2 10 20]
["rewire-links-rate" 10 20 50 80]
Runs measured by these reporters:
average-path-length
mean-clustering-coefficient
mean-clustering-coefficient-not-dead
percent-susceptible
percent-infected
percent-resistant
percent-dead
Stop each run after this condition is met:
all? turtles [ not infected? ]
Run each configuration 4000 times
results_raw <- read_csv('./M6_yes_rewiring_table.csv', skip = 6, show_col_types = FALSE)
Select variables by location, reorder for logical flow of information, rename for simplicity¶
results_named <- results_raw %>% select(B_run = 1, rewire_networks = 3, heterogenous = 7, travel_radius = 9, rewire_links_rate = 16,
mean_path_length = 18, mean_clustering_coefficient = 19, percent_susceptible = 21, percent_resistant = 23,
percent_dead = 24)
head(results_named)
| B_run | rewire_networks | heterogenous | travel_radius | rewire_links_rate | mean_path_length | mean_clustering_coefficient | percent_susceptible | percent_resistant | percent_dead |
|---|---|---|---|---|---|---|---|---|---|
| <dbl> | <lgl> | <lgl> | <dbl> | <dbl> | <chr> | <dbl> | <dbl> | <dbl> | <dbl> |
| 4 | TRUE | TRUE | 2 | 10 | false | 0.5538908 | 2.666667 | 84.66667 | 12.66667 |
| 7 | TRUE | TRUE | 2 | 10 | false | 0.5716712 | 2.333333 | 88.66667 | 9.00000 |
| 1 | TRUE | TRUE | 2 | 10 | 9.301962095875139 | 0.5715161 | 3.000000 | 87.00000 | 10.00000 |
| 10 | TRUE | TRUE | 2 | 10 | 8.810501672240802 | 0.5607492 | 1.666667 | 86.66667 | 11.66667 |
| 5 | TRUE | TRUE | 2 | 10 | 9.087959866220736 | 0.5533254 | 3.333333 | 84.33333 | 12.33333 |
| 11 | TRUE | TRUE | 2 | 10 | 9.262140468227425 | 0.5676970 | 8.000000 | 82.00000 | 10.00000 |
Group runs by parameter values and resequence the rows 1 through 4000 for each parameter group¶
results_runs <- results_named %>% group_by(rewire_networks, heterogenous, travel_radius, rewire_links_rate) %>% mutate(run = row_number())
head(results_runs)
tail(results_runs)
| B_run | rewire_networks | heterogenous | travel_radius | rewire_links_rate | mean_path_length | mean_clustering_coefficient | percent_susceptible | percent_resistant | percent_dead | run |
|---|---|---|---|---|---|---|---|---|---|---|
| <dbl> | <lgl> | <lgl> | <dbl> | <dbl> | <chr> | <dbl> | <dbl> | <dbl> | <dbl> | <int> |
| 4 | TRUE | TRUE | 2 | 10 | false | 0.5538908 | 2.666667 | 84.66667 | 12.66667 | 1 |
| 7 | TRUE | TRUE | 2 | 10 | false | 0.5716712 | 2.333333 | 88.66667 | 9.00000 | 2 |
| 1 | TRUE | TRUE | 2 | 10 | 9.301962095875139 | 0.5715161 | 3.000000 | 87.00000 | 10.00000 | 3 |
| 10 | TRUE | TRUE | 2 | 10 | 8.810501672240802 | 0.5607492 | 1.666667 | 86.66667 | 11.66667 | 4 |
| 5 | TRUE | TRUE | 2 | 10 | 9.087959866220736 | 0.5533254 | 3.333333 | 84.33333 | 12.33333 | 5 |
| 11 | TRUE | TRUE | 2 | 10 | 9.262140468227425 | 0.5676970 | 8.000000 | 82.00000 | 10.00000 | 6 |
| B_run | rewire_networks | heterogenous | travel_radius | rewire_links_rate | mean_path_length | mean_clustering_coefficient | percent_susceptible | percent_resistant | percent_dead | run |
|---|---|---|---|---|---|---|---|---|---|---|
| <dbl> | <lgl> | <lgl> | <dbl> | <dbl> | <chr> | <dbl> | <dbl> | <dbl> | <dbl> | <int> |
| 191993 | FALSE | FALSE | 20 | 80 | 9.043901895206243 | 0.5432066 | 0.3333333 | 88.00000 | 11.666667 | 3995 |
| 191996 | FALSE | FALSE | 20 | 80 | 9.600557413600892 | 0.5629702 | 1.0000000 | 91.66667 | 7.333333 | 3996 |
| 191994 | FALSE | FALSE | 20 | 80 | 8.578060200668896 | 0.5455832 | 0.6666667 | 86.33333 | 13.000000 | 3997 |
| 191998 | FALSE | FALSE | 20 | 80 | 8.869364548494984 | 0.5572436 | 1.0000000 | 89.66667 | 9.333333 | 3998 |
| 191997 | FALSE | FALSE | 20 | 80 | 9.723522853957636 | 0.5630450 | 1.6666667 | 87.33333 | 11.000000 | 3999 |
| 192000 | FALSE | FALSE | 20 | 80 | 9.126711259754739 | 0.5469290 | 0.3333333 | 88.33333 | 11.333333 | 4000 |
Calculate cumulative means for the path lengths, clustering coefficients, and percents susceptible/resistant/dead after each run concludes.¶
Select the relevant variables.
Note the filter(mean_path_length != 'false'): in NetLogo's nw extension, nw:mean_path_length returns false if any nodes have zero links. I don't know what the edge cases are that lead to that.
results <- results_runs %>% filter(mean_path_length != 'false') %>%
mutate(cum_mean_path_length = cummean(mean_path_length), cum_mean_clustering_coefficient = cummean(mean_clustering_coefficient),
cum_percent_susceptible = cummean(percent_susceptible), cum_percent_resistant = cummean(percent_resistant),
cum_percent_dead = cummean(percent_dead)) %>%
select(run, heterogenous, rewire_networks, rewire_links_rate, travel_radius, cum_mean_path_length, cum_mean_clustering_coefficient,
cum_percent_susceptible, cum_percent_resistant, cum_percent_dead, percent_dead) %>% ungroup
head(results)
tail(results)
| run | heterogenous | rewire_networks | rewire_links_rate | travel_radius | cum_mean_path_length | cum_mean_clustering_coefficient | cum_percent_susceptible | cum_percent_resistant | cum_percent_dead | percent_dead |
|---|---|---|---|---|---|---|---|---|---|---|
| <int> | <lgl> | <lgl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> |
| 3 | TRUE | TRUE | 10 | 2 | 9.301962 | 0.5715161 | 3.000000 | 87.00000 | 10.00000 | 10.000000 |
| 4 | TRUE | TRUE | 10 | 2 | 9.056232 | 0.5661326 | 2.333333 | 86.83333 | 10.83333 | 11.666667 |
| 5 | TRUE | TRUE | 10 | 2 | 9.066808 | 0.5618636 | 2.666667 | 86.00000 | 11.33333 | 12.333333 |
| 6 | TRUE | TRUE | 10 | 2 | 9.115641 | 0.5633219 | 4.000000 | 85.00000 | 11.00000 | 10.000000 |
| 7 | TRUE | TRUE | 10 | 2 | 9.037717 | 0.5659198 | 4.000000 | 85.20000 | 10.80000 | 10.000000 |
| 8 | TRUE | TRUE | 10 | 2 | 9.104965 | 0.5662062 | 4.444444 | 84.94444 | 10.61111 | 9.666667 |
| run | heterogenous | rewire_networks | rewire_links_rate | travel_radius | cum_mean_path_length | cum_mean_clustering_coefficient | cum_percent_susceptible | cum_percent_resistant | cum_percent_dead | percent_dead |
|---|---|---|---|---|---|---|---|---|---|---|
| <int> | <lgl> | <lgl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> |
| 3995 | FALSE | FALSE | 80 | 20 | 9.064987 | 0.5625082 | 1.548372 | 88.64692 | 9.804709 | 11.666667 |
| 3996 | FALSE | FALSE | 80 | 20 | 9.065139 | 0.5625083 | 1.548216 | 88.64778 | 9.804005 | 7.333333 |
| 3997 | FALSE | FALSE | 80 | 20 | 9.065001 | 0.5625035 | 1.547965 | 88.64712 | 9.804915 | 13.000000 |
| 3998 | FALSE | FALSE | 80 | 20 | 9.064945 | 0.5625020 | 1.547809 | 88.64741 | 9.804781 | 9.333333 |
| 3999 | FALSE | FALSE | 80 | 20 | 9.065132 | 0.5625021 | 1.547843 | 88.64704 | 9.805121 | 11.000000 |
| 4000 | FALSE | FALSE | 80 | 20 | 9.065150 | 0.5624977 | 1.547497 | 88.64695 | 9.805556 | 11.333333 |
These plots show the effect that travel radius and the link-rewiring rate have on the percent of the population that dies from the virus in a superspreader context.
It looks like a dramatically small travel radius has a significant effect on the percent dead, but then when the travel radius doubles from 10 to 20, the percent_dead decreases. I cannot explain this.
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'TRUE') %>%
ggplot(aes(x = rewire_links_rate, y = cum_percent_dead, color = as.factor(travel_radius))) +
theme_tufte(base_size = 18) +
geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) +
scale_color_manual(values = palette_ordered) +
scale_x_continuous(breaks = seq(0, 100, 20)) +
labs(x = 'rewire links rate', y = 'percent dead', color = 'travel radius')
This is the same plot but without superspreaders involved.
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'FALSE') %>%
ggplot(aes(x = rewire_links_rate, y = cum_percent_dead, color = as.factor(travel_radius))) +
theme_tufte(base_size = 18) +
geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) +
scale_color_manual(values = palette_ordered) +
scale_x_continuous(breaks = seq(0, 100, 20)) +
labs(x = 'rewire links rate', y = 'percent dead', color = 'travel radius')
The clustering coefficient of the system decreases as the travel radius increases.
When the nodes are free to make connections at greater distances, then they aren't increasing the degrees of nodes within their immediate vicinity. It behaves identically with and without superspreaders.
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'FALSE') %>%
ggplot(aes(x = rewire_links_rate, y = cum_mean_clustering_coefficient, color = as.factor(travel_radius))) +
theme_tufte(base_size = 18) +
geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) +
scale_color_manual(values = palette_ordered) +
scale_x_continuous(breaks = seq(0, 100, 20)) +
labs(x = 'rewire links rate', y = 'mean clustering coefficient (system)', color = 'travel radius')
This plot shows the relationship between travel radius, rewire links rate, and the mean path length.
Similar to the clustering coefficient, the mean path length decreases as travel radius and rewire links rate increase. When nodes are free to make connections farther away from home, they are reducing path distances across the system.
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'TRUE') %>%
ggplot(aes(x = rewire_links_rate, y = cum_mean_path_length, color = as.factor(travel_radius))) +
theme_tufte(base_size = 18) +
geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) +
scale_color_manual(values = palette_ordered) +
scale_x_continuous(breaks = seq(0, 100, 20)) +
labs(x = 'rewire links rate', y = 'mean path length', color = 'travel radius')