Superspreader model analysis¶

CAS 520, Spring B 2025¶

Load required packages¶

  • tidyverse provides multiple packages for data reading, analyzing, and plotting
  • ggthemes provides the Tufte theme for the plots
  • repr makes it easier to change the appearance of assorted objects; specifically, changing the size of the data plots
In [1]:
library(tidyverse)
library(ggthemes)
library(repr)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Custom color palette for the multi-line graphs¶

In [8]:
palette_ordered <- c('darkred', 'deepskyblue2', 'cadetblue', 'red3', 
                    'steelblue3', 'lightpink2', 'darkslategrey', 'paleturquoise3', 
                    'royalblue3', 'darkorange', 'palegreen3', 'navajowhite3')

Load the BehaviorSpace results from M6_yes_rewiring_table.csv file¶

Variables varied as follows:

["rewire-networks" true false]
["heterogenous" true false]
["travel-radius" 2 10 20]
["rewire-links-rate" 10 20 50 80]

Runs measured by these reporters:

average-path-length
mean-clustering-coefficient
mean-clustering-coefficient-not-dead
percent-susceptible
percent-infected
percent-resistant
percent-dead

Stop each run after this condition is met:

all? turtles [ not infected? ]

Run each configuration 4000 times

In [152]:
results_raw <- read_csv('./M6_yes_rewiring_table.csv', skip = 6, show_col_types = FALSE)

Select variables by location, reorder for logical flow of information, rename for simplicity¶

In [40]:
results_named <- results_raw %>% select(B_run = 1, rewire_networks = 3, heterogenous = 7, travel_radius = 9, rewire_links_rate = 16,
                                        mean_path_length = 18, mean_clustering_coefficient = 19, percent_susceptible = 21, percent_resistant = 23, 
                                        percent_dead = 24)
head(results_named)
A tibble: 6 × 10
B_runrewire_networksheterogenoustravel_radiusrewire_links_ratemean_path_lengthmean_clustering_coefficientpercent_susceptiblepercent_resistantpercent_dead
<dbl><lgl><lgl><dbl><dbl><chr><dbl><dbl><dbl><dbl>
4TRUETRUE210false 0.55389082.66666784.6666712.66667
7TRUETRUE210false 0.57167122.33333388.66667 9.00000
1TRUETRUE2109.3019620958751390.57151613.00000087.0000010.00000
10TRUETRUE2108.8105016722408020.56074921.66666786.6666711.66667
5TRUETRUE2109.0879598662207360.55332543.33333384.3333312.33333
11TRUETRUE2109.2621404682274250.56769708.00000082.0000010.00000

Group runs by parameter values and resequence the rows 1 through 4000 for each parameter group¶

In [42]:
results_runs <- results_named %>% group_by(rewire_networks, heterogenous, travel_radius, rewire_links_rate) %>% mutate(run = row_number())
head(results_runs)
tail(results_runs)
A grouped_df: 6 × 11
B_runrewire_networksheterogenoustravel_radiusrewire_links_ratemean_path_lengthmean_clustering_coefficientpercent_susceptiblepercent_resistantpercent_deadrun
<dbl><lgl><lgl><dbl><dbl><chr><dbl><dbl><dbl><dbl><int>
4TRUETRUE210false 0.55389082.66666784.6666712.666671
7TRUETRUE210false 0.57167122.33333388.66667 9.000002
1TRUETRUE2109.3019620958751390.57151613.00000087.0000010.000003
10TRUETRUE2108.8105016722408020.56074921.66666786.6666711.666674
5TRUETRUE2109.0879598662207360.55332543.33333384.3333312.333335
11TRUETRUE2109.2621404682274250.56769708.00000082.0000010.000006
A grouped_df: 6 × 11
B_runrewire_networksheterogenoustravel_radiusrewire_links_ratemean_path_lengthmean_clustering_coefficientpercent_susceptiblepercent_resistantpercent_deadrun
<dbl><lgl><lgl><dbl><dbl><chr><dbl><dbl><dbl><dbl><int>
191993FALSEFALSE20809.0439018952062430.54320660.333333388.0000011.6666673995
191996FALSEFALSE20809.6005574136008920.56297021.000000091.66667 7.3333333996
191994FALSEFALSE20808.5780602006688960.54558320.666666786.3333313.0000003997
191998FALSEFALSE20808.8693645484949840.55724361.000000089.66667 9.3333333998
191997FALSEFALSE20809.7235228539576360.56304501.666666787.3333311.0000003999
192000FALSEFALSE20809.1267112597547390.54692900.333333388.3333311.3333334000

Calculate cumulative means for the path lengths, clustering coefficients, and percents susceptible/resistant/dead after each run concludes.¶

Select the relevant variables.

Note the filter(mean_path_length != 'false'): in NetLogo's nw extension, nw:mean_path_length returns false if any nodes have zero links. I don't know what the edge cases are that lead to that.

In [100]:
results <- results_runs %>% filter(mean_path_length != 'false') %>% 
    mutate(cum_mean_path_length = cummean(mean_path_length), cum_mean_clustering_coefficient = cummean(mean_clustering_coefficient),
           cum_percent_susceptible = cummean(percent_susceptible), cum_percent_resistant = cummean(percent_resistant), 
           cum_percent_dead = cummean(percent_dead)) %>% 
    select(run, heterogenous, rewire_networks,  rewire_links_rate, travel_radius, cum_mean_path_length, cum_mean_clustering_coefficient,
          cum_percent_susceptible, cum_percent_resistant, cum_percent_dead, percent_dead) %>% ungroup
head(results)
tail(results)
A tibble: 6 × 11
runheterogenousrewire_networksrewire_links_ratetravel_radiuscum_mean_path_lengthcum_mean_clustering_coefficientcum_percent_susceptiblecum_percent_resistantcum_percent_deadpercent_dead
<int><lgl><lgl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
3TRUETRUE1029.3019620.57151613.00000087.0000010.0000010.000000
4TRUETRUE1029.0562320.56613262.33333386.8333310.8333311.666667
5TRUETRUE1029.0668080.56186362.66666786.0000011.3333312.333333
6TRUETRUE1029.1156410.56332194.00000085.0000011.0000010.000000
7TRUETRUE1029.0377170.56591984.00000085.2000010.8000010.000000
8TRUETRUE1029.1049650.56620624.44444484.9444410.61111 9.666667
A tibble: 6 × 11
runheterogenousrewire_networksrewire_links_ratetravel_radiuscum_mean_path_lengthcum_mean_clustering_coefficientcum_percent_susceptiblecum_percent_resistantcum_percent_deadpercent_dead
<int><lgl><lgl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
3995FALSEFALSE80209.0649870.56250821.54837288.646929.80470911.666667
3996FALSEFALSE80209.0651390.56250831.54821688.647789.804005 7.333333
3997FALSEFALSE80209.0650010.56250351.54796588.647129.80491513.000000
3998FALSEFALSE80209.0649450.56250201.54780988.647419.804781 9.333333
3999FALSEFALSE80209.0651320.56250211.54784388.647049.80512111.000000
4000FALSEFALSE80209.0651500.56249771.54749788.646959.80555611.333333

These plots show the effect that travel radius and the link-rewiring rate have on the percent of the population that dies from the virus in a superspreader context.

It looks like a dramatically small travel radius has a significant effect on the percent dead, but then when the travel radius doubles from 10 to 20, the percent_dead decreases. I cannot explain this.

In [119]:
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'TRUE') %>% 
    ggplot(aes(x = rewire_links_rate, y = cum_percent_dead, color = as.factor(travel_radius))) + 
    theme_tufte(base_size = 18) +
    geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) + 
    scale_color_manual(values = palette_ordered) +
    scale_x_continuous(breaks = seq(0, 100, 20)) +
    labs(x = 'rewire links rate', y = 'percent dead', color = 'travel radius')
No description has been provided for this image

This is the same plot but without superspreaders involved.

In [130]:
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'FALSE') %>% 
    ggplot(aes(x = rewire_links_rate, y = cum_percent_dead, color = as.factor(travel_radius))) + 
    theme_tufte(base_size = 18) +
    geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) + 
    scale_color_manual(values = palette_ordered) +
    scale_x_continuous(breaks = seq(0, 100, 20)) +
    labs(x = 'rewire links rate', y = 'percent dead', color = 'travel radius')
No description has been provided for this image

The clustering coefficient of the system decreases as the travel radius increases.

When the nodes are free to make connections at greater distances, then they aren't increasing the degrees of nodes within their immediate vicinity. It behaves identically with and without superspreaders.

In [132]:
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'FALSE') %>% 
    ggplot(aes(x = rewire_links_rate, y = cum_mean_clustering_coefficient, color = as.factor(travel_radius))) + 
    theme_tufte(base_size = 18) +
    geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) + 
    scale_color_manual(values = palette_ordered) +
    scale_x_continuous(breaks = seq(0, 100, 20)) +
    labs(x = 'rewire links rate', y = 'mean clustering coefficient (system)', color = 'travel radius')
No description has been provided for this image

This plot shows the relationship between travel radius, rewire links rate, and the mean path length.

Similar to the clustering coefficient, the mean path length decreases as travel radius and rewire links rate increase. When nodes are free to make connections farther away from home, they are reducing path distances across the system.

In [140]:
results %>% filter(run == 4000, rewire_networks == 'TRUE', heterogenous == 'TRUE') %>% 
    ggplot(aes(x = rewire_links_rate, y = cum_mean_path_length, color = as.factor(travel_radius))) + 
    theme_tufte(base_size = 18) +
    geom_line(linewidth = 1, linetype = 'dashed') + geom_point(size = 4, shape = 16) + 
    scale_color_manual(values = palette_ordered) +
    scale_x_continuous(breaks = seq(0, 100, 20)) +
    labs(x = 'rewire links rate', y = 'mean path length', color = 'travel radius')
No description has been provided for this image
In [ ]: