> In particular, why choose to study robustness against doubling time (which seems intuitively like it wouldn't affect the shape of the tree much)
As I understand it, the doubling times observed in the simulations were primarily the result of the ascertainment and transmission rate parameters.
Care to elaborate why you think the robustness of the model with respect to transmission rate should be assumed? I don't share your intuition here, and note that the authors observe, "that sensitivity analyses with longer doubling times increase the support for multiple introductions."
You really fault them for robustness analysis here?
To be clear I don't fault them for studying robustness against doubling time; I fault them for not studying robustness against connectivity of the infection network, since that seems like it would be more important than any of the parameters that they did study. My intuition is that when spread is highly deterministic (e.g. if R0 = 2 and each patient infects exactly two others), it's easy to make inferences about past spread from the present. For example, in that case it really would be near-impossible for a later lineage to outcompete an earlier one.
But we know the spread of SARS-CoV-2 is actually stochastic, with most lineages dying out but a few exploding due to super-spreader events. In that case it's much harder to judge whether a clade is big because it had more generations to grow, or just big because of a few (un)lucky founder effects. In Pekar's epi simulation, that stochasticity is modeled by their connectivity network. I expect that a more overdispersed network (i.e. greater variance in the number of edges at each vertex, keeping the same average) would make non-modal outcomes--like the real pandemic's phylogeny, if it arose from a single introduction--more likely.
Their results of the simulations are stochastic. They discuss this in-depth, as it complicates their analysis.
I don't understand what you're trying to say. Everyone agrees that the spread is stochastic. Why are you starting with a hypothetical misinterpreation of an R value to make a deterministic strawman? You think that their simulations were too deterministic because of their connectivity network?
> -like the real pandemic's phylogeny, if it arose from a single introduction-
> You think that their simulations were too deterministic because of their connectivity network?
Yeah, pretty much; and it's what other critics, including well-credentialed mathematical biologists, are saying too. There's a continuum of dispersion, with my perfectly-deterministic strawman at the left extreme but extending to infinity. Their power-law network adds some dispersion, but how do we know it's enough? I believe they chose that distribution because it's been shown to fit some real data (including the spread of HIV) reasonably well; but how do we know it fits the early spread of SARS-CoV-2, in the earliest lineages of the virus with unknown biology, in an unknown group of people with unknown behaviors?
I don't know how to root the phylogeny, and I'm mistrustful of anyone who claims they can based on the limited information available. Anyone who's built and attempted to validate mathematical models knows that sometimes, there's simply not enough information to confidently reach any useful conclusions. Absent validation of the approaches used here (e.g. evidence that they've successfully made predictions in the past in similar situations), I believe that's our situation here.
As I understand it, the doubling times observed in the simulations were primarily the result of the ascertainment and transmission rate parameters.
Care to elaborate why you think the robustness of the model with respect to transmission rate should be assumed? I don't share your intuition here, and note that the authors observe, "that sensitivity analyses with longer doubling times increase the support for multiple introductions."
You really fault them for robustness analysis here?