Review of Examine.com’s vitamin write-ups

There are a lot of vitamins and other supplements in the world, way more than I have time to investigate. Examine.com has a pretty good reputation for its reports on vitamins and supplements. It would be extremely convenient for me if this reputation was merited. So I asked Martin Bernstoff to spot check some of their reports. 

We originally wanted a fairly thorough review of multiple Examine write-ups. Alas, Martin felt the press of grad school after two shallow reviews and had to step back. This is still enough to be useful so we wanted to share, but please keep in mind its limitations. And if you feel motivated to contribute checks of more articles, please reach out to me (elizabeth@acesounderglass.com).

My (Elizabeth’s) tentative conclusion is that it would take tens of hours to beat an Examine general write-up, but they are not complete in either their list of topics nor their investigation into individual topics. If a particular effect is important to you, you will still need to do your own research.

Photo credit DALL-E

Write-Ups

Vitamin B12

Claim: “The actual rate of deficiency [of B12] is quite variable and it isn’t fully known what it is, but elderly persons (above 65), vegetarians, or those with digestion or intestinal complications are almost always at a higher risk than otherwise healthy and omnivorous youth”

Verdict: True but not well cited. Their citation merely asserts that these groups have shortages rather than providing measurements, but Martin found a meta-analysis making the same claim for vegetarians (the only group he looked for).

Toxicology

Verdict: Very brief. Couldn’t find much on my own. Seems reasonable.

Claim: “Vitamin B12 can be measured in the blood by serum B12 concentrations, which is reproducible and reliable but may not accurately reflect bodily vitamin B12 stores (as low B12 concentrations in plasma or vitamin B12 deficiencies do not always coexist in a reliable manner[19][26][27]) with a predictive value being reported to be as low as 22%”

Verdict: True, the positive predictive value was 22%, but with a negative predictive value of 100% at the chosen threshold. But that’s only the numbers at one threshold. To know whether this is good or bad, we’d have to get numbers at different threshold (or, preferably, a ROC-AUC).

Claim: B12 supplements can improve depression

Examine reviews a handful of observational studies showing a correlation, but includes no RCTs.  This is in spite of there actually being RCTs like Koning et al. 2016 and a full meta analysis, neither of which find an effect. 

The lack of effect in RCTs is less damning than it sounds. I (Elizabeth) haven’t checked all of the studies, but the Koning study didn’t confine itself to subjects with low B12 and only tested serum B12 at baseline, not after treatment. So they have ruled out neither “low B12 can cause depression, but so do a lot of other things” nor “B12 can work but they used the wrong form”.

I still find it concerning that Examine didn’t even mention the RCTs, and I don’t have any reason to believe their correlational studies are any better. 

Interactions with pregnancy

Only one study on acute lymphoblastic leukemia. Seems a weird choice. Large meta-analyses exist for pre-term birth and low birth weight, likely much more important. Rogne et al. 2016.

Overall

They don’t seem to be saying much wrong but the write-up is not nearly as comprehensive as we had hoped. To give Examine its best shot, we decided the next vitamin should be on their best write-up. We tried asking Examine which article they are especially confident in. Unfortunately, whoever handles their public email address didn’t get the point after 3 emails, so Martin made his best guess. 

Vitamin D

Upper respiratory tract infections.

They summarize several studies but miss a very large RCT published in JAMA, the VIDARIS trial. All studies (including the VIDARIS trial) show no effect, so they might’ve considered the matter settled and stopped looking for more trials, which seems reasonable.

Claim: Vitamin D helps premenstrual syndrome

”Most studies have found a decrease in general symptoms when given to women with vitamin D deficiency, some finding notable reductions and some finding small reductions. It’s currently not known why studies differ, and more research is needed”

This summary seemed optimistic after Martin looked into the studies:

  • Abdollahi 2019:
    • No statistically significant differences between groups.
    • The authors highlight statistically significant decreases for a handful of symptoms in the Vitamin D group, but the decrease is similar in magnitude to placebo. Vitamin D and placebo both have 5 outcomes which were statistically significant.
  • Dadkhah 2016:
    • No statistically significant differences between treatment groups
  • Bahrami 2018:
    • No control group
  • Heidari 2019:
    • Marked differences between groups, but absolutely terrible reporting by the authors – they don’t even mention this difference in the abstract. This makes me (Martin) somewhat worried about the results – if they knew what they were doing, they’d focus the abstract on the difference in differences.:
  • Tartagni 2015:
    • Appears to show notable differences between groups, But terrible reporting. Tests change relative to baseline (?!), rather than differences in trends or differences in differences. 

In conclusion, only the poorest research finds effects – not a great indicator of a promising intervention. But Examine didn’t miss any obvious studies.

Claim: “There is some evidence that vitamin D may improve inflammation and clinical symptoms in COVID-19 patients, but this may not hold true with all dosing regimens. So far, a few studies have shown that high dosages for 8–14 days may work, but a single high dose isn’t likely to have the same benefit.”

The evidence Martin found seems to support their conclusions. They’re missing one relatively large, recent study (De Niet 2022). More importantly, all included studies are about hospital patients given vitamin D after admission, which are useless for determining if Vitamin D is a good preventative, especially because some forms of vitamin D take days to be turned into a useful form in the body. 

  • Murai 2021:
    • The regimen was a single, high dose at admission.
    • No statistically significant differences between groups, all the effect sizes are tiny or non-existent.
  • Sabico 2021:
    • Compares Vitamin D 5000 IU/daily to 1000 IU/daily in hospitalized patients.
    • In the Vitamin D group, they show faster
      • Time to recovery (6.2 ± 0.8 versus 9.1 ± 0.8; p = 0.039)
      • Time to restoration of taste (11.4 ± 1.0 versus 16.9 ± 1.7; p = 0.035)
        • The Kaplan-Meier Plot looks weird here, though. What happens on day 14?!
    • All symptom durations, except sore throat, were lower in the 5000 IU group:

All analyses were adjusted for age, BMI and type of D vitamin – which is a good thing, because it appears the 5000 IU group was healthier at baseline:

  • Castillo 2020:
    • Huge effect – half of the control group had to go to the ICU, whereas only one person in the intervention group did so (OR 0.02).
    • Nothing apparently wrong, but I’m still highly suspicious of the study:
      • An apparently well-done randomized pilot trial, early on, published in “The Journal of Steroid Biochemistry and Molecular Biology”. Very worrying that it isn’t published somewhere more prestigious.
      • They gave hydroxychloroquine as the “best available treatment”, even though there was no evidence of effect at the time of the study.
      • They call the study “double masked” – I hope this means double-blinded, because otherwise the study is close to worthless since their primary outcomes are based on doctor’s behavior.
      • The follow-up study is still recruiting.

Conclusion

I don’t know of a better comprehensive resource than Examine.com. It is alas still not comprehensive enough for important use cases, but still a useful shortcut for smaller problems.

Thanks to the FTX Regrant program for funding this post, and Martin for doing most of the work.

Cognitive Risks of Adolescent Binge Drinking

The takeaway

Our goal was to quantify the cognitive risks of heavy but not abusive alcohol consumption. This is an inhernetly difficult task: the world is noisy, humans are highly variable, and institutional review boards won’t let us do challenge trials of known poisons. This makes strong inference or quantification of small risks incredibly difficult. We know for a fact that enough alcohol can damage you, and even levels that aren’t inherently dangerous can cause dumb decisions with long term consequences. All that said… when we tried to quantify the level of cognitive damage caused by college level binge drinking, we couldn’t demonstrate an effect. This doesn’t mean there isn’t one (if nothing else, “here, hold my beer” moments are real), just that it is below the threshold detectable with current methods and levels of variation in the population.

Motivation

In discussions with recent college graduates I (Elizabeth) casually mentioned that alcohol is obviously damaging to cognition. They were shocked and dismayed to find their friends were poisoning themselves, and wanted the costs quantified so they could reason with them (I hang around a very specific set of college students). Martin Bernstorff and I set out to research this together. Ultimately, 90-95% of the research was done by him, with me mostly contributing strategic guidance and somewhere between editing and co-writing this post. 

I spent an hour getting DALL-E to draw this

Problems with research on drinking during adolescence

Literature on the causal medium- to long-term effects of non-alcoholism-level drinking on cognition is, to our strong surprise, extremely lacking. This isn’t just our poor research skills; in 2019, the Danish Ministry of Health attempted a comprehensive review and concluded that:

“We actually know relatively little about which specific biological consequences a high level of alcohol intake during adolescence will have on youth”.

And it isn’t because scientists are ignoring the problem either. Studying medium- and long-term effects on brain development is difficult because of the myriad of confounders and/or colliders for both cognition and alcohol consumption, and because more mechanist experiments would be very difficult and are institutionally forbidden anyway (“Dear IRB: we would like to violently poison some teenagers for four years, while forbidding the other half to engage in standard college socialization”). You could randomize abstinence, but we’ll get back to that.

One problem highly prevalent in alcohol literature is the abstinence bias. People who abstain from alcohol intake are likely to do so for a reason, for example chronic disease, being highly conscientious and religious, or a bad family history with alcohol. Even if you factor out all of the known confounders, it’s still vanishingly unlikely the drinking and non-drinking samples are identical. Whatever the differences, they’re likely to affect cognitive (and other) outcomes. 

Any analysis comparing “no drinking” to “drinking” will suffer from this by estimating the effect of no alcohol + confounders, rather than the effect of alcohol. Unfortunately, this rules out a surprising number of studies (code available upon request). 

Confounding is possible to mitigate if we have accurate intuition about the causal network, and we can estimate the effects of confounders accurately. We have to draw a directed acyclic graph with the relevant causal factors and adjust analyses or design accordingly. This is essential, but has not permeated all of epidemiology (yet), and especially for older literature, this is not done. For a primer, Martin recommends “Draw Your Assumptions” on edX here.

Additionally, alcohol consumption is a politically live topic, and papers are likely to be biased. Which direction is a coin flip: public health wants to make it seem scarier, alcohol companies want to make it seem safer. Unfortunately, these biases don’t cancel out, they just obfuscate everything.

What can we do when we know much of the literature is likely biased, but we do not have a strong idea about the size or direction?

Triangulation

If we aggregate multiple estimates that are wrong, but in different (and overall uncorrelated) directions, we will approximate the true effect. For health, we have a few dimensions that we can vary over: observational/interventional, age, and species.

Randomized abstinence studies

Ideally, we would have strong evidence from randomized controlled trials of abstinence. In experimental studies like this, there is no doubt about the direction of causality. And, since participants are randomized, confounders are evenly distributed between intervention and control groups. This means that our estimate of the intervention effect is unbiased by confounders, both measured and unmeasured.

However, we were only able to find two such studies, both from the 80s, among light drinkers (mean 3 standard units per week), and of a duration of only 2-6 weeks (Bimbaum et al., 1983; Hannon et al., 1987)

Bimbaum et al. did not stick to the randomisation when analyzing their data, opening the door to confounding:

Which should decrease our confidence in their study. They found no effect of abstinence on their 7 cognitive measures.

In Hannon et al., instruction to abstain vs. maintain resulted in a difference in alcohol intake of 12.5 units pr. week over 2 weeks. On the WAIS-R vocabulary test, abstaining women scored 55.5 ± 6.7 and maintaining women scored 51.0 ± 8.8 (both mean ± SD). On the 3 other cognitive tests performed, they found no difference.

Especially due to the short duration, we should be very wary of extrapolating too much from these studies. However, it appears that for moderate amounts of drinking over a short time period, total abstinence does not provide a meaningful benefit in the above studies.

Observational studies on humans

Due to their observational nature (as opposed to being an experiment), these studies are extremely vulnerable to confounders, colliders, reverse causality etc. However, they are relatively cheap ways of getting information, and are performed in naturalistic settings.

One meta-analysis (Neafsey & Collins, 2011) compared moderate social drinking (< 4 drinks/day) to non-drinkers (note: the definition of moderate varies a lot between studies). They partially compensated for the abstinence bias by excluding “former drinkers” from their reference group, i.e. removing people who’ve stopped drinking for medical (or other) reasons. This should provide a less biased estimate of the true effect. They found a protective effect of social drinking on a composite endpoint, “cognitive decline/dementia” (Odds Ratio 0.79 [0.75; 0.84]).

Interestingly, they also found that studies adjusting for age, education, sex and smoking-status did not have markedly different estimates from those that did not (ORadjusted 0.75 vs. ORun-adjusted 0.79). This should decrease our worry about confounding overall.

Observational studies on alcohol for infants

Another angle for triangulation is the effect of moderate maternal alcohol intake during pregnancy on the offspring’s IQ. The brain is never more vulnerable than during fetal development. There are obviously large differences between fetal and adolescent brains, so any generalization should be accompanied with large error bars. However, this might give us an upper bound.

(Zuccolo et al., 2013) perform an elegant example of what’s called Mendelian randomization.

A SNP variant in a gene (ADH1B) is associated with decreased alcohol consumption. Since SNP are near-randomly assigned (but see the examination of assumptions below), one can interpret it as the SNP causing decreased alcohol consumption. If some assumptions are met, that’s essentially a randomized controlled trial! Alas, these assumptions are extremely strong and unlikely to be totally true – but it can still be much better than merely comparing two groups with differing alcohol consumption.

As the authors very explicitly state, this analysis assumes that:

1. The SNP variant (rs1229984) decreases maternal alcohol consumption. This is confirmed in the data. Unfortunately, the authors do this by chi-square test (“does this alter consumption at all?”) rather than estimating the effect size. However, we can do our own calculations using Table 5:

If we round each alcohol consumption category to the mean of its bounds (0, 0.5, 3.5, 9), we get a mean intake in the SNP variant group of 0.55 units/week and a mean intake in the non-carrier of 0.88 units/week (math). This means that SNP-carrier mothers drink, on average, 0.33 units/week less. That’s a pretty small difference! We would’ve liked the authors to do this calculation themselves, and use it to report IQ-difference per unit of alcohol per week.

2. There is no association between the genotype and confounding factors, including other genes. This assumption is satisfied for all factors examined in the study, like maternal age, parity, education, smoking in 1st trimester etc. (Table 4), but unmeasured confounding is totally a thing! E.g. a SNP which correlates with the current variant and causes a change in the offspring’s IQ/KS2-score.

3. The genotype does not affect the outcome by any path other than maternal alcohol consumption, for example through affecting metabolism of alcohol.

If we believe these assumptions to be true, the authors are estimating the effect of 0.33 maternal alcohol units per week on the offspring’s IQ and KS2-score. KS2-score is a test of intellectual achievement (similar to the SAT) for 11-year-olds with a mean of 100 points and a standard deviation of ~15 points. 

They find that the 0.33 unit/week decrease does not affect IQ (mean difference -0.01 [-2.8; 2.7]) and causes a 1.7 point (with a 95% confidence interval of between 0.4 and 3.0) increase in KS2 score. 

This is extremely interesting. Additionally, the authors complete a classical epidemiological study, adjusting for typical confounders:

This shows that the children of pre-pregnancy heavy drinkers, on average, scored 8.62 (with a standard error of  1.12) points higher on IQ than non-drinkers, 2.99 points (SE 1.06) after adjusting for confounders. However, they didn’t adjust for alcohol intake in other parts of the pregnancy! Puzzlingly, first trimester drinking has an effect in the opposite direction: -3.14 points (SE 1.64) on IQ. However, this was also not adjusted for previous alcohol intake. This means that the estimates in table 1 (pre-pregnancy and first trimester) aren’t independent, but we don’t know how they’re correlated. Good luck teasing out the causal effect of maternal alcohol intake and timing from that.

Either way, the authors (and I) interpret the effects as being highly confounded; either residual (the confounder was measured with insufficient accuracy for complete adjustment) or unknown (confounders that weren’t measured). For example, pre-pregnancy alcohol intake was strongly associated with professional social class and education (upper-class wine-drinkers?), whereas the opposite was true for first trimester alcohol intake. Perhaps drinking while you know you’re pregnant is low social status?

If you’re like Elizabeth you’re probably surprised that drinking increases with social class. I didn’t dig into this deeply, but a quick search found that it does appear to hold up.

This result is in conflict with that of the Mendelian randomization, but it makes sense. Mendelian randomization is less sensitive to confounding, so maybe there is no true effect. Also, the study only estimated the genetic effect of a 0.33 units/week difference, so the analyses are probably not sufficiently powered. 

Taken together, the study should probably update towards a lack of harm from moderate (whatever that means) levels of alcohol intake, although how big an update that is depends on your previous position. We say “moderate” because fetal alcohol syndrome is definitely a thing, so at sufficient alcohol intake it’s obviously harmful! .

Rodents

There is a decently sized, pretty well-conducted literature on adolescent intermittent ethanol exposure (science speak for “binge drinking on the weekend”). Rat adolescence is somewhat similar to human adolescence; it’s marked by sexual maturation, increased risk-taking and increased social play (Sengupta, 2013). The following is largely based on a deeper dive into the linked references from (Seemiller & Gould, 2020).

Adolescent intermittent ethanol exposure is typically operationalised as a blood-alcohol concentration of ~10 standard alcohol units, 0.5-3 times/day every 1-2 days during adolescence.

To interpret this, we make some big assumptions. Namely:

  1. Rodent blood-alcohol content can be translated 1:1 to human
  2. Effects on rodent cognition at a given alcohol concentration are similar to those on human cognition 
  3. Rodent adolescence can mimic human adolescence

Now, let’s dive in!

Two primary tasks are used in the literature:

The 5-choice serial reaction time task. 

Rodents are placed in a small box, and one of 5 holes is lit up. Rodents are measured at how good they are at touching the hole. 

Training in the 5-CSRTT varies between studies, but the two studies below consist of 6 training sessions at age 60 days. Initially, rats were rewarded with pellets from the feeder in the box to alert them to the possibility of reward. 

Afterwards, training sessions had gradually increasing difficulty. To begin with, the light stays on for 30 seconds to start, but the duration gradually decreases to 1 second. Rats progressed to the next training schedule based on either of 3 predefined criteria: 100 trials completed, >80% accuracy or <20% omissions. 

Naturally, you can measure a ton of stuff here! Generally, focus is on accuracy and omissions, but there are a ton of others:

From (Boutros et al., 2017) sup. table 1, congruent with (Semenova, 2012)

Now we know how they measured performance; but how did they imitate adolescent drinking?

Boutros et al. administered 5 g/kg of 25% ethanol through the mouth once per day in a 2-day on/off pattern, from age 28 days to 57 days – a total of 14 administrations. Based on blood alcohol content, this is equivalent to 10 standard units at each administration – quite a dose! Surprisingly, they found a decrease in omissions with the standard task, but no other systematic changes, in spite of 50+ analyses on variations of the measures (accuracy, omissions, correct responses, incorrect responses etc.) and task difficulty (length of the light staying on, whether they got the rats drunk etc.). We’d chalk this up to a chance finding.

Semenova et al. used the same training schedule, but administered 5 g/kg of 25% ethanol through the mouth every 8h for 4 days – a total of 12 administrations. They found small differences in different directions on different measures, but have the same multiple comparisons problem. Looks like noise to us.

The Barnes Maze 

Rodents are placed in the middle of an approximately 1m circle with 20-40 holes at the perimeter and are timed on how quickly they arrive at the hole with a reward (and escape box) below it. For timing spatial learning, the location of the hole is held constant. In (Coleman et al., 2014) and (Vetreno & Crews, 2012), rodents were timed once a day for 5 days. They were then given 4 days of rest, and the escape hole was relocated exactly 180° from the initial location. They were then timed again once a day, measuring relearning.


Figure: Tracing of the route taken by a control mouse right after the location was reversed, from Coleman et al., 2014.

Both studies found no effect of adolescent intermittent ethanol exposure on initial learning rate or errors. 

Vetreno found alcohol-exposed rats took longer to escape on their first trial but did equally well in all subsequent trials:

Whereas Coleman found a ~3x difference in performance on the relearning task, with similar half-times:

Somewhat suspiciously, even though Vetreno et al. is performed 2 years later than Coleman et al. and they share the same lab, they do not reference Coleman et al..

This does, technically, show an effect. However given the small size of effect, the number of metrics measured, file drawer effects, and the disagreement with the rest of the literature, we believe this is best treated as a null result.

Conclusion

So, what should we do? From the epidemiological literature, if you care about dementia risk, it looks like social drinking (i.e. excluding alcoholics) reduces your risk by ~20% as compared to not drinking. All other effects were part of a heterogenous literature with small effect sizes on cognition. Taking together, long-term cognitive effects of conventional alcohol intake during adolescence should play only a minor role in determining alcohol-intake.

Thanks to an FTX Future Fund regrantor for funding this work.

Bimbaum, I. M., Taylor, T. H., & Parker, E. S. (1983). Alcohol and Sober Mood State in Female Social Drinkers. Alcoholism: Clinical and Experimental Research, 7(4), 362–368. https://doi.org/10.1111/j.1530-0277.1983.tb05483.x

Boutros, N., Der-Avakian, A., Markou, A., & Semenova, S. (2017). Effects of early life stress and adolescent ethanol exposure on adult cognitive performance in the 5-choice serial reaction time task in Wistar male rats. Psychopharmacology, 234(9), 1549–1556. https://doi.org/10.1007/s00213-017-4555-3

Coleman, L. G., Liu, W., Oguz, I., Styner, M., & Crews, F. T. (2014). Adolescent binge ethanol treatment alters adult brain regional volumes, cortical extracellular matrix protein and behavioral flexibility. Pharmacology Biochemistry and Behavior, 116, 142–151. https://doi.org/10.1016/j.pbb.2013.11.021

Hannon, R., Butler, C. P., Day, C. L., Khan, S. A., Quitoriano, L. A., Butler, A. M., & Meredith, L. A. (1987). Social drinking and cognitive functioning in college students: A replication and reversibility study. Journal of Studies on Alcohol, 48(5), 502–506. https://doi.org/10.15288/jsa.1987.48.502

Neafsey, E. J., & Collins, M. A. (2011). Moderate alcohol consumption and cognitive risk. Neuropsychiatric Disease and Treatment, 7, 465–484. https://doi.org/10.2147/NDT.S23159

Seemiller, L. R., & Gould, T. J. (2020). The effects of adolescent alcohol exposure on learning and related neurobiology in humans and rodents. Neurobiology of Learning and Memory, 172, 107234. https://doi.org/10.1016/j.nlm.2020.107234

Semenova, S. (2012). Attention, impulsivity, and cognitive flexibility in adult male rats exposed to ethanol binge during adolescence as measured in the five-choice serial reaction time task: The effects of task and ethanol challenges. Psychopharmacology, 219(2), 433–442. https://doi.org/10.1007/s00213-011-2458-2

Sengupta, P. (2013). The Laboratory Rat: Relating Its Age With Human’s. International Journal of Preventive Medicine, 4(6), 624–630. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3733029/

Vetreno, R. P., & Crews, F. T. (2012). Adolescent binge drinking increases expression of the danger signal receptor agonist HMGB1 and toll-like receptors in the adult prefrontal cortex. Neuroscience, 226, 475–488. https://doi.org/10.1016/j.neuroscience.2012.08.046

Zuccolo, L., Lewis, S. J., Davey Smith, G., Sayal, K., Draper, E. S., Fraser, R., Barrow, M., Alati, R., Ring, S., Macleod, J., Golding, J., Heron, J., & Gray, R. (2013). Prenatal alcohol exposure and offspring cognition and school performance. A ‘Mendelian randomization’ natural experiment. International Journal of Epidemiology, 42(5), 1358–1370. https://doi.org/10.1093/ije/dyt172

Quick Poll: Booster Reactions

Lots of people are getting covid boosters now. To help myself and others plan I did an extremely informal poll on Twitter and Facebook about how people’s booster side effects compared to their second dose. Take home message: boosters are typically easier than second shots, but they’re bad often enough you should have a plan for that.

The poll was a mess for a number of reasons, including:

  • I didn’t describe the options very well, so it’s 2/3 freeform responses I collapsed into a few categories.
  • There was a tremendous variation in what combination of shots people got.
  • It’s self-reported. I have unusually data-minded friends which minimizes the typical problem of extreme responses getting disproportionate attention, but it doesn’t eliminate it, and self-report data has other issues.
  • I only sampled people who follow me on social media, who are predominantly <45 years old, reasonably healthy, reasonably high income, and mostly working desk jobs. 
  • I specified mRNA but not the manufacturer; Moderna but not Pfizer boosters are smaller than the original dose.

Nonetheless, the trend was pretty clear.

Of people who received three mRNA shots from the same manufacturer, comparing their second shot to their third:

  • 12 had no major symptoms either time (where major is defined as “affected what you could do in your day.” It specifically does not include arm soreness, including soreness that limited range of motion)
  • 2 had no major symptoms for their second shot but had major for their third
    • Not included in data: one person who got pregnant between their second and third shot
  • 23 had major symptoms for their second shot, and the third was easier
    • This includes at least one case where the third was still extremely bad and 2-3 “still pretty bad, just not as bad as the second”
    • Three cases fell short of  “major symptoms” for the second, but had an even easier third shot
  • 11 people had similar major symptoms both times
  • 2 had major symptoms for second shot, and third was worse

Of people who mix and matched doses

  • 2 had no major symptoms either time
  • 4 had no major symptoms for their second shot but had major symptoms for their third
    • Not included: 1 reported no symptoms for the first two and mild symptoms for the third
  • 4 had major symptoms for their second shot, and their third was easier
  • 2 people had major symptoms both times
  • 1 had major symptoms for their second shot, and their third was worse

Quick Look: Altitude and Child Development

A client came to me to investigate the effect of high altitude on child development and has given me permission to share the results. This post bears the usual marks of preliminary client work: I focused on the aspects of the question they cared about the most, not necessarily my favorite or the most important in general. The investigation stops when the client no longer wants to pay for more, not when I’ve achieved a particular level of certainty I’m satisfied with. Etc. In this particular case they were satisfied with the answer after only a few hours, and I did not pursue beyond that.

That out of the way: I investigated the impact of altitude on childhood outcomes, focusing on cognition. I ultimately focused mostly on effects visible at birth, because birth weight is such a hard to manipulate piece of data. What I found in < 3 hours of research is that altitude has an effect on birth weight that is very noticeable statistically, although the material impact is likely to be very small unless you are living in the Andes.

Children gestated at higher altitudes have lower birth weights

This seems to be generally supported by studies which are unusually rigorous for the field of fetal development. Even better, it’s supported in both South America (where higher altitudes correlate with lower income and lower density, and I suspect very different child-rearing practices) and Colorado (where the income relationship reverses and while I’m sure childhoods still differ somewhat, I suspect less so). The relationship also holds in Austria, which I know less about culturally but did produce the nicest graph.

This is a big deal because until you reach truly ridiculous numbers, higher birth weight is correlated with every good thing, although there’s reason to believe a loss due to high altitude is less bad than a loss caused by most other causes, which I’ll discuss later. 

[Also for any of you wondering if this is caused by a decrease in gestation time: good question, the answer appears to be no.]

Children raised at higher altitudes do worse on developmental tests 

There is a fair amount of data supporting this, and some even attempt to control for things like familiar wealth, prematurity, etc. I’m not convinced. The effects are modest, I expect families living at very high altitudes (typically rural) to be different in many ways from lower altitudes (typically urban) in ways that cause their children to score differently on tests without it making a meaningful impact on their life (and unlike birth weight, I didn’t find studies based in CO, where some trends reverse). Additionally, none of the studies looked specifically at children who were born at a lower altitude and moved, so some of the effects may be left over from the gestational effects discussed earlier. 

Hypoxia may not be your only problem

I went into this primed to believe reduced oxygen consumption was the problem. However, there’s additional evidence that UV radiation, which rises with altitude, may also be a concern. UV radiation is higher in some areas for other reasons, which indeed seems to correlate with reductions in cognition.

How much does this matter? (not much)

Based on a very cursory look at graphs on GIS (to be clear: I didn’t even check the papers, and their axes were shoddily labeled), 100 grams of birth weight corresponds to 0.2 IQ points for full term babies.

The studies consistently showed ~0.09 to 0.1 grams lower birth weight per meter of altitude. Studies showed this to be surprisingly linear; I’m skeptical and expect the reality to be more exponential or S shaped, but let’s use that rule of thumb for now. 0.1g/m means gestating in Denver rather than at sea level would shrink your baby by 170 grams (where 2500g-4500g is considered normal and healthy). If this was identical to other forms of fetal weight loss, which I don’t think it is, it would very roughly correspond to 0.35 IQ points lost. 

However, there’s reason to believe high-altitude fetal weight loss is less concerning than other forms. High altitude babies tend to have a higher brain mass percentage and are tall for their weight, suggesting they’ve prioritized growth amidst scarce resources rather than being straight out poisoned. So that small effect is even smaller than it first appears.

There was also evidence out of Austria that higher altitude increased risk of SIDS, but that disappeared when babies slept on their backs, which is standard practice now.

So gestating in Denver is definitely bad then? (No)

There are a billion things influencing gestation and childhood outcomes, and this is looking at exactly one of them, for not very long. If you are making a decision please look at all the relevant factors, and then factor in the streetlight effect that there may be harder to measure things pointing in the other direction. Do not overweight the last thing I happened to read.

In particular, Slime Mold Time Mold has some interesting data (which I haven’t verified but am hoping to at least ESC the series) that suggests higher altitudes within the US have fewer environmental contaminants, which you would expect to have all sorts of good effects.

Full notes available here.

Thanks to anonymous client for commissioning this research and Miranda Dixon-Luinenburg for copyediting.

Long Covid Informal Study Results

Introduction

Yesterday* I talked about a potential treatment for Long Covid, and referenced an informal study I’d analyzed that tried to test it, which had seemed promising but was ultimately a let down. That analysis was too long for its own post, so it’s going here instead. 

Gez Medinger ran an excellent-for-its-type study of interventions for long covid, with a focus on niacin, the center of the stack I took. I want to emphasize both how very good for its type this study was, and how limited the type is. Surveys of people in support groups who chose their own interventions is not a great way to determine anything. But really rigorous information will take a long time and some of us have to make decisions now, so I thought this was worth looking into.

Medinger does a great analysis in this youtube video. He very proactively owns all the limitations of the study (all of which should be predictable to regular readers of mine) and does what he can to make up for them in the analysis, while owning where that’s not possible. But he delivers the analysis in a video rather than a text post ugh why would you do that (answer: he was a professional filmmaker before he got long covid). I found this deeply hard to follow, so I wanted to play with the data directly. Medinger generously shared the data, at which point this snowballed into a full-blown analysis.

I think Medinger attributes his statistics to a medical doctor, but I couldn’t find it on relisten and I’m not watching that damn video again. My statistical analysis was done by my dad/Ph.D. statistician R. Craig Van Nostrand. His primary work is in industrial statistics but the math all transfers, and the biology-related judgment calls were made by me (for those of you just tuning in, I have a BA in biology and no other relevant credentials or accreditations).

The Study

As best I can determine, Medinger sent a survey to a variety of long covid support groups, asking what interventions people had tried in the last month, when they’d tried them, and how they felt relative to a month ago. Obviously this has a lot of limitations – it will exclude people who got better or worse enough they didn’t engage with support groups, it was in no way blinded, people chose their own interventions, it relied entirely on self-assessment, etc. 

Differences in Analysis

You can see Medinger’s analysis here. He compared the rate of improvement and decline among groups based on treatments. I instead transformed the improvement bucket to a number and did a multivariate analysis. 

Much better (near or at pre-covid)1
Significantly better0.5
A little better0.1
No change0
A little worse-0.2
Significantly worseCuriously unused
Much worse-1.2

You may notice that the numerical values of the statements are not symmetric- being “a little worse” is twice as bad as “a little better” is good. This was deliberate, based on my belief that people with chronic illness on average overestimate their improvement over short periods of time. We initially planned on doing a sensitivity analysis to see how this changed the results; in practice the treatment groups had very few people who got worse so this would only affect the no-treatment control, and it was obvious that fiddling with the numbers would not change the overall conclusion.

Also, no one checked “significantly worse”, and when asked Medinger couldn’t remember if it was an option at all. This suggests to me that “Much worse” should have a less bad value and “a little worse” a more bad value. However, we judged this wouldn’t affect the outcome enough to be worth the effort, and ignored it. 

We tossed all the data where people had made a change less than two weeks ago (this was slightly more than half of it), except for the no-change control group (140 people). Most things take time to have an effect and even more things take time to have an effect you can be sure isn’t random fluctuation. The original analysis attempted to fix this by looking at who had a sudden improvement or worsening, but I don’t necessarily expect a sudden improvement with these treatments.

We combined prescription and non-prescription antihistamines because the study was focused on the UK which classifies several antihistamines differently than the US. 

On row 410, a user used slightly nonstandard answers, which we corrected to being equivalent to “much improved’, since they said they were basically back to normal.

Medinger uses both “no change” and “new supplements but not niacin” as control groups, in order to compensate for selection and placebo effects from trying new things. I think that was extremely reasonable but felt I’d covered it by limiting myself to subjects with >2 weeks on a treatment and devaluing mild improvement. 

Results

I put my poor statistician through many rounds on this before settling on exactly which interventions we should focus on. In the end we picked five: niacin, anti-histamines, and low-histamine diet, which the original analysis evaluated, and vitamin D (because it’s generally popular), and selenium (because it had the strongest evidence of the substances prescribed the larger protocol, which we’ll discuss soon). 

Unfortunately, people chose their vitamins themselves, and there was a lot of correlation between the treatments. Below is the average result for people with no focal treatments, everyone with a given focal treatment, and everyone who did that and none of the other focal treatments for two weeks (but may have done other interventions). I also threw in a few other analyses we did along the way. These sample sizes get really pitifully small, and so should be taken as preliminary at best. 

TreatmentNiacin, > 2 weeksSelenium, > 2 weekVitamin D, > 2 weekAntihistamines, > 2 weeksLow-histamine diet, > 2 weeksChange (1 = complete recovery)95% Confidence Interval n
No change000000.04± 0.07140
Niacin, > 2 weeks10.23± 0.0791
Selenium, > 2 weeks10.24±0.0788
Vitamin D, > 2 week10.15±0.05261
Antihistamines, >2 weeks10.18± 0.06164
Low histamine diet10.18±0.06195
Niacin, > 2 weeks, no other focal treatments100000.15±0.211
Selenium, > 2 weeks, no other focal treatments010000.05±0.064
Vitamin D, > 2 week, no other focal treatments001000.07±0.08106
Antihistamines, >2 weeks, no other focal treatments000100.08±0.1326
Low histamine diet, > 2 weeks, no other focal treatments000010.13±0.1444
All focal treatments111110
Niacin + Antihistamines, >2 weeks1100.33± 0.0738
Niacin + Low Histamine Diet, > 2 weeks100010.29±0.1036
Selenium + Niacin, no histamine interventions11000.05±0.1917
Niacin, > 2 weeks, no other focal treatments, ignore D10000.13±0.1219
Selenium, > 2 weeks, no other focal treatments, ignore D01000.16±0.1218

1 = treatment used

0 = treatment definitely not used

– = treatment not excluded

Confidence interval calculation assumes a normal distribution, which is a stretch for data this lump and sparse but there’s nothing better available.

[I wanted to share the raw data with you but Medinger asked me not to. He was very fast to share with me though, so maybe if you ask nicely he’ll share with you too]

You may also be wondering how the improvements were distributed. The raw count isn’t high enough for really clean curves, but the results were clumped rather than bifurcated, suggesting it helps many people some rather than a few people lots. Here’s a sample graph from Niacin (>2 weeks, no exclusions)

Reasons this analysis could be wrong

  • All the normal reasons this kind of study or analysis can be wrong.
  • Any of the choices I made that I outlined in “Differences…”
  • There were a lot of potential treatments with moderate correlations with each other, which makes it impossible to truly track the cause of improvements.
  • Niacin comes in several forms, and the protocol I analyze later requires a specific form of niacin (I still don’t understand why). The study didn’t ask people what form of niacin they took. I had to actively work to get the correct form in the US (where 15% of respondents live); it’s more popular but not overwhelmingly so in the UK (75% of respondents), and who knows what other people took. If the theory is correct and if a significant number of people took the wrong form of niacin, it could severely underestimate the improvement.
  • This study only looked at people who’d changed things in the last month. People could get better or worse after that.
  • There was no attempt to look at dosage.

Conclusion

For a small sample of self-chosen interventions and opt-in participation, this study shows modest improvements from niacin and low histamine diets, which include overlap with the confidence interval of the no-treatment group if you exclude people using other focal interventions. The overall results suggest that either something in the stack is helping, or that trying lots of things is downstream of feeling better, which I would easily believe.

Thank you to Gez Medinger for running the study and sharing his data with me, R. Craig Van Nostrand for statistical analysis, and Miranda Dixon-Luinenburg⁩ for copyediting.

* I swear I scheduled this to publish the day after the big post but here we are three days later without it unpublished, so…

Consider Taking Zinc Every Time You Travel

Zinc lozenges are pretty well established to prevent or shorten the duration of colds. People are more likely to get colds while travelling, especially if doing so by plane and/or to a destination full of other people who also travelled by plane. I have a vague sense you shouldn’t take zinc 100% of the time, but given the risks it might make sense to take zinc prophylactically while travelling.

How much does zinc help? A meta-analysis I didn’t drill into further says it shortens colds by 33%, and that’s implied to be for people who waited until they were symptomatic to take it: taken preemptively I’m going to ballpark it at 50% shorter (including some colds never coming into existence at all). This is about 4 days, depending on which study you ask.

[Note: only a few forms of Zinc work for this. You want acetate if possible, gluconate if not, and it needs to be a lozenge, not something you swallow. Zinc works by physically coating your throat to prevent infection, it’s not a nutrient in this case. You need much more than you think to achieve the effect, the brand I use barely fits in my tiny mouth.]

Some risk factors for illness in general are “being around a lot of people”, “poor sleep” and “poor diet”. These factors compound: being around people who have been around a lot of people, or who have poor sleep or diet, is worse than being around a lot of well-rested, well-fed hermits. Travel often involves all of these things, especially by air and especially for large gatherings like conferences and weddings (people driving to camp in the wilderness: you are off the hook).

I struggled to find hard numbers for risk of infection during travel. It’s going to vary a lot by season, and of course covid has confused everything. Hocking and Foster gives a 20% chance of catching a cold after a flight during flu season, which seems high to me, but multiple friends reported a 50% chance of illness after travel, so fill in your own number here. Mine is probably 10%.

If my overall risk of a cold is 10%, and I lower the duration by 50%/4 days, I’ve in expectation saved myself 0.4 days of a cold, plus whatever damage I would have done spreading the cold to others, plus the remaining days are milder. Carrying around the lozenges, remembering to take them, and working eating and drinking around them is kind of inconvenient, so this isn’t a slam dunk for me but is worth best-effort (while writing this I ordered a second bottle of zinc to sit in my travel toiletry bag). It’s probably worth a lot for my friends with a 50% risk of illness, have unusually long colds, or live with small children who get cranky when sick. You know better than me where you fall.

Things that would change this cost-benefit estimate:

  • Seasonality
  • Personal reaction to zinc, or beliefs about its long term effects
  • Covid (all the numbers I used were pre-covid)
  • Different estimates for risk of illness during travel
  • Different estimates for the benefit of zinc
  • Personal susceptibility to illness

Caveats: anything that does anything real can cause damage. The side effects we know about for zinc lozenges are typically low, but pay attention to your own reaction in case you are unlucky. I remain an internet person with no medical credentials or accreditation. I attempt to follow my own advice and I’ve advised my parents to do this as well, but sometimes I’m rushed and forget.

ETA: I originally wrote this aimed at friends who already believed zinc was useful but hadn’t considered prophylactic use, and as such didn’t work very hard on it. I mistook some rando meta-analysis for a Cochrane review, and didn’t look further. There’s a pre-registered study that has come out since showing no effect from zinc. There could be other studies showing the opposite, I haven’t looked very closely. Plausibly that makes publishing this irresponsible- you definitely should judge me for mistaking a review that mentioned Cochrane for an actual Cochrane review. OTOH, writing too defensively inhibits learning, and I want to think my readers in particular are well calibrated on how much to trust off the cuff writing (but I hindered that by mislabeling the review as from Cochrane).

Long Covid Is Not Necessarily Your Biggest Problem

Introduction

At this point, people I know are not that worried about dying from covid. We’re all vaccinated, we’re mostly young and healthy(ish), and it turns out the odds were always low for us. We’re also not that worried about hospitalization: it’s much more likely than death, but maintaining covid precautions indefinitely is very costly so by and large we’re willing to risk it.

The big unknown here has been long covid. Losing a few weeks to being extremely sick might be worth the risk, but a lifetime of fatigue and reduced cognition is a very big deal. With that in mind, I set out to do some math on what risks we were running. Unfortunately baseline covid has barely been around long enough to have data on long covid, most of it is still terrible, and the vaccine and Delta variant have not been widespread long enough to have much data at all. 

In the end, the conclusion I came to was that for vaccinated people under 40 with <=1 comorbidiy, the cognitive risks of long covid are lost in the noise of other risks they commonly take. Coming to this conclusion involved reading a number of papers, but also a lot of emotional processing around risk and health. I’ve included that processing under a “personal stuff” section, which you can skip if you just want the info but I encourage you to read if you feel yourself starting to yell that I’m not taking small risks of great suffering seriously. I do encourage you to read the caveats section before deciding how much weight to put on my conclusions.

Personal Stuff

This post took a long time to write, much longer than I wanted, because this is not an abstract topic to me. I have chronic pain from nerve damage in my jaw caused by medical incompetence, and my attempts to seek treatment for this continually run into the brick wall of a medical system that doesn’t consider my pain important (tangent: if you have a pain specialist you trust, anywhere in the US, please e-mail me (elizabeth@acesounderglass.com)). I empathize very much with the long covid sufferers who are being told their suffering doesn’t exist because it’s too hard to measure and we can’t prove what caused it.

Additionally, I’m still suffering from side effects from my covid vaccine in April. It’s very minor, chest congestion that doesn’t seem to affect my lung capacity (but I don’t have a clear before picture, so hard to say for sure). But it’s getting worse and while my medical practitioners are taking it seriously, this + the experience with dental pain make me very sensitive to the possibility they might stop if it becomes too much work for them. As I type this, I am taking a supplement stack from a high end internet crackpot because first line treatment failed and there aren’t a lot of other options. And that’s just from the vaccine; I imagine if I actually had covid I would not be one of the people who shakes it off the way I describe later in this post. 

All this is to say that when I describe the long term cognitive impact of covid as being too small to measure with our current tools against our current noise levels, that is very much not the same as saying it’s zero. It’s much worse than that. What I’m saying is that you are taking risks of similar levels of suffering and impairment constantly, which our health system is very bad at measuring, and against that background long covid does not make much of a difference for people within certain age and health parameters. 

A common complaint when people say “X isn’t dangerous to the young and healthy” is that it implies the death and suffering of those who aren’t young and healthy don’t matter. I’m not saying that. It matters a lot, and it’s impossible for me to forget that because I’m very unlikely to be one of the people who gets to totally walk covid off if I catch it. But from looking at the data, there don’t seem to be very many of us in my age group.

Caveats

Medical research in general is really bad, research of a live issue in a pandemic is worse, you should assume these are low quality studies unless I indicate otherwise.

This research was compiled for LessWrong and Redwood Research, with the goal of assessing safety for their office spaces populated by mostly-but-not-entirely-healthy people 25-40, who were much more interested in the cognitive and fatigue sequelae than the physical. Much of this research is applicable outside that group or the sources can be used in that way, but you should know that’s what I focused on.

There isn’t any data on long covid in vaccinated people with breakthrough delta-variant infections. Neither vaccines nor delta have been around long enough for that to exist. Baseline covid has barely been around long enough to have long-term data. What I have here is:

  • Data showing that strength of acute infection correlates with long term impact, although not perfectly
  • Data on the long term impact of baseline covid, given the strength of an initial infection
  • Data on how the vaccine impacts the strength of acute infections
  • Data on how delta impacts the strength of acute infections

Data

Long term outcomes correlate with short term outcomes

By far the best study (best does not mean good) comes out of the UK, where the BBC coincidentally started an online intelligence test in January 2020 (giving them a pre-covid baseline) and in May began asking participants if they’d had covid and if so how bad a case. When I said “assume the studies are terrible unless I note otherwise”, this is the study I wanted to highlight as reasonably good. Because they can compare test-takers in a given time period with and without covid they can control for some of the effects of changing a study population over time, which would be the biggest concern. Additionally, my statistical consultant described the paper as “not having any errors that affect the conclusion”, which is extremely good for a medical paper. This study was not ideal for determining sequelae persistence, but they did check if size of effect was correlated with time since symptom onset, and it wasn’t (but their average was only 2 months).

This study showed a very direct correlation between the severity of the acute infection and cognitive decline. I don’t trust its absolute numbers, but the pattern that more severe disease -> more severe persistent effects is very clear

A second study in Wuhan, China (hat tip Connor Flexman) examined long term outcomes of hospitalized patients, based on the intensity of their care (hospitalization, supplemental oxygen, ventilation) found an increase in acute severity was correlated with an increase in sequelae, although it didn’t hold for every symptom (there are a lot of symptoms and the highest-intervention group is small), and they barely looked at cognitive symptoms.

Taquet et al used electronic health records to get a relatively unbiased six figure sample size, that also showed a strong correlation between acute and long term outcomes, which we’ll talk about more below.

From this I conclude that your overall risk of long covid is strongly correlated with the strength of the initial infection.

Odds of acute outcomes

Sah et al estimate that 35% of covid cases (implied to be baseline and pre-vaccination) are asymptomatic, with large variation by age. Children (<18) are 46% likely to be asymptomatic, adults 18-59 are 32% likely, adults >=60 are 20% likely. I’m going to round the non-elderly adult number to ⅓ to make the math easier.

The Economist has a great calculator showing your pre-vaccine, pre-Delta risk of hospitalization and death, given your age, sex, and comorbidities. Note that this calculator only includes diagnosed cases, so it excludes both asymptomatic cases and those that did have symptoms but didn’t drive people to seek medical care. Here’s a few sample people:

  • A healthy 30 year old man has a 2.7% chance of hospitalization, and <0.1% risk of death
  • A healthy 30 year old woman has a 1.7% chance of hospitalization, and <0.1% risk of death
  • A 25 year old man with asthma has a 4.2% risk of hospitalization, and <0.1% risk of death
  • A 40 year old woman with obesity has a 6.5% risk of hospitalization, and 0.1% risk of death.
  • Risk of hospitalization rises steadily with age but the risk of death doesn’t really take off until 50, at which point our healthy man has a death risk of 0.4% and our health woman has a risk of 0.2%

If you’d like, you can use your own numbers in this guesstimate sheet.

And again, that’s only for officially diagnosed and registered cases. If you assume ⅓ of infections in that age group are asymptomatic, the risk drops by ⅓.

If you are hospitalized, your risk of being ventilated is currently very, very low even if you’re in a high risk category. The overall average percent of hospitalized patients who were ventilated was 2.0% in the last week for which data was available (2021-03-24), after dropping steadily for most of the plague. We can assume that’s disproportionately among the elderly and people with severe comorbidities, so if that’s not you your odds are better still. I’m going to count the risk of intubation for our cohort as 0.5%, although that’s likely still an overestimate.

How do vaccines change these odds? According to CDC data from a time period ending 2021-05-01 (so before delta took off), 27% of breakthrough infections that reached the attention of the CDC were asymptomatic, and only 7% were hospitalized due to covid (another 3% were hospitalized for non-covid reasons). It’s very likely that the CDC is undercounting asymptomatic cases, so we’ll continue using our ⅓ number for now. The minimum age of reported breakthrough infection deaths was 71, so we’ll continue to treat the risk of death as 0% for our sample subjects. Additionally, given the timing most vaccinated participants would be elderly or front line workers, raising their risk considerably. A CDC press release goes much farther, saying vaccinated people > 65 had 7% of the hospitalizations of age-matched controls. 

How does delta change these odds? A Scottish study estimated delta had 2x the risk of hospitalization as alpha, which a Danish study estimated as having 1.42x the risk of hospitalization as baseline covid. So very roughly, we’re looking at 3x the risk of hospitalization from delta, relative to baseline.

So for our sample cases above, we have the following odds (note I updated these on the night it was posted, due to a math error. Thanks to Rob Bensinger for catching it):

Risk given vaccine, deltaHospitalizedIntubated
Healthy 30yo man0.38% = 2.7*.07*3*2/3.002% = 0.38*.005
Healthy 30yo woman0.24% = 1.7*.07*3*2/3.002% = 0.24*.005
Asthmatic 25yo man0.58% = 4.2*.07*3*2/3.003% = 0.58*.005
Obese 40yo woman0.92% = 6.5*.07*3*2/3.005% = 0.92*.005

That’s not so far from the rate of hospitalization in that age range for the flu (0.6%), with some caveats (the CDC sample includes unvaccinated people and the bucket is 18-49 years old, with the higher end presumably carrying more of the disease burden).

There is concern that vaccine effectiveness wanes over time, which I haven’t incorporated here.

Odds of long term outcomes

In general I ignored studies that merely tracked number of persistent sequelae but not their severity or type, which made it impossible to distinguish between “sense of smell still iffy” from “permanent intellectual crippling”, and studies that didn’t track how long the sequelae persisted. This was, unfortunately, most of them.

We talked about the Great British Intelligence Test above. I initially found this study quite scary. The study used its own tests rather than IQ, but if you assume a standard deviation in their tests is equivalent to a standard deviation in an IQ test, the worst category (ventilation) is equivalent to a 7 point IQ loss. That’s twice as bad as a stroke in this study (although I suspect sampling bias). I suspect the truth is worse still, because the worse your recently acquired cognitive and health issues are, the less likely you are to take a fun internet test advertised as measuring your intellectual strengths. However as I noted above, you are extremely unlikely to be put on a ventilator. 

For people with “symptoms, but not respiratory symptoms”, the cognitive damage is ~equivalent to 0.6 IQ points. For “medical assistance at home”, it’s 1.8 points. These are both likely to be overestimates given that the study only included known (although not necessarily formally diagnosed) cases. Additionally, while the paper claims to control for education, income, etc, bad things are more likely to happen to people in worse environments, and it’s impossible to entirely back that out.

Taquet et al used electronic health records to get a relatively unbiased six figure sample size, and found unhospitalized diagnosed covid patients (pre-Delta, pre-vaccine) had a 11% likelihood of a new neuro or psych diagnosis after their covid diagnosis, hospitalized patients had a 15% likelihood, and ICU patients had 26% likelihood. The majority of these were mood disorders (3.86%/4.49%/5.82% for home/hospitalized/ICU) and anxiety (6.81%/6.91%/9.79%). This seems quite bad, until you compare it to the overall numbers for depression in the time period, a naive reading of which suggests that covid had a protective effect

These numbers aren’t directly comparable. The second study is much lower quality and includes rediagnoses (although the total depression diagnosis numbers for the covid patients are 13.10%/14.69%/15.43%- still under the total increase in depression in the general population study). 

Overall this seems well within what you’d expect from getting a scary disease at a scary time, and not evidence of widespread neuro or psych impact of covid. Even if you take the numbers at face value, they exclude most people who were asymptomatic or treated at home without a formal diagnosis.

A UK metareview found the prevalence at 12 weeks of symptoms affecting daily life ranged from 1.2% (average age: 20, minimum 18) to 4.8% (average age: 63). The cohort with average age 31 had a mean prevalence of 2.8%., which is is well within the Lizardman Constant. This is based on self-reports on survey data, which will again exclude asymptomatic cases, so even if you treat it as real, you need to discount it down to 2.8%.

On the other hand, medicine is notoriously bad at measuring persistent, low-level, amorphous-yet-real effects. The Lizardman Constant doesn’t mean prevalences below 4% don’t exist, it means they’re impossible to measure using naive tools.

Comparison to other diseases

The Taquet study did compare covid patients to those with other respiratory diseases (including the flu, not controlling for disease severity or patient age), and found covid to be modestly worse except for myoneural junction and other muscular diseases, where covid 5xed the risk (although it’s still quite low in absolute terms). Dementia risk is also doubled, presumably mostly among the elderly.

Additionally, cognitive impairment following critical illness, and especially following intubation, is a well known phenomenon. This puts the Great British Intelligence Test numbers in perspective- being/needed to be ventilated is quite bad, but it’s always been that bad, there doesn’t appear to be any unique-to-covid badness.

Conclusion

My tentative conclusion is that the risks to me of cognitive, mood, or fatigue side effects lasting >12 weeks from long covid are small relative to risks I was already taking, including the risk of similar long term issues from other common infectious diseases. Being hospitalized would create a risk of noticeable side effects, but is very unlikely post-vaccine (although immunity persistence is a major unresolved concern).

I want to emphasize again that “small relative to risks you were already taking” doesn’t necessarily mean “too small to worry about”. For comparison, Josh Jacobson did a quick survey of the risks of driving and came to roughly the same conclusion: the risks are very small compared to the overall riskiness of life for people in their 30s. Josh isn’t stupid, so he obviously doesn’t mean “car accidents don’t happen” or “car accidents aren’t dangerous when they happen”. What he means is that if you’re 35 with 15 years driving experience and not currently impaired, the marginal returns to improvements are minor. 

And yet. I have a close friend who somehow got in three or four moderate car accidents in < 7 years, giving her maybe-permanent soft tissue damage (to answer the obvious question: no, the accidents weren’t her fault. Sometimes she wasn’t even driving). Statistically, that friend doesn’t exist. No one gets in that many car accidents that quickly without it being their fault. And yet the law of large numbers has to catch up with someone. Too small to measure can be very large.


What this means is not that covid is safe, but that you should think about covid in the context of your overall risk portfolio. Depending on who you are that could include other contagious diseases, driving, drugs-n-alcohol, skydiving, camping, poor diet, insufficient exercise, too much exercise, and breathing outside. If you decide your current risk level is too high, or are suddenly realizing you were too risk-tolerant in the past, reducing covid risk in particular might not be the best bang for your buck. Paying for a personal trainer, higher quality food, or a HEPA filter should be on your radar as much as reducing social contact, although for all I know that will end up being the best choice for you personally. 

Change my mind

My own behavior and plans have changed a lot based on this research, so I’m extremely interested in counterarguments. To make that easy, here’s a non-exhaustive list of things that would change my mind:

  1. Evidence that long covid gets worse over time, rather than slowly improving (note that I did look at data from SARS 1 and failed to find this).
  2. New variants increase the risk to what it was or was feared to be in April 2020
  3. Evidence of more severe vaccine attenuation than we’re currently seeing.
  4. Credible paths through which the risk could drop sharply in the next six months.

Thanks to LessWrong and Redwood Research for funding this research, Connor Flexman and Ray Arnold for comments on drafts, and Rob Bensinger and Lanrian for catching errors post-publication that did not affect my overall conclusion.