Recently a client commissioned me to look at the potential cognitive impacts of general anesthesia. I was surprised to find out that it’s not obvious general anesthesia does more damage than spinal or local anesthesia, and my guess is most but not all of the damage is done by the illness or surgery themselves.
Caveats and difficulties
I’m not a doctor. The following represents something like 5 hours of work, which obviously is not enough time to process even a fraction of the literature. I was focused on the dangers of median uses of anesthesia, where nothing goes obviously wrong and the anesthesiologist considers it a success; I didn’t even attempt to look at the rate of accidents, which can be pretty severe. My friend’s dad’s life was ruined by a fungal contaminant in a spinal injection. And of course, people die from excess general anesthesia. But for this post I only looked at damage done by routine anesthetic usage.
Like all client research, this was tailored to a particular person’s needs and budget, and shouldn’t be considered a general-purpose survey.
It’s pretty hard to tease out the difference between damage done by anesthesia, damage done by whatever necessitated the surgery, and damage done by having your body ripped open and bits moved around. Bodies hate that sort of thing. The few RCTs that exist by necessity focus on a narrow range of minimally invasive surgeries for which there exists a choice in type of anesthesia, and animal studies tended to focus on developing animals rather than adults. Even for procedures where multiple types are possible, patients tend to be pretty opinionated about what they want; one paper even announced they’d given up on reaching their sample size goal because recruiting was too hard.
Studies also often focused on cognition within a few hours of surgery (when people are still at the hospital to test). I think that’s less likely to be “damage” and more likely to be “it’s still wearing off” or “I’m sorry, I just had minor surgery and you want me to take an IQ test?”. This made me throw out a lot of studies.
Few if any of the papers attempted to control for post-operative condition or pain med usage, which seems like an enormous oversight to me.
My overall take home is that:
Little or nothing that necessitates surgery is good for cognition and that needs to be factored into assessments.
Surgery itself is enormously stressful, physically and emotionally, and that stress impairs cognition, sometimes in lasting ways. This includes procedures that are not cutting new holes in you, like kidney stone treatments, although presumably it’s worse for open heart surgery.
Probably there are additional effects from anesthesia. At least general and spinal, maybe including local. On priors I still believe general and spinal are worse on a purely physical level.
Probably a lot of whatever damage there is heals in most people, although people who need surgery are already under heavy load and will be the worst at healing.
There may be treatments that can prevent damage but they’re still in rodent trials right now.
I also believe that being awake and aware during surgery can be emotionally traumatic, and trauma is also bad for cognition, so include that in your math.
But I’m not trustworthy on this, seeing as I was terrorized by a series of dentists and now can’t get myself through simple teeth cleaning without some sort of bribe, a human to guard me from the bad dentist monster, and a sedative.
I didn’t rigorously track correlational studies, but my sense was they tended to show faster recovery from local and spinal anesthetic, relative to general, presumably because milder cases get milder anesthesia even when the procedures use the same billing code. Additionally a lot of studies were given too soon after surgery, which I don’t expect to predict long term damage
In the few studies that randomly assigned patients to spinal, local, or general anesthesia, and surveyed at least 7 days out, it’s really hard to pick a winner.
Incidence of postoperative cognitive dysfunction after general or spinal anaesthesia for extracorporeal shock wave lithotripsy tries really hard to claim that spinal and general anesthetic are equally damaging to cognition, despite finding a 3x higher rate of cognitive issues after general anesthestic. I showed this paper to my statistician father and he gave a rant I wish I had recorded because it would make me famous in the right corner of Twitter. Hell hath no fury like a statistician forced to read a medical paper. He agreed with me that 19.6% (the rate of complications in the spinal group) was much larger than 6.8% (rate of complications in the general group), but dismissed that as merely a felony next to the war crimes against statistics they committed by using the wrong test for statistical significance.
Twometa-analyses both find a small difference in favor of spinal over general, with confidence intervals that overlap no-difference. One found spinal to be ~5% better (26 studies), the other 50% (but only 5 studies, so still overlapping with 0). The latter analysis is tiny in part because it is restricted to tests within a week of surgery. The analysis that looked also failed to find improvements from using local anesthetic.
On the other hand, animal studies of anesthesia without surgery regularly show impairment, although they can’t agree if post-anesthesia animals start off worse but catch up, or start off the same but fall behind. Also they found other medications could mediate the effects. I summarize the animal results in this spreadsheet. These are effect sizes we would clearly notice in humans so I assume they’re using much more anesthetic (although they claim it’s proportionate) or the animals, primarily rodents, are much more sensitive. Also the studies tended to be within days of anesthesia application, removing a chance to heal.
The original commission was to investigate kidney stone treatments, and what I can say there is that the general medical site UpToDate is pretty good. Every claim I investigated checked out and I didn’t find anything at all established that they didn’t.
Thank you to Claire Zabel for commissioning the research and encouraging me to share the findings, and to my Patreon patrons for supporting the public write-up.
You know those health books with “miracle cure” in the subtitle? The ones that always start with a preface about a particular patient who was completely hopeless until they tried the supplement/meditation technique/healing crystal that the book is based on? These people always start broken and miserable, unable to work or enjoy life, perhaps even suicidal from the sheer hopelessness of getting their body to stop betraying them. They’ve spent decades trying everything and nothing has worked until their friend makes them see the book’s author, who prescribes the same thing they always prescribe, and the patient immediately stands up and starts dancing because their problem is entirely fixed (more conservative books will say it took two sessions). You know how those are completely unbelievable, because anything that worked that well would go mainstream, so basically the book is starting you off with a shit test to make sure you don’t challenge its bullshit later?
Well 5 months ago I became one of those miraculous stories, except worse, because my doctor didn’t even do it on purpose. This finalized some already fermenting changes in how I view medical interventions and research. Namely: sometimes knowledge doesn’t work and then you have to optimize for luck.
I assure you I’m at least as unhappy about this as you are.
Preface to the Preface
I’ve had nonspecific digestive issues since before I have memories. In pre-school my family joked that I would die as a caveman because there were so few things I would eat, and they were mostly grains. This caused a bunch of subclinical malnutrition issues that took a lot of time to manage and never got completely better. And while I couldn’t articulate this until it went away, food felt gross all the time
It’s hard to convey just how bad this was for me, because it feels like it undermines everything I did to work around it. I’ve always been functional but decidedly less healthy than my friends. I got sick more often and it hit me harder. I was slower to heal from injuries and scrapes and that limited my interest in the more athletic sort of hobbies. I couldn’t work the same hours, and working hours traded off really sharply against energetic hobbies. I had to spend a lot of time managing food where other people can just show up and eat, which was a constant source of social stress. My genetics say I was destined to have anxiety issues, but the low level malnutrition and justified feelings of food insecurity despite apparent abundance did not help anything.
Eventually in my late 20s. I saw a nutrition-focused psychiatrist who listened to my observations (I could only eat protein with soda), immediately formed a hypothesis (I produced insufficient stomach acid), asked questions to rule it out (which I no longer remember), suggested a test (take stomach acid pills and see if they gave me heartburn), and when it came back positive (no heartburn) suggested a course of action (keep taking stomach acid pills) that showed immediate benefits in practice (indigestion removed, but only when I took the pills). My protein and produce intake increased enormously, and I felt overall much better.
This is exactly how I want medicine to work. I gathered good data and took it to an expert who immediately formed a model, definitively tested it, and prescribed a course of action that made mechanistic sense. If you forget that it took almost 30 years and I took those exact same symptoms to other doctors beforehand, it’s a stunning success.
But it was not a total success. My protein intake maxed out at 50 grams/day, and that was if I made consuming protein a hobby and nothing went wrong. I was doing much better than I had been, but my nutrient tests showed I still had a lot of issues. Eventually the stomach acid pills stopped working, although that seems to be “my stomach started producing more acid and a different problem became the bottleneck” rather than the pills ceasing to contain acid. But the problem was not solved, and more of the existing treatment did not help.
I worked with a number of doctors on fixing the remaining digestive, for ~another decade. I had a lot of conversations like the following:
Me (over 20 pages of medical history and 30 minutes of conversation): I can’t digest protein or fiber, when I try it feels like something died inside me.
Them: Oh that’s no good, you need to eat so much protein and vitamins
Me: Yes! Exactly!. That’s why I made an appointment with you, an expensive doctor I had to drive very far to get to. I’m so excited you see the problem and for the solution you’re definitely about to propose.
Them: What if you took a slab of protein and chewed it and swallowed it. But like a lot of that.
Me: Then I’d feel like something died inside me, and would still fail to absorb the nutrients which is the actual thing we want me to get from food.
Them: I can’t help you if you’re not willing to help yourself.
Me (over 20 pages of medical history and 30 minutes of conversation): I can’t digest protein or fiber, when I try it feels like something died inside me. If I make it my top priority I can get maybe 50 grams of protein a day.
Them: Oh that’s no good, you need 70 minimum, and really more like 100. Also because I’m a naturopath I’m morally obligated to tell you to give up eggs, dairy, and wheat.
Me: That’s gonna be hard seeing as those three are 90% of my protein intake and by far the easiest forms of protein to digest. Them: What if you ate pea protein?
Me: Well that’s harder so…worse.
Them: What about hemp?
Me: That is even harder than pea protein.
Them: If you’re not going to try why are you even here?
These exchanges were incredibly draining for me, so I didn’t have them that often. Every year or two I’d get my hopes up for a new doctor, pay a shitton of money (these doctors are never covered by insurance) for several emotionally draining appointments, and then get told they couldn’t help me and this was a failure on my part.
After several years of that pattern I gave up and went back to my old PCP. She hadn’t solved the problem either, but she had solved other problems, had ideas to try for this one, and believed it was a physical rather than moral problem. Unfortunately she is very busy, and sometimes pawns me off on her assistant doctors, who are idiots. That second conversation was with one of those, although in the real conversation I was less witty, and was more like “*sob* no *sob* I told you *sob* I CAN’T”.
I refused to see that doctor again, but this left me little leverage when they assigned me a different sub-doctor to handle a post-covid rash back in April. You know how naturopaths complain about western medicine being mechanical and reactive and not taking the time to reach a systemic understanding? Well this guy, who we will call Dr. Spray-n-pray, was determined to fight for equality by taking the same approach with unregulated supplements. He guessed I had an allergic reaction and threw 5 different antihistamines of varying legitimacy at me, with no mention of testing the hypothesis, monitoring my progress, expected changes, duration of treatment…
And it worked.
Not on the rash; I eventually had to go to urgent care for that. But shortly after I started the pills, I found myself eating 50 grams of protein in a sitting and then going back for more the next meal. I also started chowing down on produce, and at some point realized I couldn’t remember the last time I’d had dessert. I had known I had some aversion issues with food but didn’t realize how gross I found it until the feeling went away and I could just eat without feeling contaminated. About here is when I started a food diary and found I was regularly hiting 100g of protein/day. When I crashed my scooter I ate 350 grams of protein over two days, suggesting I could do that any time I wanted but chose not to, suggesting my body was getting all the protein it felt it needed, all of the time.
I’m not sure I can convey what a big deal this is either. I would have paid several years’ salary for this cure without thinking. It is now possible for me to feel okay at an emotional level it wasn’t before. Plus, you know, I can actually get the nutrients I need to run my body and stuff. My injuries after that scooter accident healed noticeably faster than past injuries. The fact that I haven’t caught an illness since April’s covid isn’t conclusive, since it’s summer and I haven’t done anything high risk, but it is interesting.
[I do have covid antibody results from the December (8 months after my vaccine) and August (4 months after catching covid) and my levels have gone way up, but that’s more likely due to the more recent and stronger immune stimulus.]
But that evidence came later. Back in May the timing of the miracle suggested that one of Dr. Spray-n-pray’s pills was responsible. This was more or less confirmed when I weaned off the various pills and the subtle grossness around food started to return. I could also feel growing sugar cravings. So it was important to figure out what the miracle pill was and get back on it immediately.
[If any of you are thinking “well it could have been a coincidence”: no it fucking couldn’t. I did not carry this around for 35 years and try everything to fix it only to have it suddenly go into remission for no reason. I’ll believe covid fixed it before I believe that.]
I had always assumed the reason doctors turned on me was that it was easier than accepting that they couldn’t solve my problem. But this one had fixed my problem! Not on purpose or anything, but I was fully prepared to pretend it was. Now we just had to figure out what had worked and why, in case it suggested any additional actions. I made a spreadsheet tracking the changes as best I could – when my diet changed (using grocery order data), when I’d started and stopped which pills. Surely my data plus his doctor ego would help us get to the bottom of this.
At the time of my follow-up appointment I had a strong guess which supplement had helped based on timing, but it didn’t make any sense. The active ingredient was Boswelia (specifically BosPro brand (affiliate link). I’m afraid to try another in case it breaks the spell). Boswelia is sometimes described by alt medicine websites as helping digestive issues, but in the same way they describe every supplement as helping digestive issues. “Helps anxiety, allergies, autoimmune disorders, inflammation, and digestion” should just be a stamp. This isn’t even necessarily illegitimate – the body is complicated and lots of things are entangled, especially with inflammation.
But I’ve tried a lot of these supplements at one point or another and there was absolutely no reason to predict this one would be different, even if I had researched it ahead of time. Examine.com is pretty positive on Boswelia but doesn’t list digestion as an issue it solves. Everything is connected to everything else in the body and it was still pretty hard for me to make a causal chain between Boswelia’s alleged mechanisms and improvements in my digestion. So I was extremely excited for Dr. Spray-n-pray to explain why it had worked.
All this was on my mind when I finally got to ask Dr. Spray-n-pray why his treatment had worked. He mumbled something about inflammation and moved on. He had zero interest in my spreadsheet or a more mechanistic understanding of what had changed. I confirmed the miracle was from BosPro when I resumed taking it and the digestive improvements returned (including the creeping feeling of grossness going away). It’s now 5 months since I started taking it and it still works but I have no idea why.
This is not how I want medicine to work, at all. A medic who clearly was not trying for a systemic understanding recommended a lot of stuff and one of them happened to fix a problem as unrelated as could be that I’d spent a decade+ searching for without success? Even knowing definitively that it works we have no idea why, and what would help or hinder it? And there’s ~0 evidence this would help other people with the same condition?
This is bullshit. But bullshit is working where logic feared to tread.
This experience isn’t what got me on the path of luck-based medicine though. I was already at that point when the supplements were prescribed, which is why I took them instead of doing 5 hours of research and ignoring Dr. Spray-n-pray’s suggestions as the ravings of an idiot. There were a lot of contributors to my shift, but a few stand out.
A few years ago I ran a series of epistemic spot checks on various self-help books, and found that how helpful they were had no correlation with how rigorous or true their theoretical backing was.
Then last year I ran that ketone ester study. I and a handful of people I know get insane gains from using ketone esters – better than Ritalin with none of the side effects – but when I ran an RCT (n=8-12 depending on how you count) no one reported any benefits.
They failed to contextualize it as a monodiet and discuss the classic monodiet problems.
Potatoes aren’t nutritionally complete and don’t have enough protein for people to thrive. They gestured at some of the nutritional deficiencies but I think not hard enough, and believe potatoes have more protein than reported but have not pointed to any evidence to that effect.
They tracked weight loss over 28 days but will not be doing a follow-up for six months. Since the default after rapid weight loss caused by an unsustainable diet is immediate regain, this is unconscionable.
I haven’t had time to dig into the object-level facts in the argument between SMTM and a persistent critic, but with my monkey social brain it sure does look like SMTM is blowing off well-founded criticism (given in a super aggressive manner).
They treat weight loss as an unalloyed good no matter how fast or what the person’s starting weight was.
I have not looked into the popular “weight loss not safe above 2 pounds per week” claim and it wouldn’t shock me if it were made up, but if I had an intervention with double that impact I’d spend an hour investigating the claim.
Weight loss beyond a certain body fat percentage is bad. You need that stuff.
They did warn people about solanine poisoning but I think they should be more concerned about it.
Analysis featured a lot of stories along the lines of “Did X on Wednesday and lost 2 pounds on Thursday” and fat loss does not work like that. Two pounds overnight is either water weight or has a lookback period longer than 24 hours.
I’m deeply confused about that second part, I don’t understand why or how weight-loss-that-is-definitely-not-changes-in-water-retention comes in chunks. If you have an answer I’m quite curious.
That’s a lot of epistemic sins. OTOH, their potato diet results inspired me to try the minimal potato diet, which consists of eating some potatoes every day (I started with ~100g of baby potatoes), and I’ve lost 15 pounds in 3 months. That level of weight loss with zero sacrifices buys you a lot of epistemic forgiveness, especially when my miraculous dramatic dietary improvements did fuck all to the number on the scale.
[ People already writing their “potatoes can’t possibly be the cause it must be psychosomatic” comments in their head: I see you. Your hypothesis is perfectly reasonable; in your position it would be my first reaction too. But in this particular case you’re going to need to explain why potatoes caused that magic mental shift when giving up soda, a dramatic improvement in diet and removal of dessert entirely, complete emotional reorientation to food, a mild prescription stimulant, and varying levels of exercise did nothing, and ketone esters worked better than all of those but much worse than potatoes. Comments not attempting this will be deleted or mocked as I see fit.]
If you are thinking “ah, but clearly those all did contribute and the potatoes were just the last step”: I agree that’s likely. If I’d started minimal potato diet before BosPro it either wouldn’t have worked or would have been extremely bad for me. But since it seems to work for at least some other people who didn’t have all this baggage I think we need to update in that direction.]
Or take every person who got a second opinion on their cancer and was recommended diametrically opposing treatment plans. Doctors as a class are not as epistemically virtuous as I’d like, but that’s not (always) why they propose wildly divergent treatment plans. In most cases it’s because the answer isn’t obvious, or at best has only been obvious for a few years.
And then there’s the absolute shitshow that is nutrition research. No one knows what the average optimum nutrient level is and even if we did it wouldn’t be that helpful for figuring out the optimum level for a given individual, because humans are so unbelievably variable.
I could go on here, but if you’re reading my blog you’re probably already on board with shit being extremely complicated and I don’t want to belabor the point.
Moral of the story: when intellect fails, try luck guided by intuition
Some medicine is very deterministic. Antibiotics, most of the time. That daylong IV drip when I had norovirus that probably turned the infection from deadly to a kind of annoying 36 hours. We may not know the optimum level of a given nutrient but most severe deficiency diseases can be solved by giving you the thing you’re severely deficient in. My impression is statins work pretty reliably.
But a lot of medicine just seems to be kind of random. People go through 10 antidepressants and then somehow the 11th one works great. Ketone esters increase my energy level so much I gave up soda and caffeine entirely but do nothing for most people. All those books where the cure was a miracle for someone, and it can’t just be a placebo because there’s no reason for the 35th placebo to be the one that works but nothing else makes sense.
All of which leads me to conclude that once you have exhausted the reliable part of medicine without solving your problem, looking for a mechanistic understanding or even empirical validation of potential solutions is a waste of time. The best use of energy is to try shit until you get lucky.
Not at random or anything. My guess is the world contains metis and you do better-than-chance preferentially trying things that helped one guy on a message board for your condition (even though it was shown to make no difference in real studies) or going to alt-modality practitioners (even the one with proactively stupid justifications they insist on sharing). The latter is especially true if you can find a practitioner that accepts that their treatments don’t always work and have a system to notice that and change course, but I think maybe even the really gung-ho ones sometimes have good ideas (you just have to set up your own system for deciding when to quit). Just don’t get hung up on “do we understand why this works?” or “does this work for other people?”
Also please remember that side effects and drug interactions are a thing. Anything with a real effect can hurt you. I gave a very caveated suggestion of BosPro to someone on Twitter and it caused something akin to niacin flush in them. This is the same brand that does nothing to me but makes me better at digestion and uninterested in sugar.
So I guess the full and accurate statement of my beliefs is “Try solving problems with understanding first, but accept when you’ve hit diminishing returns and consider if your energy isn’t better spent increasing your surface area to luck”.
Fuck you every doctor who told me my digestive problems were in my head or my fault for being a bad patient and you couldn’t help me until I solved the problem that drove me to you. You were factually incorrect and you should feel terrible.
For potential clients in particular
People sometimes approach me for medical literature reviews aimed at their specific problem. There are forms of these I will do, but those forms do not include producing a mechanistic model and high-probability treatment for someone’s persistent, sub-clinical, amorphous problem that medicine has failed to solve. There are a few reasons accepting these commisions would be wasting the clients’ money, and one of them is that by the time they come to me they have found all the low hanging deterministic fruit. The best I can do is spend a ton of time generating lists of things that might work. Sometimes I do offer that, but people tend to prefer my other offer of a referral to a researcher that’s better at individualized treatment.
There are a lot of vitamins and other supplements in the world, way more than I have time to investigate. Examine.com has a pretty good reputation for its reports on vitamins and supplements. It would be extremely convenient for me if this reputation was merited. So I asked Martin Bernstoff to spot check some of their reports.
We originally wanted a fairly thorough review of multiple Examine write-ups. Alas, Martin felt the press of grad school after two shallow reviews and had to step back. This is still enough to be useful so we wanted to share, but please keep in mind its limitations. And if you feel motivated to contribute checks of more articles, please reach out to me (firstname.lastname@example.org).
My (Elizabeth’s) tentative conclusion is that it would take tens of hours to beat an Examine general write-up, but they are not complete in either their list of topics nor their investigation into individual topics. If a particular effect is important to you, you will still need to do your own research.
Claim: “The actual rate of deficiency [of B12] is quite variable and it isn’t fully known what it is, but elderly persons (above 65), vegetarians, or those with digestion or intestinal complications are almost always at a higher risk than otherwise healthy and omnivorous youth”
Verdict: True but not well cited. Their citation merely asserts that these groups have shortages rather than providing measurements, but Martin found a meta-analysis making the same claim for vegetarians (the only group he looked for).
Verdict: Very brief. Couldn’t find much on my own. Seems reasonable.
Claim: “Vitamin B12 can be measured in the blood by serum B12 concentrations, which is reproducible and reliable but may not accurately reflect bodily vitamin B12 stores (as low B12 concentrations in plasma or vitamin B12 deficiencies do not always coexist in a reliable manner) with a predictive value being reported to be as low as 22%”
Verdict: True, the positive predictive value was 22%, but with a negative predictive value of 100% at the chosen threshold. But that’s only the numbers at one threshold. To know whether this is good or bad, we’d have to get numbers at different threshold (or, preferably, a ROC-AUC).
Claim: B12 supplements can improve depression
Examine reviews a handful of observational studies showing a correlation, but includes no RCTs. This is in spite of there actually being RCTs like Koning et al. 2016 and a full meta analysis, neither of which find an effect.
The lack of effect in RCTs is less damning than it sounds. I (Elizabeth) haven’t checked all of the studies, but the Koning study didn’t confine itself to subjects with low B12 and only tested serum B12 at baseline, not after treatment. So they have ruled out neither “low B12 can cause depression, but so do a lot of other things” nor “B12 can work but they used the wrong form”.
I still find it concerning that Examine didn’t even mention the RCTs, and I don’t have any reason to believe their correlational studies are any better.
Interactions with pregnancy
Only one study on acute lymphoblastic leukemia. Seems a weird choice. Large meta-analyses exist for pre-term birth and low birth weight, likely much more important. Rogne et al. 2016.
They don’t seem to be saying much wrong but the write-up is not nearly as comprehensive as we had hoped. To give Examine its best shot, we decided the next vitamin should be on their best write-up. We tried asking Examine which article they are especially confident in. Unfortunately, whoever handles their public email address didn’t get the point after 3 emails, so Martin made his best guess.
They summarize several studies but miss a very large RCT published in JAMA, the VIDARIS trial. All studies (including the VIDARIS trial) show no effect, so they might’ve considered the matter settled and stopped looking for more trials, which seems reasonable.
Claim: Vitamin D helps premenstrual syndrome
”Most studies have found a decrease in general symptoms when given to women with vitamin D deficiency, some finding notable reductions and some finding small reductions. It’s currently not known why studies differ, and more research is needed”
This summary seemed optimistic after Martin looked into the studies:
No statistically significant differences between groups.
The authors highlight statistically significant decreases for a handful of symptoms in the Vitamin D group, but the decrease is similar in magnitude to placebo. Vitamin D and placebo both have 5 outcomes which were statistically significant.
Marked differences between groups, but absolutely terrible reporting by the authors – they don’t even mention this difference in the abstract. This makes me (Martin) somewhat worried about the results – if they knew what they were doing, they’d focus the abstract on the difference in differences.:
Appears to show notable differences between groups, But terrible reporting. Tests change relative to baseline (?!), rather than differences in trends or differences in differences.
In conclusion, only the poorest research finds effects – not a great indicator of a promising intervention. But Examine didn’t miss any obvious studies.
Claim: “There is some evidence that vitamin D may improve inflammation and clinical symptoms in COVID-19 patients, but this may not hold true with all dosing regimens. So far, a few studies have shown that high dosages for 8–14 days may work, but a single high dose isn’t likely to have the same benefit.”
The evidence Martin found seems to support their conclusions. They’re missing one relatively large, recent study (De Niet 2022). More importantly, all included studies are about hospital patients given vitamin D after admission, which are useless for determining if Vitamin D is a good preventative, especially because some forms of vitamin D take days to be turned into a useful form in the body.
Our goal was to quantify the cognitive risks of heavy but not abusive alcohol consumption. This is an inhernetly difficult task: the world is noisy, humans are highly variable, and institutional review boards won’t let us do challenge trials of known poisons. This makes strong inference or quantification of small risks incredibly difficult. We know for a fact that enough alcohol can damage you, and even levels that aren’t inherently dangerous can cause dumb decisions with long term consequences. All that said… when we tried to quantify the level of cognitive damage caused by college level binge drinking, we couldn’t demonstrate an effect. This doesn’t mean there isn’t one (if nothing else, “here, hold my beer” moments are real), just that it is below the threshold detectable with current methods and levels of variation in the population.
In discussions with recent college graduates I (Elizabeth) casually mentioned that alcohol is obviously damaging to cognition. They were shocked and dismayed to find their friends were poisoning themselves, and wanted the costs quantified so they could reason with them (I hang around a very specific set of college students). Martin Bernstorff and I set out to research this together. Ultimately, 90-95% of the research was done by him, with me mostly contributing strategic guidance and somewhere between editing and co-writing this post.
Problems with research on drinking during adolescence
Literature on the causal medium- to long-term effects of non-alcoholism-level drinking on cognition is, to our strong surprise, extremely lacking. This isn’t just our poor research skills; in 2019, the Danish Ministry of Health attempted a comprehensive review and concluded that:
“We actually know relatively little about which specific biological consequences a high level of alcohol intake during adolescence will have on youth”.
And it isn’t because scientists are ignoring the problem either. Studying medium- and long-term effects on brain development is difficult because of the myriad of confounders and/or colliders for both cognition and alcohol consumption, and because more mechanist experiments would be very difficult and are institutionally forbidden anyway (“Dear IRB: we would like to violently poison some teenagers for four years, while forbidding the other half to engage in standard college socialization”). You could randomize abstinence, but we’ll get back to that.
One problem highly prevalent in alcohol literature is the abstinence bias. People who abstain from alcohol intake are likely to do so for a reason, for example chronic disease, being highly conscientious and religious, or a bad family history with alcohol. Even if you factor out all of the known confounders, it’s still vanishingly unlikely the drinking and non-drinking samples are identical. Whatever the differences, they’re likely to affect cognitive (and other) outcomes.
Any analysis comparing “no drinking” to “drinking” will suffer from this by estimating the effect of no alcohol + confounders, rather than the effect of alcohol. Unfortunately, this rules out a surprising number of studies (code available upon request).
Confounding is possible to mitigate if we have accurate intuition about the causal network, and we can estimate the effects of confounders accurately. We have to draw a directed acyclic graph with the relevant causal factors and adjust analyses or design accordingly. This is essential, but has not permeated all of epidemiology (yet), and especially for older literature, this is not done. For a primer, Martin recommends “Draw Your Assumptions” on edX here.
Additionally, alcohol consumption is a politically live topic, and papers are likely to be biased. Which direction is a coin flip: public health wants to make it seem scarier, alcohol companies want to make it seem safer. Unfortunately, these biases don’t cancel out, they just obfuscate everything.
What can we do when we know much of the literature is likely biased, but we do not have a strong idea about the size or direction?
If we aggregate multiple estimates that are wrong, but in different (and overall uncorrelated) directions, we will approximate the true effect. For health, we have a few dimensions that we can vary over: observational/interventional, age, and species.
Randomized abstinence studies
Ideally, we would have strong evidence from randomized controlled trials of abstinence. In experimental studies like this, there is no doubt about the direction of causality. And, since participants are randomized, confounders are evenly distributed between intervention and control groups. This means that our estimate of the intervention effect is unbiased by confounders, both measured and unmeasured.
Bimbaum et al. did not stick to the randomisation when analyzing their data, opening the door to confounding:
Which should decrease our confidence in their study. They found no effect of abstinence on their 7 cognitive measures.
In Hannon et al., instruction to abstain vs. maintain resulted in a difference in alcohol intake of 12.5 units pr. week over 2 weeks. On the WAIS-R vocabulary test, abstaining women scored 55.5 ± 6.7 and maintaining women scored 51.0 ± 8.8 (both mean ± SD). On the 3 other cognitive tests performed, they found no difference.
Especially due to the short duration, we should be very wary of extrapolating too much from these studies. However, it appears that for moderate amounts of drinking over a short time period, total abstinence does not provide a meaningful benefit in the above studies.
Observational studies on humans
Due to their observational nature (as opposed to being an experiment), these studies are extremely vulnerable to confounders, colliders, reverse causality etc. However, they are relatively cheap ways of getting information, and are performed in naturalistic settings.
One meta-analysis (Neafsey & Collins, 2011) compared moderate social drinking (< 4 drinks/day) to non-drinkers (note: the definition of moderate varies a lot between studies). They partially compensated for the abstinence bias by excluding “former drinkers” from their reference group, i.e. removing people who’ve stopped drinking for medical (or other) reasons. This should provide a less biased estimate of the true effect. They found a protective effect of social drinking on a composite endpoint, “cognitive decline/dementia” (Odds Ratio 0.79 [0.75; 0.84]).
Interestingly, they also found that studies adjusting for age, education, sex and smoking-status did not have markedly different estimates from those that did not (ORadjusted 0.75 vs. ORun-adjusted 0.79). This should decrease our worry about confounding overall.
Observational studies on alcohol for infants
Another angle for triangulation is the effect of moderate maternal alcohol intake during pregnancy on the offspring’s IQ. The brain is never more vulnerable than during fetal development. There are obviously large differences between fetal and adolescent brains, so any generalization should be accompanied with large error bars. However, this might give us an upper bound.
A SNP variant in a gene (ADH1B) is associated with decreased alcohol consumption. Since SNP are near-randomly assigned (but see the examination of assumptions below), one can interpret it as the SNP causing decreased alcohol consumption. If some assumptions are met, that’s essentially a randomized controlled trial! Alas, these assumptions are extremely strong and unlikely to be totally true – but it can still be much better than merely comparing two groups with differing alcohol consumption.
As the authors very explicitly state, this analysis assumes that:
1. The SNP variant (rs1229984) decreases maternal alcohol consumption. This is confirmed in the data. Unfortunately, the authors do this by chi-square test (“does this alter consumption at all?”) rather than estimating the effect size. However, we can do our own calculations using Table 5:
If we round each alcohol consumption category to the mean of its bounds (0, 0.5, 3.5, 9), we get a mean intake in the SNP variant group of 0.55 units/week and a mean intake in the non-carrier of 0.88 units/week (math). This means that SNP-carrier mothers drink, on average, 0.33 units/week less. That’s a pretty small difference! We would’ve liked the authors to do this calculation themselves, and use it to report IQ-difference per unit of alcohol per week.
2. There is no association between the genotype and confounding factors, including other genes. This assumption is satisfied for all factors examined in the study, like maternal age, parity, education, smoking in 1st trimester etc. (Table 4), but unmeasured confounding is totally a thing! E.g. a SNP which correlates with the current variant and causes a change in the offspring’s IQ/KS2-score.
3. The genotype does not affect the outcome by any path other than maternal alcohol consumption, for example through affecting metabolism of alcohol.
If we believe these assumptions to be true, the authors are estimating the effect of 0.33 maternal alcohol units per week on the offspring’s IQ and KS2-score. KS2-score is a test of intellectual achievement (similar to the SAT) for 11-year-olds with a mean of 100 points and a standard deviation of ~15 points.
They find that the 0.33 unit/week decrease does not affect IQ (mean difference -0.01 [-2.8; 2.7]) and causes a 1.7 point (with a 95% confidence interval of between 0.4 and 3.0) increase in KS2 score.
This is extremely interesting. Additionally, the authors complete a classical epidemiological study, adjusting for typical confounders:
This shows that the children of pre-pregnancy heavy drinkers, on average, scored 8.62 (with a standard error of 1.12) points higher on IQ than non-drinkers, 2.99 points (SE 1.06) after adjusting for confounders. However, they didn’t adjust for alcohol intake in other parts of the pregnancy! Puzzlingly, first trimester drinking has an effect in the opposite direction: -3.14 points (SE 1.64) on IQ. However, this was also not adjusted for previous alcohol intake. This means that the estimates in table 1 (pre-pregnancy and first trimester) aren’t independent, but we don’t know how they’re correlated. Good luck teasing out the causal effect of maternal alcohol intake and timing from that.
Either way, the authors (and I) interpret the effects as being highly confounded; either residual (the confounder was measured with insufficient accuracy for complete adjustment) or unknown (confounders that weren’t measured). For example, pre-pregnancy alcohol intake was strongly associated with professional social class and education (upper-class wine-drinkers?), whereas the opposite was true for first trimester alcohol intake. Perhaps drinking while you know you’re pregnant is low social status?
If you’re like Elizabeth you’re probably surprised that drinking increases with social class. I didn’t dig into this deeply, but a quick search found that it doesappear to hold up.
This result is in conflict with that of the Mendelian randomization, but it makes sense. Mendelian randomization is less sensitive to confounding, so maybe there is no true effect. Also, the study only estimated the genetic effect of a 0.33 units/week difference, so the analyses are probably not sufficiently powered.
Taken together, the study should probably update towards a lack of harm from moderate (whatever that means) levels of alcohol intake, although how big an update that is depends on your previous position. We say “moderate” because fetal alcohol syndrome is definitely a thing, so at sufficient alcohol intake it’s obviously harmful! .
There is a decently sized, pretty well-conducted literature on adolescent intermittent ethanol exposure (science speak for “binge drinking on the weekend”). Rat adolescence is somewhat similar to human adolescence; it’s marked by sexual maturation, increased risk-taking and increased social play (Sengupta, 2013). The following is largely based on a deeper dive into the linked references from (Seemiller & Gould, 2020).
Adolescent intermittent ethanol exposure is typically operationalised as a blood-alcohol concentration of ~10 standard alcohol units, 0.5-3 times/day every 1-2 days during adolescence.
To interpret this, we make some big assumptions. Namely:
Rodent blood-alcohol content can be translated 1:1 to human
Effects on rodent cognition at a given alcohol concentration are similar to those on human cognition
Rodent adolescence can mimic human adolescence
Now, let’s dive in!
Two primary tasks are used in the literature:
The 5-choice serial reaction time task.
Rodents are placed in a small box, and one of 5 holes is lit up. Rodents are measured at how good they are at touching the hole.
Training in the 5-CSRTT varies between studies, but the two studies below consist of 6 training sessions at age 60 days. Initially, rats were rewarded with pellets from the feeder in the box to alert them to the possibility of reward.
Afterwards, training sessions had gradually increasing difficulty. To begin with, the light stays on for 30 seconds to start, but the duration gradually decreases to 1 second. Rats progressed to the next training schedule based on either of 3 predefined criteria: 100 trials completed, >80% accuracy or <20% omissions.
Naturally, you can measure a ton of stuff here! Generally, focus is on accuracy and omissions, but there are a ton of others:
Now we know how they measured performance; but how did they imitate adolescent drinking?
Boutros et al. administered 5 g/kg of 25% ethanol through the mouth once per day in a 2-day on/off pattern, from age 28 days to 57 days – a total of 14 administrations. Based on blood alcohol content, this is equivalent to 10 standard units at each administration – quite a dose! Surprisingly, they found a decrease in omissions with the standard task, but no other systematic changes, in spite of 50+ analyses on variations of the measures (accuracy, omissions, correct responses, incorrect responses etc.) and task difficulty (length of the light staying on, whether they got the rats drunk etc.). We’d chalk this up to a chance finding.
Semenova et al. used the same training schedule, but administered 5 g/kg of 25% ethanol through the mouth every 8h for 4 days – a total of 12 administrations. They found small differences in different directions on different measures, but have the same multiple comparisons problem. Looks like noise to us.
The Barnes Maze
Rodents are placed in the middle of an approximately 1m circle with 20-40 holes at the perimeter and are timed on how quickly they arrive at the hole with a reward (and escape box) below it. For timing spatial learning, the location of the hole is held constant. In (Coleman et al., 2014) and (Vetreno & Crews, 2012), rodents were timed once a day for 5 days. They were then given 4 days of rest, and the escape hole was relocated exactly 180° from the initial location. They were then timed again once a day, measuring relearning.
Figure: Tracing of the route taken by a control mouse right after the location was reversed, from Coleman et al., 2014.
Both studies found no effect of adolescent intermittent ethanol exposure on initial learning rate or errors.
Vetreno found alcohol-exposed rats took longer to escape on their first trial but did equally well in all subsequent trials:
Whereas Coleman found a ~3x difference in performance on the relearning task, with similar half-times:
Somewhat suspiciously, even though Vetreno et al. is performed 2 years later than Coleman et al. and they share the same lab, they do not reference Coleman et al..
This does, technically, show an effect. However given the small size of effect, the number of metrics measured, file drawer effects, and the disagreement with the rest of the literature, we believe this is best treated as a null result.
So, what should we do? From the epidemiological literature, if you care about dementia risk, it looks like social drinking (i.e. excluding alcoholics) reduces your risk by ~20% as compared to not drinking. All other effects were part of a heterogenous literature with small effect sizes on cognition. Taking together, long-term cognitive effects of conventional alcohol intake during adolescence should play only a minor role in determining alcohol-intake.
Thanks to an FTX Future Fund regrantor for funding this work.
Lots of people are getting covid boosters now. To help myself and others plan I did an extremely informal poll on Twitter and Facebook about how people’s booster side effects compared to their second dose. Take home message: boosters are typically easier than second shots, but they’re bad often enough you should have a plan for that.
The poll was a mess for a number of reasons, including:
I didn’t describe the options very well, so it’s 2/3 freeform responses I collapsed into a few categories.
There was a tremendous variation in what combination of shots people got.
It’s self-reported. I have unusually data-minded friends which minimizes the typical problem of extreme responses getting disproportionate attention, but it doesn’t eliminate it, and self-report data has other issues.
I only sampled people who follow me on social media, who are predominantly <45 years old, reasonably healthy, reasonably high income, and mostly working desk jobs.
I specified mRNA but not the manufacturer; Moderna but not Pfizer boosters are smaller than the original dose.
Nonetheless, the trend was pretty clear.
Of people who received three mRNA shots from the same manufacturer, comparing their second shot to their third:
12 had no major symptoms either time (where major is defined as “affected what you could do in your day.” It specifically does not include arm soreness, including soreness that limited range of motion)
2 had no major symptoms for their second shot but had major for their third
Not included in data: one person who got pregnant between their second and third shot
23 had major symptoms for their second shot, and the third was easier
This includes at least one case where the third was still extremely bad and 2-3 “still pretty bad, just not as bad as the second”
Three cases fell short of “major symptoms” for the second, but had an even easier third shot
11 people had similar major symptoms both times
2 had major symptoms for second shot, and third was worse
Of people who mix and matched doses
2 had no major symptoms either time
4 had no major symptoms for their second shot but had major symptoms for their third
Not included: 1 reported no symptoms for the first two and mild symptoms for the third
4 had major symptoms for their second shot, and their third was easier
2 people had major symptoms both times
1 had major symptoms for their second shot, and their third was worse
A client came to me to investigate the effect of high altitude on child development and has given me permission to share the results. This post bears the usual marks of preliminary client work: I focused on the aspects of the question they cared about the most, not necessarily my favorite or the most important in general. The investigation stops when the client no longer wants to pay for more, not when I’ve achieved a particular level of certainty I’m satisfied with. Etc. In this particular case they were satisfied with the answer after only a few hours, and I did not pursue beyond that.
That out of the way: I investigated the impact of altitude on childhood outcomes, focusing on cognition. I ultimately focused mostly on effects visible at birth, because birth weight is such a hard to manipulate piece of data. What I found in < 3 hours of research is that altitude has an effect on birth weight that is very noticeable statistically, although the material impact is likely to be very small unless you are living in the Andes.
Children gestated at higher altitudes have lower birth weights
This seems to be generallysupported by studies which are unusuallyrigorous for the field of fetal development. Even better, it’s supported in both South America (where higher altitudes correlate with lower income and lower density, and I suspect very different child-rearing practices) and Colorado (where the income relationship reverses and while I’m sure childhoods still differ somewhat, I suspect less so). The relationship also holds in Austria, which I know less about culturally but did produce the nicest graph.
This is a big deal because until you reach truly ridiculous numbers, higher birth weight is correlated with every good thing, although there’s reason to believe a loss due to high altitude is less bad than a loss caused by most other causes, which I’ll discuss later.
[Also for any of you wondering if this is caused by a decrease in gestation time: good question, the answer appears to be no.]
Children raised at higher altitudes do worse on developmental tests
There is a fairamount of data supporting this, and some even attempt to control for things like familiar wealth, prematurity, etc. I’m not convinced. The effects are modest, I expect families living at very high altitudes (typically rural) to be different in many ways from lower altitudes (typically urban) in ways that cause their children to score differently on tests without it making a meaningful impact on their life (and unlike birth weight, I didn’t find studies based in CO, where some trends reverse). Additionally, none of the studies looked specifically at children who were born at a lower altitude and moved, so some of the effects may be left over from the gestational effects discussed earlier.
Hypoxia may not be your only problem
I went into this primed to believe reduced oxygen consumption was the problem. However, there’s additional evidence that UV radiation, which rises with altitude, may also be a concern. UV radiation is higher in some areas for other reasons, which indeed seems to correlate with reductions in cognition.
How much does this matter? (not much)
Based on a very cursory look at graphs on GIS (to be clear: I didn’t even check the papers, and their axes were shoddily labeled), 100 grams of birth weight corresponds to 0.2 IQ points for full term babies.
The studies consistently showed ~0.09 to 0.1 grams lower birth weight per meter of altitude. Studies showed this to be surprisingly linear; I’m skeptical and expect the reality to be more exponential or S shaped, but let’s use that rule of thumb for now. 0.1g/m means gestating in Denver rather than at sea level would shrink your baby by 170 grams (where 2500g-4500g is considered normal and healthy). If this was identical to other forms of fetal weight loss, which I don’t think it is, it would very roughly correspond to 0.35 IQ points lost.
However, there’s reason to believe high-altitude fetal weight loss is less concerning than other forms. High altitude babies tend to have a higher brain mass percentage and are tall for their weight, suggesting they’ve prioritized growth amidst scarce resources rather than being straight out poisoned. So that small effect is even smaller than it first appears.
There was also evidence out of Austria that higher altitude increased risk of SIDS, but that disappeared when babies slept on their backs, which is standard practice now.
So gestating in Denver is definitely bad then? (No)
There are a billion things influencing gestation and childhood outcomes, and this is looking at exactly one of them, for not very long. If you are making a decision please look at all the relevant factors, and then factor in the streetlight effect that there may be harder to measure things pointing in the other direction. Do not overweight the last thing I happened to read.
In particular, Slime Mold Time Mold has some interesting data (which I haven’t verified but am hoping to at least ESC the series) that suggests higher altitudes within the US have fewer environmental contaminants, which you would expect to have all sorts of good effects.
Yesterday* I talked about a potential treatment for Long Covid, and referenced an informal study I’d analyzed that tried to test it, which had seemed promising but was ultimately a let down. That analysis was too long for its own post, so it’s going here instead.
Gez Medinger ran an excellent-for-its-type study of interventions for long covid, with a focus on niacin, the center of the stack I took. I want to emphasize both how very good for its type this study was, and how limited the type is. Surveys of people in support groups who chose their own interventions is not a great way to determine anything. But really rigorous information will take a long time and some of us have to make decisions now, so I thought this was worth looking into.
Medinger does a great analysis in this youtube video. He very proactively owns all the limitations of the study (all of which should be predictable to regular readers of mine) and does what he can to make up for them in the analysis, while owning where that’s not possible. But he delivers the analysis in a video rather than a text post ugh why would you do that (answer: he was a professional filmmaker before he got long covid). I found this deeply hard to follow, so I wanted to play with the data directly. Medinger generously shared the data, at which point this snowballed into a full-blown analysis.
I think Medinger attributes his statistics to a medical doctor, but I couldn’t find it on relisten and I’m not watching that damn video again. My statistical analysis was done by my dad/Ph.D. statistician R. Craig Van Nostrand. His primary work is in industrial statistics but the math all transfers, and the biology-related judgment calls were made by me (for those of you just tuning in, I have a BA in biology and no other relevant credentials or accreditations).
As best I can determine, Medinger sent a survey to a variety of long covid support groups, asking what interventions people had tried in the last month, when they’d tried them, and how they felt relative to a month ago. Obviously this has a lot of limitations – it will exclude people who got better or worse enough they didn’t engage with support groups, it was in no way blinded, people chose their own interventions, it relied entirely on self-assessment, etc.
Differences in Analysis
You can see Medinger’s analysis here. He compared the rate of improvement and decline among groups based on treatments. I instead transformed the improvement bucket to a number and did a multivariate analysis.
Much better (near or at pre-covid)
A little better
A little worse
You may notice that the numerical values of the statements are not symmetric- being “a little worse” is twice as bad as “a little better” is good. This was deliberate, based on my belief that people with chronic illness on average overestimate their improvement over short periods of time. We initially planned on doing a sensitivity analysis to see how this changed the results; in practice the treatment groups had very few people who got worse so this would only affect the no-treatment control, and it was obvious that fiddling with the numbers would not change the overall conclusion.
Also, no one checked “significantly worse”, and when asked Medinger couldn’t remember if it was an option at all. This suggests to me that “Much worse” should have a less bad value and “a little worse” a more bad value. However, we judged this wouldn’t affect the outcome enough to be worth the effort, and ignored it.
We tossed all the data where people had made a change less than two weeks ago (this was slightly more than half of it), except for the no-change control group (140 people). Most things take time to have an effect and even more things take time to have an effect you can be sure isn’t random fluctuation. The original analysis attempted to fix this by looking at who had a sudden improvement or worsening, but I don’t necessarily expect a sudden improvement with these treatments.
We combined prescription and non-prescription antihistamines because the study was focused on the UK which classifies several antihistamines differently than the US.
On row 410, a user used slightly nonstandard answers, which we corrected to being equivalent to “much improved’, since they said they were basically back to normal.
Medinger uses both “no change” and “new supplements but not niacin” as control groups, in order to compensate for selection and placebo effects from trying new things. I think that was extremely reasonable but felt I’d covered it by limiting myself to subjects with >2 weeks on a treatment and devaluing mild improvement.
I put my poor statistician through many rounds on this before settling on exactly which interventions we should focus on. In the end we picked five: niacin, anti-histamines, and low-histamine diet, which the original analysis evaluated, and vitamin D (because it’s generally popular), and selenium (because it had the strongest evidence of the substances prescribed the larger protocol, which we’ll discuss soon).
Unfortunately, people chose their vitamins themselves, and there was a lot of correlation between the treatments. Below is the average result for people with no focal treatments, everyone with a given focal treatment, and everyone who did that and none of the other focal treatments for two weeks (but may have done other interventions). I also threw in a few other analyses we did along the way. These sample sizes get really pitifully small, and so should be taken as preliminary at best.
Niacin, > 2 weeks
Selenium, > 2 week
Vitamin D, > 2 week
Antihistamines, > 2 weeks
Low-histamine diet, > 2 weeks
Change (1 = complete recovery)
95% Confidence Interval
Niacin, > 2 weeks
Selenium, > 2 weeks
Vitamin D, > 2 week
Antihistamines, >2 weeks
Low histamine diet
Niacin, > 2 weeks, no other focal treatments
Selenium, > 2 weeks, no other focal treatments
Vitamin D, > 2 week, no other focal treatments
Antihistamines, >2 weeks, no other focal treatments
Low histamine diet, > 2 weeks, no other focal treatments
All focal treatments
Niacin + Antihistamines, >2 weeks
Niacin + Low Histamine Diet, > 2 weeks
Selenium + Niacin, no histamine interventions
Niacin, > 2 weeks, no other focal treatments, ignore D
Selenium, > 2 weeks, no other focal treatments, ignore D
1 = treatment used
0 = treatment definitely not used
– = treatment not excluded
Confidence interval calculation assumes a normal distribution, which is a stretch for data this lump and sparse but there’s nothing better available.
[I wanted to share the raw data with you but Medinger asked me not to. He was very fast to share with me though, so maybe if you ask nicely he’ll share with you too]
You may also be wondering how the improvements were distributed. The raw count isn’t high enough for really clean curves, but the results were clumped rather than bifurcated, suggesting it helps many people some rather than a few people lots. Here’s a sample graph from Niacin (>2 weeks, no exclusions)
Reasons this analysis could be wrong
All the normal reasons this kind of study or analysis can be wrong.
Any of the choices I made that I outlined in “Differences…”
There were a lot of potential treatments with moderate correlations with each other, which makes it impossible to truly track the cause of improvements.
Niacin comes in several forms, and the protocol I analyze later requires a specific form of niacin (I still don’t understand why). The study didn’t ask people what form of niacin they took. I had to actively work to get the correct form in the US (where 15% of respondents live); it’s more popular but not overwhelmingly so in the UK (75% of respondents), and who knows what other people took. If the theory is correct and if a significant number of people took the wrong form of niacin, it could severely underestimate the improvement.
This study only looked at people who’d changed things in the last month. People could get better or worse after that.
There was no attempt to look at dosage.
For a small sample of self-chosen interventions and opt-in participation, this study shows modest improvements from niacin and low histamine diets, which include overlap with the confidence interval of the no-treatment group if you exclude people using other focal interventions. The overall results suggest that either something in the stack is helping, or that trying lots of things is downstream of feeling better, which I would easily believe.
Thank you to Gez Medinger for running the study and sharing his data with me, R. Craig Van Nostrand for statistical analysis, and Miranda Dixon-Luinenburg for copyediting.
* I swear I scheduled this to publish the day after the big post but here we are three days later without it unpublished, so…
Zinc lozenges are pretty well established to prevent or shorten the duration of colds. People are more likely to get colds while travelling, especially if doing so by plane and/or to a destination full of other people who also travelled by plane. I have a vague sense you shouldn’t take zinc 100% of the time, but given the risks it might make sense to take zinc prophylactically while travelling.
How much does zinc help? A meta-analysis I didn’t drill into further says it shortens colds by 33%, and that’s implied to be for people who waited until they were symptomatic to take it: taken preemptively I’m going to ballpark it at 50% shorter (including some colds never coming into existence at all). This is about 4 days, depending on which study you ask.
[Note: only a few forms of Zinc work for this. You want acetate if possible, gluconate if not, and it needs to be a lozenge, not something you swallow. Zinc works by physically coating your throat to prevent infection, it’s not a nutrient in this case. You need much more than you think to achieve the effect, the brand I use barely fits in my tiny mouth.]
Some risk factors for illness in general are “being around a lot of people”, “poor sleep” and “poor diet”. These factors compound: being around people who have been around a lot of people, or who have poor sleep or diet, is worse than being around a lot of well-rested, well-fed hermits. Travel often involves all of these things, especially by air and especially for large gatherings like conferences and weddings (people driving to camp in the wilderness: you are off the hook).
I struggled to find hard numbers for risk of infection during travel. It’s going to vary a lot by season, and of course covid has confused everything. Hocking and Foster gives a 20% chance of catching a cold after a flight during flu season, which seems high to me, but multiple friends reported a 50% chance of illness after travel, so fill in your own number here. Mine is probably 10%.
If my overall risk of a cold is 10%, and I lower the duration by 50%/4 days, I’ve in expectation saved myself 0.4 days of a cold, plus whatever damage I would have done spreading the cold to others, plus the remaining days are milder. Carrying around the lozenges, remembering to take them, and working eating and drinking around them is kind of inconvenient, so this isn’t a slam dunk for me but is worth best-effort (while writing this I ordered a second bottle of zinc to sit in my travel toiletry bag). It’s probably worth a lot for my friends with a 50% risk of illness, have unusually long colds, or live with small children who get cranky when sick. You know better than me where you fall.
Things that would change this cost-benefit estimate:
Personal reaction to zinc, or beliefs about its long term effects
Covid (all the numbers I used were pre-covid)
Different estimates for risk of illness during travel
Different estimates for the benefit of zinc
Personal susceptibility to illness
Caveats: anything that does anything real can cause damage. The side effects we know about for zinc lozenges are typically low, but pay attention to your own reaction in case you are unlucky. I remain an internet person with no medical credentials or accreditation. I attempt to follow my own advice and I’ve advised my parents to do this as well, but sometimes I’m rushed and forget.
ETA: I originally wrote this aimed at friends who already believed zinc was useful but hadn’t considered prophylactic use, and as such didn’t work very hard on it. I mistook some rando meta-analysis for a Cochrane review, and didn’t look further. There’s a pre-registered study that has come out since showing no effect from zinc. There could be other studies showing the opposite, I haven’t looked very closely. Plausibly that makes publishing this irresponsible- you definitely should judge me for mistaking a review that mentioned Cochrane for an actual Cochrane review. OTOH, writing too defensively inhibits learning, and I want to think my readers in particular are well calibrated on how much to trust off the cuff writing (but I hindered that by mislabeling the review as from Cochrane).
At this point, people I know are not that worried about dying from covid. We’re all vaccinated, we’re mostly young and healthy(ish), and it turns out the odds were always low for us. We’re also not that worried about hospitalization: it’s much more likely than death, but maintaining covid precautions indefinitely is very costly so by and large we’re willing to risk it.
The big unknown here has been long covid. Losing a few weeks to being extremely sick might be worth the risk, but a lifetime of fatigue and reduced cognition is a very big deal. With that in mind, I set out to do some math on what risks we were running. Unfortunately baseline covid has barely been around long enough to have data on long covid, most of it is still terrible, and the vaccine and Delta variant have not been widespread long enough to have much data at all.
In the end, the conclusion I came to was that for vaccinated people under 40 with <=1 comorbidiy, the cognitive risks of long covid are lost in the noise of other risks they commonly take. Coming to this conclusion involved reading a number of papers, but also a lot of emotional processing around risk and health. I’ve included that processing under a “personal stuff” section, which you can skip if you just want the info but I encourage you to read if you feel yourself starting to yell that I’m not taking small risks of great suffering seriously. I do encourage you to read the caveats section before deciding how much weight to put on my conclusions.
This post took a long time to write, much longer than I wanted, because this is not an abstract topic to me. I have chronic pain from nerve damage in my jaw caused by medical incompetence, and my attempts to seek treatment for this continually run into the brick wall of a medical system that doesn’t consider my pain important (tangent: if you have a pain specialist you trust, anywhere in the US, please e-mail me (email@example.com)). I empathize very much with the long covid sufferers who are being told their suffering doesn’t exist because it’s too hard to measure and we can’t prove what caused it.
Additionally, I’m still suffering from side effects from my covid vaccine in April. It’s very minor, chest congestion that doesn’t seem to affect my lung capacity (but I don’t have a clear before picture, so hard to say for sure). But it’s getting worse and while my medical practitioners are taking it seriously, this + the experience with dental pain make me very sensitive to the possibility they might stop if it becomes too much work for them. As I type this, I am taking a supplement stack from a high end internet crackpot because first line treatment failed and there aren’t a lot of other options. And that’s just from the vaccine; I imagine if I actually had covid I would not be one of the people who shakes it off the way I describe later in this post.
All this is to say that when I describe the long term cognitive impact of covid as being too small to measure with our current tools against our current noise levels, that is very much not the same as saying it’s zero. It’s much worse than that. What I’m saying is that you are taking risks of similar levels of suffering and impairment constantly, which our health system is very bad at measuring, and against that background long covid does not make much of a difference for people within certain age and health parameters.
A common complaint when people say “X isn’t dangerous to the young and healthy” is that it implies the death and suffering of those who aren’t young and healthy don’t matter. I’m not saying that. It matters a lot, and it’s impossible for me to forget that because I’m very unlikely to be one of the people who gets to totally walk covid off if I catch it. But from looking at the data, there don’t seem to be very many of us in my age group.
Medical research in general is really bad, research of a live issue in a pandemic is worse, you should assume these are low quality studies unless I indicate otherwise.
This research was compiled for LessWrong and Redwood Research, with the goal of assessing safety for their office spaces populated by mostly-but-not-entirely-healthy people 25-40, who were much more interested in the cognitive and fatigue sequelae than the physical. Much of this research is applicable outside that group or the sources can be used in that way, but you should know that’s what I focused on.
There isn’t any data on long covid in vaccinated people with breakthrough delta-variant infections. Neither vaccines nor delta have been around long enough for that to exist. Baseline covid has barely been around long enough to have long-term data. What I have here is:
Data showing that strength of acute infection correlates with long term impact, although not perfectly
Data on the long term impact of baseline covid, given the strength of an initial infection
Data on how the vaccine impacts the strength of acute infections
Data on how delta impacts the strength of acute infections
Long term outcomes correlate with short term outcomes
By far the best study (best does not mean good) comes out of the UK, where the BBC coincidentally started an online intelligence test in January 2020 (giving them a pre-covid baseline) and in May began asking participants if they’d had covid and if so how bad a case. When I said “assume the studies are terrible unless I note otherwise”, this is the study I wanted to highlight as reasonably good. Because they can compare test-takers in a given time period with and without covid they can control for some of the effects of changing a study population over time, which would be the biggest concern. Additionally, my statistical consultant described the paper as “not having any errors that affect the conclusion”, which is extremely good for a medical paper. This study was not ideal for determining sequelae persistence, but they did check if size of effect was correlated with time since symptom onset, and it wasn’t (but their average was only 2 months).
This study showed a very direct correlation between the severity of the acute infection and cognitive decline. I don’t trust its absolute numbers, but the pattern that more severe disease -> more severe persistent effects is very clear
A second study in Wuhan, China (hat tip Connor Flexman) examined long term outcomes of hospitalized patients, based on the intensity of their care (hospitalization, supplemental oxygen, ventilation) found an increase in acute severity was correlated with an increase in sequelae, although it didn’t hold for every symptom (there are a lot of symptoms and the highest-intervention group is small), and they barely looked at cognitive symptoms.
Taquet et alused electronic health records to get a relatively unbiased six figure sample size, that also showed a strong correlation between acute and long term outcomes, which we’ll talk about more below.
From this I conclude that your overall risk of long covid is strongly correlated with the strength of the initial infection.
Odds of acute outcomes
Sah et al estimate that 35% of covid cases (implied to be baseline and pre-vaccination) are asymptomatic, with large variation by age. Children (<18) are 46% likely to be asymptomatic, adults 18-59 are 32% likely, adults >=60 are 20% likely. I’m going to round the non-elderly adult number to ⅓ to make the math easier.
The Economist has a great calculator showing your pre-vaccine, pre-Delta risk of hospitalization and death, given your age, sex, and comorbidities. Note that this calculator only includes diagnosed cases, so it excludes both asymptomatic cases and those that did have symptoms but didn’t drive people to seek medical care. Here’s a few sample people:
A healthy 30 year old man has a 2.7% chance of hospitalization, and <0.1% risk of death
A healthy 30 year old woman has a 1.7% chance of hospitalization, and <0.1% risk of death
A 25 year old man with asthma has a 4.2% risk of hospitalization, and <0.1% risk of death
A 40 year old woman with obesity has a 6.5% risk of hospitalization, and 0.1% risk of death.
Risk of hospitalization rises steadily with age but the risk of death doesn’t really take off until 50, at which point our healthy man has a death risk of 0.4% and our health woman has a risk of 0.2%
If you’d like, you can use your own numbers in this guesstimate sheet.
And again, that’s only for officially diagnosed and registered cases. If you assume ⅓ of infections in that age group are asymptomatic, the risk drops by ⅓.
If you are hospitalized, your risk of being ventilated is currently very, very low even if you’re in a high risk category. The overall average percent of hospitalized patients who were ventilated was 2.0% in the last week for which data was available (2021-03-24), after dropping steadily for most of the plague. We can assume that’s disproportionately among the elderly and people with severe comorbidities, so if that’s not you your odds are better still. I’m going to count the risk of intubation for our cohort as 0.5%, although that’s likely still an overestimate.
How do vaccines change these odds? According to CDC data from a time period ending 2021-05-01 (so before delta took off), 27% of breakthrough infections that reached the attention of the CDC were asymptomatic, and only 7% were hospitalized due to covid (another 3% were hospitalized for non-covid reasons). It’s very likely that the CDC is undercounting asymptomatic cases, so we’ll continue using our ⅓ number for now. The minimum age of reported breakthrough infection deaths was 71, so we’ll continue to treat the risk of death as 0% for our sample subjects. Additionally, given the timing most vaccinated participants would be elderly or front line workers, raising their risk considerably. A CDC press release goes much farther, saying vaccinated people > 65 had 7% of the hospitalizations of age-matched controls.
How does delta change these odds? A Scottish study estimated delta had 2x the risk of hospitalization as alpha, which a Danish study estimated as having 1.42x the risk of hospitalization as baseline covid. So very roughly, we’re looking at 3x the risk of hospitalization from delta, relative to baseline.
So for our sample cases above, we have the following odds (note I updated these on the night it was posted, due to a math error. Thanks to Rob Bensinger for catching it):
Risk given vaccine, delta
Healthy 30yo man
0.38% = 2.7*.07*3*2/3
.002% = 0.38*.005
Healthy 30yo woman
0.24% = 1.7*.07*3*2/3
.002% = 0.24*.005
Asthmatic 25yo man
0.58% = 4.2*.07*3*2/3
.003% = 0.58*.005
Obese 40yo woman
0.92% = 6.5*.07*3*2/3
.005% = 0.92*.005
That’s not so far from the rate of hospitalization in that age range for the flu (0.6%), with some caveats (the CDC sample includes unvaccinated people and the bucket is 18-49 years old, with the higher end presumably carrying more of the disease burden).
There is concern that vaccine effectiveness wanes over time, which I haven’t incorporated here.
Odds of long term outcomes
In general I ignored studies that merely tracked number of persistent sequelae but not their severity or type, which made it impossible to distinguish between “sense of smell still iffy” from “permanent intellectual crippling”, and studies that didn’t track how long the sequelae persisted. This was, unfortunately, most of them.
We talked about the Great British Intelligence Test above. I initially found this study quite scary. The study used its own tests rather than IQ, but if you assume a standard deviation in their tests is equivalent to a standard deviation in an IQ test, the worst category (ventilation) is equivalent to a 7 point IQ loss. That’s twice as bad as a stroke in this study (although I suspect sampling bias). I suspect the truth is worse still, because the worse your recently acquired cognitive and health issues are, the less likely you are to take a fun internet test advertised as measuring your intellectual strengths. However as I noted above, you are extremely unlikely to be put on a ventilator.
For people with “symptoms, but not respiratory symptoms”, the cognitive damage is ~equivalent to 0.6 IQ points. For “medical assistance at home”, it’s 1.8 points. These are both likely to be overestimates given that the study only included known (although not necessarily formally diagnosed) cases. Additionally, while the paper claims to control for education, income, etc, bad things are more likely to happen to people in worse environments, and it’s impossible to entirely back that out.
Taquet et alused electronic health records to get a relatively unbiased six figure sample size, and found unhospitalized diagnosed covid patients (pre-Delta, pre-vaccine) had a 11% likelihood of a new neuro or psych diagnosis after their covid diagnosis, hospitalized patients had a 15% likelihood, and ICU patients had 26% likelihood. The majority of these were mood disorders (3.86%/4.49%/5.82% for home/hospitalized/ICU) and anxiety (6.81%/6.91%/9.79%). This seems quite bad, until you compare it to the overall numbers for depression in the time period, a naive reading of which suggests that covid had a protective effect
These numbers aren’t directly comparable. The second study is much lower quality and includes rediagnoses (although the total depression diagnosis numbers for the covid patients are 13.10%/14.69%/15.43%- still under the total increase in depression in the general population study).
Overall this seems well within what you’d expect from getting a scary disease at a scary time, and not evidence of widespread neuro or psych impact of covid. Even if you take the numbers at face value, they exclude most people who were asymptomatic or treated at home without a formal diagnosis.
A UK metareview found the prevalence at 12 weeks of symptoms affecting daily life ranged from 1.2% (average age: 20, minimum 18) to 4.8% (average age: 63). The cohort with average age 31 had a mean prevalence of 2.8%., which is is well within the Lizardman Constant. This is based on self-reports on survey data, which will again exclude asymptomatic cases, so even if you treat it as real, you need to discount it down to 2.8%.
On the other hand, medicine is notoriously bad at measuring persistent, low-level, amorphous-yet-real effects. The Lizardman Constant doesn’t mean prevalences below 4% don’t exist, it means they’re impossible to measure using naive tools.
Comparison to other diseases
The Taquet study did compare covid patients to those with other respiratory diseases (including the flu, not controlling for disease severity or patient age), and found covid to be modestly worse except for myoneural junction and other muscular diseases, where covid 5xed the risk (although it’s still quite low in absolute terms). Dementia risk is also doubled, presumably mostly among the elderly.
My tentative conclusion is that the risks to me of cognitive, mood, or fatigue side effects lasting >12 weeks from long covid are small relative to risks I was already taking, including the risk of similar long term issues from other common infectious diseases. Being hospitalized would create a risk of noticeable side effects, but is very unlikely post-vaccine (although immunity persistence is a major unresolved concern).
I want to emphasize again that “small relative to risks you were already taking” doesn’t necessarily mean “too small to worry about”. For comparison, Josh Jacobson did a quick survey of the risks of driving and came to roughly the same conclusion: the risks are very small compared to the overall riskiness of life for people in their 30s. Josh isn’t stupid, so he obviously doesn’t mean “car accidents don’t happen” or “car accidents aren’t dangerous when they happen”. What he means is that if you’re 35 with 15 years driving experience and not currently impaired, the marginal returns to improvements are minor.
And yet. I have a close friend who somehow got in three or four moderate car accidents in < 7 years, giving her maybe-permanent soft tissue damage (to answer the obvious question: no, the accidents weren’t her fault. Sometimes she wasn’t even driving). Statistically, that friend doesn’t exist. No one gets in that many car accidents that quickly without it being their fault. And yet the law of large numbers has to catch up with someone. Too small to measure can be very large.
What this means is not that covid is safe, but that you should think about covid in the context of your overall risk portfolio. Depending on who you are that could include other contagious diseases, driving, drugs-n-alcohol, skydiving, camping, poor diet, insufficient exercise, too much exercise, and breathing outside. If you decide your current risk level is too high, or are suddenly realizing you were too risk-tolerant in the past, reducing covid risk in particular might not be the best bang for your buck. Paying for a personal trainer, higher quality food, or a HEPA filter should be on your radar as much as reducing social contact, although for all I know that will end up being the best choice for you personally.
Change my mind
My own behavior and plans have changed a lot based on this research, so I’m extremely interested in counterarguments. To make that easy, here’s a non-exhaustive list of things that would change my mind:
Evidence that long covid gets worse over time, rather than slowly improving (note that I did look at data from SARS 1 and failed to find this).
New variants increase the risk to what it was or was feared to be in April 2020
Evidence of more severe vaccine attenuation than we’re currently seeing.
Credible paths through which the risk could drop sharply in the next six months.
Thanks to LessWrong and Redwood Research for funding this research, Connor Flexman and Ray Arnold for comments on drafts, and Rob Bensinger and Lanrian for catching errors post-publication that did not affect my overall conclusion.