Criticism as Entertainment

Media Reviews

There is a popular genre of video that consist of shitting on other people’s work without any generative content. Let me provide some examples.

First, Cinema Sins. This is the first video I selected when looking for a movie I’d seen with a Cinema Sins I hadn’t (i.e. it’s not random, but it wasn’t selected for being particularly good or bad).

The first ten sins are:

Use of a consistent brand for props in the movie they’d have to have anyway, unobtrusively enough that I never noticed until Cinema Sins pointed it out.
A character being mildly unreasonable to provoke exposition.
The logo
Exposition that wasn’t perfectly justified in-story
A convenience about what was shown on screen
A font choice (from an entity that in-universe would plausibly make bad font choices)
An omission that will nag at you if you think about it long enough or expect the MCU to be a completely different thing, with some information about why it happened.
In-character choices that would be concerning in the real world and I would argue are treated appropriately by the movie, although reasonable people could disagree
Error by character that was extremely obviously intentional on the part of the film makers. There is no reasonable disagreement on this point.
An error perfectly in keeping with what we know about the character.

Of those, three to four could even plausibly be called sins of the movie- and if those bother you, maybe the MCU is not for you. The rest are deliberate choices by filmmakers to have characters do imperfect things. Everyone gets to draw their own line on characters being dumb- mine is after this movie but before 90s sitcoms running on miscommunication- but that’s irrelevant to this post because Cinema Sins is not helping you determine where a particular movie is relative to your line. Every video makes the movie sound exactly as bad as the last, regardless of the quality of the underlying movie. It’s like they analyze the dialogue sentence by sentence and look to see if there’s anything that could be criticized about it.

Pitch Meeting is roughly as useful, but instead of reacting to sentences, it’s reading the plot summary in a sarcastic tone of voice.

Pitch Meeting is at least bringing up actual problems with Game of Thrones season 8. But I dare you to tell if early Game of Thrones was better or worse than season 8, based on the Pitch Meeting.

I keep harping on “You can’t judge movie quality by the review”, but I don’t actually think that’s the biggest problem. Or rather, it’s a subset of the problem, which is you don’t learn anything from the review: not whether the reviewer considered the movie “good” or not, and not what could be changed to do make it better. Contrast with Zero Punctuation, a video game review series notorious for being criticism-as-entertainment, that nonetheless occasionally likes things, and at least once per episode displays a deep understanding of the problems of a game and what might be done to fix it.

Why Are You Talking About This?

It’s really, really easy to make something look bad, and the short-term rewards to doing so are high. You never risk looking stupid or having to issue a correction. It’s easier to make criticism funny. You get to feel superior. Not to mention the sheer joy in punishing bad things. But it’s corrosive. I’ve already covered (harped on) how useless shitting-on videos are for learning or improvement, but it goes deeper than that. Going in with those intentions changes how you watch the movie. It makes flaws more salient and good parts less so. You become literally less able to enjoy or learn from the original work.

Maybe this isn’t universal, but for me there is definitely a trade off between “groking the author’s concepts” and “interrogating the author’s concepts and evidence”. Groking is a good word here: it mostly means understand, but includes playing with the idea and applying it what I know. That’s very difficult to do while simultaneously looking for flaws.

Should it be, though? Generating testable hypotheses should lead to greater understanding and trust or less trust, depending on the correctness of the book. So at least one of my investigation or groking procedures are wrong.

What we Know vs. How we Know it?

Two weeks ago I said:

The other concept I’m playing with is that “what we know” is inextricable from “how we know it”. This is dangerously close to logical positivism, which I disagree with my limited understanding of. And yet it’s really improved my thinking when doing historical research.

I have some more clarify on what I meant now. Let’s say you’re considering my ex-roommate, person P, as a roommate, and ask me for information. I have a couple of options.

Scenario 1: I turn over chat logs and video recordings of my interactions with the P.

E.g., recordings of P playing music loudly and chat logs showing I’d asked them to stop.

Trust required: that the evidence is representative and not an elaborate deep fake.

Scenario 2: I report representative examples of my interactions with P.

E.g., “On these dates P played music really loudly even when I asked them to stop.”

Trust required: that from scenario 1, plus that I’m not making up the examples.

Scenario 3: I report summaries of patterns with P

E.g., “P often played loud music, even when I asked them to stop”

Trust required: that from scenario 2, plus my ability to accurately infer and report patterns from data.

Scenario 4: I report what a third party told me

E.g. “Mark told me they played loud music a lot”

Trust required: that from scenario 3, plus my ability to evaluate other people’s evidence

Scenario 5: I give a flat “yes good” or “no bad” answer.

E.g., “P was a bad roommate.”

Trust required: that from scenario 3 and perhaps 4, plus that I have the same heuristics for roommate goodness that you do.

The earlier the scenario, the more you can draw your own conclusions and the less trust you need to have in me. Maybe you don’t care about loud music, and a flat yes/no would drive you away from a roommate that would be fine for you. Maybe I thought I was clear about asking for music to stop but my chat logs reveal I was merely hinting, and you are confident you’ll be able to ask more directly. The more specifics I give you, the better an assessment you’ll be able to make.

Here’s what this looks like applied to recent reading:

Scenario 5: Rome fell in the 500s AD.

Even if I trust your judgement, I have no idea why you think this or what it means to you.

Scenario 4: In Rome: The Book, Bob Loblaw says Rome Fell in the 500s AD.

At least I can look up why Bob thinks this.

Scenario 3: Pottery says Rome fell between 300 and 500 AD.

Useful to experts who already know the power of pottery, but leaves newbies lost.

Scenario 2: Here are 20 dig sites in England. Those dated before 323 (via METHOD) contain pottery made in Greece (which we can identify by METHOD), those after 500 AD show cruder pottery made locally.

Great. Now my questions are “Can pottery evidence give that much precision?” and “Are you interpreting it correctly?”

Scenario 1: Please enjoy this pile of 3 million pottery shards.

Too far, too far.

In this particular example (from The Fall of Rome), 2-3 was the sweet spot. It allowed me to learn as much as possible with a minimum of trust. But there’s definitely room in life for 4; you can’t prove everything in every paper and sometimes it’s more efficient to offload it.

I don’t view 5 as acceptable for anything that’s trying to claim to be evidenced based, or at least, any basis besides “Try this and see if it helps you.” (which is a perfectly fine basis if it’s cheap).

ESC Process Notes: Detail-Focused Books

When I started doing epistemic spot checks, I would pick focal claims and work to verify them. That meant finding other sources and skimming them as quickly as possible to get their judgement on the particular claim. This was not great for my overall learning, but it’s not even really good for claim evaluation: it flattens complexity and focuses me on claims with obvious binary answers that can be evaluated without context. It also privileges the hypothesis by focusing on “is this claim right?” rather than “what is the truth?”.

So I moved towards reading all of my sources deeply, even if my selection was inspired by a particular book’s particular claim. But this has its own problems.

In both The Oxford Handbook of Children and Childhood Education in the Ancient World and Children and Childhood in Roman Italy, my notes sometimes degenerate into “and then a bunch of specifics”. “Specifics” might mean a bunch of individual art pieces, or a list of books that subtly changed a field’s framing. This happens because I’m not sure what’s important and get overwhelmed.

Knowledge of importance comes from having a model I’m trying to test. The model can be external to the focal book (either from me, or another book), or from it. E.g. I didn’t have a a particular frame on the evolution of states before starting Against the Grain, but James C. Scott is very clear on what he believes, so I can assess how relevant various facts he presents are to evaluating that claim.

[I’m not perfect at this- e.g., in The Unbound Prometheus, the author claims that Europeans were more rational than Asians, and that their lower birth rate was evidence of this. I went along with that at the time because of the frame I was in, but looking back, I think that even assuming Europe did have a lower birth rate, it wouldn’t have proved Europeans were more rational or scientifically minded. This is a post in itself.]

If I’d come into The Oxford Handbook of Children and Childhood Education in the Ancient World or Children and Childhood in Roman Italy with a hypothesis to test, it would have been obvious information was relevant and what wasn’t. But I didn’t, so it wasn’t, and that was very tiring.

The obvious answer is “just write down everything”, and I think that would work with certain books. In particular, it would work with books that could be rewritten in Workflowy: those with crisp points that can be encapsulated in a sentence or two and stored linearly or hierarchically. There’s a particular thing both books did that necessitated copying entire paragraphs because I couldn’t break it down into individual points.

Here’s an example from Oxford Handbook…

“Pietas was the term that encompassed the dutiful respect shown by the Romans towards their gods, the state, and members of their family (Cicero Nat. Deor. 1.116; Rep. 6.16; O . 2.46; Saller 1991: 146–51; 1998). is was a concept that children would have been socialized to understand and respect from a young age. Between parent and child pietas functioned as a form of reciprocal dutiful affection (Saller 1994: 102–53; Bradley 2000: 297–8; Evans Grubbs 2011), and this combination of “duty” and “affection” helps us to understand how the Roman elite viewed and expressed their relationship with their children.”

And from Children and Childhood…

“No doubt families often welcomed new babies and cherished their children, but Roman society was still struggling to establish itself even in the second century and many military, political, and economic problems preoccupied the thoughts and activities of adult Romans”

I summarized that second one as “Families were distracted by war and such up through 0000 BC”, which is losing a lot of nuance. It’s not impossible to break these paragraphs down into constituent thoughts, but it’s ugly and messy and would involve a lot of repetition. The first mixing up what pietas is with how and who it was expressed to. The second is combining a claim about the state of Rome with the state’s effects.

This reveals that calling the two books “lists of facts” was incomplete. Lists of facts would be easier to take notes on. These authors clearly have some concepts they are trying to convey, but because they’re not cleanly encapsulated in the author’s own mind it’s hard for me to encapsulate them. It’s like trying to lay the threads of a gordian knot in an organized fashion.

So we have two problems: books which have laid out all their facts in a row but not connected them, and books which have entwined their facts too roughly for them to be disentangled. These feel very similar to me but when I write it out the descriptions sure sound like two completely different problems.

Let me know how much sense this makes, I can’t tell if I’ve written something terribly unpolished-but-deep or screamingly shallow.

Epistemic Spot Check: Children and Childhood Education in the Classical World

Introduction

Once upon a time I started fact checking books I read, to see if they were trustworthy. I called this epistemic spot checking because it was not a rigorous or thorough investigation; just following up on things I thought were interesting, likely to be wrong, or easy to check. Eventually I became dissatisfied with this. It placed too much emphasis on a binary decision about a particular book’s trustworthiness, and not enough on building models. So I started working on something better. Something that used multiple sources to build robust models of the world.

The Oxford Handbook of Children and Childhood Education in the Classical World (editors Judith Evan Grubbs and Tim Parkins) (affiliate link) is part of that attempt, but not a very big part, because it failed to be the kind of book I wanted. It was not as bad as Children and Childhood in Roman Italy at being just a bunch of facts with no organizing thesis, but it’s on that scale. And honestly it might be just as bad, I just find literature more interesting than visual arts. Like Children and Childhood… I’m going to write it up anyway, because learning from this kind of book is important.

Typically I read a book in order, but this was a collection of papers from different authors, so one chapter’s epistemics didn’t have much predictive value for the next and they didn’t build on each other the way a single-author book might. I started with chapter 15 (Children and Childhood in Roman Commemorative Art) because I was checking Children and Childhood…, and then chapter 13 (The Socialization of Roman Citizens) because it looked the most interesting.

You can see the entirety of my notes here.

Claims

Claim: Soranus advised swaddling is good for babies because it keeps them from rubbing their eyes (bad for eyesight) and leads to a healthy strong body (p290)
Verdict: Directly confirmed by a translation of Soranus’s Gynecology.

Claim: Seneca’s de Ira (On Anger) recommended:

Guiding young children to avoid high-anger personalities in adulthood
Not crushing children’s spirits
Not spoiling children

(p290)
Verdict: Directly confirmed by a translation of On Anger or as they call it, Of Anger.

Claim: Beryl Rawson called children the explicit aim of marriage in Rome.
Verdict: Yup, I remember that from last week.

Claim: Soranus, Laes, and Rawson all say a typical Roman birth would be witnessed by women from outside the home (p290-291)
Verdict: The report on Rawson is clearly true, and I believe she was quoting Soranus.

Claim: “Juvenal suggests that celebrations were held in the narrow streets outside dwellings ”
Verdict: True (see translation).

Claim: “Although Cicero did not want to govern the province for an extended period of time, he would have stressed to [his son and nephew] that this was an important duty and that his own dedication to the task was an excellent example of his virtue and self-control (Cic. Att. 5.10.2–3, 5.14.2, 5.15.1).” (p296)
Verdict: Cicero’s letters make it abundantly clear he did not want to be there, but if the author has evidence of his motives for doing so, she doesn’t share it.

Claim: When Cicero went off to war he left his son and nephew with King Deiotarus of Galatia (p296).
Verdict: Confirmed in Cicero’s letters.

Claim: Most Roman girls experienced their first marriage in their mid-to-late teens (p298).
Verdict: Likely but not proven. You can see everything I’ve gathered on this question here. The summary is: the usual view was that Roman girls got married in early-to-mid teens, then someone went through and checked tombstones looking to see who died when and if they mentioned a surviving spouse, and found that Roman women married in their late teens (excellent summary of both sides). Tombstone demographics have their own issues so I don’t consider this proven, but it is suggestive.

Claim: In 000s, the representation of children in art increased substantially, through the early 200s
Verdict: Rawson says the same thing, with some quibbling about dates.

Claim: The toga was a mark of Roman citizenship and forbidden to slaves. (p329)
Verdict: Confirmed by Wikipedia.

Claim: Quintus Sulpicius Maximus was an 11 year old boy who performed well in a poetry competition and got a nice funerary altar. (p336)
Verdict: Exact same data was in Children and Childhood in Ancient Rome (Beryl Rawson)

Claim: “Funerary reliefs as well as altars were most frequently commissioned by freedmen. To them a freeborn child, especially a son, was a mark of success. It was of particular importance to demonstrate the existence of a freeborn child, even one who had not lived to adulthood, and to show the family’s financial capacity to raise a memorial to a deceased child.” (p343)
Verdict: Likely but I haven’t seen a census. When I was reading Children and Childhood in Roman Italy, enough of the funeral art was about the freeborn children of ex-slaves that I noticed and wondered about it. But that could have been Rawson cherrypicking examples, or that freed couples chose more durable forms of art for their children than citizens.

Verdict

Like Children and Childhood in Roman Italy, The Oxford Handbook of Children and Childhood Education in the Classical World isn’t really interesting or ambitious enough to get things wrong. It is nonetheless useful as a repository of facts with which to check more ambitious books (which is in fact why I’m reading it) or generate your own theses.

Forecasting Epistemic Spot Checks

As previously mentioned, I’ve been experimenting with Parallel Forecast and Foretold on running prediction markets using my epistemic spot checks. I’ve never given much detail on this, but finally there’s something I can point you to:

The results of the last experiment

Jacob’s models for why this could be impactful.

Both are 15-20m reads.

ESC Process Notes: Claim Evaluation vs. Syntheses

Forgive me if some of this is repetitive, I can’t remember what I’ve written in which draft and what’s actually been published, much less tell what’s actually novel. Eventually there will be a polished master post describing my overall note taking method and leaving out most of how it was developed, but it also feels useful to discuss the journey.

When I started taking notes in Roam (a workflowy/wiki hybrid), I would:

Create a page for the book (called a Source page), with some information like author and subject (example)
Record every claim the book made on that Source page
Tag each claim so it got its own page
When I investigated a claim, gather evidence from various sources and list it on the claim page, grouped by source

This didn’t make sense though: why did some sources get their own page and some a bullet point on a claims page? Why did some claims get their own page and some not? What happened if a piece of evidence was useful in multiple claims?

Around this time I coincidentally had a call with Roam CEO Conor White-Sullivan to demo a bug I thought I had found. There was no bug, I had misremembered the intended behavior, but this meant that he saw my system and couldn’t hide his flinch. Aside from wrecking performance, there was no need to give each claim its own page: Roam has block references, so you can point to bullet points, not just pages.

When Conor said this, something clicked. I had already identified one of the problems with epistemic spot checks as being too binary, too focused on evaluating a particular claim or book than building knowledge. The original way of note taking was a continuation of that. What I should be doing was gathering multiple sources, taking notes on equal footing, and then combining them into an actual belief using references to the claims’ bullet points. I call that a Synthesis (example). Once I had an actual belief, I could assess the focal claim in context and give it a credence (a slider from 1-10), which could be used to inform my overall assessment of the book.

Sometimes there isn’t enough information to create a Synthesis, so something is left as a Question instead (example).

Once I’d proceduralized this a bit, it felt so natural and informative I assumed everyone else would find it similarly so. Finally you didn’t have to take my word for what was important- you could see all the evidence I’d gathered and then click through to see the context on anything you thought deserved a closer look. Surely everyone will be overjoyed that I am providing this

Feedback was overwhelming that this was much worse, no one wanted to read my Roam DB, and I should keep presenting evidence linearly.

I refuse to accept that my old way is the best way of presenting evidence and conclusions about a book or a claim. It’s too linear and contextless. I do accept that “here’s my Roam have fun” is worse. Part of my current project is to identify a third way that shares the information I want to in a way that is actually readable.

Epistemic Spot Check: Children and Childhood in Ancient Rome (Beryl Rawson)

Introduction

Children and Childhood in Roman Italy (affiliate link) was supposed to be a step in that process of learning how to extract and test models from books. Unfortunately it turned out to be poor soil. I can tell you what Children and Childhood is about, although not more than you are already capable of guessing, but ⅓ through it I can’t name a thesis. It’s just a collection of facts. So this is going to look a lot more like old epistemic spot checks than I’d hoped. I’m publishing anyway because it’s good practice writing, and because I’ve received grant money for this project and that carries with it an obligation to share as much of my work as is practical.

Claims

Claim: Quintus Sulpicius Maximus is a nice boy who placed beyond his years (11) at a poetry competition and got a very nice funerary altar depicting him as a little scholar
Verdict: Yup, way to not screw up some basic facts (read: The Oxford Handbook of Childhood and Education in the Classical World (my notes in Roam) (affiliate link) agrees with them).

Claim: Children first started appearing in (preserved) Roman art in the first century BC and steadily increased through 0200 AD
Verdict: Plausible, and a weird thing to get wrong, but not proven. I found a similar claim in The Oxford Handbook of Childhood and Education in the Classical World, which cites three sources: Currie 1996 (inaccessible), Rawson 2001 (that’s the same author as the focal book and thus not independent verification), Uzzi 2005 (inaccessible).

Claim: Representations of children picked up about this time because Augustus Caesar was trying to establish his descendants as rightful rulers after his death, and other people copied him.
Verdict: Yeah, sure seems plausible, but I don’t know how we would know that it was that as opposed to…

Claim: Augustus Caesar passed a number of laws (Lex Aelia Sentia and Lex Papia Poppaea) incentivizing marriage and children, most notably giving privileges to people who had sufficient children (3 for citizens, 4 for freedmen). Dead children counted, creating an incentive to have art commemorating your dead offspring.
Verdict: The legal claims are easily verified on wikipedia.

Or perhaps representation of children went up because…

Claim: Rome got significantly wealthier in the first century BC.
Verdict: Unknown and sensitive to definitions. I had the impression this was correct from reading other books and I expected to knock it out in five minutes, but I couldn’t actually find any data clearly laying out the case. The Greenland Ice Core data doesn’t line up

Although how it fails to line up depends on who you ask

and while I found some very cool graphs on construction in Rome, they were sourceless.

Or maybe representation of children went up because there were more freed slaves and…

Claim: Former slaves produced art of their citizen children at greatly increased rates to advertise upward mobility.
Verdict: Plausible, but still impossible to distinguish from other explanations. Funerary art of ex-slaves does seem to be overrepresented, but perhaps the upper classes produced just as much in a less durable form.

Claim: Funerary reliefs came into fashion in the first century BC, followed by altars in the first century AD and sarcophagi in the third.
Verdict: Confirmed by The Oxford Handbook of Childhood and Education in the Classical World, with some quibbling about dates.

Claim: In a typical Roman marriage, the man was 10 years older than the woman.
Verdict: Probably true. This was another one I expected to knock out in two minutes but was surprisingly difficult. The best source I found was this blog post which is a summary of one very old paper (1896) that did a demographic survey of graves, and two papers, one of which relies on that same paper and the other of which I didn’t check. It seems that the exact ages at which Roman men and women got married is in dispute, but the relative ages are agreed to be about 10 years apart.

Claim: Contraception and Abortion from the Ancient World to the Renaissance concludes, from a variety of evidence (literary, skeletal, comparative), that ‘classical peoples were somehow regulating their family sizes
Verdict: Accurate citation, but ignores the author’s skepticism of his own evidence.

Claim: “Scott (2000) has recently queried some of the assumptions in discus- sions of infanticide and has suggested that the evidence for alleged bias against females and the disabled in infant deaths is weak. ”
Verdict: Accurate citation.

Conclusion

Children and Childhood in Ancient Rome seems too boring to have made any major mistakes. If it gets overturned, I expect it to be because new evidence becomes available, which is a risk inherent to the topic.

If you’re interested in my process, you can see my notes in Roam here. Any claim with a number to its right

Screen Shot 2019-12-12 at 7.14.55 PM

is cited by another page, which you can get to by clicking on the number.

Any claim with a slider bar has been investigated and assigned a credence.

Many thanks to my Patreon Patrons and the Long Term Future Fund for financial support of this post.

How’s that Epistemic Spot Check Project Coming?

Quick context: Epistemic spot checks started as a process in which I did quick investigations a few of a book’s early claims to see if it was trustworthy before continuing to read it, in order to avoid wasting time on books that would teach me wrong things. Epistemic spot checks worked well enough for catching obvious flaws (*cou*Carol Dweck*ugh*), but have a number of problems. They emphasize a trust/don’t trust binary over model building, and provability over importance. They don’t handle “severely flawed but deeply insightful” well at all. So I started trying to create something better.

Below are some scattered ideas I’m playing with that relate to this project. They’re by no means fully baked, but it seemed like it might be helpful to share them. This kind of assumes you’ve been following my journey with epistemic spot checks at least a little. If you haven’t that’s fine, a more polished version of these ideas will come out eventually.

A parable in Three Books.

I’m currently attempting to write up an investigation of Children and Childhood in Roman Italy (Beryl Rawson) (affiliate link) (Roam notes). This is very slow going, because ChCiRI doesn’t seem to have a thesis. At least, I haven’t found one, and I’ve read almost half of the content. It’s just a bunch of facts. Often not even syntheses, just “Here is one particular statue and some things about it.” I recognize that this is important work, even the kind of work I’d use to verify another book’s claims. But as a focal source, it’s deadly boring to take notes on and very hard to write anything interesting about. What am I supposed to say? “Yes, that 11 year old did do well (without winning) in a poetry competition and it was mentioned on his funeral altar, good job reporting that.” I want to label this sin “weed based publishing” (as in, “lost in the weeds”, although the fact that I have to explain that is a terrible sign for it as a name).

One particular bad sign for Children and Childhood in Roman Italy was that I found myself copying multiple sentences at once into my notes. Direct quoting can sometimes mean “there’s only so many ways to arrange these words and the author did a perfectly good job so why bother”, but when it’s frequent, and long, it often means “I can’t summarize or distill what the author is saying”, which can mean the author is being vague, eliding over important points, or letting implications do work that should be made explicit. This was easier to notice when I was taking notes in Roam (a workflowy/wiki hybrid) because Roam pushes me to make my bullet points as self-contained as possible (so when you refer them in isolation nothing is lost), so it became obvious and unpleasant when I couldn’t split a paragraph into self contained assertions. Obviously real life is context-dependent and you shouldn’t try to make things more self-contained than they are, but I’m comfortable saying frequent long quotes are a bad sign about a book.

On the other side you have The Unbound Prometheus (David S. Landes) (affiliate link) (Roam notes), which made several big, interesting, important, systemic claims (e.g., “Britain had a legal system more favorable to industrialization than continental Europe’s”, “Europe had a more favorable climate for science than Islamic regions”), none of which it provided support for (in the sections I read- a friend tells me he gets more specific later). I tried to investigate these myself and ended up even more confused- scholars can’t even agree on whether Britain’s patent protections were strong or weak. I want to label this sin “making me make your case for you”.

A Goldilocks book is The Fate of Rome (Kyle Harper) (affiliate link) (Roam notes). Fate of Rome’s thesis is that the peak of the Roman empire corresponds with unusually favorable weather conditions in the mediteranean. It backs this up with claims about climate archeology, e.g., ice core data (claim 1, 2). This prompted natural and rewarding follow up questions like “What is ice core capable of proving?” and “What does it actually show?”. My note taking system in Roam was superb at enabling investigations of questions like these (my answer).

Based on claims creation, Against the Grain (James Scott) (affiliate link) (Roam notes) is even better. It has both interesting high level models (“settlement and states are different thing that came very far apart”, “states are entangled with grains in particular”) and very specific claims to back them up (“X was permanently settled in year Y but didn’t develop statehood hallmarks A, B, and C until year Z”). It is very easy to see how that claim supports that model, and the claim is about as easy to investigate as it can be. It is still quite possible that the claim is wrong or more controversial than the author is admitting, but it’s something I’ll be able to determine in a reasonable amount of time. As opposed to Unbound Prometheus, where I still worry there’s a trove of data somewhere that answers all of the questions conclusively and I just failed to find it.

[Against the Grain was started as part of the Forecasting project, which is currently being reworked. I can’t research its claims because that would ruin our ability to use it for the next round, should we choose to do so, so evaluation is on hold.]

If you asked me to rate these books purely on ease-of-reading, the ordering (starting with the easiest) would be:

Against the Grain

The Fate of Rome

Children and Childhood in Roman Italy

The Unbound Prometheus

Which is also very nearly the order they were published in (Against the Grain came out six weeks before Fate of Rome; the others are separated by decades). It’s possible that the two modern books were no better epistemically but felt so because they were easier to read. It’s also possible it’s a coincidence, or that epistemics have gotten better in the last 50 years.

Model Based Reading

As is kind of implied in the parable above, one shift in Epistemic Spot Checks is a new emphasis on extracting and evaluating the author’s models, which includes an emphasis on finding load bearing facts. I feel dumb for not emphasizing this sooner, but better late than never. I think the real trick here is not identifying that knowing a book’s models are good, but creating techniques for how to do that.

How do we Know This?

The other concept I’m playing with is that “what we know” is inextricable from “how we know it”. This is dangerously close to logical positivism, which I disagree with my limited understanding of. And yet it’s really improved my thinking when doing historical research.

This is a pretty strong reversal for me. I remember strongly wanting to just be told what we knew in my science classes in college, not the experiments that revealed it. I’m now pretty sure that’s scientism, not science.

How’s it Going with Roam?

When I first started taking notes with Roam (note spelling), I was pretty high on it. Two months later, I’m predictably loving it less than I did (it no longer drives me to do real life chores), but still find it indispensable. The big discovery is that the delight it brings me is somewhat book dependent- it’s great for Against the Grain or The Fate of Rome, but didn’t help nearly so much with Children and Childhood in Roman Italy, because it was most very on-the-ground facts that didn’t benefit from my verification system and long paragraphs that couldn’t be disambiguated.

I was running into a ton of problems with Roam’s search not handling non-sequential words, but they seem to have fixed that. Search is still not ideal, but it’s at least usable

Roam is pretty slow. It’s currently a race between their performance improvements and my increasing hoard of #Claims.

Epistemic Spot Check: The Unbound Prometheus

Introduction

One of the challenging things about learning is knowing what sources you should learn from- if you already knew what was correct, you wouldn’t be trying to learn it in the first place. Epistemic spot checks started as a process in which I did quick investigations a few of a book’s early claims to see if it was trustworthy before continuing to read it, in order to avoid wasting time on books that would teach me wrong things. Friends indicated they found these useful, so I started sharing them, and even got a small Patreon running.

Epistemic spot checks worked well enough for catching obvious flaws (*cou*Carol Dweck*ugh*), but have a number of problems. They emphasize a trust/don’t trust binary over model building, and provability over importance. They don’t handle “severely flawed but deeply insightful” well at all. So I started trying to create something better. This post is part of that attempt, and as such contains both checks of a book’s claims and introspection on the process of checking those claims.

But before I started that improvement process, there was another. Even the quick versions of epistemic spot checks are time consuming, and I am only one person, who will not be unemployed forever. I started working with Foretold and Parallel on a project to amplify my spot checks by having people predict how I would evaluate claims- the idea being that if the masses got good at it, prediction markets could be a partial substitute for my investigations. This blog post is also a part of that project, which entails some extra steps. If you’re interested in how this has worked in the past, check out The Fate of Rome and The Fall of Rome.

Today’s book is The Unbound Prometheus (affiliate link), which aims to explain why the Industrial Revolution happened when and where it did. Spoiler alert: I didn’t like it.

Process

I had three phases of actions: in the first, I read the book, created claims in Foretold, and entered my priors for the claims as predictions. Phase two was much like the only phase of previous checks, in which I had a set period of time (six hours, twice what I had for The Fall of Rome) to investigate randomly selected claims for as long as feels useful, after which I predict what my credence in the claim would be after 10 hours of research. This prediction was submitted as a resolution in Foretold. In practice there were several claims where I stopped an hour in, even though I still had very high uncertainty, because it seemed like it would take a lot of additional time to shrink my confidence bars.

Answers, here and in the prediction market, are given in Foretold syntax.

In phase three, I had three hours each to answer two questions, randomly selected from those I had answered in phase two. The goal here was to see how good I was at predicting my own answers. The gods of fate were not kind on this one, and I drew the two questions that least benefited from additional time, being fairly strict factual questions with an exhaustible amount of relevant material. These evaluations were not entered in Foretold, but I’ve included them here.

As is my new custom, I took my notes in Roam, a workflowy/wiki hybrid. Until recently I thought Roam was so magic that my raw notes were better formatted there than I could ever hope to make them in a linear document like this, so I could just share my conclusions here, and let people read my notes in Roam if they were especially curious. In between writing most of this post and publishing it I learned that many people find my Roam notes too difficult to read and prefer having them written out linearly. I didn’t have the time or energy to fix this post, but rest assured I’m thinking about how to do this better. In the meantime, Roam notes are formatted as follows:

The target source gets its own page
On this page I list some details about the book and claims it makes. If the claim is citing another source, I may include a link to the source.
If I investigate a claim or have an opinion so strong it doesn’t seem worth verifying (“Parenting is hard”), I’ll mark it with a credence slider. The meaning of each credence will eventually be explained here, although I’m still working out the system.
- Then I’ll hand-type a number for the credence in a bullet point, because sliders are changeable even by people who otherwise have only read privileges. If the slider and text number disagree, believe the text.
You may see a number to the side of a claim. That means it’s been cited by another page. It is likely a synthesis page, where I have drawn a conclusion from a variety of sources. The synthesis pages are also what I’ll be linking to in this post.

Another thing that changed this time around is that I learned to use the multimodal function of Foretold, which lets you combine distribution functions. This is great for hedging your bets but not great for comprehensibility, so I’ll include the graphs generated. Unfortunately Foretold still doesn’t have a graphical export, so I’m using hideous screen shots.

Even more unfortunately, WordPress, which has happily accepted these screen shots in the past, will not tolerate them now. So to see my claims and conclusion, please continue to the Google Doc. Please let me know what was useful, useless, or high friction for you. I’m especially interested in how comprehendible/usable my Roam database is.

Many thanks to my Patreon patrons and Parallel Forecast for financial support for this post

	Aceso Under Glass on Betadine oral rinses for covid…
	Glenn Willen on Betadine oral rinses for covid…
	kronopath on Inositol Non-Results
	kronopath on Follow-up survey: inositol
	Anonymous on Follow-up survey: inositol