Last year in my end of year review I wrote that

What is the plan for 2021? The year will start with some posts in the Fund People, not Projects series that I just started. Having an answer to "What's the best way to structure science" would be nice. Maybe by the end of year we'll have that answer. I am also working on some longevity-related stuff but that will take a while to be made public.

Indeed 2021 started with that, I added a few more entries to the Fund People, not Projects series. Some of that work ended up leading up to two essays elsewhere, one for the Tony Blair Institute-The Entrepreneurs Network's joint report The Way of the Future and one more for Works in Progress. The former expands on an idea from the latter: The fact that we don't have "good" evidence that many proposals to fix science "work" and so we should do some randomized control trials. I'm now less sure of this.

Science funding: diversity and randomization

When I wrote it I already acknowledged some obvious critiques: Sure, measuring what works in science is hard, this wouldn't be like a clinical trial where there is a crisply defined endpoint, rather a varied range of metrics to measure success. At the same time, I acknowledged that we don't need to RCT all the things (That would be engaging in the same flawed thinking that led to the claim that “there is no evidence that masks work").

I think however that I didn't pay enough attention to the costs and benefits that a hypothetical funder would face when implementing these policies, nor to how hard it is actually run these comparisons. Regarding costs and benefits, a funder that is allocating science funding in any one given way is implicitly asserting that their method does better than chance, but they are not doing this on the basis of past RCTs, rather on a mix of all sorts of pieces of knowledge that someone else may weigh differently. For example, I came to the conclusion that top science is done by a small elite in disproportional amounts and that is is probably not due to a Matthew effect. Conversely there is a current of thought within NIH that wants to spread funding more because they think that concentrating funding in elite labs won't lead to more or better science. There is also the position that everything should be a funding lottery (because radical uncertainty or something, but here we don't believe in that). But in that case as a funder you would be facing the choice between funding what you think is best, or running an RCT that would allocate funding to a funding mechanisn you don't quite believe in. To be more concrete, the recently launched Arc Institute is making a bet on a very specific (HHMI-inspired) model of funding and doing science. Should the funders have instead started two institutes, with the second one doing something different altogether? The problem here is that a) Knowing if this second-best was actually worth it would take decades b) It would be very hard to compare both institutes and say that one did substantially better than the other. There is a reasonable case to be made for both! The particular managerial talent may matter more than the nominal charter that sets off the courses of each entity, making it even harder to compare both models fairly.

Randomization is a way to gain information, but one needs ideas to allocate funds to. Starting novel institutions is also a way to generate information. Arc wouldn't have taken the form it took without HHMI and its Janelia campus or the Broad Institute. Sure there isn't super incontrovertible econometric evidence that in particular HHMI leads to better research relative to a counterfactual, but if one puts together various strands of evidence (Something that requires judgement, not mechanical aggregation of results in a meta-analysis) then that conclusion starts to make more sense. When I proposed a draft for a hypothetical funding institution, that three pronged design (May be found at the bottom of this) reflects some beliefs about the importance of funding "people not projects" (Parts A and B) that focused, engineering-heavy efforts are also required (Part C), that some randomization is good (Part A), and that it may be good to free up a small intellectual elite to pursue very long term projects without worrying about grant renewals, and without having to be extremely detailed about what they want to do (Part B). I didn't spend a lot of time thinking about this structure and I wouldn't necessarily do it this way if I were to do it tomorrow, but it serves as an illustration of how beliefs influence institutional design, i.e. I am certain enough of some things in science funding to propose a concrete structure vs randomizing everything and trying to learn from that.

The Works in Progress essay points out to a benefit of randomizing: That it would introduce more diversity

There is one more argument in favor of trying more things out through this experimental approach: it will increase the diversity of funding mechanisms available at any given time. By most measures, the US innovation ecosystem is the world’s leading engine of technical and scientific progress. Part of this success may be due to the diversity of funding: rather than coordinating or planning the entire nation’s scientific investments centrally, the US historically has enabled a menagerie of entities to thrive, from philanthropies, privately-run federally funded research centers, to university and industrial labs. This makes it easier for a given researcher to find a research home that suits her and her ideas. Diversity could be further pursued: a large agency like NIH or one of its member institutes like the National Cancer Institute could be split into two or more funding mechanisms internally, and their performance could be assessed every few years.

But one does not really need to randomize to get this, one can just go and start institutions. These shouldn't strive to be maximally diverse just for the sake of diversity: I think some competition is good.

Longevity: the Rejuvenome

Last year's "I am also working on some longevity-related stuff but that will take a while to be made public." ended up being made public in a series of tweets in August (More here, here). In 2020 Rejuvenome was just a piece of paper, and now it has ended up becoming a project at the Astera Institute. Seeing a name I coined having something tangible to refer to has been great!

Wildfires

Besides these main themes I wrote two posts on wildfires in California which got condensed into a single piece for a16z Future. Out of that I got some emails offering me jobs in wildfire-related startups (!), a big tech company used the posts in an internal slide deck to help justify a wildfire-related project, and I hope, everyone that read the piece got more information about the wildfires situation in California. As I say in the articles, there is a rising trend, but the trend is quite noisy: While the 2021 fire season was poised to be the worst ever that didn't happen, in 2021 wildfires devastated 0.6x the area they did in 2020.

Tools for thought

I wrote an essay on what would a better Google Scholar look like. In the process I found a large number of attempts at building tools (in the context of the essay, search engines and social annotation) for scientists that failed. In other contexts, lots of seemingly well funded work failed to translate in useful tools, though for sure as Michael Nielsen points out there some things do stick. What sticks and what doesn't? Nielsen and Andy Matuschak's own Quantum Country is a beautifully designed website and an experiment on a novel way to teach by embedding carefully curated SRS prompts in the body of the essay. Yet these essays remain rare (Two examples in the wild are my own essay on whether scientists' performance drops with age and David Chapman's Maps, the territory, and meta-rationality). What kinds of tools for thought exist (Surely a note taking app, SRS prompts, hypothes.is, and the idea of retweeting deserve their subcategory!). Thought includes memory, but also idea generation, improved information acquisition, richer understanding, and more.

Learning: reverse engineering how the brain learns vs leveraging it

A while back I wrote a review of Bloom's two sigma problem: the finding that one on one tutoring is extremely effective and the problem being how to achieve that at scale. The way this has been approaches is to try to reverse engineer tutors. By building models of how students learn, and how best to give them cues to help them overcome specific problems that they may have ("Intelligent Tutoring Systems", ITS) the hope is to solve this problem, but for now it remains unsolved.

A substantial chunk of the literature I reviewed there focuses on educating children, with some particular focus on children with learning disabilities. The most promising case study I discuss there is the DARPA Digital Tutor, but again for now it remains a one-off curiosity. The reason why is that to build effective ITSs one needs to find domain experts and distill their knowledge, and this is notoriously hard. Experts often don't know how much they know, so they may skip or gloss over obvious details that are not so obvious to beginners. Sometimes they won't even be able to explain their actions! An example is this fire commander who says that "I don't make decisions". He has interiorized the domain of firefighting so much that acting on well honed instinct is all one needs. The ITS approach would be to sit down with the fire commanded and extract said knowledge, and this work to some extent.

But there is another way, that described in my Scaling tacit knowledge essay: By being exposed to a large library of examples (e.g. multiple hours of filmed real world firefighting operations), one may be able to be taught without having to be explained the domain. Ideally one would of course have both explanation and "show, don't tell" videos. Me and people like me who are STEM-brained when given a problem take some joy in finding solutions to it without stopping as much as we ought to ask if the problem is the right one. Rather than asking how to model knowledge acquisition and use that to build a tutor, leave it up to the brain to make sense of the library of examples: Knowledge distillation from examples is a problem our brain naturally solves!

I've started recording myself reading through papers, and in the essay itself I point at a test of the usefulness of this perspective on learning: One could just build that library of examples and see how students fare. In chess and language learning this already seems to work!

Other

Besides the wildfires piece, I coined the term "cozy futurism" which (to my surprise!) some people have started using.

I also recorded a lot of podcasts, the most in a single year, covering mostly meta-science and longevity.

And co-organized the Bottlenecks 2021 conference.

Some of a development during 2021 is that now I think biology can be understood more than I initially thought (for reasons at the last link).

What I will be doing in 2022

2022 looks less clear than 2021! Unlike last year there is no ongoing series of posts that calls for a continuation. I expect I'll be doing something in the tacit knowledge/video direction, and maybe I'll publish more aging-related essays.

I'll still be based in the Bay Area for the foreseeable future (with a planned visit to Miami and then Boston in January), that is not changing.