Back in 1945, Vannevar Bush wrote Science The Endless Frontier. The monograph itself aims to justify the establishment of what would later become the NSF (altough it was not the first time such a thing had been proposed), and Bush even proposes the exact structure and budget it should have.

But, despite the title, the essay does not talk at all, implicitly or explicitly as science being an endless frontier. There is talk of science being an open frontier ripe for exploration, but that's as close as it gets.

Science would be an endless frontier if scientific knowledge can be accreted ad infinitum. This is the case for culture: Despite there being a bounded set of building blocks of culture, there is indeed such a great number to recombine plots, characters or media that we will have new novels, films, and animes coming for millennia to come. And despite many of those bearing some resemblance to past work, we will continue enjoying them. Thus culture is a true endless frontier.

For Science one could make a case, where

science is an endless frontier, where there are always new phenomena to be discovered, and major new questions to be answered. The possibility of an endless frontier is a consequence of an idea known as emergence. Consider, for example, water. It’s one thing to have equations describing the way a single molecule of water behaves. It’s quite another to understand why rainbows form in the sky, or the crashing of ocean waves, or the origins of the dirty snowballs in space that we call comets. All these are “water,” but at different levels of complexity. Each emerges out of the basic equations describing water, but who would ever have suspected from those equations something so intricate as a rainbow or the crashing of waves? [..] So the optimistic view is that science is an endless frontier, and we will continue to discover and even create entirely new fields, with their own fundamental questions. If we see a slowing today, it is because science has remained too focused on established fields, where it’s becoming ever harder to make progress. We hope the future will see a more rapid proliferation of new fields, giving rise to major new questions. This is an opportunity for science to accelerate.

rather than

science—the exploration of nature—as similar to the exploration of a new continent. In the early days, little is known. Explorers set out and discover major new features with ease. But gradually they fill in knowledge of the new continent. To make significant discoveries explorers must go to ever-more-remote areas, under ever-more-difficult conditions. Exploration gets harder. In this view, science is a limited frontier, requiring ever more effort to “fill in the map.” One day the map will be near complete, and science will largely be exhausted. In this view, any increase in the difficulty of discovery is intrinsic to the structure of scientific knowledge itself.

Here at Nintil I said that

Under the assumption of bounded useful (i.e. productivity enhancing) knowledge, useful knowledge would be like a mine. There would be a fixed amount, and exploiting some of it would initially leads to faster returns, but then over time, as the mine gets depleted, one has to work harder and harder to get more ideas out of it. If at the beginning you are not learning or you are not learning much, the initial growth rate is zero. As you knowledge grows, ideas beget new ideas and the growth rate increases. After this point, the growth rate could either continue increasing, stabilise, or decrease in the similar way as it went up. In the two first cases, growth will continue indefinitely, but because we have assumed bounded knowledge, it has to be the latter. This naturally then leads to sigmoid-like models. [...]

if one takes a technology at any given point in time, and one assumes that current techniques will be used in the future, then one is constrained by the limitations of those techniques: One would be setting the amount of gold in the mine to be that allowed by current technology. But technology evolves. A similar phenomenon happened a few years ago in US oil production. The production curve of oil fields tend to also be logistic, thus the rate of extraction per year is broadly a a Hubbert's curve. People predicted an imminent Peak Oil, which would have catastrophic consequences, etc. But then shale oil happened. At the end of the day, however, the underlying prediction is still correct. If one takes away shale oil from the curve, ones still sees a decline. The why that peak suddenly appears out of nowhere is that a "new mine" opened up: the technology that allows for economically profitable fracking matured enough to be used and used it was. Thus, instead of modeling the rate of growth of TFP as one big Hubbert's curve (or similar), we could model it as a succesion of curves, each representing a new technology. Discovering a new technology is adding a new term to the model, a term that didn't make sense before.

So there are two questions to be addressed: One is how scientific advance is structured. Here I think there will be probably agreement: A given field or paradigm will yield a sigmoid-like pattern and eventually it will be exhausted. New fields can then pick up and keep the pace of improvement, and so a stacking of sigmoids can sustain continuous progress.

[1]. There are also fields that seem to continue accreting knowledge at a good pace, yet it is not knowledge I would think is as useful as that of other fields. Advances in astronomical equipment have enabled the detection of, say, an increasing number of exoplanets, but it's hard to make a case for its usefulness, especially relative to alternative investments

The second question is about what science looks like in general. Or perhaps general is not the right framing. As Patrick and Michael note up in their essay, some fields seem more inexhaustible than others, importantly the inexhaustibility I care about is that of useful knowledge; of course one can keep proving mathematical theorems forever but that won't by itself stop aging1. Computer Science is an example they note where we have had lots of progress over the last decades, without signs of it stopping. On the other hand we have particle physics, which after completing the puzzle of the Standard Model with the Higgs Boson does not seem like a healthy field.

The right view is that it is a spectrum. The two factors that seem to influence how inexhaustible a field is the simplicity of the object of study and how removed one is from it2. The simpler and closer one gets, the more constrained it is. In biology for example one can zoom in on a protein. A closer look reveals the atoms that make up the protein and how they are connected together, or what different conformations can it adopt. Once we know a few facts that "field" is closed and exhausted: We know everything there is to know about said protein in isolation. But then we add protein-protein interactions, water, and transcription pathways and suddenly we have a combinatorial explosion of possibilities. Add then multiple tissues and organs and there is enough material to keep us busy for a while.

[2]. One could think of this as the density and volume of complexity

The difference between both models of scientific progress is chiefly one of mood: How much progress in general remains? So much it's in practice endless, or a lot but we are starting to run against its limits? I admit I lean towards the bounded view because the former sounds overly optimistic.

Once one has admitted that it's more of a spectrum the question shifts to where different fields are across that spectrum. I've said in the past that physics is over, but that's a broad claim (I'm really talking about some subfields in particular). Could this argument be made more rigorous? This would be of interest to allocate scientific funding.

Back in 1998, Bill Clinton said

"I do believe that in scientific terms, the last 50 years will be seen as an age of physics and an age of space exploration," Clinton said. "I think the next 50 years will very likely be characterized predominantly as an age of biology and the exploration of the human organism, especially with the completion of the human genome project, which I think will literally explode what we know about how to deal with health issues."

Now maybe the HGP was too hyped but the point that we transitioned from the age of physics to the age of biology as core engine of progress seems remarkably prescient. I do think that had we had a National Institute for Particle Physics instead taking up the funding instead of doubling the NIH's own would have been a worse outcome. But other cases are not as clear. What about the National Institutes for Energy and Materials Research (with the same budget)? Maybe it doesn't make sense on a closer examination, but one'd need some numbers to even start considering the decision.

To quantify how close is a field to ending, or conversely how promising it is, some indicators come to mind, each with their own problems:

Number of publicationsEasy to measureVolume doesn't make for relevance
Number of researchers in the fieldEasy to measureJust because it's popular doesn't mean it matters
Funding the field getsEasy to measure, a measure of how much funding bodies think the field is worth itScience funding has inertia and is enmeshed in politics, so it is not directly connected to expectations
Subjective sense among researchers that progress is being made or notEasy to measureBiased responses
Increase in the time between breakthroughs in the fieldDirectly related to relevant knowledge production and how hard it is to getHard to determine what is a breakthrough; a field can make steady progress without key step changes
Patents citing less papers from that fieldDirect measure of how relevant knowledge is to inventionsHarder to use for basic science
Rhetoric shift in calls for funding from "Fund us because it is useful" to "Fund us because knowledge is intrinsically valuable" or "Fund us because we did great in the past"Indirect but useful metric of researchers' true beliefs about their disciplineNot always a good heuristic (See Deep Learning). Hard to sample.
Number of research paths being exploredDirect measure of field exhaustion, especially if restricted to experimentally testable pathsHard to measure and compare, there may be hundreds of promising drugs to cure cancer, but only 6 ways to to brain computer interfaces

Ok, suppose we somehow manage to compile the above for two arbitrarily defined fields, then what? One could perhaps just take the metrics as input to a function that spits out percentages out of 100% like softmax or just a proportion of the total for the metric of interest.

A problem with the whole approach is that one has to define what fields are and where discoveries, funding, or researchers fall into. What is to be mapped is nebulous and there are probably many reasonable answers to these problems. One way to again overcome this is to do something like an exponential weighted average of contributions to a discovery. So if discovery X happened in genomics but that was enabled by Y in computer science then X would be imputed to both genomics and CS with some decay ratio a someone gets away from the discovery.

Take this cool paper: In vivo amelioration of Age-associated hallmarks by partial reprogramming (Ocampo et al., 2016) from the Izpisua-Belmonte lab at the Salk Institute. The point of the paper is to take genetically engineered mice that express the OSKM transcription factors when dosed with doxycycline (an antibitiotic), in turn this show that rejuvenation across various metrics. To get to this development, the paper itself refers us to the following enabling discoveries

  1. Induced pluripotent stem cells + The OSKM/Yamanaka factors
  2. Knowledge that epigenetic disregulation is involved in aging
  3. The development of Tetracycline-controlled transcriptional activation
  4. Past work on OSKM in mice that unfortunately led to tumorigenesis
  5. Past work on OSKM in vitro

These are the ones that come to mind, but one could go as far of making every citation into a point in this list, and perhaps that's a way, just crawl N levels through the citation tree assigning scores. Fields may not be easily identifiable, but departments and scientists are. That's an interesting research project that I can now do with the semantic scholar dataset (And a suitable graph database): For a given paper, estimate the scores of everyone who contributed at N levels of depth in the tree. That way for a given set of agreed breakthroughs, one can run this algorithm on all of them, group by author, department or field and get a ranked list of contributions. In turn that can be used to distribute funding. I wonder if this would coincide with our priors about what ought to be funded.

There should be literature on this problem already, but I wanted to focus on a different topic before going into that.