AI, Agents, Night Science
This week: AI productivity, data team struggles, agents that matter, and night science (hot hot hot)!
Artificial Intelligence, Scientific Discovery, and Product Innovation
AI-Assistance is making the best researchers better…but at a cost?
“AI-assisted researchers discover 44% more materials, resulting in a 39% increase in patent filings and a 17% rise in downstream product innovation. These compounds possess more novel chemical structures and lead to more radical inventions. However, the technology has strikingly disparate effects across the productivity distribution: while the bottom third of scientists see little benefit, the output of top researchers nearly doubles. Investigating the mechanisms behind these results, I show that AI automates 57% of “idea-generation” tasks, reallocating researchers to the new task of evaluating model-produced candidate materials…Survey evidence reveals that these gains come at a cost, however, as 82% of scientists report reduced satisfaction with their work due to decreased creativity and skill underutilization.”
Toner-Rodgers, A. (2024). Artificial Intelligence, Scientific Discovery, and Product Innovation.
https://aidantr.github.io/files/AI_innovation.pdf
Data teams struggle
1 in 4 struggle with inter-team comms, 1 in 5 with conflicting priorities, and 2.5% report that Nobody is primarily responsible for designing and maintaining data and ML pipelines.
“The study, conducted in September 2024, gathered insights from over 200 participants across various organizations. The results highlight the critical hurdles teams face in building, orchestrating, and maintaining data and ML pipelines at scale.”
“24.5% of respondents struggle with communication breakdowns across different teams.
20.6% reported conflicts due to competing priorities between teams.”
AI Agents That Matter
The benchmarking problem that plagued ML in the teens persists…only now with AI Agents.
“AI agent benchmarking is new and best practices haven’t yet been established, making it hard to distinguish genuine advances from hype. Our thesis is that agents are sufficiently different from models that benchmarking practices need to be rethought. We have taken the first steps toward a principled approach to agent benchmarking, resulting in recommendations including cost-controlled comparisons, separating model and downstream evaluation, preventing shortcuts using appropriate hold-outs, and greater standardization of evaluation practices.”
Kapoor, S., Stroebl, B., Siegel, Z. S., Nadgir, N., & Narayanan, A. (2024). Ai agents that matter. arXiv preprint arXiv:2407.01502.
https://arxiv.org/abs/2407.01502
Night Science
Rourrrrrrrrrrrrrr.
“Night science is where we explore the unstructured realm of possible hypotheses, of ideas not yet fully fleshed out. In day science, we falsify hypotheses and observe which are left standing; in night science, we create them.”
“We were not sure exactly what we were looking for, but we were sure something interesting was hiding in that data. Multiple times during the day, we would come up with a plan for a new or modified analysis. We each coded it, one of us in Matlab, one in R. When we would compare notes, we almost inevitably first had different answers, and so we each debugged and compared again until our answers converged. What did the answer mean?”
“We kept jumping to the question: what was it, precisely, that we were asking?”
“But one can pop out to the much more open night science world, where leaps among ideas are made possible by intuition, associative thinking, unexplained observations, and loosely applied principles from other fields. When a new idea has been generated, one can pop back into the day below and test it efficiently using day science methods”
“By the time the artist knows what she wants to paint, a majority of the creative process may already have happened.”
Yanai, I., & Lercher, M. (2019). Night science. Genome Biology, 20, 1-3.
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1800-6
Reader Feedback
Got time for three questions on a janky google form?
Each episode we’ll compare the results from the survey to what my agents predicted so we can all have a good laugh.
Footnotes
I’ve started this newsletter wide, and then I’ll let the feedback nudge what is shared here one way or another. Thank you for your attention.
CTA
Was this forwarded to you? Subscribe!