{"found":50496,"hits":[{"document":{"authors":[{"contributor_roles":[],"family":"Marcum","given":"Christopher Steven","url":"https://orcid.org/0000-0002-0899-6143"}],"blog":{"authors":null,"community_id":"8bdb1ae7-4621-4fa5-ad1a-3a639417dfd5","created":1768694400,"current_feed_url":null,"description":"Perspectives on science, data, and technology that don't fit anywhere else.","favicon":"https://rogue-scholar.org/api/communities/8bdb1ae7-4621-4fa5-ad1a-3a639417dfd5/logo","feed_format":"application/atom+xml","feed_url":"http://chrismarcum.com/marcum-blog/feed.atom","filter":null,"generator":"Jekyll","home_page_url":"https://www.chrismarcum.com/marcum-blog/","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"chrismarcum","status":"active","subfield":"3312","title":"Open Evidence","updated":1781278875,"use_api":null},"blog_name":"Open Evidence","blog_slug":"chrismarcum","content_html":"<p>Late last month, the US Census Bureau released a really cool new data product: the <a href=\"https://www.census.gov/data/experimental-data-products/lace.html\">Local Air Conditioning Estimates</a> or LACE. An experimental data product, LACE provides insights into air conditioning prevalence across the United States. This new data product is significant because it fills a critical information gap in our knowledge about energy use potential and vulnerability to extreme heat.</p>\n<p>One of the things I'm really excited about is the underlying methodology. The LACE estimates were derived using cross-survey modeling and leveraged machine learning to integrate detailed housing data from the American Housing Survey (AHS) with the comprehensive geographic coverage of the American Community Survey (ACS). Census is innovating here!</p>\n<p>## \nI am still a gerontologist at heart and I know one of the major perennial issues elders face each year is the challege of summer heat. Summer heat is especially dangerous for older adults because aging bodies lose several of the systems that normally protect people from overheating. I merged the new LACE estimates with ACS 5-year data regarding the population aged 65 and older at the county-level. By combining these datasets, we can visualize the distribution of households without air conditioning alongside the concentration of older residents. The code <a href=\"https://github.com/cmarcum/talks-and-posts/tree/main/2026-06-12-LACE-and-Age\">is available here</a> (and requires a free Census API key).</p>\n<p>The map below visualizes these metrics by representing the percentage of occupied households without air conditioning. Darker tones indicate a higher proportion of homes lacking cooling systems. If you hover over a county with your cursor (or finger if you're on a mobile device), a pop-up will display the percentage of households without AC and the percentage of the local population aged 65 or older. While I did not look at the bivariate correlation between the two, one thing I did notice in the viz is Appalachia looks particularly exposed due to its combination of high elder population and low AC coverage. It can get HOT in them hollers (today's <a href=\"https://www.wpc.ncep.noaa.gov/heatrisk/\">heat index is extreme</a> in many of those places).</p>\n<div class=\"map-container\" style=\"margin: 20px 0;\">\n<iframe height=\"600px\" src=\"/marcum-blog/assets/leaflets/lace_o65_map.html\" style=\"border: none; border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);\" title=\"Choropleth / Heatmap of County-Level Percent of Households without Air Conditioning\" width=\"100%\">\n</iframe>\n</div>","doi":"https://doi.org/10.59350/d0nmw-5wf08","guid":"https://www.chrismarcum.com/marcum-blog/2026/06/12/LACE-and-Age","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1781222400,"rid":"9h3az-r0n56","summary":"Late last month, the US Census Bureau released a really cool new data product: the Local Air Conditioning Estimates or LACE. An experimental data product, LACE provides insights into air conditioning prevalence across the United States. This new data product is significant because it fills a critical information gap in our knowledge about energy use potential and vulnerability to extreme heat.","tags":["General","Open Data","Government"],"title":"A Really `Cool` New Data Set from Census","updated_at":1781280738,"url":"https://www.chrismarcum.com/marcum-blog/2026/06/12/LACE-and-Age.html","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p><a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> and I <a href=\"https://geotessera.org/blog/2026-06-09-tessera-v1-1\">announced TESSERA v1.1</a>\non behalf of <a href=\"https://geotessera.org/about#:~:text=for%2520Science%2520%C2%B7%2520Isambard-,People,-Lead%2520Faculty\">the team</a> earlier this week, and I wanted to follow up here with a more\nvisual explanation of what changed as I got quite a few questions about it!</p>\n<p>v1.1 is a retrained successor to the <a href=\"https://anil.recoil.org/papers/2025-tessera\">original v1.0 model</a> that\n<a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> and the team have been hammering on for months. Crucially, since we\npre-generate embedding 'map tiles', the new release is a drop-in replacement if\nyou just swap tiles; the basic format of 128 dimensions is unchanged.  Accuracy\nof your tasks should improve in all cases (a trend which will continue as we\ntrain better models with more data and training FLOPS).</p>\n<h2 id=\"fewer-artefacts-in-low-observation-areas\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#fewer-artefacts-in-low-observation-areas\"></a>Fewer artefacts in low observation areas</h2>\n<p>Tessera v1.0 could sometimes produce noisier tiles in regions with few clear\nsatellite observations (e.g. due to persistent cloud or satellite sensor gaps).\nThis exhibited as boundary-like seams in the tiles where the inferred\nembeddings didn't quite align; e.g. along Sentinel-1 ascending/descending\ncoverage edges where one side of the line might have ~50 valid observations and\nthe other ~150.</p>\n<p>Tessera v1.1 now handles both sparse and imbalanced observation patterns\ngracefully! If your region of interest was small and didn't straddle a\nproblematic tile you'll see no difference, but large-scale analyses should get\ncleaner.</p>\n<p>The easiest way to see all this is to look at the embeddings themselves in the\n<a href=\"https://tze.geotessera.org\">TZE explorer</a>. In this video I flip between the\nv1.0 and v1.1 embeddings over the same regions, visualised in false colour:</p>\n<p><div class=\"video-center\"><iframe allowfullscreen=\"\" frameborder=\"0\" height=\"315px\" sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" src=\"https://crank.recoil.org/videos/embed/297de7c9-9cea-4051-8b27-041fffa90e72\" title=\"Tessera 1.0 to 1.1 embeddings\" width=\"100%\"></iframe></div></p>\n<p>What you're looking at in the v1.0 layer are the grid-like seams running\nthrough an otherwise homogeneous landscape (Ireland doesn't really have those\njagged lines, you can confirm by visiting my lovely home).</p>\n<p>What's happening is that the number of valid observations jumps across the\nline, and the old v1.0 model showed that difference up into the embeddings. The\nspeckly patches are areas where persistent cloud left the model with too few\nclean observations to produce a stable representation.</p>\n<p>We then switch to the v1.1 layer, and the seams are gone and the noisy patches\nresolve into a smooth structure that follows the actual land cover. It's <em>very</em>\nsatisfying to click around the 10m\u00b2 pixels and watch embeddings that used to\nflicker between years settle down into stable trajectories in <a href=\"https://tze.geotessera.org\">the explorer</a>!</p>\n<h2 id=\"temporal-stability\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#temporal-stability\"></a>Temporal stability</h2>\n<p>If you're doing analysis over a long period of time, then the 128-dimensional\nembeddings are now much more consistent year-on-year for the same location.\nThis is a big deal for tasks like change detection, trend analysis, and even\njust convenience since training a classifier on one year and applying it to\nanother is now much more accurate.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/tessera11-temporal-drift.webp\" title=\"Differences in the same region across years with Tessera v1.0 and v1.1 (credit: Jovana Knezevic)\"/></p>\n<p>This feature won't affect most users,\nbut we're pretty pleased with how well change detection now works.</p>\n<h2 id=\"expanded-coastal-coverage-worldwide\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#expanded-coastal-coverage-worldwide\"></a>Expanded coastal coverage worldwide</h2>\n<p>The v1.0 land mask we used to mask out ocean areas was too aggressive, and\ndropped legitimate land pixels along coastlines or on small islands. We've\nlistened to our <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-thomas-worthington\">mangrove-loving friends</a> and extended the inference\nbuffer to 20km, which brings coastlines and remote islands properly into\ncoverage.</p>\n<p><a href=\"https://tze.geotessera.org/?store=v1.1\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/tze-explorer-v1.1-ss-1.webp\" title=\"The green false colour is the expanded coastal tiles, which now captures all of the UK including islands\"/> </a></p>\n<h2 id=\"our-coverage-maps-now-include-v10-and-v11\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#our-coverage-maps-now-include-v10-and-v11\"></a>Our coverage maps now include v1.0 and v1.1</h2>\n<p>I updated the <a href=\"https://ucam-eo.github.io/tessera-coverage-map/\">live coverage map</a> to now\ntrack both generations side-by-side, so you can see exactly which tiles exist\nfor v1.0 and v1.1 in any given year:</p>\n<p><div class=\"video-center\"><iframe allowfullscreen=\"\" frameborder=\"0\" height=\"315px\" sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" src=\"https://crank.recoil.org/videos/embed/97d422a2-af9c-47b5-947a-c136ad7093b6\" title=\"Tessera v1 and v1.1 coverage map\" width=\"100%\"></iframe></div></p>\n<p>This is all updated via a <a href=\"https://github.com/ucam-eo/tessera-coverage-map/blob/main/.github/workflows/map.yml\">GitHub Action on ucam-eo/tessera-coverage-map</a>\nthat also updates an index Parquet file of all available manifests.</p>\n<h3 id=\"getting-the-v11-embeddings\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#getting-the-v11-embeddings\"></a>Getting the v1.1 embeddings</h3>\n<p>To get the new embeddings, grab the <a href=\"https://github.com/ucam-eo/geotessera/releases/tag/v0.9.0\">geotessera 0.9.0+ release</a> of the\n<a href=\"https://anil.recoil.org/notes/geotessera-python\">Python library</a> which went out alongside v1.1. It has a new\n<code>--dataset-version</code> flag to pick v1.0 or v1.1, and a <code>--dataset-variant</code> flag\nnow that multiple parties are generating embeddings for the community:</p>\n<ul>\n<li><code>vultr</code> is the original <a href=\"https://geotessera.org/blog/2026-03-30-training-and-inference-at-scale\">v1.0 global run</a></li>\n<li><code>cambridge</code> is our <a href=\"https://www.tunbury.org/2026/05/20/processing-uk-azure-spot/\">OxCaml-generated</a> v1.1 run for early adopters</li>\n<li>We're working on a Zarr-native full global v1.1 with <a href=\"https://www.cyclops.ai/\">Cyclops.ai</a>, covering 2017-2025 that will become the default once it lands.</li>\n</ul>\n<p>Use <a href=\"https://docs.astral.sh/uv/\">uvx</a> to try this without any installation:</p>\n<pre><code class=\"language-bash\">uvx geotessera download \\\n  --country \"United Kingdom\" \\\n  --year 2024 \\\n  --dataset-version v1.1 \\\n  --dataset-variant cambridge \\\n  --format npy \\\n  --output ./uk-v1.1\n</code></pre>\n<p>All the embeddings (both versions) are also now in the <code>s3://tessera-embeddings</code>\npublic bucket on AWS Open Data, which geotessera 0.9 switches to by default.\nSpare a kind thought for \"okavango\", our single overworked Cambridge server that served every\nTESSERA embedding for the first six months without falling over (much)!\nBut seriously, at some point, we're going to have to turn off `okavanago' as it's\ntaking up a significant amount of the egress bandwidth for Cambridge, so I encourage\nusers to upgrade to geotessera 0.9 as soon as possible just to change the source\nof your embeddings download. Let me know if you have any problems!</p>\n<h3 id=\"also-on-hugging-face-now\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#also-on-hugging-face-now\"></a>Also on Hugging Face now</h3>\n<p>We're also now on <a href=\"https://huggingface.co/geotessera/TESSERA-V-1.1\">Hugging Face</a>\nwith the full v1.1 (and <a href=\"https://huggingface.co/geotessera/TESSERA-V-1.0\">v1.0</a>)\nmodel weights, with checkpoints for both the Microsoft Planetary Computer and\nAWS Open Data preprocessing backends. If you'd rather run inference yourself\nor fine-tune on your own data, everything you need is there, all under CC0 as\nusual. Do <a href=\"https://eeg.zulipchat.com\">let us know</a> if you fine-tune a model as\nwe'd love to see how it goes.</p>\n<p>If there's a region of the world you need for your own research urgently,\nplease do <a href=\"https://github.com/ucam-eo/geotessera/issues\">request an ROI</a> on the\ngeotessera issue tracker and we'll prioritise it in the generation queue.\nOtherwise, sit tight as we'll have full global 2017-2025 coverage within a few\nmonths!</p>\n<p>See also [coverage from the <a href=\"https://www.meteorologicaltechnologyinternational.com/news/satellites/cambridge-ai-tool-converts-satellite-archives-into-accessible-earth-intelligence.html\">Meteorological Technology trade magazine</a> about the release.</p><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li></ul>","doi":"https://doi.org/10.59350/vcqjp-24y05","guid":"https://doi.org/10.59350/vcqjp-24y05","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1781222400,"reference":[{"id":"https://doi.org/10.48550/arxiv.2506.20380","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/7hy6m-1rq76","unstructured":"<b>[cito:citesAsRelated]</b>"}],"rid":"h0wq3-hpy81","summary":"Frank Feng and I announced TESSERA v1.1 on behalf of the team earlier this week, and I wanted to follow up here with a more visual explanation of what changed as I got quite a few questions about it! v1.1 is a retrained successor to the original v1.0 model that Frank Feng and the team have been hammering on for months. Crucially, since we pre-generate embedding 'map tiles', the new release is a drop-in replacement if you just swap tiles;","tags":["Tessera","Spatial","Ai","Satellite"],"title":"Tessera v1.1 released, with smoother and temporally stable embeddings","updated_at":1781276156,"url":"https://anil.recoil.org/notes/tessera-v11-out","version":"v1"}},{"document":{"authors":[{"contributor_roles":[],"family":"Turner","given":"Stephen D."}],"blog":{"authors":[{"name":"Stephen Turner"}],"community_id":"382941a7-2ffa-41df-8bbb-5f772188517f","created":1780876800,"current_feed_url":null,"description":"A practicing data scientist's take on AI, genomics, biosecurity, and the ways AI is reshaping how science gets done. Weekly updates from the field. Occasional notes on programming.","favicon":"https://rogue-scholar.org/api/communities/382941a7-2ffa-41df-8bbb-5f772188517f/logo","feed_format":"application/rss+xml","feed_url":"https://blog.stephenturner.us/feed","filter":null,"generator":"Substack","home_page_url":"https://blog.stephenturner.us","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"stephenturner","status":"active","subfield":"1311","title":"Paired Ends","updated":1781270487,"use_api":null},"blog_name":"Paired Ends","blog_slug":"stephenturner","content_html":"<p>I've had a busy week, taking the day off today, and I haven't had a chance to do much reading. I've been spending a ton of time lately developing a new <a href=\"https://hooslist.virginia.edu/ClassSchedule/ClassHistory?subject=DS&amp;catalogNumber=5080\">course</a> I'll be teaching this fall, and preparing a <a href=\"https://ai.provost.virginia.edu/ai-upskilling\">workshop</a> on AI-powered literature review and synthesis I'll be teaching next week (if you're at UVA, <a href=\"https://www.eventbrite.com/e/in-person-smarter-literature-reviews-with-ai-powered-tools-tickets-1987394833446?aff=oddtdtcreator\">register</a> and attend for the in-person event if you can \u2014 it'll be much more engaging than Zooming in, trust me).</p><p>Here are my open browser tabs I have open that I hope to catch up on soon.</p><p class=\"button-wrapper\" data-attrs=\"{&quot;url&quot;:&quot;https://blog.stephenturner.us/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}\" data-component-name=\"ButtonCreateButton\"><a class=\"button primary\" href=\"https://blog.stephenturner.us/subscribe?\"><span>Subscribe now</span></a></p><h3>Blogs/newsletters/etc</h3><ol><li><p><a href=\"https://darioamodei.com/post/policy-on-the-ai-exponential\">Dario Amodei \u2014&nbsp;Policy on the AI Exponential</a></p></li><li><p><a href=\"https://www.anthropic.com/research/agents-in-biology\">Paving the way for agents in biology \\ Anthropic</a><a class=\"footnote-anchor\" data-component-name=\"FootnoteAnchorToDOM\" id=\"footnote-anchor-1\" href=\"#footnote-1\" target=\"_self\">1</a></p></li><li><p><a href=\"https://grants.nih.gov/grants/guide/notice-files/NOT-OD-26-086.html\">NIH RFI on limiting the number of grants per PI</a></p></li><li><p><a href=\"https://www.anthropic.com/news/claude-fable-5-mythos-5\">Claude Fable 5 and Claude Mythos 5 \\ Anthropic</a><a class=\"footnote-anchor\" data-component-name=\"FootnoteAnchorToDOM\" id=\"footnote-anchor-2\" href=\"#footnote-2\" target=\"_self\">2</a></p></li><li><p><a href=\"https://www.newyorker.com/news/fault-lines/eight-predictions-for-the-future-of-higher-education\">Eight Predictions for the Future of Higher Education</a></p></li><li><p><a href=\"https://mattsbiodefense.substack.com/p/five-things-june-7-2026\">Matt Lubin: Five Things: June 7, 2026</a></p></li><li><p><a href=\"https://www.profgmedia.com/p/is-ai-more-expensive-than-the-employees\">Is AI More Expensive Than the Employees It's Replacing?</a></p></li><li><p><a href=\"https://liangchang.substack.com/p/the-anti-scaling-law-in-biology-and\">The Anti-Scaling Law in Biology, and Why AI Could Make Crowding Worse Before Making Drug Development Better</a></p></li><li><p><a href=\"https://theinfinitesimal.substack.com/p/thoughts-on-ai-in-academia\">Sasha Gusev: Thoughts on AI in academia</a></p></li><li><p><a href=\"https://www.0xkato.xyz/how-llms-actually-work/\">How LLMs Actually Work | 0xkato</a></p></li><li><p><a href=\"https://evgenykiner.substack.com/p/a-cell-is-not-a-spreadsheet-why-virtual\">A cell is not a spreadsheet- why \"Virtual Cells\" are still mostly hype</a></p></li><li><p><a href=\"https://www.anthropic.com/institute/recursive-self-improvement\">Anthropic: When AI builds itself</a></p></li><li><p><a href=\"https://www.newyorker.com/news/fault-lines/can-ai-produce-writing-that-we-actually-want-to-read\">Can A.I. Produce Writing That We Actually Want to Read?</a></p></li><li><p><a href=\"https://epochai.substack.com/p/is-a-compute-crunch-coming\">Is a compute crunch coming?</a></p></li><li><p><a href=\"https://openai.com/index/built-to-benefit-everyone-our-plan/\">Built to benefit everyone: our plan | OpenAI</a></p></li><li><p><a href=\"https://www.nature.com/articles/d41586-026-01689-0?utm_source=x&amp;utm_medium=social&amp;utm_campaign=nature&amp;linkId=62230411\">Bots are scraping open data \u2014 how should researchers respond?</a></p></li><li><p><a href=\"https://letter.nikomc.com/p/small\">Why Are Cells Small? - Niko McCarty</a></p></li><li><p><a href=\"https://www.owlposting.com/p/how-to-build-a-cancer-vaccine-and\">How to build a cancer vaccine, and whether they will work this time</a></p></li></ol><h3>Papers</h3><ol><li><p><a href=\"https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1014287\">The total eclipse of bioinformatics: From disruption to convention, and a gentle warning</a></p></li><li><p><a href=\"https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2026.1832974/full\">Dual-use artificial intelligence and biology: upstream risk-benefit reviews</a></p></li><li><p><a href=\"https://www.pnas.org/doi/10.1073/pnas.2615114123\">Molecular de-extinction looks to the past to find the molecules of the future</a></p></li><li><p><a href=\"https://arxiv.org/abs/2605.28655v1\">AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation</a></p></li><li><p><a href=\"https://www.nature.com/articles/s41588-026-02607-w\">Pleiotropic shared heritability quantifies the shared genetic variance of common diseases</a></p></li><li><p><a href=\"https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1014338\">Ten simple rules for teaching data science</a></p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2022.05.06.490859v3\">Depth normalization for single-cell genomics count data</a> and <a href=\"https://xcancel.com/lpachter/status/2064795978264432988\">Lior's explainer</a></p></li></ol><p class=\"button-wrapper\" data-attrs=\"{&quot;url&quot;:&quot;https://blog.stephenturner.us/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}\" data-component-name=\"ButtonCreateButton\"><a class=\"button primary\" href=\"https://blog.stephenturner.us/subscribe?\"><span>Subscribe now</span></a></p><div class=\"footnote\" data-component-name=\"FootnoteToDOM\"><a id=\"footnote-1\" href=\"#footnote-anchor-1\" class=\"footnote-number\" contenteditable=\"false\" target=\"_self\">1</a><div class=\"footnote-content\"><p>I just read this one right before posting. The post describes the difficulty agents have at retrieving biological data. Which isn't limited to agents! It's difficult for a human to navigate the disparate databases and web interfaces and NCBI Virus search incantations to get the thing you're looking for. If this problem were solved for agents, it'd make life easier for us humans as well. A conclusion from the post: <em>\"We want models to be creative when they generate hypotheses, design experiments, or reason about mechanisms. But the layer underneath that creativity\u2014gene identifiers, schemas, retrieval logic, coordinate systems, metadata conventions, and data access paths\u2014has to be boringly reliable (or in other words, deterministic)\"</em>. </p></div></div><div class=\"footnote\" data-component-name=\"FootnoteToDOM\"><a id=\"footnote-2\" href=\"#footnote-anchor-2\" class=\"footnote-number\" contenteditable=\"false\" target=\"_self\">2</a><div class=\"footnote-content\"><p>I haven't had a chance to do anything with Fable yet, mostly because I work in AIxBio, and Bio is off limits. And because I'm a biologist, Fable refuses to talk to me (\"Who am I?\" leads to safety flags and demotion of the rest of the conversation to Opus). Precautionary principal is probably the right move here given the benchmarks, and I think managed access will likely be the way these models are released from here out.</p><div class=\"captioned-image-container\"><figure><a class=\"image-link image2 is-viewable-img\" target=\"_blank\" href=\"https://substackcdn.com/image/fetch/$s_!8PjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg\" data-component-name=\"Image2ToDOM\"><div class=\"image2-inset\"><picture><source type=\"image/webp\" srcset=\"https://substackcdn.com/image/fetch/$s_!8PjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1456w\" sizes=\"100vw\"><img src=\"https://substackcdn.com/image/fetch/$s_!8PjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg\" width=\"356\" height=\"772.3189368770765\" data-attrs=\"{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1306,&quot;width&quot;:602,&quot;resizeWidth&quot;:356,&quot;bytes&quot;:157698,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.stephenturner.us/i/201151842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}\" class=\"sizing-normal\" alt=\"\" srcset=\"https://substackcdn.com/image/fetch/$s_!8PjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1456w\" sizes=\"100vw\" loading=\"lazy\"></picture><div class=\"image-link-expand\"><div class=\"pencraft pc-display-flex pc-gap-8 pc-reset\"><button tabindex=\"0\" type=\"button\" class=\"pencraft pc-reset pencraft icon-container restack-image\"><svg role=\"img\" width=\"20\" height=\"20\" viewBox=\"0 0 20 20\" fill=\"none\" stroke-width=\"1.5\" stroke=\"var(--color-fg-primary)\" stroke-linecap=\"round\" stroke-linejoin=\"round\" xmlns=\"http://www.w3.org/2000/svg\"><g><title></title><path d=\"M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882\"></path></g></svg></button><button tabindex=\"0\" type=\"button\" class=\"pencraft pc-reset pencraft icon-container view-image\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"20\" height=\"20\" viewBox=\"0 0 24 24\" fill=\"none\" stroke=\"currentColor\" stroke-width=\"2\" stroke-linecap=\"round\" stroke-linejoin=\"round\" class=\"lucide lucide-maximize2 lucide-maximize-2\"><polyline points=\"15 3 21 3 21 9\"></polyline><polyline points=\"9 21 3 21 3 15\"></polyline><line x1=\"21\" x2=\"14\" y1=\"3\" y2=\"10\"></line><line x1=\"3\" x2=\"10\" y1=\"21\" y2=\"14\"></line></svg></button></div></div></div></a></figure></div><p><br></p></div></div>","doi":"https://doi.org/10.59350/6z1rs-ner26","guid":"201151842","image":"https://substackcdn.com/image/fetch/$s_!8PjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1781222400,"rid":"78n6a-47133","summary":"TBR in AIxBio, AIxEdu, AIxLabor, AIxWriting, and other essays &amp;","tags":["Biosecurity","AI"],"title":"Open tabs (June 12, 2026)","updated_at":1781272366,"url":"https://blog.stephenturner.us/p/open-tabs-june-12-2026","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>I spent a couple of days at the <a href=\"https://www.nationalacademies.org/home\">National Academy of Sciences</a> in the USA at the invitation of the <a href=\"https://royalsociety.org\">Royal Society</a>, who held a forum on \"<a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">Measuring Biodiversity for Addressing the Global Crisis</a>\". It was a <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">packed program</a> for those working in evidence-driven conservation:</p>\n<blockquote>\n<p>Assessing biodiversity is fundamental to understanding the distribution of biodiversity, the changes that are occurring and, crucially, the effectiveness of actions to address the ongoing biodiversity crisis. Such assessments face multiple challenges, not least the great complexity of natural systems, but also a lack of standardized approaches to measurement, a plethora of measurement technologies with their own strengths and weaknesses, and different data needs depending on the purpose\nfor which the information is being gathered.</p>\n<p>Other sectors have faced similar challenges, and the forum will look to learn from these precedents with a view to building momentum toward standardized methods for using environmental monitoring technologies, including new technologies, for particular purposes.\n<cite>-- NAS/Royal Society <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">US-UK Scientific Forum on Measuring Biodiversity</a></cite></p>\n</blockquote>\n<p>I was honoured to talk about our work on using AI to \"connect the dots\" between disparate data like the academic literature and remote observations at scale. But before that, here's some of the bigger picture stuff I learnt...</p>\n<p><a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-cover.webp\" title=\"Identifying the bird is an exercise for the reader!\"/> </a></p>\n<h2 id=\"shifting-conservation-to-a-winning-stance\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#shifting-conservation-to-a-winning-stance\"></a>Shifting conservation to a winning stance</h2>\n<p>The need for urgent, additional action came across loud and clear from all the top actors in biodiversity. On the bright side, we have made stellar progress in measuring more dimensions of biodiversity accurately than ever before in human history. But, the field of biodiversity does not have a single \"simple question\" that needs answering, unlike many other science challenges in physics or chemistry. The ecosystem of nature measurements need to span scales ranging from the micro (from fungi and soil health) to the macro (species richness and diversity), with geographical coverage across the planet but also hyperlocal accuracy for ecosystem services.</p>\n<p>One key question asked at the forum was how we can get to interoperable, pragmatic tools that enable all the actors involved in conservation actions (from the governments that set policy, to the private sector that controls the supply chains, to the people who have to live in and depend on natural services) to work together more effectively on gathering all the data needed.</p>\n<p>This interoperability has to emerge during a rapid shift towards digital methods, which are vulnerable to being <a href=\"https://www.bbc.com/future/article/20250422-usa-scientists-race-to-save-climate-data-before-its-deleted-by-the-trump-administration\">deleted and edited at scale</a> with decades of painstaking observations at risk at the moment.  And in the middle of all this, machine learning is swooping in to perform data interpolation at scale, but also risks <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">dividing</a> and polluting observations with inaccurate projections.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-2.webp\"/></p>\n<h2 id=\"what-is-an-optimistic-future-for-conservation\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#what-is-an-optimistic-future-for-conservation\"></a>What is an optimistic future for conservation?</h2>\n<p>This is all quite the challenge even for a gung-ho computer scientist like me, and I was struggling with the enormity of it all! But things really clicked into place after the inspirational <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> pointed me at a <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">fantastic big-picture paper</a>:</p>\n<blockquote>\n<p>Drawing reasonable inferences from current patterns, we can predict that 100 years from now, the Earth could be inhabited by between 6-8 billion people, with very few remaining in extreme poverty, most living in towns and cities, and nearly all participating in a technologically driven, interconnected market economy.</p>\n<p>[...] we articulate a theory of social\u2013environmental change that describes the simultaneous and interacting effects of urban lifestyles on fertility, poverty alleviation, and ideation.</p>\n<p><cite><a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation</a></cite></p>\n</blockquote>\n<p>They observe that the field of conservation has often \"succumbed to jeremiad, bickering, and despair\". Much of this angst springs from the (failed) bets made by <a href=\"https://en.wikipedia.org/wiki/Paul_R._Ehrlich\">Paul Ehlrich</a>, who thinks <a href=\"https://www.nature.com/articles/d41586-024-03592-y\">humans are going to be wiped out</a> because of unbounded expansion. In response, conservation has become \"the art of slowing declines\" rather than achieving long term wins. But instead of being moribund, the paper paints an optimistic, practical endgame for conservation:</p>\n<blockquote>\n<p>We suggest that lasting conservation success can best be realized when:</p>\n<ul>\n<li>the human population stabilizes and begins to decrease</li>\n<li>extreme poverty is alleviated</li>\n<li>the majority of the world's people and institutions act on a shared belief that it is in their best interest to care for rather than destroy the natural bases of life on Earth.</li>\n</ul>\n</blockquote>\n<p>It turns out that most of these conditions can be reasonably projected to happen in the next fifty years or so. Population is projected to <a href=\"https://en.wikipedia.org/wiki/Human_population_projections\">peak by the turn of the century</a>, <a href=\"https://openknowledge.worldbank.org/entities/publication/9d0fb27a-3afe-5999-8d8e-baf90b4331c0/full\">extreme poverty might reasonably be eradicated by 2050</a>, and <a href=\"https://iopscience.iop.org/article/10.1088/1748-9326/8/1/014025\">urban landuse will stabilise at 6% of terrestrial land</a> by 2030-ish.</p>\n<p><a href=\"https://academic.oup.com/view-large/figure/118140827/biy039fig4.jpeg\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-6.webp\" title=\"Connecting demographic and economic trends in the 21st century to the environment\"/> </a></p>\n<p>Given this projection, the paper then points out that conservation doesn't need to save nature \"forever\". Instead, we have to save enough nature now to \"breakthrough\" from the <a href=\"https://en.wikipedia.org/wiki/Great_Acceleration\">great acceleration</a> of WWII until we stabilise landuse.</p>\n<blockquote>\n<p>The profound danger is that by the time the foundations of recovery are in place, little of wildlife and wild places will be left. If society focuses only on economic development and technological innovation as a mechanism to pass through the bottleneck as fast as possible, then what remains of nature could well be sacrificed.\nIf society were to focus only on limiting economic growth to protect nature, then terrible poverty and population growth could overwhelm what remains.</p>\n<p>Either extreme risks narrowing the bottleneck to such an extent that our world passes through without its tigers, elephants, rainforests, coral reefs, or a life-sustaining climate. Therefore, the only sensible path for conservation is to continue its efforts to protect biodiversity while engaging in cities to build the foundations for a lasting recovery of nature.\n<cite>-- <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough</a></cite></p>\n</blockquote>\n<p>This puts what we need to achieve today in a far, far more pragmatic light:</p>\n<blockquote>\n<p>[...] it means that conservation faces another 30\u201350 years of extreme difficulty, when more losses can be expected. However, if we can sustain enough nature through the bottleneck\u2014despite climate change, growth in the population and economy, and urban expansion\u2014then we can see the future of nature in a dramatically more positive light.</p>\n</blockquote>\n<p>Conservation is all about solving difficult opportunity-cost decisions in society.\nScience can help calculate <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">credible counterfactuals</a> that allow policymakers to balance\nlimited resources to minimise nature harm while maximising benefit to humans. We can also figure out new <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">economic methods</a> to figure out the value of future actions. When combined, this can help conservation break through the bottleneck of the next fifty years of nature loss... and computer science can make a serious <a href=\"https://fivetimesfaster.org/\">accelerative</a> impact here (yay!).</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-5.webp\" title=\"What does one call a group of ecology legends? A committee!\"/></p>\n<h2 id=\"topics-relevant-to-our-planetary-computing-research\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#topics-relevant-to-our-planetary-computing-research\"></a>Topics relevant to our planetary computing research</h2>\n<p>Having got my existential big-picture crisis under control, here are some more concrete thoughts about some of the joint ideas that emerged from the NAS meeting.</p>\n<h3 id=\"resilience-in-biodiversity-data\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#resilience-in-biodiversity-data\"></a>Resilience in biodiversity data</h3>\n<p>We've been doing a <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">lot</a> of <a href=\"https://digitalflapjack.com/weeknotes/2025-04-22/\">work</a> on mechanisms to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">process and ingest</a> remote sensing data. All of our techniques also apply to biodiversity, except that the pipelines are even more complex due to the multi-modal nature of the data being stored. This can be clearly seen in this <a href=\"https://www.science.org/doi/10.1126/science.adq2110\">review on the decline of insect biodiversity</a> that speaker Nick Isaac and my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> published last month.</p>\n<p><a href=\"https://www.science.org/doi/10.1126/science.adq2110\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-1.webp\" title=\"(source: Science, 10.1126/science.adq2110)\"/> </a></p>\n<p>The data itself isn't just from one source; instead, we need a pipeline of spatial (at different resolution) measurements, of different types (visual, acoustic, occurrence), of different provenance (experts, crowdsourced, museum), and from different hypotheses tests (evidence bases).</p>\n<p>Once the ingestion pipeline is in place, there's a full range of validation and combination and extrapolation involved, often involving AI methods these days.  The output from all of this is then tested to determine which <a href=\"https://anil.recoil.org/projects/ce\">conservation actions</a> to take.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-3.webp\" title=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\"/></p>\n<p><a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> also talked about the ambitious <a href=\"https://www.nature.com/articles/s41559-023-02171-0\">global biodiversity observing system</a> that he's been assembling a coalition for in recent years.  They are using Docker as part of this via their <a href=\"https://boninabox.geobon.org/\">Bon in a Box</a> product but hitting scaling issues (a common problem due to the size of geospatial tiles).</p>\n<p><a href=\"https://www.nature.com/articles/s41559-023-02171-0\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-7.webp\" title=\"Andrew Gonzalez explains the GBioS concept\"/> </a></p>\n<p>There's a good tie in for collaboration with us here via the next-generation <a href=\"https://patrick.sirref.org/weekly-2025-05-12/index.xml\">time-travelling shell</a> that <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> is developing that can handle this via <a href=\"https://www.tunbury.org/zfs-system-concept/\">ZFS snapshots</a>.  <a href=\"https://mynameismwd.org\">Michael Dales</a> has been applying this to scaling the <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> pipelines recently with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. And meanwhile <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a> have been researching <a href=\"https://anil.recoil.org/papers/2024-terracorder\">embedded biodiversity sensors</a>. The overall theme is that we need to make the hardware and software stack involved far easier to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">use for non-expert programmers</a>.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-8.webp\" title=\"A key part of the GBioS vision is to have a federated system\"/></p>\n<h3 id=\"observing-the-earth-through-geospatial-foundation-models\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#observing-the-earth-through-geospatial-foundation-models\"></a>Observing the earth through geospatial foundation models</h3>\n<p>Another problem that several speakers discussed was how complex biodiversity observations are to manage since they span multiple scales. In my talk, I described the new <a href=\"https://github.com/FrankFeng-23/btfm_project\">TESSERA</a> geospatial foundation model that <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have been leading in Cambridge. As this is a pre-trained foundation model, it needs to be finetuned to specific downstream tasks. A number of people came up after my talk with suggestions for collaborations here!</p>\n<p>Firstly, <a href=\"https://earthshotprize.org/winners-finalists/naturemetrics/\">Kat Bruce</a> (fresh from <a href=\"https://www.bbc.com/news/articles/cre8xxd7xl8o\">spraying pondwater</a> with Prince William) explained how <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a> are gathering <a href=\"https://en.wikipedia.org/wiki/Environmental_DNA\">eDNA</a> from many diverse sources. The data is of varying licenses depending on which customer paid for the acquisition, but overall there is a lot of information about species presence that's very orthogonal to the kind of data gathered from satellite observations.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-4.webp\" title=\"Kat Bruce showing how much information is packed into eDNA measurements\"/></p>\n<p>Secondly, <a href=\"https://darulab.org/\">Barnabas Daru</a> from Stanford described his efforts to map plant traits to species distribution models. This complements some work <a href=\"https://coomeslab.org\">David Coomes</a> has been leading recently in our group with <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a> and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> on mapping rare plants globally. The basic problem here is that plant occurrence data is <em>extremely</em> data deficient and spatially biased for 100k+ species, and so we'll need cunning interpolation techniques to fill in the data gaps.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-12.webp\" title=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\"/></p>\n<p>When back in Cambridge, I'm going to arrange for all of us to chat to see if we can somehow combine eDNA, fungal biodiversity, plant traits and satellite foundation models into a comprehensive global plant species map!</p>\n<h3 id=\"evidence-synthesis-from-the-literature\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#evidence-synthesis-from-the-literature\"></a>Evidence synthesis from the literature</h3>\n<p>There was also huge enthusiasm for another of our projects on <a href=\"https://anil.recoil.org/projects/ce\">analysing the academic literature</a> at scale. While we've been using it initially to accelerate the efficiacy and accuracy of <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">systematic reviews</a> for <a href=\"https://conservationevidence.com\">Conservation Evidence</a>, there are a huge number of followup benefits for having a comprehensive data corpus.</p>\n<p>Firstly, <a href=\"http://elphick.lab.uconn.edu/\">Chris Elphick</a> pointed out a metasynthesis where they manually integrate recent <a href=\"https://academic.oup.com/bioscience/advance-article-abstract/doi/10.1093/biosci/biaf034/8115312\">hypotheses about insect stressors and responses</a> into a network (3385 edges / 108 nodes). It found that the network is highly interconnected, with agricultural intensification often identified as a root cause for insect decline. Much like the CE manually labeled dataset, it should be possible to do hypothesis searches in our LLM pipeline to expand this search and make it more dynamic.</p>\n<p>Secondly, <a href=\"http://oisin.info\">Oisin Mac Aodha</a>, fresh from a <a href=\"https://watch.eeg.cl.cam.ac.uk/w/7aqBd2Nn9E6QpMvnoBPxuQ\">recent talk</a> in Cambridge, discussed his <a href=\"https://arxiv.org/abs/2502.14977\">recent work</a> on few-shot species range estimation and also <a href=\"https://arxiv.org/abs/2412.14428\">WildSAT text/image encoding</a>. His example showed how you could not only spot a species from images, but also use text prompts to refine the search. An obvious extension for us to have a go at here is to combine our large corpus of academic papers with these models to see how good the search/range estimation could get with a much larger corpus of data.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-13.webp\" title=\"I am proud to have pronounced Oisin's name correctly while introducing his recent CCI seminar\"/></p>\n<p>And thirdly, I finally met my coauthor <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\">David Williams</a> in the flesh for the first time! We've worked together recently on the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity impact of food</a>, and we had a long discussion over dinner about whether we could glean more behavioural data about how people react from the wider literature. This would require us expanding our literature corpus into <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">grey literature</a> and policy documents, but this is something that <a href=\"https://toao.com\">Sadiq Jaffer</a> and I want to do soon anyway.</p>\n<p>The connective tissue across these seemingly disparate projects is that there is a strong connection between what you can observe from space (the canopies of trees) to the traits expressed via knowledge of plant physiology and their DNA. If we could figure out how to connect the dots between the observed species to the physiological traits to the bioclimatic range variables, we could figure out where the (many) data-deficient plant species in the world are! I'll be hosting a meeting in Cambridge soon on this since we're already <a href=\"https://anil.recoil.org/notes/ukri-grant-terra\">working on it</a>.</p>\n<h3 id=\"visualisations-in-biodiversity\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#visualisations-in-biodiversity\"></a>Visualisations in biodiversity</h3>\n<p>The most unexpectedly cool talk was <a href=\"https://www.weizmann.ac.il/plants/Milo/home\">Ron Milo</a> showing us visualisations of the <a href=\"https://www.pnas.org/doi/10.1073/pnas.1711842115\">mass distribution of all life on earth</a>. His work really puts our overall challenge into context, as it shows just how utterly dominated wildlife is by domesticated animals.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-11.webp\" title=\"The dominant mammal biomass on the planet are domesticated animals\"/></p>\n<p>It struck me just how important these sort of high-level visualisations are in putting detailed numbers into context. For example, he also broke down global biomass that showed that plants are by far the \"heaviest\" living thing on earth, and that the ocean organisms do still dominate animal biomass.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-9.webp\"/></p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-10.webp\"/></p>\n<p>My favourite new animation library on the block is <a href=\"https://animejs.com/\">AnimeJS</a>, and so once I plan to try to do some nice animations for <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> along these lines after the academic term finishes.</p>\n<p>And that's a wrap on my notes for now! I'm still hanging out in the US for a bunch more meetings (including one at <a href=\"https://www.nationalgeographic.com/\">National Geographic HQ</a>), so I'll update this note when the official RS/NAS videos and writeup comes out.</p>\n<p><em>(Update 5th June: the <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&amp;list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">full talk videos series</a> is now online at the National Academy of Sciences channel. Enjoy!)</em></p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li>\n<li>Ferris et al (2024). Planetary computing for data-driven environmental policy-making. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2303.04501\" target=\"_blank\"><i>10.48550/arXiv.2303.04501</i></a></li>\n<li>Madhavapeddy (2025). Technology needs to unite conservation, not divide it. <a href=\"https://doi.org/10.59350/vwrvd-3sg08\" target=\"_blank\"><i>10.59350/vwrvd-3sg08</i></a></li>\n<li>Sanderson et al (2018). From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation. BioScience. <a href=\"https://doi.org/10.1093/biosci/biy039\" target=\"_blank\"><i>10.1093/biosci/biy039</i></a></li>\n<li>Jones (2024). The scale of the biodiversity crisis laid bare. Nature. <a href=\"https://doi.org/10.1038/d41586-024-03592-y\" target=\"_blank\"><i>10.1038/d41586-024-03592-y</i></a></li>\n<li>Gonzalez et al (2023). A global biodiversity observing system to unite monitoring and guide action. Nature Ecology &amp; Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02171-0\" target=\"_blank\"><i>10.1038/s41559-023-02171-0</i></a></li>\n<li>Halsch et al (2025). Meta-synthesis reveals interconnections among apparent drivers of insect biodiversity loss. BioScience. <a href=\"https://doi.org/10.1093/biosci/biaf034\" target=\"_blank\"><i>10.1093/biosci/biaf034</i></a></li>\n<li>Lange et al (2025). Feedforward Few-shot Species Range Estimation. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.14977\" target=\"_blank\"><i>10.48550/arXiv.2502.14977</i></a></li>\n<li>Daroya et al (2025). WildSAT: Learning Satellite Image Representations from Wildlife Observations. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2412.14428\" target=\"_blank\"><i>10.48550/arXiv.2412.14428</i></a></li></ul>","doi":"https://doi.org/10.59350/j6zkp-n7t82","guid":"https://doi.org/10.59350/j6zkp-n7t82","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1748044800,"reference":[{"id":"https://doi.org/10.33774/coe-2024-gvslq","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1098/rstb.2023.0327","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1038/s43016-025-01224-w","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1038/s41558-023-01815-0","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.48550/arxiv.2408.02407","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.48550/arxiv.2303.04501","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/vwrvd-3sg08","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1093/biosci/biy039","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1038/d41586-024-03592-y","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1038/s41559-023-02171-0","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1093/biosci/biaf034","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.48550/arxiv.2502.14977","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.48550/arxiv.2412.14428","unstructured":"<b>[cito:cites]</b>"}],"rid":"zmtp7-j5q82","summary":"I spent a couple of days at the National Academy of Sciences in the USA at the invitation of the Royal Society, who held a forum on \"Measuring Biodiversity for Addressing the Global Crisis\". It was a packed program for those working in evidence-driven conservation: I was honoured to talk about our work on using AI to \"connect the dots\" between disparate data like the academic literature and remote observations at scale.","tags":["Biodiversity","Conservation","Policy","Royalsociety","Usa"],"title":"What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity","updated_at":1781259301,"url":"https://anil.recoil.org/notes/nas-rs-biodiversity","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>I stayed on for a few days extra in Washington DC after the <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">biodiversity extravaganza</a> to attend a workshop at legendary <a href=\"https://www.nationalgeographic.org/society/visit-base-camp/\">National Geographic Basecamp</a>. While I've been to several NatGeo <a href=\"https://www.nationalgeographic.org/society/national-geographic-explorers/\">Explorers</a> meetups in California, I've never had the chance to visit their HQ. The purpose of this was to attend a workshop organised by <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz</a> from St Andrews about the \"Urban Exploration Project\":</p>\n<blockquote>\n<p>[The UEP is a...] global-scale, community-driven initiative will collaboratively track animals across gradients of urbanization worldwide, to produce a holistic understanding of animal behaviour in human-modified landscapes that can, in turn, be used to develop evidence-based approaches to achieving sustainable human-wildlife coexistence.\n<cite>-- <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz's homepage</a></cite></p>\n</blockquote>\n<p>This immediately grabbed my interest, since it's a very different angle of biodiversity measurements to my usual. I've so far been mainly involved in efforts that use <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> or expert <a href=\"https://anil.recoil.org/projects/life\">range maps</a>, but the UEP program is more concerned with the dynamic <em>movements</em> of species. Wildlife movements are extremely relevant to conservation efforts since there is a large tension between human/wildlife coexistence in areas where both communities are under spatial pressure. <a href=\"https://ratsakatika.com/\">Tom Ratsakatika</a> for example did his <a href=\"https://ai4er-cdt.esc.cam.ac.uk/\">AI4ER</a> <a href=\"https://github.com/ratsakatika/camera-traps\">project</a> on the tensions in the <a href=\"https://www.endangeredlandscapes.org/news/advancing-human-wildlife-coexistence-in-the-carpathian-mountains/\">Romanian Carpathian mountains</a>, and <a href=\"https://www.ifaw.org/journal/human-elephant-conflict-major-threat\">elephant/human conflicts</a> and <a href=\"https://www.bbc.co.uk/news/articles/cx2j43e2j5ro\">tiger/human conflicts</a> are also well known.</p>\n<p>The core challenge posed at the workshop was how to build momentum for the UEP's vision of fostering human\u2013wildlife coexistence in the world's <em>unprotected</em> areas (often, this is areas near urban expansion zones like cities).  The UEP idea sprang from Christian's earlier efforts after the pandemic on the <a href=\"https://bio-logging.net/wg/covid19-biologging/\">COVID-19 Bio-Logging</a> that built up a database of over 1 billion satellite fixes for ~13,000 tagged animals across ~200 species. The lead student on that <a href=\"https://www.nature.com/articles/s41559-023-02125-6\">work</a>, <a href=\"https://diegoellissoto.org/\">Diego Ellis Soto</a> has since graduated and was also at the UEP workshop sitting beside me!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-2.webp\" title=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\"/></p>\n<p>The workshop itself wasn't fully public (not because it's secret, but just because the details are still being iterated on), so here are some high-level takeaways from my conversations there...</p>\n<h2 id=\"movebank-for-gps-tracking\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/natgeo-urban-wildlife/#movebank-for-gps-tracking\"></a>Movebank for GPS tracking</h2>\n<p>I've used <a href=\"https://inaturalist.org\">iNaturalist</a> and <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a> extensively for wildlife occurrence and urban data, but I'm less familiar with how animal movement data is recorded. <a href=\"https://www.ab.mpg.de/person/98226\">Martin Wikelski</a> was at the workshop and explained the <a href=\"https://www.humboldt-foundation.de/en/entdecken/magazin-humboldt-kosmos/humboldt-today-the-secret-of-an-eternal-idol/the-high-flyer\">ICARUS</a> project to me, which collected data fitted to animals via GPS transmitters. This is then fed into the <a href=\"https://www.movebank.org/cms/movebank-main\">MoveBank</a> service that is custom-designed for movement data.</p>\n<p>Unlike most other biodiversity data services though, MoveBank data is not immediately made public (due to the sensitivity of animal movements), but is licensed to the user that made it. For that reason, it's less of a \"social\" service than iNaturalist, but still has a staggering <a href=\"https://www.movebank.org/cms/movebank-content/february-2024-newsletter\">11 million records added every day</a>.  This data is then <a href=\"https://www.movebank.org/cms/movebank-content/archiving-animal-movements-as-biodiversity-2023-01-04\">fed into GBIF</a>, although it is downsampled to a single record per day. Martin also indicated to me that they're considering federating Movebank to other countries, which is important as <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&amp;list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">biodiversity data resilience</a> was a hot topic in our <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">meeting</a> a few days before.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-3.webp\" title=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\"/></p>\n<h2 id=\"storytelling-about-conservation-actions\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/natgeo-urban-wildlife/#storytelling-about-conservation-actions\"></a>Storytelling about conservation actions</h2>\n<p>I was really struck by how deeply the National Geographic staff were thinking about and co-designing solutions for along with the academics involved. I got chatting to <a href=\"https://www.nationalgeographic.org/society/our-leadership/\">Ian Miller</a>, the chief scientist at NatGeo about his scientific background (he's worked on all seven continents!) and how our <a href=\"https://anil.recoil.org/projects/ce\">conservation evidence database</a> might be of use to help the Society figure out the long-term impacts of their projects. I also met the person with the coolest job title there: <a href=\"https://www.linkedin.com/in/alextait/\">Alex Tait</a>, who is <a href=\"https://education.nationalgeographic.org/resource/mapping-change-roof-world/\">The Geographer</a> at the NGS. Alex, along with <a href=\"https://theorg.com/org/national-geographic-society/org-chart/lindsay-anderson\">Lindsay Anderson</a> and other NGS staff who participated, all had infectious enthusiasm about exploration combined with an encyclopedic knowledge of specific projects that they support involving explorers across the world.</p>\n<p>These projects ranged from the <a href=\"https://www.nationalgeographic.com/into-the-amazon/pink-dolphins-tricksters-and-thieves/\">Amazon River Dolphins</a> (to understand <a href=\"https://www.nationalgeographic.com/impact/article/fernando-trujillo-explorer-story\">aquatic health</a>) over to <a href=\"https://www.nationalgeographic.com/impact/article/alex-schnell-explorer-story\">cephalopod empathy</a>) and <a href=\"https://www.nationalgeographic.com/impact/article\">many more</a>. These gave me a new perspective on the importance of <em>storytelling</em> as a key mechanism to help connect the dots from conservation actions to people; something that I've been learning from <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s <a href=\"https://anil.recoil.org/notes/junior-rangers\">video series</a> as well!</p>\n<p><a href=\"https://www.nationalgeographic.com/impact\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-5.webp\" title=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\"/> </a></p>\n<p>It's also worth noting that the NGS support goes beyond \"just\" filmmaking. Our own <a href=\"https://charlesemogor.com\">Charles Emogor</a> is also an <a href=\"https://explorers.nationalgeographic.org/directory/charles-agbor-emogor\">Explorer</a>, and recently received support from their <a href=\"https://www.nationalgeographic.org/society/our-programs/lab/\">Exploration Technology Lab</a> to get a bunch of <a href=\"https://www.wildlifeacoustics.com/products/song-meter-mini-2-aa\">biologgers</a> to support his research on <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">mapping hunting pressures</a>. Rather than placing a few big bets, the Society seems to focus on investing widely in a diverse range of people and geographies.</p>\n<h2 id=\"the-importance-of-hedgehogs\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/natgeo-urban-wildlife/#the-importance-of-hedgehogs\"></a>The importance of hedgehogs</h2>\n<p>A lot of the discussion at the workshop naturally focussed on charismatic mammals such as the amazing work done by the <a href=\"https://www.zambiacarnivores.org/\">Zambian Carnivore programme</a>. However, I also had in mind the importance of addressing issues closer to home in the UK as well so that we didn't ignore Europe.</p>\n<p>Luckily, before the workshop, I had grabbed a coffee with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a> from the CCI, who has been bringing me up to speed on the <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring programme</a> (did you know that British hedgehogs are now <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">vulnerable to extinction</a>?). This particular effort seems to tick a lot of boxes; it's a local and beloved species in the UK, it requires <a href=\"https://www.conservationevidence.com/individual-study/1018\">evidence-based interventions</a> to avoid making the problems worse, and also requires combining data sources (from camera traps to species distribution models to urban planning to the GPS Movebank data) to build up a really accurate high res picture of what's going on.</p>\n<p>I brought up UK hedgehog conservation at the NatGeo workshop, and then while down at <a href=\"https://earthfest.world/\">Earthfest</a> at Google a few days later I learnt from <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a> that they've developed an extremely high-res map of <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">woodland and hedgerows</a> in the UK.  I've therefore created a new student project on <a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">hedgehog mapping</a> and hope to recruit a summer internship for this. It would be extremely cool to put the pieces together with a very concrete project such as this as a first small step for the UEP.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-1.webp\" title=\"NatGeo Basecamp is under construction, but still epic\"/></p>\n<p>I found the whole experience of visiting National Geographic inspirational, and not just because of the projects discussed. The walls of their HQ are full of incredible photographs of explorers all over the world, and a seemingly unbounded enthusiasm for exploring the unknown. I kind of thought I'd aged out on applying to become an explorer, but <a href=\"https://totalkatastrophe.blogspot.com/\">Kathy Ho</a> has been encouraging me to apply, and the same was echoed by the lovely conversations with NatGeo staffers.</p>\n<p>I'm therefore putting on my thinking hat on for what my Explorers project proposal should be, as I am on academic sabbatical next year and have more freedom to travel; suggestions are welcome if you see me at the pub!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-4.webp\" title=\"I might have deliberately gone the wrong way a few times while exploring the HQ\"/></p><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). We become Junior Rangers at Shenandoah. <a href=\"https://doi.org/10.59350/d27v1-5tk68\" target=\"_blank\"><i>10.59350/d27v1-5tk68</i></a></li>\n<li>Ellis-Soto et al (2023). A vision for incorporating human mobility in the study of human\u2013wildlife interactions. Nature Ecology &amp; Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02125-6\" target=\"_blank\"><i>10.1038/s41559-023-02125-6</i></a></li></ul>","doi":"https://doi.org/10.59350/7cpwj-d4161","guid":"https://doi.org/10.59350/7cpwj-d4161","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1749254400,"reference":[{"id":"https://doi.org/10.59350/j6zkp-n7t82","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.59350/d27v1-5tk68","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1038/s41559-023-02125-6","unstructured":"<b>[cito:cites]</b>"}],"rid":"pd2qp-yxd12","summary":"I stayed on for a few days extra in Washington DC after the biodiversity extravaganza to attend a workshop at legendary National Geographic Basecamp.","tags":["Natgeo","Usa","Biodiversity","Urban"],"title":"Visiting National Geographic HQ and the Urban Exploration Project","updated_at":1781259299,"url":"https://anil.recoil.org/notes/natgeo-urban-wildlife","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>Apple made a notable <a href=\"https://developer.apple.com/videos/play/wwdc2025/346/\">announcement</a> in <a href=\"https://developer.apple.com/wwdc25/\">WWDC 2025</a> that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early <a href=\"https://docs.docker.com/desktop/setup/install/mac-install/\">Docker for Mac</a> days in 2016 when we <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">announced</a> the first mainstream use of the <a href=\"https://developer.apple.com/documentation/hypervisor\">hypervisor framework</a>, so I couldn't resist taking a quick peek under the hood.</p>\n<p>There were two separate things announced: a <a href=\"https://github.com/apple/containerization\">Containerization framework</a> and also a <a href=\"https://github.com/apple/container\">container</a> CLI tool that aims to be an <a href=\"https://opencontainers.org/\">OCI</a> compliant tool to manipulate and execute container images. The former is a general-purpose framework that could be used by Docker, but it wasn't clear to me where the new CLI tool fits in among the existing layers of <a href=\"https://github.com/opencontainers/runc\">runc</a>, <a href=\"https://containerd.io/\">containerd</a> and of course Docker itself. The only way to find out is to take the new release for a spin, since Apple open-sourced everything (well done!).</p>\n<h2 id=\"getting-up-and-running\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#getting-up-and-running\"></a>Getting up and running</h2>\n<p>To get the full experience, I chose to install the <a href=\"https://www.apple.com/uk/newsroom/2025/06/macos-tahoe-26-makes-the-mac-more-capable-productive-and-intelligent-than-ever/\">macOS Tahoe beta</a>, as there have been improvements to the networking frameworks<sup id=\"fnref:1\"><a class=\"footnote\" href=\"https://anil.recoil.org/notes/apple-containerisation/#fn:1\">[1]</a></sup> that are only present in the new beta. It's essential you only use the <a href=\"https://developer.apple.com/news/releases/?id=06092025g\">Xcode 26 beta</a> as otherwise you'll get Swift link errors against vmnet. I had to force my installation to use the right toolchain via:</p>\n<pre><code>sudo xcode-select --switch /Applications/Xcode-beta.app/Contents/Developer\n</code></pre>\n<p>Once that was done, it was simple to clone and install the <a href=\"https://github.com/apple/container\">container\nrepo</a> with a <code>make install</code>. The first\nthing I noticed is that everything is written in Swift with no Go in sight.\nThey still use Protobuf for communication among the daemons, as most of the\nwider Docker ecosystem does.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/macos-ss-1.webp\" title=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they're impossible to distinguish!\"/></p>\n<h2 id=\"starting-our-first-apple-container\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#starting-our-first-apple-container\"></a>Starting our first Apple container</h2>\n<p>Let's start our daemon up and take the <code>container</code> CLI for a spin.</p>\n<pre><code class=\"language-sh\">$ container system start\nVerifying apiserver is running...\nInstalling base container filesystem...\nNo default kernel configured.\nInstall the recommended default kernel from [https://github.com/kata-containers/kata-containers/releases/download/3.17.0/kata-static-3.17.0-arm64.tar.xz]? [Y/n]: y\nInstalling kernel... \n\u2819 [1/2] Downloading kernel 33% (93.4/277.1 MB, 14.2 MB/s) [5s]\n</code></pre>\n<p>The first thing we notice is it downloading a full Linux kernel from the <a href=\"https://github.com/kata-containers/kata-containers\">Kata Containers</a> project. This system spins up a VM per container in order to provide more isolation. Although I haven't tracked Kata closely since its <a href=\"https://techcrunch.com/2017/12/05/intel-and-hyper-partner-with-the-openstack-foundation-to-launch-the-kata-containers-project/\">launch</a> in 2017, I did notice it being used to containerise <a href=\"https://confidentialcomputing.io/\">confidential computing enclaves</a> while <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and I were working on <a href=\"https://anil.recoil.org/projects/difc-tee\">TEE programming models</a> a few years ago.</p>\n<p>The use of Kata tells us that <code>container</code> spins up a new kernel using the\nmacOS <a href=\"https://developer.apple.com/documentation/virtualization\">Virtualization framework</a> every time a new container is started. This\nis ok for production use (where extra isolation may be appropriate in a\nmultitenant cloud environment) but very memory inefficient for development\n(where it's usual to spin up 4-5 VMs for a development environment with a\ndatabase etc). In contrast, Docker for Mac <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">uses</a> a single Linux kernel and runs\nthe containers within that instead.</p>\n<p>It's not quite clear to me why Apple chose the extra overheads of a\nVM-per-container, but I suspect this might be something to do with running code securely\ninside the <a href=\"https://support.apple.com/en-gb/guide/security/sec59b0b31ff/web\">many hardware enclaves</a>\npresent in modern Apple hardware, a usecase that is on the rise with <a href=\"https://www.apple.com/uk/apple-intelligence/\">Apple\nIntelligence</a>.</p>\n<h2 id=\"peeking-under-the-hood-of-the-swift-code\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#peeking-under-the-hood-of-the-swift-code\"></a>Peeking under the hood of the Swift code</h2>\n<p>Once the container daemon is running, we can spin up our first container using Alpine, which uses the familiar Docker-style <code>run</code>:</p>\n<pre><code class=\"language-sh\">$ time container run alpine uname -a \nLinux 3c555c19-b235-4956-bed8-27bcede642a6 6.12.28 #1 SMP\nTue May 20 15:19:05 UTC 2025 aarch64 Linux\n0.04s user 0.01s system 6% cpu 0.733 total\n</code></pre>\n<p>The container spinup time is noticable, but still less than a second and pretty acceptable for day to day use. This is possible thanks to a custom userspace they implement via a Swift init process that's run by the Linux kernel as the <em>sole</em> binary in the filesystem, and that provides an RPC interface to manage other services. The <a href=\"https://github.com/apple/containerization/tree/main/vminitd/Sources/vminitd\">vminitd</a> is built using the Swift static Linux SDK, which links <a href=\"https://musl.libc.org/\">musl libc</a> under the hood (the same one used by <a href=\"https://www.alpinelinux.org/\">Alpine Linux</a>).</p>\n<p>We can see the processes running by using <a href=\"https://man7.org/linux/man-pages/man1/pstree.1.html\">pstree</a>:</p>\n<pre><code>|- 29203 avsm /System/Library/Frameworks/Virtualization.framework/\n   Versions/A/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/\n   Contents/MacOS/com.apple.Virtualization.VirtualMachine\n|- 29202 avsm &lt;..&gt;/plugins/container-runtime-linux/\n   bin/container-runtime-linux\n   --root &lt;..&gt;/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n   --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm &lt;..&gt;/bin/container-network-vmnet\n   start --id default\n   --service-identifier &lt;..&gt;network.container-network-vmnet.default\n|- 28899 avsm &lt;..&gt;/bin/container-core-images start\n|- 29202 avsm &lt;..&gt;/bin/container-runtime-linux\n   --root &lt;..&gt;/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n   --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm &lt;..&gt;/container-network-vmnet start --id default\n   --service-identifier &lt;..&gt;network.container-network-vmnet.default\n</code></pre>\n<p>You can start to see the overheads of a VM-per-container now, as each container\nneeds the host process infrastructure to not only run the computation, but also to\nfeed it with networking and storage IO (which have to be translated from the\nhost).  Still, its a drop in the ocean for macOS these days, as I'm running 850\nprocesses in the background on my Macbook Air from an otherwise fresh\ninstallation! This isn't the lean, fast MacOS X Cheetah I used on my G4 Powerbook anymore,\nsadly.</p>\n<h3 id=\"finding-the-userspace-ext4-in-swift\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#finding-the-userspace-ext4-in-swift\"></a>Finding the userspace ext4 in Swift</h3>\n<p>I then tried to run a more interesting container for my local dev environment:\nthe <a href=\"https://hub.docker.com/r/ocaml/opam\">ocaml/opam</a> Docker images that we use\nin OCaml development.  This showed up an interesting new twist in the Apple\nrewrite: they have an entire <a href=\"https://en.wikipedia.org/wiki/Ext4\">ext4</a> filesystem <a href=\"https://github.com/apple/containerization/tree/main/Sources/ContainerizationEXT4\">implementation written in\nSwift</a>!\nThis is used to extract the OCI images from the Docker registry and then\nconstruct a new filesystem.</p>\n<pre><code class=\"language-sh\">$ container run ocaml/opam opam list\n\u2826 [2/6] Unpacking image for platform linux/arm64 (112,924 entries, 415.9 MB, Zero KB/s) [9m 22s] \n\u2839 [2/6] Unpacking image for platform linux/arm64 (112,972 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u2807 [2/6] Unpacking image for platform linux/arm64 (113,012 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u283c [2/6] Unpacking image for platform linux/arm64 (113,059 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u280b [2/6] Unpacking image for platform linux/arm64 (113,104 entries, 415.9 MB, Zero KB/s) [9m 24s] \n# Packages matching: installed                                                                      \n# Name                # Installed # Synopsis\nbase-bigarray         base\nbase-domains          base\nbase-effects          base\nbase-threads          base\nbase-unix             base\nocaml                 5.3.0       The OCaml compiler (virtual package)\nocaml-base-compiler   5.3.0       pinned to version 5.3.0\nocaml-compiler        5.3.0       Official release of OCaml 5.3.0\nocaml-config          3           OCaml Switch Configuration\nopam-depext           1.2.3       Install OS distribution packages\n</code></pre>\n<p>The only hitch here is how slow this process is. The OCaml images do have a lot of individual\nfiles within the layers (not unusual for a package manager), but I was surprised that this took\n10 minutes on my modern M4 Macbook Air, versus a few seconds on Docker for Mac.  I <a href=\"https://github.com/apple/container/issues/136\">filed a bug</a> upstream to investigate further since (as with any new implementation) there are many <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">edge cases</a> when handling filesystems in userspace, and the Apple code seems to have <a href=\"https://github.com/apple/container/issues/134\">other limitations</a> as well.  I'm sure this will all shake out as the framework gets more users, but it's worth bearing in mind if you're thinking of using it in the near term in a product.</p>\n<h2 id=\"whats-conspicuously-missing\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#whats-conspicuously-missing\"></a>What's conspicuously missing?</h2>\n<p>I was super excited when this announcement first happened, since I thought it might be the beginning of a few features I've needed for years and years. But they're missing...</p>\n<h3 id=\"running-macos-containers-nope\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#running-macos-containers-nope\"></a>Running macOS containers: nope</h3>\n<p>In OCaml-land, we have gone to ridiculous lengths to be able to run macOS CI on our own infrastructure. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> first wrote a <a href=\"https://tarides.com/blog/2023-08-02-obuilder-on-macos/\">custom snapshotting builder</a> using undocumented interfaces like userlevel sandboxing, subsequently taken over and maintained by <a href=\"https://www.tunbury.org/\">Mark Elvers</a>. This is a tremendous amount of work to maintain, but the alternative is to depend on very expensive hosted services to spin up individual macOS VMs which are slow and energy hungry.</p>\n<p>What we <em>really</em> need are macOS containers! We have dozens of mechanisms to run Linux ones already, and only a few <a href=\"https://github.com/dockur/macos\">heavyweight alternatives</a> to run macOS itself within macOS. However, the VM-per-container mechanism chosen by Apple might be the gateway to supporting macOS itself in the future. I will be first in line to test this if it happens!</p>\n<h3 id=\"running-ios-containers-nope\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#running-ios-containers-nope\"></a>Running iOS containers: nope</h3>\n<p>Waaaay back when we were <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">first writing</a> Docker for Mac, there were no mainstream users of the Apple Hypervisor framework at all (that's why we built and released <a href=\"https://github.com/moby/hyperkit\">Hyperkit</a>. The main benefit we hoped to derive from using Apple-blessed frameworks is that they would make our app App-Store friendly for distribution via those channels.</p>\n<p>But while there do exist <a href=\"https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.security.hypervisor\">entitlements</a> to support virtualisation on macOS, there is <em>no</em> support for iOS or iPadOS to this day! All of the trouble to sign binaries and deal with entitlements and opaque Apple tooling only gets it onto the Mac App store, which is a little bit of a graveyard compared to the iOS ecosystem.\nThis thus remains on my wishlist for Apple: the hardware on modern iPad adevices <em>easily</em> supports virtualisation, but Apple is choosing to cripple these devices from having a decent development experience by not unlocking the software capability by allowing the hypervisor, virtualisation and container frameworks to run on there.</p>\n<h3 id=\"running-linux-containers-yeah-but-no-gpu\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#running-linux-containers-yeah-but-no-gpu\"></a>Running Linux containers: yeah but no GPU</h3>\n<p>One reason to run Linux containers on macOS is to handle machine learning workloads. Actually getting this to be performant is tricky, since macOS has its own custom <a href=\"https://github.com/ml-explore/mlx\">MLX-based</a> approach to handling tensor computations. Meanwhile, the rest of the world mostly uses nVidia or AMD interfaces for those GPUs, which is reflected in container images that are distributed.</p>\n<p>There is some chatter on the <a href=\"https://github.com/apple/container/discussions/62#discussioncomment-13414483\">apple/container GitHub</a> about getting GPU passthrough working, but I'm still unclear on how to get a more portable GPU ABI. The reason Linux containers work so well is that the Linux kernel provides a very stable ABI, but this breaks down with GPUs badly.</p>\n<h1 id=\"does-this-threaten-dockers-dominance\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#does-this-threaten-dockers-dominance\"></a>Does this threaten Docker's dominance?</h1>\n<p>I have mixed feelings about the Containerization framework release. On one hand, it's always fun to see more systems code in a new language like Swift, and this is an elegant and clean reimplementation of classic containerisation techniques in macOS. But the release <strong>fails to unlock any real new end-user capabilities</strong>, such as running a decent development environment on my iPad without using cloud services. Come on Apple, you can make that happen; you're getting ever closer every release!</p>\n<p>I don't believe that Docker or Orbstack are too threatened by this release at this stage either, despite some reports that <a href=\"https://appleinsider.com/articles/25/06/09/sorry-docker-macos-26-adds-native-support-for-linux-containers\">they're being Sherlocked</a>. The Apple container CLI is quite low-level, and there's a ton of quality-of-life features in the full Docker for Mac app that'll keep me using it, and there seems to be no real blocker from Docker adopting the Containerization framework as one of its optional backends. I prefer having a single VM for my devcontainers to keep my laptop battery life going, so I think Docker's current approach is better for that usecase.</p>\n<p>Apple has been a very good egg here by open sourcing all their code, so I believe this will overall help the Linux container ecosystem by adding choice to how we deploy software containers. Well done <a href=\"https://github.com/crosbymichael\">Michael Crosby</a>, <a href=\"https://github.com/mavenugo\">Madhu Venugopal</a> and many of my other former colleagues who are all merrily hackily away on this for doing so!  As an aside, I'm also just revising a couple of papers about the history of using OCaml in several Docker components, and a retrospective look back at the hypervisor architecture backing Docker for Desktop, which will appear in print in the next couple of months (I'll update this post when they appear). But for now, back to my day job of marking undergraduate exam scripts...</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>vmnet is a networking framework for VMs/containers that I had to <a href=\"https://github.com/mirage/ocaml-vmnet\">reverse engineer</a> back in 2014 to use with OCaml/MirageOS.</p>\n<a class=\"reversefootnote\" href=\"https://anil.recoil.org/notes/apple-containerisation/#fnref:1\">\u21a9</a></p></li></ol></div><h1>References</h1><ul><li>Ridge et al (2015). SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems. ACM. <a href=\"https://doi.org/10.1145/2815400.2815411\" target=\"_blank\"><i>10.1145/2815400.2815411</i></a></li></ul>","doi":"https://doi.org/10.59350/70ynk-ves20","guid":"https://doi.org/10.59350/70ynk-ves20","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1749600000,"reference":[{"id":"https://doi.org/10.1145/2815400.2815411","unstructured":"<b>[cito:citesAsSourceDocument]</b>"}],"rid":"5gf2r-ag171","summary":"Apple made a notable announcement in WWDC 2025 that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early Docker for Mac days in 2016 when we announced the first mainstream use of the hypervisor framework, so I couldn't resist taking a quick peek under the hood.","tags":["Docker","Containers","Systems","Networking","Macos"],"title":"Under the hood with Apple's new Containerization framework","updated_at":1781259298,"url":"https://anil.recoil.org/notes/apple-containerisation","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>The <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass\">BIOMASS</a> forest mission satellite was <a href=\"https://www.bbc.co.uk/newsround/articles/c0jzy3g0zx2o\">successfully</a> boosted into space a couple of days ago, after decades of development from just down the road in <a href=\"https://www.gov.uk/government/news/british-built-satellite-to-map-earths-forests-in-3d-for-the-first-time\">Stevenage</a>. I'm excited by this because it's the first global-scale <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">P-band SAR</a> instrument that can penetrate forest canopys to look underneath. This, when combined with <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">hyperspectral mapping</a> will give us a lot more <a href=\"https://anil.recoil.org/projects/rsn\">insight</a> into global tree health.</p>\n<p>Weirdly, the whole thing almost never happened because permission to use the <a href=\"https://ieeexplore.ieee.org/document/9048581\">P-band</a> was blocked because it might <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">interfere with US nuclear missile warning radars</a> back in 2013.</p>\n<blockquote>\n<p>Meeting in Graz, Austria, to select the the 7th Earth Explorer mission to be flown by the 20-nation European Space Agency (ESA), backers of the Biomass mission were pelted with questions about how badly the U.S. network of missile warning and space-tracking radars in North America, Greenland and Europe would undermine Biomass' global carbon-monitoring objectives.</p>\n<p>Europe's Earth observation satellite system may be the world's most dynamic, but as it pushes its operating envelope into new areas, it is learning a lesson long ago taught to satellite telecommunications operators: Radio frequency is scarce, and once users have a piece of it they hold fast.\n<cite>-- <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">Spacenews</a> (2013)</cite></p>\n</blockquote>\n<p>Luckily, all this got sorted by international frequency negotiators, and after\n<a href=\"https://www.thecomet.net/news/25125302.satellite-built-stevenage-airbus-launches-space/\">being built by Airbus in Stevenage</a>\n(and Germany and France, as it's a complex instrument!) it took off without a hitch. Looking forward to getting my hands on the first results later in the year over at the <a href=\"https://eo.conservation.cam.ac.uk\">Centre for Earth Observation</a>.</p>\n<p>Check out this cool <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">ESA video</a> about the instrument to learn more, and congratulations to the team at ESA. Looking forward to the next <a href=\"https://anil.recoil.org/notes/biospace-25\">BIOSPACE</a> where there will no doubt be initial buzz about this.</p>\n<p><div class=\"video-center\"><iframe allowfullscreen=\"\" frameborder=\"0\" height=\"315px\" sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" src=\"https://crank.recoil.org/videos/embed/c3981e1f-3f2d-439a-924d-6d29de33cfe4\" title=\"BIOMASS p-band mirror\" width=\"100%\"></iframe></div></p>\n<p><em>Update 28th June 2025:</em> See also this <a href=\"https://www.bbc.co.uk/news/resources/idt-d7353b50-0fea-46ba-8495-ae9e25192cfe\">beautiful BBC article</a> about the satellite, via <a href=\"https://coomeslab.org\">David Coomes</a>.</p><h1>References</h1><ul><li>Madhavapeddy (2025). ESA's first BioSpace conference seems a huge success. <a href=\"https://doi.org/10.59350/vd6af-4bc83\" target=\"_blank\"><i>10.59350/vd6af-4bc83</i></a></li>\n<li>Ball et al (2024). Harnessing temporal &amp; spectral dimensionality to identify individual trees in tropical forests. bioRxiv. <a href=\"https://doi.org/10.1101/2024.06.24.600405\" target=\"_blank\"><i>10.1101/2024.06.24.600405</i></a></li>\n<li>Li et al (2019). The P-band SAR Satellite: Opportunities and Challenges. 2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR). <a href=\"https://doi.org/10.1109/APSAR46974.2019.9048581\" target=\"_blank\"><i>10.1109/APSAR46974.2019.9048581</i></a></li></ul>","doi":"https://doi.org/10.59350/53zjq-ft509","guid":"https://doi.org/10.59350/53zjq-ft509","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1746057600,"reference":[{"id":"https://doi.org/10.59350/vd6af-4bc83","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1101/2024.06.24.600405","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1109/apsar46974.2019.9048581","unstructured":"<b>[cito:cites]</b>"}],"rid":"46t51-7zq39","summary":"The BIOMASS forest mission satellite was successfully boosted into space a couple of days ago, after decades of development from just down the road in Stevenage. I'm excited by this because it's the first global-scale P-band SAR instrument that can penetrate forest canopys to look underneath. This, when combined with hyperspectral mapping will give us a lot more insight into global tree health.","tags":["Sensing","Space","Satellite","Forests","Biodiversity"],"title":"BIOMASS launches to measure forest carbon flux from space","updated_at":1781259297,"url":"https://anil.recoil.org/notes/biomass-launches","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>The exam marking is over, and a glorious Cambridge summer awaits! This year, we\nhave a sizeable cohort of undergraduate and graduate interns joining us from\nnext week.</p>\n<p>This note serves as a point of coordination to keep track of what's\ngoing on, and I'll update it as we get ourselves organised.\nIf you're an intern, then I highly recommend you take the time to carefully\nread through all of this, starting with <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#who-we-all-are-this-summer\">who we are</a>,\nsome <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#ground-rules\">ground rules</a>, <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#where-we-will-work\">where we will work</a>,\n<a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#registering-on-chat-channels\">how we chat</a>, <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#how-you-will-get-paid\">how to get paid</a>, and of course <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#summer-social-activities\">social activities</a> to make sure we have some fun!</p>\n<h2 id=\"who-we-all-are-this-summer\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#who-we-all-are-this-summer\"></a>Who we all are this summer</h2>\n<p>We're working on quite the diversity of projects this summer, ranging from classic\ncomputer systems and programming problems all the way through to environmental\nscience. Here's a recap of what's going on.</p>\n<p>First we're working against the <a href=\"https://anil.recoil.org/projects/ce\">evidence database</a> we've been building for the past couple of years:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/ai-assisted-inclusion-criteria\">Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</a>\"</em> with <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>, supervised by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/accurate-summarisation-for-ce\">Accurate summarisation of threats for conservation evidence literature</a>\"</em> with <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>, supervised by <a href=\"https://toao.com\">Sadiq Jaffer</a> following up her successful MPhil submission.</li>\n</ul>\n<p>We're then heading into <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> and working on some mapping projects:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/cairngorms-connect-habitats\">Habitat mapping of the Cairngormes Connect restoration area</a>\"</em> with <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>, supervised by <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">Mapping urban and rural British hedgehogs</a>\"</em> with <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>, supervised by <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>, as well as writing up his MPhil dissertation on <em>\"<a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">Enhancing Navigation Algorithms with Semantic Embeddings</a>\"</em></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/validating-anti-poaching-predictions\">Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</a>\"</em> with <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>, supervised by <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a></li>\n</ul>\n<p>Dropping down towards <a href=\"https://anil.recoil.org/projects/osmose\">embedded systems</a> and fun \"real-world\" projects, we have:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">Affordable digitisation of insect collections using photogrammetry</a>\"</em> with <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a> and <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, supervised by <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing the planet (or bits of it)</a>\"</em> with <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, supervised by <a href=\"https://mynameismwd.org\">Michael Dales</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/embedded-whisper\">Low power audio transcription with Whisper</a>\"</em> with <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a> and <em>\"<a href=\"https://anil.recoil.org/ideas/battery-free-riotee\">Battery-free wildlife monitoring with Riotee</a>\"</em> with <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>, both supervised by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a></li>\n</ul>\n<p>Going back to classic computer science, we have a few programming language and systems projects:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/hazel-to-ocaml-to-hazel\">Bidirectional Hazel to OCaml programming</a>\"</em> with <a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a>, supervised by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">Effects based scheduling for the OCaml compiler pipeline</a>\"</em> with <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <em>\"<a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">Runtimes \u00e0 la carte: crossloading native and bytecode OCaml</a>\"</em> with <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>, both supervised by <a href=\"https://www.dra27.uk\">David Allsopp</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/zfs-filesystem-perf\">ZFS replication strategies with encryption</a>\"</em> with <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>, supervised by <a href=\"https://www.tunbury.org/\">Mark Elvers</a></li>\n</ul>\n<h2 id=\"ground-rules\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#ground-rules\"></a>Ground rules</h2>\n<p>Since there are so many of us this summer, it's imperative that you're all\n<strong>proactive about communicating</strong> any problems or clarifications you need. If something\nhere doesn't make sense, or you have a better idea, then just reach out to any\nof the supervisors or me directly!</p>\n<p>Do also take time to <strong>learn from each other</strong>. Read up on not just your own project in the\nlist above, but take some to read the remainder so that you have a sense of what everyone\nis working on. When you see each other, it'll be much easier to chat about what's going\non and find opportunities for commonality.</p>\n<p>The projects above have been carefully selected to <strong>not be on the critical path</strong> for any\ndeadlines. If it's not going well from your perspective, then it's ok to take a step back\nand figure out why! We're hear to learn and discover things, so take the time to do so.</p>\n<h2 id=\"where-we-will-work\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#where-we-will-work\"></a>Where we will work</h2>\n<p>This will be different for everyone, since it depends on which home department will house the project.\nSome of us will be in the David Attenborough Building, in the third floor where the <a href=\"https://www.conservation.cam.ac.uk\">CRI</a> is:</p>\n<ul>\n<li><a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a> and <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a> will be with the <a href=\"https://anil.recoil.org/projects/ce\">CE</a> crew near <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s office</li>\n<li><a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a> and <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a> will hang out with <a href=\"https://coomeslab.org\">David Coomes</a>'s group</li>\n<li><a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> can work near <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>'s office where <a href=\"https://charlesemogor.com\">Charles Emogor</a> works</li>\n</ul>\n<p>Those working on the Zoology Museum itself (<a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>) will have an health and safety induction on Monday with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> and find offices there.</p>\n<p>The rest of us will be in the Computer Lab over in West Cambridge:</p>\n<ul>\n<li><a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a> will work out of FW15 with <a href=\"https://www.dra27.uk\">David Allsopp</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a></li>\n<li><a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>, <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a> and <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a> will be in FW15/14.  We may need to clear out one desk in FW15 to make room here (just put the stuff in my office in FW16). <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> will work out of my office (FW16) for the summer, and <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> is away for an internship in the USA.</li>\n<li>We'll find somewhere for <a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a> either in West Cambridge or in Pembroke soon, depending on preferences and heat!</li>\n</ul>\n<p>It'll probably take a week to let this all shake out, so please do shout if you find yourself stuck in your room and without an office! You should of course arrange to meet your immediate supervisors regularly according to whatever schedule and location works for you.</p>\n<h2 id=\"how-you-will-get-paid\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#how-you-will-get-paid\"></a>How you will get paid</h2>\n<p>The way you get paid weekly is via the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">Cambridge Casual Worker</a> system. This has a few important steps that you <strong>must</strong> pay attention to, or you will not get paid!</p>\n<ul>\n<li><strong>Before starting work</strong> you must go find <a href=\"https://www.cst.cam.ac.uk/people/ac733\">Alicja Zavros</a> in the Computer Lab with your passport or other proof of your right to work in the UK.  I've told Alicja that may of you will show up on Monday 30th June morning. It won't take more than a few minutes, as she'll take a photocopy of your id. You should also have registered on the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">CCWS</a> and gotten a login.</li>\n<li><strong>Every Friday</strong> that you do some work, fill in a timesheet on the CCWS. Round this off to a full day (8 hours) and don't do fine-grained timekeeping; just the number of days you've worked is fine. If you don't fill in a timesheet promptly, you won't get paid.</li>\n<li><strong>You must keep a research log with weeknotes</strong> that record what you've been up to. The exact style of weeknotes are entirely up to you, but it's vital that you get in the habit of keeping a log. If you have your own homepage, then send an <a href=\"https://en.wikipedia.org/wiki/Atom_(web_standard)\">Atom feed</a> to me. If you don't, then we have a <a href=\"https://github.com/ucam-eo/interns-2025\">github/ucam-eo/interns-2025</a> which I can give you write access to.  It's typical to store your weeknotes in Markdown format, and just a simple subdirectory with a date-based convention is fine. The primary use of weeknotes is to highlight things you've accomplished, areas where you are blocked, and interesting things you have run across. Try to make it a record to your future self, and also a way to let those around you know what's going on. While missing the occasional weeknote is just fine, missing them all will be a problem, so plan your time accordingly.  Weeknotes are also <em>not</em> a mechanism to assess anything to do with your progress, but a simple form of communication.</li>\n</ul>\n<h2 id=\"registering-on-chat-channels\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#registering-on-chat-channels\"></a>Registering on chat channels</h2>\n<p>Since we're all going to spread around Cambridge physically, it's important to have a chat channel. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> is setting up a WhatsApp group for social things (see below), but we also use <a href=\"https://matrix.org\">Matrix</a> as our \"hackers choice\" for day-to-day messaging.</p>\n<p>We host a Computer Lab <a href=\"https://matrix.org\">Matrix</a> server on which anyone with a valid Raven account can create an account. Since Matrix is a decentralised chat system, it is also possible to use other accounts from third-party servers, and also to join channels elsewhere.</p>\n<p>To create an account:</p>\n<ul>\n<li>In your Matrix client (we most commonly use <a href=\"https://element.io\">Element</a>), select <code>eeg.cl.cam.ac.uk</code> as your homeserver.</li>\n<li>Login with SSO (Single Sign On)</li>\n<li>You should see a Cambridge authentication screen for your CRSID.</li>\n</ul>\n<p>Once you create your account, you will be in the \"EEG\" Matrix space.  A <a href=\"https://matrix.org/blog/2021/05/17/the-matrix-space-beta/\">Matrix space</a> is a collection of channels, and you should join \"EEGeneral\" as the overall channel for the group. We'll create a separate room just for intern chats. We also have a bot in the room that posts our blogs to the channel, so you can keep up with what the group members are all chattering about. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> runs the CL matrix server, and there are occasional quirks, so just let us know if you run into any problems.  I am <code>@avsm:recoil.org</code> on there, not <code>avsm2</code> as I use my personal Matrix for a bunch of stuff.</p>\n<h2 id=\"summer-social-activities\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#summer-social-activities\"></a>Summer social activities</h2>\n<p>It's important to get some downtime this summer and recharge. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> has been setting up a social group for the interns to hang out together, and we'll organise a punting excursion at some point to get us out to the river.  Of course, many of us will be travelling this summer (I'm heading off to Botswana in late July for instance), so please do also make suggestions.</p>","doi":"https://doi.org/10.59350/tf22g-p1822","guid":"https://doi.org/10.59350/tf22g-p1822","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1751068800,"rid":"0wpqs-a7079","summary":"The exam marking is over, and a glorious Cambridge summer awaits! This year, we have a sizeable cohort of undergraduate and graduate interns joining us from next week. This note serves as a point of coordination to keep track of what's going on, and I'll update it as we get ourselves organised.","tags":["Urop"],"title":"EEG internships for the summer of 2025","updated_at":1781259296,"url":"https://anil.recoil.org/notes/eeg-interns-2025","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>For the past few years, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I been working with our colleagues in\n<a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> to do <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">analysis at scale</a> on the\nacademic literature. Getting local access to millions of fulltext papers has not\nbeen without drama, but made possible thanks to huge amounts of help from our\n<a href=\"https://www.lib.cam.ac.uk/\">University Library</a> who helped us navigate our\nrelationships with scientific publishers. We have just <strong><a href=\"https://rdcu.be/evkfj\">published a comment\nin Nature</a></strong> about the next phase\nof our research, where are looking into the impact of AI advances on evidence synthesis.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/davidparkins-ai-poison.webp\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\"/> </a></p>\n<p>Our work on literature reviews led us into assessing methods for <a href=\"https://royalsociety.org/news-resources/projects/evidence-synthesis/\">evidence\nsynthesis</a>\n(which is crucial to rational policymaking!) and specifically about how recent advances in AI may\nimpact it.  The current methods for <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">rigorous systematic literature review</a> are expensive and slow, and authors are already struggling to keep up with the <a href=\"https://ourworldindata.org/grapher/scientific-and-technical-journal-articles?time=latest\">rapidly expanding</a>\nnumber of legitimate papers. Adding to this, <a href=\"https://retractionwatch.com/2025/\">paper retractions</a> are increasing near\n<a href=\"https://doi.org/10.1038/d41586-023-03974-8\">exponentially</a> and already\nsystematic reviews <a href=\"https://retractionwatch.com/the-retraction-watch-leaderboard/top-10-most-highly-cited-retracted-papers/\">unknowingly cite</a>\nretracted papers, with most remaining uncorrected even a year (after notification!)</p>\n<p>This is all made much more complex as LLMs are flooding the landscape with\nconvincing, fake manuscripts and doctored data, potentially overwhelming our\ncurrent ability to distinguish fact from fiction.  Just this March, the <a href=\"https://sakana.ai/ai-scientist/\">AI\nScientist</a> formulated hypotheses, designed and\nran experiments, analysed the results, generated the figures and produced a\nmanuscript that <a href=\"https://sakana.ai/ai-scientist-first-publication/\">passed human peer\nreview</a> for an ICLR\nworkshop! Distinguishing genuine papers from those produced by LLMs isn't just\na problem for review authors; it's a threat to the very foundation of\nscientific knowledge. And meanwhile, Google is taking a different tack with a\ncollaborative <a href=\"https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/\">AI co-scientist</a> who acts as a multi-agent assistant.</p>\n<p>So the landscape is moving <em>really</em> quickly! Our proposal for the future of\nliterature reviews builds on our desire to move towards a more regional,\nfederated network approach. Instead of having giant repositories of knowledge\nthat <a href=\"https://en.wikipedia.org/wiki/2025_United_States_government_online_resource_removals\">may be erased unilaterally</a>,\nwe're aiming for a more bilateral network of \"living evidence databases\".\nEvery government, especially those in the Global South, should have the ability to build their\nown \"<a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">national data libraries</a>\" which represent the body\nof digital data that affects their own regional needs.</p>\n<p>This system of living evidence databases can be incremental and dynamically\nupdated, and AI assistance can be used as long as humans remain in-the-loop.\nSuch a system can continuously gather, screen, and index literature,\nautomatically remove compromised studies and recalculating results.  We're\nworking on this on multiple fronts this year; ranging from the computer science\nto figure out the distributed-nitty-gritty <sup id=\"fnref:1\"><a class=\"footnote\" href=\"https://anil.recoil.org/notes/ai-poisoning/#fn:1\">[1]</a></sup>, over to working with the\n<a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">GEOBON folk</a> on global biodiversity <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">data\nmanagement</a>, and continuing\nto drive the core LED design at Conservation Evidence. It feels like a</p>\n<p>Read our <a href=\"https://www.nature.com/articles/d41586-025-02069-w\">Nature Comment piece</a> (<a href=\"https://www.linkedin.com/posts/anilmadhavapeddy_will-ai-speed-up-literature-reviews-or-derail-activity-7348317711002705920-Y5UT?rcm=ACoAAAB0Kb0BNo1v6ylsGU2NtPa95mj-w1VcaJA\">comment on LI</a>) to learn more about how we think we can safeguard evidence synthesis against the rising tide of \"AI-poisoned literature\" and ensure the continued integrity of scientific discovery. As a random bit of trivia, the incredibly cool artwork in the piece was drawn by the legendary <a href=\"https://www.davidparkins.com/\">David Parkins</a>, who also drew <a href=\"https://www.beano.com/\">Beano</a> and <a href=\"https://en.wikipedia.org/wiki/Dennis_the_Menace_and_Gnasher\">Dennis the Menace</a>!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>My instinct is that we'll end up with something <a href=\"https://arxiv.org/abs/2402.03239\">ATProto based</a> as it's so convenient for <a href=\"https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/\">distributed system authentication</a>.</p>\n<a class=\"reversefootnote\" href=\"https://anil.recoil.org/notes/ai-poisoning/#fnref:1\">\u21a9</a></p></li></ol></div><h1>References</h1><ul><li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li>\n<li>Noorden (2023). More than 10,000 research papers were retracted in 2023 \u2014 a new record. Nature. <a href=\"https://doi.org/10.1038/d41586-023-03974-8\" target=\"_blank\"><i>10.1038/d41586-023-03974-8</i></a></li>\n<li>Kleppmann et al (2024). Bluesky and the AT Protocol: Usable Decentralized Social Media. Proceedings of the ACM Conext-2024 Workshop on the Decentralization of the Internet. <a href=\"https://doi.org/10.1145/3694809.3700740\" target=\"_blank\"><i>10.1145/3694809.3700740</i></a></li></ul>","doi":"https://doi.org/10.59350/pbxew-d2j78","guid":"https://doi.org/10.59350/pbxew-d2j78","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1751932800,"reference":[{"id":"https://doi.org/10.1038/d41586-025-02069-w","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/j6zkp-n7t82","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.59350/fk6vy-5q841","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1371/journal.pone.0323563","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1038/d41586-023-03974-8","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1145/3694809.3700740","unstructured":"<b>[cito:cites]</b>"}],"rid":"exrf3-3m363","summary":"For the past few years, Sadiq Jaffer and I been working with our colleagues in Conservation Evidence to do analysis at scale on the academic literature. Getting local access to millions of fulltext papers has not been without drama, but made possible thanks to huge amounts of help from our University Library who helped us navigate our relationships with scientific publishers.","tags":["Evidence","Llms","Ai","Federation","Networks"],"title":"Is AI poisoning the scientific literature? Our comment in Nature","updated_at":1781259295,"url":"https://anil.recoil.org/notes/ai-poisoning","version":"v1"}},{"document":{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>I was a bit sleepy getting into the Royal Society <a href=\"https://royalsociety.org/science-events-and-lectures/2025/07/future-of-scientific-publishing/\">Future of Scientific\nPublishing</a>\nconference early this morning, but was quickly woken up by the dramatic passion\non show as publishers, librarians, academics and funders all got together for a\n\"frank exchange of views\" at a meeting that didn't pull any punches!</p>\n<p>These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available\nfrom the RS in due course.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-1.webp\" title=\"Sir Mark Walport FRS opens up the conference\"/></p>\n<h2 id=\"mark-walport-sets-the-scene\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#mark-walport-sets-the-scene\"></a>Mark Walport sets the scene</h2>\n<p>Sir Mark Walport was a delightful emcee for the proceedings of the day, and\nopened how important the moment is for the future of how we conduct science.\nAcademic publishing faces a perfect storm: peer review is buckling under\nenormous volume, funding models are broken and replete with perverse\nincentives, and the entire system groans with inefficiency.</p>\n<p>The Royal Society is the publisher of the world's oldest continuously published\nscientific journal <a href=\"https://royalsocietypublishing.org/journal/rstb\">Philosophical Transactions</a>\n(since 1665) and has convened this conference for academies worldwide. The\noverall question is: what <em>is</em> a scientific journal in 2025 and beyond?\nWalport traced the economic evolution of publishing: for centuries, readers\npaid through subscriptions (I hadn't realised that the <a href=\"https://royalsociety.org/blog/2015/03/philosophical-transactions-the-early-years/\">early editions of the RS</a>\nused to be sent for free to libraries worldwide until the current commercial\nmodel arrived about 80 years ago).. Now, the pendulum has swung to open access\nthat creates perverse incentives that prioritize volume over quality. He called\nit a \"smoke and mirrors\" era where diamond open access models obscure who\n<em>actually</em> pays for the infrastructure of knowledge dissemination: is it the\npublishers, the governments, the academics, the libraries, or some combination\nof the above?  The profit margins of the commercial publishers answers that\nquestion for me...</p>\n<p>He then identified the transformative forces that are a forcing function:</p>\n<ul>\n<li>LLMs have <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">entered</a> the publishing ecosystem</li>\n<li>The proliferation of journals has created an attention economy rather than a knowledge economy</li>\n<li><a href=\"https://openreview.net/\">Preprint</a> archives are reshaping how research is shared quickly</li>\n</ul>\n<p>The challenges ahead while dealing with these are maintaining metadata\nintegrity, preserving the scholarly archive into the long term, and ensuring\nsystematic access for meta-analyses that advance human knowledge.</p>\n<h2 id=\"historical-perspectives-350-years-of-evolution\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#historical-perspectives-350-years-of-evolution\"></a>Historical Perspectives: 350 Years of Evolution</h2>\n<p>The opening pair of speakers were unexpected: they brought a historical and\nlinguistic perspective to the problem. I found both of these talks the\nhighlights of the day!  Firstly <a href=\"https://www.st-andrews.ac.uk/history/people/akf\">Professor Aileen\nFyfe</a> drew upon her research\nfrom 350 years of the Royal Society archives. Back in the day, there was no\nreal fixed entity called a \"scientific journal\". Over the centuries, everything\nfrom editorial practices to publication methods over to dissemination means\nhave transformed repeatedly, so we shouldn't view the status quo as set in stone.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-2.webp\" title=\"Professor Aileen Fyfe talks publishing history\"/></p>\n<p>While the early days of science were essentially people writing letters to each\nother, the post-WWII era of journals marked the shift to \"scale\". The tools for\ndistance communication (i.e. publishing collected issues) and universities\nswitching from being teaching focused over to today's research-centric\npublishing ecosystem were both key factors. University scientists used to\nproduce 30% of published articles in 1900; by 2020, that figure exceeded 80%.\nThis parallels the globalization of science itself in the past century;\nresearch has expanded well beyond its European origins to encompass almost all\ninstitutions and countries worldwide.</p>\n<p>Amusingly, Prof Fyfe pointed out that a 1960 Nature editorial asked <em>\"<a href=\"https://www.nature.com/articles/186018a0\">How many more new\njournals?</a>\"</em> even back then! The 1950s\ndid bring some standardization efforts (nomenclature, units, symbols) also\nthough citation formats robustly seem to resist uniformity. English was also\nexplicitly selected as the \"<a href=\"https://en.wikipedia.org/wiki/Languages_of_science\">default language for\nscience</a>, and peer review\nwas also formalised via papers like <em>\"<a href=\"https://journals.sagepub.com/doi/10.1177/000456327901600179\">Uniform requirements for manuscripts submitted to biomedical journals</a>\"</em> (in 1979). <a href=\"https://nsf-gov-resources.nsf.gov/pubs/1977/nsb77468/nsb77468.pdf\">US Congressional hearings</a>\nwith the NSF began distinguishing peer review from other evaluation methods.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-3.webp\" title=\"Professor Aileen Fyfe shows the globalisation of research over the years\"/></p>\n<p>All of this scale was then \"solved\" by financialisation after WWII. At the turn of the\n20th century, almost no journals generated any profit (the Royal Society\ndistributed its publications freely). By 1955, financial pressures and growing scale of submissions forced a\n<a href=\"https://journals.sagepub.com/doi/10.1177/0073275321999901\">reckoning</a>, leading\nto more self-supporting models by the 1960s. An era of mergers and acquisitions\namong journals followed, reshaping the <a href=\"https://serials.uksg.org/articles/259/files/submission/proof/259-1-259-1-10-20150210.pdf\">scientific information system</a>.</p>\n<p><a href=\"https://www.universiteitleiden.nl/en/staffmembers/vincent-lariviere#tab-1\">Professor Vincent Larivi\u00e8re</a> then took the stage to dispel some myths of English monolingualism in scientific publishing. While <a href=\"https://garfield.library.upenn.edu/essays/V1p019y1962-73.pdf\">English offers some practical benefits</a>, the reality at non-Anglophone institutions (like his own Universit\u00e9 de Montr\u00e9al) reveals that researchers spend significantly more time reading, writing, and processing papers as non-native language speakers, and often face higher rejection rates as a result of this.\nThis wasn't always the case though; Einstein published primarily in German, not English!</p>\n<p>He went on to note that today's landscape for paper language choices is more\ndiverse than is commonly assumed. English represents only 67% of publications,\na figure whic itself has been inflated by non-English papers that are commonly\npublished with English abstracts. Initiatives like the <a href=\"https://pkp.sfu.ca/2025/03/05/ojs-workshops-indonesia/\">Public Knowledge\nProject</a> has enabled\ngrowth in Indonesian and Latin America for example.  Chinese journals now\npublish twice the volume of English-language publishers, but are difficult to\nindex which makes Lariviere's numbers even more interesting: a growing majority\nof the world is no longer publishing in English! I also heard this in my trip\nin 2023 to China with the Royal Society; the scholars we met had a sequence of\nChinese language journals they submitted too, often before \"translating\" the\noutputs to English journals.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-4.webp\" title=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\"/></p>\n<p>All this leads us to believe that the major publisher's market share is smaller than commonly believed, which gives us reason for hope to change! Open access adoption worldwide currently varies fairly dramatically by per-capita <a href=\"https://ourworldindata.org/grapher/scientific-publications-per-million\">wealth and geography</a>, but reveals substantive greenspace for publishing beyond the major commercial publishers. Crucially, Larivi\u00e8re argued that research \"prestige\" is a socially constructed phenomenon, and not intrinsic to quality.</p>\n<p>In the Q&amp;A, Magdalena Skipper (Nature's Editor-in-Chief) noted that the private sector is reentering academic publishing (especially <a href=\"https://www.science.org/content/article/china-tops-world-artificial-intelligence-publications-database-analysis-reveals\">in AI topics</a>). Fyfe noted the challenge of tracking private sector activities; e.g. varying corporate policies on patenting and disclosure mean they are hard to infdex. A plug from <a href=\"https://coherentdigital.net/\">Coherent Digital</a> noted they have catalogued 20 million reports from non-academic research; this is an exciting direction (we've got <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">30TB of grey literature</a> on our servers, still waiting to be categorisd).</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-5.webp\" title=\"Professor Lariviere shows how uneven citations are across languages and geographies\"/></p>\n<h2 id=\"what-researchers-actually-need-from-stem-publishing\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#what-researchers-actually-need-from-stem-publishing\"></a>What researchers actually need from STEM publishing</h2>\n<p>Our very own <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> opened with a sobering demonstration of \"AI\npoisoning\" in the literature, referencing <a href=\"https://anil.recoil.org/static/papers/2025-ai-poison.pdf\">our recent Nature\ncomment</a>. He did the risky-but-catchy\ngeneration of a plausible-sounding but entirely fabricated conservation study\nusing an LLM and noted how economically motivated rational actors might quite\nreasonably use these tools to advance their agendas via the scientific record.\nAnd recovering from this will be very difficult indeed once it mixes up with\nreal science.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-6.webp\" title=\"Bill talks about our recent AI poisoning piece\"/></p>\n<p>Bill then outlined our <a href=\"https://anil.recoil.org/projects/ce\">emerging approach to subject-wide synthesis</a> via:</p>\n<ul>\n<li><strong>Systematic reviews</strong>: Slow, steady, comprehensive</li>\n<li><strong>Rapid reviews</strong>: Sprint-based approaches for urgent needs</li>\n<li><strong>Subject-wide evidence synthesis</strong>: Focused sectoral analyses</li>\n<li><strong>Ultrafast bespoke reviews</strong>: AI-accelerated with human-in-the-loop</li>\n</ul>\n<p>Going back to what ournals are <em>for</em> in 2025, Bill then discussed how they were\noriginally vehicles for exchanging information through letters, but now serve\nprimarily as stamps of authority and quality assurance. In an \"AI slop world,\"\nthis quality assurance function becomes existentially important, but shouldn't\nnecessarily be implemented in the current system of incentives. So then, how do\nwe maintain trust when the vast majority of submissions may soon be\nAI-generated? <em>(Bill and I scribbled down a plan on the back of a napkin for\nthis; more on that soon!)</em></p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-7.webp\" title=\"Bill also does a cheeky advert for his Conservation Concepts channel!\"/></p>\n<h3 id=\"early-career-researcher-perspectives\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#early-career-researcher-perspectives\"></a>Early Career Researcher perspectives</h3>\n<p><a href=\"https://www.york.ac.uk/psychology/staff/postdocs/meekings,-sophie/\">Dr. Sophie Meekings</a> then took the stage to discuss the many barriers facing early career researchers (ECRs). They're on short-term contracts, are dependent on others people's grant funding, and yet are the ones conducting the frontline research that drives scientific progress. And this is <em>after</em> years spent on poorly paid PhD stipends!</p>\n<p>ECRs require:</p>\n<ul>\n<li>clear, accessible guidelines spelling out each publishing stage without requiring implicit knowledge of the \"system\"</li>\n<li>constructive, blinded peer review** that educates rather than gatekeeps</li>\n<li>consistent authorship conventions like <a href=\"https://www.elsevier.com/researcher/author/policies-and-guidelines/credit-author-statement\">CRediT</a> (Contributor Roles Taxonomy)</li>\n</ul>\n<p>Dr. Meekings then noted how the precarious nature of most ECR positions creates cascading complications for individuals. When job-hopping between short-term contracts, who funds the publication of work from previous positions? How do ECRs balance completing past research with new employers' priorities? <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> also had this issue when joining my group a few years ago, as it took a significant portion of her time in the first year to finish up her previous publication from her last research contract.</p>\n<p>If we're going to fix the system itself, then ECRs need better incentives for PIs to publish null results and exploratory work, the councils need to improve support for interdisciplinary research that doesn't fit traditional journal boundaries (as these as frontiers between \"conventional\" science where many ECRs will work), and recognition that ECRs often lack the networks for navigating journal politics where editors rule supreme.</p>\n<p>Dr. Meekings summarized ECR needs with an excellent new acronym (SCARF) that drew a round of applause!</p>\n<ul>\n<li><strong>S</strong>peed in publication processes</li>\n<li><strong>C</strong>larity in requirements and decisions</li>\n<li><strong>A</strong>ffordability of publication fees</li>\n<li><strong>R</strong>ecognition of contributions</li>\n<li><strong>F</strong>airness in review and credit</li>\n</ul>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-8.webp\" title=\"Dr Sophie Meekings' SCARF principles for ECRs\"/></p>\n<p>The audience Q&amp;A was quite robust at this point. The first question was about how might we extend the evidence synthesis approach widely?\n<a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> noted that we are currently extending this to education working with <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a>. Interconnected datasets <em>across</em> subjects are an obvious future path for evidence datasets, with common technology for handling (e.g.) retracted datasets that can be applied consistently. <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> are supervising <a href=\"https://anil.recoil.org/notes/eeg-interns-2025\">projects on evidence synthesis</a> this summer on just this topic here in Cambridge.</p>\n<p>Another question was why ECRs feel that double blind review is important. Dr. Meekings noted that reviewers may not take ECR peer reviews as seriously, but this coul dbe fixed by opening up peer review and assigning credit <em>after</em> the process is completed and not during. Interestingly, the panel all like double-blind, which is the norm in computer science but not in other science journals. Some from the  BMJ noted there exists a lot of research into blinding; they summarised it that blinding doesn't work on the whole (people know who it is anyway) and open review doesn't cause any of the problems that people think it causes.</p>\n<p>A really interesting comment from Mark Walport was that a grand scale community project could work for the future of evidence collation, but this critically depends on breaking down the current silos since it doesn't work unless everyone makes their literature available. There was much nodding from the audience in support of this line of thinkin.g</p>\n<h2 id=\"charting-the-future-for-scientific-publishing\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#charting-the-future-for-scientific-publishing\"></a>Charting the future for scientific publishing</h2>\n<p>The next panel brought together folks from across the scientific\npublishing ecosystem, moderated by Clive Cookson of the Financial Times. This\nwas a particularly frank and pointed panel, with lots of quite direct messages\nbeing sent between the representatives of libraries, publishers and funders!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-9.webp\" title=\"Amy Brand from MIT Press opens the panel\"/></p>\n<p>Amy Brand (MIT Press) started by delivered a warning about conflating \"open to\nread\" with \"open to train on\". She pointed out that when MIT Press did a survey\nacross their authors, many of them raised concerns about the reinforcement of\nbias through AI training on scientific literature. While many of the authors\nacknowledged a moral imperative to make science available for LLM training,\nthey also wanted the <em>choice</em> of making their own work used for this. She urged\nthe community to pause and ask fundamental questions like \"AI training, at what\ncost?\" and \"to whose benefit?\". I did think she made a good point by drawing\nparallels with the early internet, where Brand pointed out that lack of\nregulation accelerated the decline of non-advertising-driven models. Her\nclosing question asked if search engines merely lead to AI-generated summaries,\nwhy serve the original content at all? This is something we discuss in our\n<a href=\"https://anil.recoil.org/papers/2025-internet-ecology\">upcoming Aarhus paper on an Internet ecology</a>.</p>\n<p><a href=\"https://experts.deakin.edu.au/66981-danny-kingsley\">Danny Kingsley</a> from Deakin University Library then delivered a biting perspective as a representative of libraries. She said that libraries are \"the ones that sign the cheques that keeps the system running\", which the rest of the panel all disagreed with in the subsequent discussion (they all claimed to be responsible, from the government to the foundations).  Her survey of librarians was interesting; they all asked for:</p>\n<ul>\n<li>Transparent peer review processes</li>\n<li>Unified expectations around AI declarations and disclosures</li>\n<li>Licensing as open as possible, resisting the \"salami slicing\" of specific use. We also ran across this problem of overly precise restrictions on use while <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">building our paper corpus</a> for <a href=\"https://anil.recoil.org/projects/ce\">CE</a>.</li>\n</ul>\n<p>Kingsley had a great line that \"publishers re monetizing the funding mandate\",\nwhich <a href=\"https://www.stats.ox.ac.uk/~deane/\">Charlotte Deane</a> later also said was the most succinct way she had heard\nto describe the annoyance we all have with the vast profit margins of\ncommercial publishers.  Kingsley highlighted this via the troubling practices\nin the IEEE and the American Chemical Society by charging to place repositories\nunder green open access. Her blunt assessment was that publishers are not\nnegotiating in good faith. Her talk drew the biggest applause of the day by\nfar.</p>\n<p>After this, <a href=\"https://wellcome.org/about-us/our-people/staff/john-arne-rottingen\">John-Arne\nR\u00f8ttingen</a>\n(CEO of the Wellcome Trust) emphasised that funders depend on scientific\ndiscourse as a continuous process of refutations and discussions. He expressed\nconcern about overly depending on brand value as a proxy for quality, calling\nit eventually misleading even if it works sometimes in the short term. Key\npriorities the WT have is ensuring that reviewers have easy access to all\nliterature, to supporting evidence synthesis initiatives to translate research\ninto impact, and controlling the open body of research outputs through digital\ninfrastructure to manage the new scale.  However, his challenge lies in\nmaintaining sustainable financing models for all this research data; he noted\nexplicitly that the Wellcome would not cover open access costs for commercial\npublishers.</p>\n<p>R\u00f8ttingen further highlighted the Global Biodata Coalition (which he was a\nmember of) concerns about US data resilience and framed research infrastructure\nas \"a global public good\" requiring collective investment and fair financing\nacross nations. Interestingly, he explicitly called out UNESCO as a weak force\nin global governance for this from the UN; I hadn't even realised that UNESCO\nwas responsible for this stuff!</p>\n<p>Finally, <a href=\"https://www.stats.ox.ac.uk/~deane/\">Prof Charlotte Deane</a> from the EPSRC also discussed what a scientific\njournal is for these days. It's not for proofreading or typesetting anymore and\n(as <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> also noted earlier), the stamp of quality is key. Deane\nargued that \"research completion\" doesn't happen until someone else can read it\nand reasonably verify the methods are sound; not something that can happen\nwithout more open access.  Deane also warned of the existential threat of <a href=\"https://anil.recoil.org/notes/ai-poisoning\">AI poisoning</a> since \"AI can make fake papers at a rate humans can't\nimagine. It won't be long before mose of the content on the Internet will be AI\ngenerated\".</p>\n<p>The audience Q&amp;A was <em>very</em> blunt here.  <a href=\"https://uniweb.uottawa.ca/view/profile/members/2846\">Stefanie Haustein</a> pointed out that we\nare pumping of billions of dollars into the publishing industry, many of which\nare shareholder companies, and so we are losing a significant percentage of\neach dollar spent. There is enough money in the system, but it's very\ninefficiently deployed right now!</p>\n<p><a href=\"https://www.linkedin.com/in/richardsever\">Richard Sever</a> from openRxiv asked\nhow we pay for this when major funders like the NIH have issued a series of\n<em>unfunded</em> open data mandates over recent years. John-Arne Rottingen noted that\nUNESCO is a very weak global body and not influential here, but that we need\ncoalitions of the willing to build such open data approaches from the bottom\nup. Challenging the publisher hegemony can only be done as a pack, which lead\nnicely onto the next session after lunch where the founder of\n<a href=\"https://openalex.org/\">OpenAlex</a> would be present!</p>\n<h2 id=\"who-are-the-stewards-of-knowledge-\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#who-are-the-stewards-of-knowledge-\"></a>Who are the stewards of knowledge ?</h2>\n<p>After lunch (where sadly, the vegetarian options were terrible but\nluckily I had my trustly Huel bar!), we reconvened with a panel debating\nwho the stewards of the scientific record should be. This brought together\nperspectives from commercial publishers (Elsevier), open infrastructure advocates (OpenAlex),\nfunders (MRC), and university leadership (pro-VC of Birmingham).</p>\n<p><a href=\"https://www.elsevier.com/people/victoria-eva\">Victoria Eva</a> (<a href=\"https://researcheracademy.elsevier.com/publication-process/open-science/open-access-end-user-licenses\">SVP from\nElsevier</a>)\nopened by describing the \"perfect storm\" facing their academic publishing\nbusiness as they had 600k more submissions this year than the previous year.\nThere was a high level view on how their digital pipeline \"aims to insert\nsafeguards\" throughout the publication process to maintain integrity. She\nargued in general terms to view GenAI through separate lenses of trust and\ndiscoverability and argud that Elsevier's substantial technological investments\nposition them to manage both challenges well. I was\n<a href=\"https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science\">predisposed</a>\nto dislike excuses from staggeringly profitable commercial publishers, but I\ndid find her answers to providing bulk access to their corpus unsatisfying.\nWhile she highlighted their growing open access base of papers, she also noted\nthat the transitionon to open access cannot happen overnight (my personal\ntranslation is that this means slow-walking). She mentioned special cases in\nplace for\n<a href=\"https://www.elsevier.com/en-gb/about/open-science/research-data/text-and-data-mining\">TDM</a>\nin the Global South and healthcare access (presumably at the commercial\ndiscretion of Elsevier).</p>\n<p><a href=\"https://jasonpriem.org/\">Jason Priem</a> from <a href=\"https://openalex.org/\">OpenAlex</a>\n(part of <a href=\"https://ourresearch.org/\">OurResearch</a>) then offered a radically\ndifferent perspective. I'm a huge fan of OpenAlex, as we use it extensively in\nthe <a href=\"https://anil.recoil.org/projects/ce\">CE</a> infrastructure. He disagreed with the conference framing of\npublishers as \"custodians\" or \"stewards,\" noting that these evoke someone\nmaintaining a static, old lovely house. Science <em>isn't</em> a static edifice but a\ngrowing ecosystem, with more scientists alive today than at any point in\nhistory. He instead proposed a \"gardener\" as a better metaphor; the science\necosystem needs to nourish growth rather than merely preserving what exists.\nExtending the metaphor, Priem contrasted French and English garden styles:\nFrench gardens constrain nature into platonic geometric forms, while English\ngardens embrace a more rambling style that better represents nature's inherent\ndiversity. He argued that science needs to adopt the \"English garden\" approach\nand that we don't have an information overload problem but rather \"<a href=\"https://www.cnet.com/culture/shirky-problem-is-filter-failure-not-info-overload/\">bad\nfilters</a>\"\n(to quote Clay Shirky).</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-11.webp\" title=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\"/></p>\n<p>Priem advocated <em>strongly</em> for open infrastructures since communities don't just produce papers: also software, datasets, abstracts, and things we don't envision yet. If we provide them with the \"digital soil\" (open infrastructure) then they will prosper. OpenAlex and <a href=\"https://zenodo.org/\">Zenodo</a> are great examples of how such open infrastructure hold up here. I use both all the time; I'm a huge fan of Jason's work and talk.</p>\n<p><a href=\"https://www.ukri.org/people/patrick-chinnery/\">Patrick Chinnery</a> from the Medical Research Council brought the funder perspective with some numbers: publishing consumes 1 to 2% of total research turnover funds (roughly \u00a324 million for UKRI) . He noted that during the pandemic, decision-makers were reviewing preprint data in real-time to determine which treatments should proceed to clinical trials and decisions had to be reversed after peer review revealed flaws. He emphasised the the need for more real time quality assurance in rapid decision-making contexts.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Adam_Tickell\">Adam Tickell</a> from the University of Birmingham declared the current model \"broken\", and not that each attempt at reform fails to solve the <em>basic problem of literature access</em> (something I've faced myself). He noted that David Willetts (former UK Minister for Science) couldn't access paywalled material while minister of science in government (!) which significantly influenced <a href=\"https://www.gov.uk/government/news/government-to-open-up-publicly-funded-research\">subsequent government policy</a> towards open access.\nTickell was scathing about the oligopolies of Elsevier and Springer, arguing their <a href=\"https://www.researchprofessionalnews.com/rr-news-world-2025-2-elsevier-parent-company-reports-10-rise-in-profit-to-3-2bn/\">profit margins</a> are out of proportion with the public funding for science. He noted that early open access attempts from the <a href=\"https://ioppublishing.org/news/spotlight-on-the-finch-report/\">Finch Report</a> were well-intentioned but ultimately insufficient to break the hegemony. Perhaps an opportunity for a future UK <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">National Data Library</a>...\nTickell closed his talk with an observation about the current crisis of confidence in science. This did make me think of a <a href=\"https://bsky.app/profile/hetanshah.bsky.social/post/3lttyexntps2y\">recent report on British confidence in science</a>, which shows the British public still retains belief in scientific institutions. So at least we're doing better than the US in this regard for now!</p>\n<p><a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7350547427319275520?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7350547427319275520%2C7350886618490130433%29&amp;replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7350547427319275520%2C7350908587134644225%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287350886618490130433%2Curn%3Ali%3Aactivity%3A7350547427319275520%29&amp;dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287350908587134644225%2Curn%3Ali%3Aactivity%3A7350547427319275520%29\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-ss-1.webp\" title=\"Stefanie Haustein points out ChatGPT-related content in response to Elsevier's comments on stage.\"/> </a></p>\n<p>The Q&amp;A session opened with Mark Walport asked how Elsevier manages to publish so many articles. Victoria Eva from Elsevier responded that they receive 3.5m articles annually with ~750k published. Eva mentioned something about \"digital screening throughout the publication process\" but acknowledged that this was a challenge due to the surge from paper mills. A suggestion of paying peer reviewers was raised from the audience but not substantively addressed. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> once again made a great point from the audience about how Elsevier could let through <a href=\"https://www.vice.com/en/article/scientific-journal-frontiers-publishes-ai-generated-rat-with-gigantic-penis-in-worrying-incident/\">AI generated rats with giant penises</a> with all this protection in place; clearly, some papers have been published by them with no humans ever reading it. This generated a laugh from the audience, and an acknowlegment from the Elsevier rep that they needed to invest more and improve.</p>\n<h2 id=\"how-to-make-open-infrastructure-sustainable\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#how-to-make-open-infrastructure-sustainable\"></a>How to make open infrastructure sustainable</h2>\n<p>My laptop power ran out at this point, but the next panel was an absolute treat as it had both <a href=\"https://kaythaney.com/\">Kaitlin Thaney</a> and <a href=\"https://en.wikipedia.org/wiki/Jimmy_Wales\">Jimmy Wales</a> of Wikipedia fame on it!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-12.webp\" title=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\"/></p>\n<p>Jimmy Wales pointed out an interesting point from his \"seven rules of trust\" is that a key one is to be personal with human-to-human contact and not run too quickly to technological solutions. Rather than, for example, asking what percentage of academic papers showed evidence of language from ChatGPT, it's more fruitful to ask whether the science contained within the paper is good instead of how it's written. There are many reasons why someone might have used ChatGPT (non-native speakers etc) but also many reasons unrelated why the science might be bad.</p>\n<p>Kaitlin Thaney pointed out the importance of openness given <a href=\"https://www.motherjones.com/politics/2025/07/trump-war-assault-national-science-foundation-american-innovation-greatness-education/\">the US assault on\nscience</a>\nmeans that the open data repositories can be replicated reasonably as well.</p>\n<p>Ian Mulvaney pointed out that Nature claims to have invested $240m in research\ninfrastructure, and this is a struggle for a medium sized publisher (like his\nown <a href=\"https://www.bmj.com/\">BMJ</a>). Open infrastructure allows sharing and\ncreation of value to make it possible to let these smaller organisations\nsurvive.</p>\n<p>When it comes to policy recommendations, what did the panel have to say about a more trustworthy literature?</p>\n<ul>\n<li>The <a href=\"https://www.ccsd.cnrs.fr/en/posi-principles/\">POSI principles</a> came up as important levels.</li>\n<li>Kaitlin mentioned the <a href=\"https://www.nextgenlibpub.org/forest-framework\">FOREST framework</a> funded by Arcadia and how they need to manifest in concrete infrastructure. There's an implicit reliance on infrastructure that you only notice when it's taken away! Affordability of open is a key consideration as well.</li>\n<li>Jimmy talked about open source software, and what generally works is not one-size-fits-all. Some are run by companies (their main product and they sell services), and others by individuals.  If we bring this back to policy, we need to look at preserving whats already working sustainably but support it. Dont try to find a general solution but adopt targeted, well thought through interventions instead.</li>\n</ul>\n<p><em>I'm updating this as I go along but running out of laptop battery too!</em></p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). Is AI poisoning the scientific literature? Our comment in Nature. <a href=\"https://doi.org/10.59350/pbxew-d2j78\" target=\"_blank\"><i>10.59350/pbxew-d2j78</i></a></li>\n<li>Madhavapeddy (2025). EEG internships for the summer of 2025. <a href=\"https://doi.org/10.59350/tf22g-p1822\" target=\"_blank\"><i>10.59350/tf22g-p1822</i></a></li>\n<li>Richter (1960). How Many More New Journals?. Nature. <a href=\"https://doi.org/10.1038/186018a0\" target=\"_blank\"><i>10.1038/186018a0</i></a></li>\n<li>Editors (1979). Uniform Requirements for Manuscripts Submitted to Biomedical Journals. Annals of Clinical Biochemistry. <a href=\"https://doi.org/10.1177/000456327901600179\" target=\"_blank\"><i>10.1177/000456327901600179</i></a></li>\n<li>Fyfe (2022). Self-help for learned journals: Scientific societies and the commerce of publishing in the 1950s. History of Science. <a href=\"https://doi.org/10.1177/0073275321999901\" target=\"_blank\"><i>10.1177/0073275321999901</i></a></li></ul>","doi":"https://doi.org/10.59350/nmcab-py710","guid":"https://doi.org/10.59350/nmcab-py710","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1752451200,"reference":[{"id":"https://doi.org/10.1145/3744169.3744180","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/fk6vy-5q841","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1038/d41586-025-02069-w","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/pbxew-d2j78","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.59350/tf22g-p1822","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1038/186018a0","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1177/000456327901600179","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1177/0073275321999901","unstructured":"<b>[cito:cites]</b>"}],"rid":"7p1xb-30w84","summary":"I was a bit sleepy getting into the Royal Society Future of Scientific Publishing conference early this morning, but was quickly woken up by the dramatic passion on show as publishers, librarians, academics and funders all got together for a \"frank exchange of views\" at a meeting that didn't pull any punches! These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available from the RS in due course.","tags":["Royalsociety","Evidence","Publishing","Ai","Livenotes"],"title":"Royal Society's Future of Scientific Publishing meeting","updated_at":1781259294,"url":"https://anil.recoil.org/notes/rs-future-of-publishing","version":"v1"}}],"items":[{"authors":[{"contributor_roles":[],"family":"Marcum","given":"Christopher Steven","url":"https://orcid.org/0000-0002-0899-6143"}],"blog":{"authors":null,"community_id":"8bdb1ae7-4621-4fa5-ad1a-3a639417dfd5","created":1768694400,"current_feed_url":null,"description":"Perspectives on science, data, and technology that don't fit anywhere else.","favicon":"https://rogue-scholar.org/api/communities/8bdb1ae7-4621-4fa5-ad1a-3a639417dfd5/logo","feed_format":"application/atom+xml","feed_url":"http://chrismarcum.com/marcum-blog/feed.atom","filter":null,"generator":"Jekyll","home_page_url":"https://www.chrismarcum.com/marcum-blog/","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"chrismarcum","status":"active","subfield":"3312","title":"Open Evidence","updated":1781278875,"use_api":null},"blog_name":"Open Evidence","blog_slug":"chrismarcum","content_html":"<p>Late last month, the US Census Bureau released a really cool new data product: the <a href=\"https://www.census.gov/data/experimental-data-products/lace.html\">Local Air Conditioning Estimates</a> or LACE. An experimental data product, LACE provides insights into air conditioning prevalence across the United States. This new data product is significant because it fills a critical information gap in our knowledge about energy use potential and vulnerability to extreme heat.</p>\n<p>One of the things I'm really excited about is the underlying methodology. The LACE estimates were derived using cross-survey modeling and leveraged machine learning to integrate detailed housing data from the American Housing Survey (AHS) with the comprehensive geographic coverage of the American Community Survey (ACS). Census is innovating here!</p>\n<p>## \nI am still a gerontologist at heart and I know one of the major perennial issues elders face each year is the challege of summer heat. Summer heat is especially dangerous for older adults because aging bodies lose several of the systems that normally protect people from overheating. I merged the new LACE estimates with ACS 5-year data regarding the population aged 65 and older at the county-level. By combining these datasets, we can visualize the distribution of households without air conditioning alongside the concentration of older residents. The code <a href=\"https://github.com/cmarcum/talks-and-posts/tree/main/2026-06-12-LACE-and-Age\">is available here</a> (and requires a free Census API key).</p>\n<p>The map below visualizes these metrics by representing the percentage of occupied households without air conditioning. Darker tones indicate a higher proportion of homes lacking cooling systems. If you hover over a county with your cursor (or finger if you're on a mobile device), a pop-up will display the percentage of households without AC and the percentage of the local population aged 65 or older. While I did not look at the bivariate correlation between the two, one thing I did notice in the viz is Appalachia looks particularly exposed due to its combination of high elder population and low AC coverage. It can get HOT in them hollers (today's <a href=\"https://www.wpc.ncep.noaa.gov/heatrisk/\">heat index is extreme</a> in many of those places).</p>\n<div class=\"map-container\" style=\"margin: 20px 0;\">\n<iframe height=\"600px\" src=\"/marcum-blog/assets/leaflets/lace_o65_map.html\" style=\"border: none; border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);\" title=\"Choropleth / Heatmap of County-Level Percent of Households without Air Conditioning\" width=\"100%\">\n</iframe>\n</div>","doi":"https://doi.org/10.59350/d0nmw-5wf08","guid":"https://www.chrismarcum.com/marcum-blog/2026/06/12/LACE-and-Age","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1781222400,"rid":"9h3az-r0n56","summary":"Late last month, the US Census Bureau released a really cool new data product: the Local Air Conditioning Estimates or LACE. An experimental data product, LACE provides insights into air conditioning prevalence across the United States. This new data product is significant because it fills a critical information gap in our knowledge about energy use potential and vulnerability to extreme heat.","tags":["General","Open Data","Government"],"title":"A Really `Cool` New Data Set from Census","updated_at":1781280738,"url":"https://www.chrismarcum.com/marcum-blog/2026/06/12/LACE-and-Age.html","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p><a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> and I <a href=\"https://geotessera.org/blog/2026-06-09-tessera-v1-1\">announced TESSERA v1.1</a>\non behalf of <a href=\"https://geotessera.org/about#:~:text=for%2520Science%2520%C2%B7%2520Isambard-,People,-Lead%2520Faculty\">the team</a> earlier this week, and I wanted to follow up here with a more\nvisual explanation of what changed as I got quite a few questions about it!</p>\n<p>v1.1 is a retrained successor to the <a href=\"https://anil.recoil.org/papers/2025-tessera\">original v1.0 model</a> that\n<a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> and the team have been hammering on for months. Crucially, since we\npre-generate embedding 'map tiles', the new release is a drop-in replacement if\nyou just swap tiles; the basic format of 128 dimensions is unchanged.  Accuracy\nof your tasks should improve in all cases (a trend which will continue as we\ntrain better models with more data and training FLOPS).</p>\n<h2 id=\"fewer-artefacts-in-low-observation-areas\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#fewer-artefacts-in-low-observation-areas\"></a>Fewer artefacts in low observation areas</h2>\n<p>Tessera v1.0 could sometimes produce noisier tiles in regions with few clear\nsatellite observations (e.g. due to persistent cloud or satellite sensor gaps).\nThis exhibited as boundary-like seams in the tiles where the inferred\nembeddings didn't quite align; e.g. along Sentinel-1 ascending/descending\ncoverage edges where one side of the line might have ~50 valid observations and\nthe other ~150.</p>\n<p>Tessera v1.1 now handles both sparse and imbalanced observation patterns\ngracefully! If your region of interest was small and didn't straddle a\nproblematic tile you'll see no difference, but large-scale analyses should get\ncleaner.</p>\n<p>The easiest way to see all this is to look at the embeddings themselves in the\n<a href=\"https://tze.geotessera.org\">TZE explorer</a>. In this video I flip between the\nv1.0 and v1.1 embeddings over the same regions, visualised in false colour:</p>\n<p><div class=\"video-center\"><iframe allowfullscreen=\"\" frameborder=\"0\" height=\"315px\" sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" src=\"https://crank.recoil.org/videos/embed/297de7c9-9cea-4051-8b27-041fffa90e72\" title=\"Tessera 1.0 to 1.1 embeddings\" width=\"100%\"></iframe></div></p>\n<p>What you're looking at in the v1.0 layer are the grid-like seams running\nthrough an otherwise homogeneous landscape (Ireland doesn't really have those\njagged lines, you can confirm by visiting my lovely home).</p>\n<p>What's happening is that the number of valid observations jumps across the\nline, and the old v1.0 model showed that difference up into the embeddings. The\nspeckly patches are areas where persistent cloud left the model with too few\nclean observations to produce a stable representation.</p>\n<p>We then switch to the v1.1 layer, and the seams are gone and the noisy patches\nresolve into a smooth structure that follows the actual land cover. It's <em>very</em>\nsatisfying to click around the 10m\u00b2 pixels and watch embeddings that used to\nflicker between years settle down into stable trajectories in <a href=\"https://tze.geotessera.org\">the explorer</a>!</p>\n<h2 id=\"temporal-stability\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#temporal-stability\"></a>Temporal stability</h2>\n<p>If you're doing analysis over a long period of time, then the 128-dimensional\nembeddings are now much more consistent year-on-year for the same location.\nThis is a big deal for tasks like change detection, trend analysis, and even\njust convenience since training a classifier on one year and applying it to\nanother is now much more accurate.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/tessera11-temporal-drift.webp\" title=\"Differences in the same region across years with Tessera v1.0 and v1.1 (credit: Jovana Knezevic)\"/></p>\n<p>This feature won't affect most users,\nbut we're pretty pleased with how well change detection now works.</p>\n<h2 id=\"expanded-coastal-coverage-worldwide\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#expanded-coastal-coverage-worldwide\"></a>Expanded coastal coverage worldwide</h2>\n<p>The v1.0 land mask we used to mask out ocean areas was too aggressive, and\ndropped legitimate land pixels along coastlines or on small islands. We've\nlistened to our <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-thomas-worthington\">mangrove-loving friends</a> and extended the inference\nbuffer to 20km, which brings coastlines and remote islands properly into\ncoverage.</p>\n<p><a href=\"https://tze.geotessera.org/?store=v1.1\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/tze-explorer-v1.1-ss-1.webp\" title=\"The green false colour is the expanded coastal tiles, which now captures all of the UK including islands\"/> </a></p>\n<h2 id=\"our-coverage-maps-now-include-v10-and-v11\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#our-coverage-maps-now-include-v10-and-v11\"></a>Our coverage maps now include v1.0 and v1.1</h2>\n<p>I updated the <a href=\"https://ucam-eo.github.io/tessera-coverage-map/\">live coverage map</a> to now\ntrack both generations side-by-side, so you can see exactly which tiles exist\nfor v1.0 and v1.1 in any given year:</p>\n<p><div class=\"video-center\"><iframe allowfullscreen=\"\" frameborder=\"0\" height=\"315px\" sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" src=\"https://crank.recoil.org/videos/embed/97d422a2-af9c-47b5-947a-c136ad7093b6\" title=\"Tessera v1 and v1.1 coverage map\" width=\"100%\"></iframe></div></p>\n<p>This is all updated via a <a href=\"https://github.com/ucam-eo/tessera-coverage-map/blob/main/.github/workflows/map.yml\">GitHub Action on ucam-eo/tessera-coverage-map</a>\nthat also updates an index Parquet file of all available manifests.</p>\n<h3 id=\"getting-the-v11-embeddings\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#getting-the-v11-embeddings\"></a>Getting the v1.1 embeddings</h3>\n<p>To get the new embeddings, grab the <a href=\"https://github.com/ucam-eo/geotessera/releases/tag/v0.9.0\">geotessera 0.9.0+ release</a> of the\n<a href=\"https://anil.recoil.org/notes/geotessera-python\">Python library</a> which went out alongside v1.1. It has a new\n<code>--dataset-version</code> flag to pick v1.0 or v1.1, and a <code>--dataset-variant</code> flag\nnow that multiple parties are generating embeddings for the community:</p>\n<ul>\n<li><code>vultr</code> is the original <a href=\"https://geotessera.org/blog/2026-03-30-training-and-inference-at-scale\">v1.0 global run</a></li>\n<li><code>cambridge</code> is our <a href=\"https://www.tunbury.org/2026/05/20/processing-uk-azure-spot/\">OxCaml-generated</a> v1.1 run for early adopters</li>\n<li>We're working on a Zarr-native full global v1.1 with <a href=\"https://www.cyclops.ai/\">Cyclops.ai</a>, covering 2017-2025 that will become the default once it lands.</li>\n</ul>\n<p>Use <a href=\"https://docs.astral.sh/uv/\">uvx</a> to try this without any installation:</p>\n<pre><code class=\"language-bash\">uvx geotessera download \\\n  --country \"United Kingdom\" \\\n  --year 2024 \\\n  --dataset-version v1.1 \\\n  --dataset-variant cambridge \\\n  --format npy \\\n  --output ./uk-v1.1\n</code></pre>\n<p>All the embeddings (both versions) are also now in the <code>s3://tessera-embeddings</code>\npublic bucket on AWS Open Data, which geotessera 0.9 switches to by default.\nSpare a kind thought for \"okavango\", our single overworked Cambridge server that served every\nTESSERA embedding for the first six months without falling over (much)!\nBut seriously, at some point, we're going to have to turn off `okavanago' as it's\ntaking up a significant amount of the egress bandwidth for Cambridge, so I encourage\nusers to upgrade to geotessera 0.9 as soon as possible just to change the source\nof your embeddings download. Let me know if you have any problems!</p>\n<h3 id=\"also-on-hugging-face-now\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/tessera-v11-out/#also-on-hugging-face-now\"></a>Also on Hugging Face now</h3>\n<p>We're also now on <a href=\"https://huggingface.co/geotessera/TESSERA-V-1.1\">Hugging Face</a>\nwith the full v1.1 (and <a href=\"https://huggingface.co/geotessera/TESSERA-V-1.0\">v1.0</a>)\nmodel weights, with checkpoints for both the Microsoft Planetary Computer and\nAWS Open Data preprocessing backends. If you'd rather run inference yourself\nor fine-tune on your own data, everything you need is there, all under CC0 as\nusual. Do <a href=\"https://eeg.zulipchat.com\">let us know</a> if you fine-tune a model as\nwe'd love to see how it goes.</p>\n<p>If there's a region of the world you need for your own research urgently,\nplease do <a href=\"https://github.com/ucam-eo/geotessera/issues\">request an ROI</a> on the\ngeotessera issue tracker and we'll prioritise it in the generation queue.\nOtherwise, sit tight as we'll have full global 2017-2025 coverage within a few\nmonths!</p>\n<p>See also [coverage from the <a href=\"https://www.meteorologicaltechnologyinternational.com/news/satellites/cambridge-ai-tool-converts-satellite-archives-into-accessible-earth-intelligence.html\">Meteorological Technology trade magazine</a> about the release.</p><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li></ul>","doi":"https://doi.org/10.59350/vcqjp-24y05","guid":"https://doi.org/10.59350/vcqjp-24y05","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1781222400,"reference":[{"id":"https://doi.org/10.48550/arxiv.2506.20380","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/7hy6m-1rq76","unstructured":"<b>[cito:citesAsRelated]</b>"}],"rid":"h0wq3-hpy81","summary":"Frank Feng and I announced TESSERA v1.1 on behalf of the team earlier this week, and I wanted to follow up here with a more visual explanation of what changed as I got quite a few questions about it! v1.1 is a retrained successor to the original v1.0 model that Frank Feng and the team have been hammering on for months. Crucially, since we pre-generate embedding 'map tiles', the new release is a drop-in replacement if you just swap tiles;","tags":["Tessera","Spatial","Ai","Satellite"],"title":"Tessera v1.1 released, with smoother and temporally stable embeddings","updated_at":1781276156,"url":"https://anil.recoil.org/notes/tessera-v11-out","version":"v1"},{"authors":[{"contributor_roles":[],"family":"Turner","given":"Stephen D."}],"blog":{"authors":[{"name":"Stephen Turner"}],"community_id":"382941a7-2ffa-41df-8bbb-5f772188517f","created":1780876800,"current_feed_url":null,"description":"A practicing data scientist's take on AI, genomics, biosecurity, and the ways AI is reshaping how science gets done. Weekly updates from the field. Occasional notes on programming.","favicon":"https://rogue-scholar.org/api/communities/382941a7-2ffa-41df-8bbb-5f772188517f/logo","feed_format":"application/rss+xml","feed_url":"https://blog.stephenturner.us/feed","filter":null,"generator":"Substack","home_page_url":"https://blog.stephenturner.us","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"stephenturner","status":"active","subfield":"1311","title":"Paired Ends","updated":1781270487,"use_api":null},"blog_name":"Paired Ends","blog_slug":"stephenturner","content_html":"<p>I've had a busy week, taking the day off today, and I haven't had a chance to do much reading. I've been spending a ton of time lately developing a new <a href=\"https://hooslist.virginia.edu/ClassSchedule/ClassHistory?subject=DS&amp;catalogNumber=5080\">course</a> I'll be teaching this fall, and preparing a <a href=\"https://ai.provost.virginia.edu/ai-upskilling\">workshop</a> on AI-powered literature review and synthesis I'll be teaching next week (if you're at UVA, <a href=\"https://www.eventbrite.com/e/in-person-smarter-literature-reviews-with-ai-powered-tools-tickets-1987394833446?aff=oddtdtcreator\">register</a> and attend for the in-person event if you can \u2014 it'll be much more engaging than Zooming in, trust me).</p><p>Here are my open browser tabs I have open that I hope to catch up on soon.</p><p class=\"button-wrapper\" data-attrs=\"{&quot;url&quot;:&quot;https://blog.stephenturner.us/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}\" data-component-name=\"ButtonCreateButton\"><a class=\"button primary\" href=\"https://blog.stephenturner.us/subscribe?\"><span>Subscribe now</span></a></p><h3>Blogs/newsletters/etc</h3><ol><li><p><a href=\"https://darioamodei.com/post/policy-on-the-ai-exponential\">Dario Amodei \u2014&nbsp;Policy on the AI Exponential</a></p></li><li><p><a href=\"https://www.anthropic.com/research/agents-in-biology\">Paving the way for agents in biology \\ Anthropic</a><a class=\"footnote-anchor\" data-component-name=\"FootnoteAnchorToDOM\" id=\"footnote-anchor-1\" href=\"#footnote-1\" target=\"_self\">1</a></p></li><li><p><a href=\"https://grants.nih.gov/grants/guide/notice-files/NOT-OD-26-086.html\">NIH RFI on limiting the number of grants per PI</a></p></li><li><p><a href=\"https://www.anthropic.com/news/claude-fable-5-mythos-5\">Claude Fable 5 and Claude Mythos 5 \\ Anthropic</a><a class=\"footnote-anchor\" data-component-name=\"FootnoteAnchorToDOM\" id=\"footnote-anchor-2\" href=\"#footnote-2\" target=\"_self\">2</a></p></li><li><p><a href=\"https://www.newyorker.com/news/fault-lines/eight-predictions-for-the-future-of-higher-education\">Eight Predictions for the Future of Higher Education</a></p></li><li><p><a href=\"https://mattsbiodefense.substack.com/p/five-things-june-7-2026\">Matt Lubin: Five Things: June 7, 2026</a></p></li><li><p><a href=\"https://www.profgmedia.com/p/is-ai-more-expensive-than-the-employees\">Is AI More Expensive Than the Employees It's Replacing?</a></p></li><li><p><a href=\"https://liangchang.substack.com/p/the-anti-scaling-law-in-biology-and\">The Anti-Scaling Law in Biology, and Why AI Could Make Crowding Worse Before Making Drug Development Better</a></p></li><li><p><a href=\"https://theinfinitesimal.substack.com/p/thoughts-on-ai-in-academia\">Sasha Gusev: Thoughts on AI in academia</a></p></li><li><p><a href=\"https://www.0xkato.xyz/how-llms-actually-work/\">How LLMs Actually Work | 0xkato</a></p></li><li><p><a href=\"https://evgenykiner.substack.com/p/a-cell-is-not-a-spreadsheet-why-virtual\">A cell is not a spreadsheet- why \"Virtual Cells\" are still mostly hype</a></p></li><li><p><a href=\"https://www.anthropic.com/institute/recursive-self-improvement\">Anthropic: When AI builds itself</a></p></li><li><p><a href=\"https://www.newyorker.com/news/fault-lines/can-ai-produce-writing-that-we-actually-want-to-read\">Can A.I. Produce Writing That We Actually Want to Read?</a></p></li><li><p><a href=\"https://epochai.substack.com/p/is-a-compute-crunch-coming\">Is a compute crunch coming?</a></p></li><li><p><a href=\"https://openai.com/index/built-to-benefit-everyone-our-plan/\">Built to benefit everyone: our plan | OpenAI</a></p></li><li><p><a href=\"https://www.nature.com/articles/d41586-026-01689-0?utm_source=x&amp;utm_medium=social&amp;utm_campaign=nature&amp;linkId=62230411\">Bots are scraping open data \u2014 how should researchers respond?</a></p></li><li><p><a href=\"https://letter.nikomc.com/p/small\">Why Are Cells Small? - Niko McCarty</a></p></li><li><p><a href=\"https://www.owlposting.com/p/how-to-build-a-cancer-vaccine-and\">How to build a cancer vaccine, and whether they will work this time</a></p></li></ol><h3>Papers</h3><ol><li><p><a href=\"https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1014287\">The total eclipse of bioinformatics: From disruption to convention, and a gentle warning</a></p></li><li><p><a href=\"https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2026.1832974/full\">Dual-use artificial intelligence and biology: upstream risk-benefit reviews</a></p></li><li><p><a href=\"https://www.pnas.org/doi/10.1073/pnas.2615114123\">Molecular de-extinction looks to the past to find the molecules of the future</a></p></li><li><p><a href=\"https://arxiv.org/abs/2605.28655v1\">AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation</a></p></li><li><p><a href=\"https://www.nature.com/articles/s41588-026-02607-w\">Pleiotropic shared heritability quantifies the shared genetic variance of common diseases</a></p></li><li><p><a href=\"https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1014338\">Ten simple rules for teaching data science</a></p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2022.05.06.490859v3\">Depth normalization for single-cell genomics count data</a> and <a href=\"https://xcancel.com/lpachter/status/2064795978264432988\">Lior's explainer</a></p></li></ol><p class=\"button-wrapper\" data-attrs=\"{&quot;url&quot;:&quot;https://blog.stephenturner.us/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}\" data-component-name=\"ButtonCreateButton\"><a class=\"button primary\" href=\"https://blog.stephenturner.us/subscribe?\"><span>Subscribe now</span></a></p><div class=\"footnote\" data-component-name=\"FootnoteToDOM\"><a id=\"footnote-1\" href=\"#footnote-anchor-1\" class=\"footnote-number\" contenteditable=\"false\" target=\"_self\">1</a><div class=\"footnote-content\"><p>I just read this one right before posting. The post describes the difficulty agents have at retrieving biological data. Which isn't limited to agents! It's difficult for a human to navigate the disparate databases and web interfaces and NCBI Virus search incantations to get the thing you're looking for. If this problem were solved for agents, it'd make life easier for us humans as well. A conclusion from the post: <em>\"We want models to be creative when they generate hypotheses, design experiments, or reason about mechanisms. But the layer underneath that creativity\u2014gene identifiers, schemas, retrieval logic, coordinate systems, metadata conventions, and data access paths\u2014has to be boringly reliable (or in other words, deterministic)\"</em>. </p></div></div><div class=\"footnote\" data-component-name=\"FootnoteToDOM\"><a id=\"footnote-2\" href=\"#footnote-anchor-2\" class=\"footnote-number\" contenteditable=\"false\" target=\"_self\">2</a><div class=\"footnote-content\"><p>I haven't had a chance to do anything with Fable yet, mostly because I work in AIxBio, and Bio is off limits. And because I'm a biologist, Fable refuses to talk to me (\"Who am I?\" leads to safety flags and demotion of the rest of the conversation to Opus). Precautionary principal is probably the right move here given the benchmarks, and I think managed access will likely be the way these models are released from here out.</p><div class=\"captioned-image-container\"><figure><a class=\"image-link image2 is-viewable-img\" target=\"_blank\" href=\"https://substackcdn.com/image/fetch/$s_!8PjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg\" data-component-name=\"Image2ToDOM\"><div class=\"image2-inset\"><picture><source type=\"image/webp\" srcset=\"https://substackcdn.com/image/fetch/$s_!8PjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1456w\" sizes=\"100vw\"><img src=\"https://substackcdn.com/image/fetch/$s_!8PjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg\" width=\"356\" height=\"772.3189368770765\" data-attrs=\"{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1306,&quot;width&quot;:602,&quot;resizeWidth&quot;:356,&quot;bytes&quot;:157698,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.stephenturner.us/i/201151842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}\" class=\"sizing-normal\" alt=\"\" srcset=\"https://substackcdn.com/image/fetch/$s_!8PjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8PjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg 1456w\" sizes=\"100vw\" loading=\"lazy\"></picture><div class=\"image-link-expand\"><div class=\"pencraft pc-display-flex pc-gap-8 pc-reset\"><button tabindex=\"0\" type=\"button\" class=\"pencraft pc-reset pencraft icon-container restack-image\"><svg role=\"img\" width=\"20\" height=\"20\" viewBox=\"0 0 20 20\" fill=\"none\" stroke-width=\"1.5\" stroke=\"var(--color-fg-primary)\" stroke-linecap=\"round\" stroke-linejoin=\"round\" xmlns=\"http://www.w3.org/2000/svg\"><g><title></title><path d=\"M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882\"></path></g></svg></button><button tabindex=\"0\" type=\"button\" class=\"pencraft pc-reset pencraft icon-container view-image\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"20\" height=\"20\" viewBox=\"0 0 24 24\" fill=\"none\" stroke=\"currentColor\" stroke-width=\"2\" stroke-linecap=\"round\" stroke-linejoin=\"round\" class=\"lucide lucide-maximize2 lucide-maximize-2\"><polyline points=\"15 3 21 3 21 9\"></polyline><polyline points=\"9 21 3 21 3 15\"></polyline><line x1=\"21\" x2=\"14\" y1=\"3\" y2=\"10\"></line><line x1=\"3\" x2=\"10\" y1=\"21\" y2=\"14\"></line></svg></button></div></div></div></a></figure></div><p><br></p></div></div>","doi":"https://doi.org/10.59350/6z1rs-ner26","guid":"201151842","image":"https://substackcdn.com/image/fetch/$s_!8PjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db0bbd0-87b4-4b71-9782-61e004635f6e_602x1306.jpeg","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1781222400,"rid":"78n6a-47133","summary":"TBR in AIxBio, AIxEdu, AIxLabor, AIxWriting, and other essays &amp;","tags":["Biosecurity","AI"],"title":"Open tabs (June 12, 2026)","updated_at":1781272366,"url":"https://blog.stephenturner.us/p/open-tabs-june-12-2026","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>I spent a couple of days at the <a href=\"https://www.nationalacademies.org/home\">National Academy of Sciences</a> in the USA at the invitation of the <a href=\"https://royalsociety.org\">Royal Society</a>, who held a forum on \"<a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">Measuring Biodiversity for Addressing the Global Crisis</a>\". It was a <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">packed program</a> for those working in evidence-driven conservation:</p>\n<blockquote>\n<p>Assessing biodiversity is fundamental to understanding the distribution of biodiversity, the changes that are occurring and, crucially, the effectiveness of actions to address the ongoing biodiversity crisis. Such assessments face multiple challenges, not least the great complexity of natural systems, but also a lack of standardized approaches to measurement, a plethora of measurement technologies with their own strengths and weaknesses, and different data needs depending on the purpose\nfor which the information is being gathered.</p>\n<p>Other sectors have faced similar challenges, and the forum will look to learn from these precedents with a view to building momentum toward standardized methods for using environmental monitoring technologies, including new technologies, for particular purposes.\n<cite>-- NAS/Royal Society <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">US-UK Scientific Forum on Measuring Biodiversity</a></cite></p>\n</blockquote>\n<p>I was honoured to talk about our work on using AI to \"connect the dots\" between disparate data like the academic literature and remote observations at scale. But before that, here's some of the bigger picture stuff I learnt...</p>\n<p><a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-cover.webp\" title=\"Identifying the bird is an exercise for the reader!\"/> </a></p>\n<h2 id=\"shifting-conservation-to-a-winning-stance\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#shifting-conservation-to-a-winning-stance\"></a>Shifting conservation to a winning stance</h2>\n<p>The need for urgent, additional action came across loud and clear from all the top actors in biodiversity. On the bright side, we have made stellar progress in measuring more dimensions of biodiversity accurately than ever before in human history. But, the field of biodiversity does not have a single \"simple question\" that needs answering, unlike many other science challenges in physics or chemistry. The ecosystem of nature measurements need to span scales ranging from the micro (from fungi and soil health) to the macro (species richness and diversity), with geographical coverage across the planet but also hyperlocal accuracy for ecosystem services.</p>\n<p>One key question asked at the forum was how we can get to interoperable, pragmatic tools that enable all the actors involved in conservation actions (from the governments that set policy, to the private sector that controls the supply chains, to the people who have to live in and depend on natural services) to work together more effectively on gathering all the data needed.</p>\n<p>This interoperability has to emerge during a rapid shift towards digital methods, which are vulnerable to being <a href=\"https://www.bbc.com/future/article/20250422-usa-scientists-race-to-save-climate-data-before-its-deleted-by-the-trump-administration\">deleted and edited at scale</a> with decades of painstaking observations at risk at the moment.  And in the middle of all this, machine learning is swooping in to perform data interpolation at scale, but also risks <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">dividing</a> and polluting observations with inaccurate projections.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-2.webp\"/></p>\n<h2 id=\"what-is-an-optimistic-future-for-conservation\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#what-is-an-optimistic-future-for-conservation\"></a>What is an optimistic future for conservation?</h2>\n<p>This is all quite the challenge even for a gung-ho computer scientist like me, and I was struggling with the enormity of it all! But things really clicked into place after the inspirational <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> pointed me at a <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">fantastic big-picture paper</a>:</p>\n<blockquote>\n<p>Drawing reasonable inferences from current patterns, we can predict that 100 years from now, the Earth could be inhabited by between 6-8 billion people, with very few remaining in extreme poverty, most living in towns and cities, and nearly all participating in a technologically driven, interconnected market economy.</p>\n<p>[...] we articulate a theory of social\u2013environmental change that describes the simultaneous and interacting effects of urban lifestyles on fertility, poverty alleviation, and ideation.</p>\n<p><cite><a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation</a></cite></p>\n</blockquote>\n<p>They observe that the field of conservation has often \"succumbed to jeremiad, bickering, and despair\". Much of this angst springs from the (failed) bets made by <a href=\"https://en.wikipedia.org/wiki/Paul_R._Ehrlich\">Paul Ehlrich</a>, who thinks <a href=\"https://www.nature.com/articles/d41586-024-03592-y\">humans are going to be wiped out</a> because of unbounded expansion. In response, conservation has become \"the art of slowing declines\" rather than achieving long term wins. But instead of being moribund, the paper paints an optimistic, practical endgame for conservation:</p>\n<blockquote>\n<p>We suggest that lasting conservation success can best be realized when:</p>\n<ul>\n<li>the human population stabilizes and begins to decrease</li>\n<li>extreme poverty is alleviated</li>\n<li>the majority of the world's people and institutions act on a shared belief that it is in their best interest to care for rather than destroy the natural bases of life on Earth.</li>\n</ul>\n</blockquote>\n<p>It turns out that most of these conditions can be reasonably projected to happen in the next fifty years or so. Population is projected to <a href=\"https://en.wikipedia.org/wiki/Human_population_projections\">peak by the turn of the century</a>, <a href=\"https://openknowledge.worldbank.org/entities/publication/9d0fb27a-3afe-5999-8d8e-baf90b4331c0/full\">extreme poverty might reasonably be eradicated by 2050</a>, and <a href=\"https://iopscience.iop.org/article/10.1088/1748-9326/8/1/014025\">urban landuse will stabilise at 6% of terrestrial land</a> by 2030-ish.</p>\n<p><a href=\"https://academic.oup.com/view-large/figure/118140827/biy039fig4.jpeg\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-6.webp\" title=\"Connecting demographic and economic trends in the 21st century to the environment\"/> </a></p>\n<p>Given this projection, the paper then points out that conservation doesn't need to save nature \"forever\". Instead, we have to save enough nature now to \"breakthrough\" from the <a href=\"https://en.wikipedia.org/wiki/Great_Acceleration\">great acceleration</a> of WWII until we stabilise landuse.</p>\n<blockquote>\n<p>The profound danger is that by the time the foundations of recovery are in place, little of wildlife and wild places will be left. If society focuses only on economic development and technological innovation as a mechanism to pass through the bottleneck as fast as possible, then what remains of nature could well be sacrificed.\nIf society were to focus only on limiting economic growth to protect nature, then terrible poverty and population growth could overwhelm what remains.</p>\n<p>Either extreme risks narrowing the bottleneck to such an extent that our world passes through without its tigers, elephants, rainforests, coral reefs, or a life-sustaining climate. Therefore, the only sensible path for conservation is to continue its efforts to protect biodiversity while engaging in cities to build the foundations for a lasting recovery of nature.\n<cite>-- <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough</a></cite></p>\n</blockquote>\n<p>This puts what we need to achieve today in a far, far more pragmatic light:</p>\n<blockquote>\n<p>[...] it means that conservation faces another 30\u201350 years of extreme difficulty, when more losses can be expected. However, if we can sustain enough nature through the bottleneck\u2014despite climate change, growth in the population and economy, and urban expansion\u2014then we can see the future of nature in a dramatically more positive light.</p>\n</blockquote>\n<p>Conservation is all about solving difficult opportunity-cost decisions in society.\nScience can help calculate <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">credible counterfactuals</a> that allow policymakers to balance\nlimited resources to minimise nature harm while maximising benefit to humans. We can also figure out new <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">economic methods</a> to figure out the value of future actions. When combined, this can help conservation break through the bottleneck of the next fifty years of nature loss... and computer science can make a serious <a href=\"https://fivetimesfaster.org/\">accelerative</a> impact here (yay!).</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-5.webp\" title=\"What does one call a group of ecology legends? A committee!\"/></p>\n<h2 id=\"topics-relevant-to-our-planetary-computing-research\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#topics-relevant-to-our-planetary-computing-research\"></a>Topics relevant to our planetary computing research</h2>\n<p>Having got my existential big-picture crisis under control, here are some more concrete thoughts about some of the joint ideas that emerged from the NAS meeting.</p>\n<h3 id=\"resilience-in-biodiversity-data\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#resilience-in-biodiversity-data\"></a>Resilience in biodiversity data</h3>\n<p>We've been doing a <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">lot</a> of <a href=\"https://digitalflapjack.com/weeknotes/2025-04-22/\">work</a> on mechanisms to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">process and ingest</a> remote sensing data. All of our techniques also apply to biodiversity, except that the pipelines are even more complex due to the multi-modal nature of the data being stored. This can be clearly seen in this <a href=\"https://www.science.org/doi/10.1126/science.adq2110\">review on the decline of insect biodiversity</a> that speaker Nick Isaac and my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> published last month.</p>\n<p><a href=\"https://www.science.org/doi/10.1126/science.adq2110\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-1.webp\" title=\"(source: Science, 10.1126/science.adq2110)\"/> </a></p>\n<p>The data itself isn't just from one source; instead, we need a pipeline of spatial (at different resolution) measurements, of different types (visual, acoustic, occurrence), of different provenance (experts, crowdsourced, museum), and from different hypotheses tests (evidence bases).</p>\n<p>Once the ingestion pipeline is in place, there's a full range of validation and combination and extrapolation involved, often involving AI methods these days.  The output from all of this is then tested to determine which <a href=\"https://anil.recoil.org/projects/ce\">conservation actions</a> to take.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-3.webp\" title=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\"/></p>\n<p><a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> also talked about the ambitious <a href=\"https://www.nature.com/articles/s41559-023-02171-0\">global biodiversity observing system</a> that he's been assembling a coalition for in recent years.  They are using Docker as part of this via their <a href=\"https://boninabox.geobon.org/\">Bon in a Box</a> product but hitting scaling issues (a common problem due to the size of geospatial tiles).</p>\n<p><a href=\"https://www.nature.com/articles/s41559-023-02171-0\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-7.webp\" title=\"Andrew Gonzalez explains the GBioS concept\"/> </a></p>\n<p>There's a good tie in for collaboration with us here via the next-generation <a href=\"https://patrick.sirref.org/weekly-2025-05-12/index.xml\">time-travelling shell</a> that <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> is developing that can handle this via <a href=\"https://www.tunbury.org/zfs-system-concept/\">ZFS snapshots</a>.  <a href=\"https://mynameismwd.org\">Michael Dales</a> has been applying this to scaling the <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> pipelines recently with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. And meanwhile <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a> have been researching <a href=\"https://anil.recoil.org/papers/2024-terracorder\">embedded biodiversity sensors</a>. The overall theme is that we need to make the hardware and software stack involved far easier to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">use for non-expert programmers</a>.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-8.webp\" title=\"A key part of the GBioS vision is to have a federated system\"/></p>\n<h3 id=\"observing-the-earth-through-geospatial-foundation-models\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#observing-the-earth-through-geospatial-foundation-models\"></a>Observing the earth through geospatial foundation models</h3>\n<p>Another problem that several speakers discussed was how complex biodiversity observations are to manage since they span multiple scales. In my talk, I described the new <a href=\"https://github.com/FrankFeng-23/btfm_project\">TESSERA</a> geospatial foundation model that <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have been leading in Cambridge. As this is a pre-trained foundation model, it needs to be finetuned to specific downstream tasks. A number of people came up after my talk with suggestions for collaborations here!</p>\n<p>Firstly, <a href=\"https://earthshotprize.org/winners-finalists/naturemetrics/\">Kat Bruce</a> (fresh from <a href=\"https://www.bbc.com/news/articles/cre8xxd7xl8o\">spraying pondwater</a> with Prince William) explained how <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a> are gathering <a href=\"https://en.wikipedia.org/wiki/Environmental_DNA\">eDNA</a> from many diverse sources. The data is of varying licenses depending on which customer paid for the acquisition, but overall there is a lot of information about species presence that's very orthogonal to the kind of data gathered from satellite observations.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-4.webp\" title=\"Kat Bruce showing how much information is packed into eDNA measurements\"/></p>\n<p>Secondly, <a href=\"https://darulab.org/\">Barnabas Daru</a> from Stanford described his efforts to map plant traits to species distribution models. This complements some work <a href=\"https://coomeslab.org\">David Coomes</a> has been leading recently in our group with <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a> and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> on mapping rare plants globally. The basic problem here is that plant occurrence data is <em>extremely</em> data deficient and spatially biased for 100k+ species, and so we'll need cunning interpolation techniques to fill in the data gaps.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-12.webp\" title=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\"/></p>\n<p>When back in Cambridge, I'm going to arrange for all of us to chat to see if we can somehow combine eDNA, fungal biodiversity, plant traits and satellite foundation models into a comprehensive global plant species map!</p>\n<h3 id=\"evidence-synthesis-from-the-literature\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#evidence-synthesis-from-the-literature\"></a>Evidence synthesis from the literature</h3>\n<p>There was also huge enthusiasm for another of our projects on <a href=\"https://anil.recoil.org/projects/ce\">analysing the academic literature</a> at scale. While we've been using it initially to accelerate the efficiacy and accuracy of <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">systematic reviews</a> for <a href=\"https://conservationevidence.com\">Conservation Evidence</a>, there are a huge number of followup benefits for having a comprehensive data corpus.</p>\n<p>Firstly, <a href=\"http://elphick.lab.uconn.edu/\">Chris Elphick</a> pointed out a metasynthesis where they manually integrate recent <a href=\"https://academic.oup.com/bioscience/advance-article-abstract/doi/10.1093/biosci/biaf034/8115312\">hypotheses about insect stressors and responses</a> into a network (3385 edges / 108 nodes). It found that the network is highly interconnected, with agricultural intensification often identified as a root cause for insect decline. Much like the CE manually labeled dataset, it should be possible to do hypothesis searches in our LLM pipeline to expand this search and make it more dynamic.</p>\n<p>Secondly, <a href=\"http://oisin.info\">Oisin Mac Aodha</a>, fresh from a <a href=\"https://watch.eeg.cl.cam.ac.uk/w/7aqBd2Nn9E6QpMvnoBPxuQ\">recent talk</a> in Cambridge, discussed his <a href=\"https://arxiv.org/abs/2502.14977\">recent work</a> on few-shot species range estimation and also <a href=\"https://arxiv.org/abs/2412.14428\">WildSAT text/image encoding</a>. His example showed how you could not only spot a species from images, but also use text prompts to refine the search. An obvious extension for us to have a go at here is to combine our large corpus of academic papers with these models to see how good the search/range estimation could get with a much larger corpus of data.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-13.webp\" title=\"I am proud to have pronounced Oisin's name correctly while introducing his recent CCI seminar\"/></p>\n<p>And thirdly, I finally met my coauthor <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\">David Williams</a> in the flesh for the first time! We've worked together recently on the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity impact of food</a>, and we had a long discussion over dinner about whether we could glean more behavioural data about how people react from the wider literature. This would require us expanding our literature corpus into <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">grey literature</a> and policy documents, but this is something that <a href=\"https://toao.com\">Sadiq Jaffer</a> and I want to do soon anyway.</p>\n<p>The connective tissue across these seemingly disparate projects is that there is a strong connection between what you can observe from space (the canopies of trees) to the traits expressed via knowledge of plant physiology and their DNA. If we could figure out how to connect the dots between the observed species to the physiological traits to the bioclimatic range variables, we could figure out where the (many) data-deficient plant species in the world are! I'll be hosting a meeting in Cambridge soon on this since we're already <a href=\"https://anil.recoil.org/notes/ukri-grant-terra\">working on it</a>.</p>\n<h3 id=\"visualisations-in-biodiversity\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/nas-rs-biodiversity/#visualisations-in-biodiversity\"></a>Visualisations in biodiversity</h3>\n<p>The most unexpectedly cool talk was <a href=\"https://www.weizmann.ac.il/plants/Milo/home\">Ron Milo</a> showing us visualisations of the <a href=\"https://www.pnas.org/doi/10.1073/pnas.1711842115\">mass distribution of all life on earth</a>. His work really puts our overall challenge into context, as it shows just how utterly dominated wildlife is by domesticated animals.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-11.webp\" title=\"The dominant mammal biomass on the planet are domesticated animals\"/></p>\n<p>It struck me just how important these sort of high-level visualisations are in putting detailed numbers into context. For example, he also broke down global biomass that showed that plants are by far the \"heaviest\" living thing on earth, and that the ocean organisms do still dominate animal biomass.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-9.webp\"/></p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/nas-rs-10.webp\"/></p>\n<p>My favourite new animation library on the block is <a href=\"https://animejs.com/\">AnimeJS</a>, and so once I plan to try to do some nice animations for <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> along these lines after the academic term finishes.</p>\n<p>And that's a wrap on my notes for now! I'm still hanging out in the US for a bunch more meetings (including one at <a href=\"https://www.nationalgeographic.com/\">National Geographic HQ</a>), so I'll update this note when the official RS/NAS videos and writeup comes out.</p>\n<p><em>(Update 5th June: the <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&amp;list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">full talk videos series</a> is now online at the National Academy of Sciences channel. Enjoy!)</em></p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li>\n<li>Ferris et al (2024). Planetary computing for data-driven environmental policy-making. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2303.04501\" target=\"_blank\"><i>10.48550/arXiv.2303.04501</i></a></li>\n<li>Madhavapeddy (2025). Technology needs to unite conservation, not divide it. <a href=\"https://doi.org/10.59350/vwrvd-3sg08\" target=\"_blank\"><i>10.59350/vwrvd-3sg08</i></a></li>\n<li>Sanderson et al (2018). From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation. BioScience. <a href=\"https://doi.org/10.1093/biosci/biy039\" target=\"_blank\"><i>10.1093/biosci/biy039</i></a></li>\n<li>Jones (2024). The scale of the biodiversity crisis laid bare. Nature. <a href=\"https://doi.org/10.1038/d41586-024-03592-y\" target=\"_blank\"><i>10.1038/d41586-024-03592-y</i></a></li>\n<li>Gonzalez et al (2023). A global biodiversity observing system to unite monitoring and guide action. Nature Ecology &amp; Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02171-0\" target=\"_blank\"><i>10.1038/s41559-023-02171-0</i></a></li>\n<li>Halsch et al (2025). Meta-synthesis reveals interconnections among apparent drivers of insect biodiversity loss. BioScience. <a href=\"https://doi.org/10.1093/biosci/biaf034\" target=\"_blank\"><i>10.1093/biosci/biaf034</i></a></li>\n<li>Lange et al (2025). Feedforward Few-shot Species Range Estimation. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.14977\" target=\"_blank\"><i>10.48550/arXiv.2502.14977</i></a></li>\n<li>Daroya et al (2025). WildSAT: Learning Satellite Image Representations from Wildlife Observations. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2412.14428\" target=\"_blank\"><i>10.48550/arXiv.2412.14428</i></a></li></ul>","doi":"https://doi.org/10.59350/j6zkp-n7t82","guid":"https://doi.org/10.59350/j6zkp-n7t82","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1748044800,"reference":[{"id":"https://doi.org/10.33774/coe-2024-gvslq","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1098/rstb.2023.0327","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1038/s43016-025-01224-w","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1038/s41558-023-01815-0","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.48550/arxiv.2408.02407","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.48550/arxiv.2303.04501","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/vwrvd-3sg08","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1093/biosci/biy039","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1038/d41586-024-03592-y","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1038/s41559-023-02171-0","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1093/biosci/biaf034","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.48550/arxiv.2502.14977","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.48550/arxiv.2412.14428","unstructured":"<b>[cito:cites]</b>"}],"rid":"zmtp7-j5q82","summary":"I spent a couple of days at the National Academy of Sciences in the USA at the invitation of the Royal Society, who held a forum on \"Measuring Biodiversity for Addressing the Global Crisis\". It was a packed program for those working in evidence-driven conservation: I was honoured to talk about our work on using AI to \"connect the dots\" between disparate data like the academic literature and remote observations at scale.","tags":["Biodiversity","Conservation","Policy","Royalsociety","Usa"],"title":"What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity","updated_at":1781259301,"url":"https://anil.recoil.org/notes/nas-rs-biodiversity","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>I stayed on for a few days extra in Washington DC after the <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">biodiversity extravaganza</a> to attend a workshop at legendary <a href=\"https://www.nationalgeographic.org/society/visit-base-camp/\">National Geographic Basecamp</a>. While I've been to several NatGeo <a href=\"https://www.nationalgeographic.org/society/national-geographic-explorers/\">Explorers</a> meetups in California, I've never had the chance to visit their HQ. The purpose of this was to attend a workshop organised by <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz</a> from St Andrews about the \"Urban Exploration Project\":</p>\n<blockquote>\n<p>[The UEP is a...] global-scale, community-driven initiative will collaboratively track animals across gradients of urbanization worldwide, to produce a holistic understanding of animal behaviour in human-modified landscapes that can, in turn, be used to develop evidence-based approaches to achieving sustainable human-wildlife coexistence.\n<cite>-- <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz's homepage</a></cite></p>\n</blockquote>\n<p>This immediately grabbed my interest, since it's a very different angle of biodiversity measurements to my usual. I've so far been mainly involved in efforts that use <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> or expert <a href=\"https://anil.recoil.org/projects/life\">range maps</a>, but the UEP program is more concerned with the dynamic <em>movements</em> of species. Wildlife movements are extremely relevant to conservation efforts since there is a large tension between human/wildlife coexistence in areas where both communities are under spatial pressure. <a href=\"https://ratsakatika.com/\">Tom Ratsakatika</a> for example did his <a href=\"https://ai4er-cdt.esc.cam.ac.uk/\">AI4ER</a> <a href=\"https://github.com/ratsakatika/camera-traps\">project</a> on the tensions in the <a href=\"https://www.endangeredlandscapes.org/news/advancing-human-wildlife-coexistence-in-the-carpathian-mountains/\">Romanian Carpathian mountains</a>, and <a href=\"https://www.ifaw.org/journal/human-elephant-conflict-major-threat\">elephant/human conflicts</a> and <a href=\"https://www.bbc.co.uk/news/articles/cx2j43e2j5ro\">tiger/human conflicts</a> are also well known.</p>\n<p>The core challenge posed at the workshop was how to build momentum for the UEP's vision of fostering human\u2013wildlife coexistence in the world's <em>unprotected</em> areas (often, this is areas near urban expansion zones like cities).  The UEP idea sprang from Christian's earlier efforts after the pandemic on the <a href=\"https://bio-logging.net/wg/covid19-biologging/\">COVID-19 Bio-Logging</a> that built up a database of over 1 billion satellite fixes for ~13,000 tagged animals across ~200 species. The lead student on that <a href=\"https://www.nature.com/articles/s41559-023-02125-6\">work</a>, <a href=\"https://diegoellissoto.org/\">Diego Ellis Soto</a> has since graduated and was also at the UEP workshop sitting beside me!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-2.webp\" title=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\"/></p>\n<p>The workshop itself wasn't fully public (not because it's secret, but just because the details are still being iterated on), so here are some high-level takeaways from my conversations there...</p>\n<h2 id=\"movebank-for-gps-tracking\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/natgeo-urban-wildlife/#movebank-for-gps-tracking\"></a>Movebank for GPS tracking</h2>\n<p>I've used <a href=\"https://inaturalist.org\">iNaturalist</a> and <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a> extensively for wildlife occurrence and urban data, but I'm less familiar with how animal movement data is recorded. <a href=\"https://www.ab.mpg.de/person/98226\">Martin Wikelski</a> was at the workshop and explained the <a href=\"https://www.humboldt-foundation.de/en/entdecken/magazin-humboldt-kosmos/humboldt-today-the-secret-of-an-eternal-idol/the-high-flyer\">ICARUS</a> project to me, which collected data fitted to animals via GPS transmitters. This is then fed into the <a href=\"https://www.movebank.org/cms/movebank-main\">MoveBank</a> service that is custom-designed for movement data.</p>\n<p>Unlike most other biodiversity data services though, MoveBank data is not immediately made public (due to the sensitivity of animal movements), but is licensed to the user that made it. For that reason, it's less of a \"social\" service than iNaturalist, but still has a staggering <a href=\"https://www.movebank.org/cms/movebank-content/february-2024-newsletter\">11 million records added every day</a>.  This data is then <a href=\"https://www.movebank.org/cms/movebank-content/archiving-animal-movements-as-biodiversity-2023-01-04\">fed into GBIF</a>, although it is downsampled to a single record per day. Martin also indicated to me that they're considering federating Movebank to other countries, which is important as <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&amp;list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">biodiversity data resilience</a> was a hot topic in our <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">meeting</a> a few days before.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-3.webp\" title=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\"/></p>\n<h2 id=\"storytelling-about-conservation-actions\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/natgeo-urban-wildlife/#storytelling-about-conservation-actions\"></a>Storytelling about conservation actions</h2>\n<p>I was really struck by how deeply the National Geographic staff were thinking about and co-designing solutions for along with the academics involved. I got chatting to <a href=\"https://www.nationalgeographic.org/society/our-leadership/\">Ian Miller</a>, the chief scientist at NatGeo about his scientific background (he's worked on all seven continents!) and how our <a href=\"https://anil.recoil.org/projects/ce\">conservation evidence database</a> might be of use to help the Society figure out the long-term impacts of their projects. I also met the person with the coolest job title there: <a href=\"https://www.linkedin.com/in/alextait/\">Alex Tait</a>, who is <a href=\"https://education.nationalgeographic.org/resource/mapping-change-roof-world/\">The Geographer</a> at the NGS. Alex, along with <a href=\"https://theorg.com/org/national-geographic-society/org-chart/lindsay-anderson\">Lindsay Anderson</a> and other NGS staff who participated, all had infectious enthusiasm about exploration combined with an encyclopedic knowledge of specific projects that they support involving explorers across the world.</p>\n<p>These projects ranged from the <a href=\"https://www.nationalgeographic.com/into-the-amazon/pink-dolphins-tricksters-and-thieves/\">Amazon River Dolphins</a> (to understand <a href=\"https://www.nationalgeographic.com/impact/article/fernando-trujillo-explorer-story\">aquatic health</a>) over to <a href=\"https://www.nationalgeographic.com/impact/article/alex-schnell-explorer-story\">cephalopod empathy</a>) and <a href=\"https://www.nationalgeographic.com/impact/article\">many more</a>. These gave me a new perspective on the importance of <em>storytelling</em> as a key mechanism to help connect the dots from conservation actions to people; something that I've been learning from <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s <a href=\"https://anil.recoil.org/notes/junior-rangers\">video series</a> as well!</p>\n<p><a href=\"https://www.nationalgeographic.com/impact\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-5.webp\" title=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\"/> </a></p>\n<p>It's also worth noting that the NGS support goes beyond \"just\" filmmaking. Our own <a href=\"https://charlesemogor.com\">Charles Emogor</a> is also an <a href=\"https://explorers.nationalgeographic.org/directory/charles-agbor-emogor\">Explorer</a>, and recently received support from their <a href=\"https://www.nationalgeographic.org/society/our-programs/lab/\">Exploration Technology Lab</a> to get a bunch of <a href=\"https://www.wildlifeacoustics.com/products/song-meter-mini-2-aa\">biologgers</a> to support his research on <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">mapping hunting pressures</a>. Rather than placing a few big bets, the Society seems to focus on investing widely in a diverse range of people and geographies.</p>\n<h2 id=\"the-importance-of-hedgehogs\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/natgeo-urban-wildlife/#the-importance-of-hedgehogs\"></a>The importance of hedgehogs</h2>\n<p>A lot of the discussion at the workshop naturally focussed on charismatic mammals such as the amazing work done by the <a href=\"https://www.zambiacarnivores.org/\">Zambian Carnivore programme</a>. However, I also had in mind the importance of addressing issues closer to home in the UK as well so that we didn't ignore Europe.</p>\n<p>Luckily, before the workshop, I had grabbed a coffee with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a> from the CCI, who has been bringing me up to speed on the <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring programme</a> (did you know that British hedgehogs are now <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">vulnerable to extinction</a>?). This particular effort seems to tick a lot of boxes; it's a local and beloved species in the UK, it requires <a href=\"https://www.conservationevidence.com/individual-study/1018\">evidence-based interventions</a> to avoid making the problems worse, and also requires combining data sources (from camera traps to species distribution models to urban planning to the GPS Movebank data) to build up a really accurate high res picture of what's going on.</p>\n<p>I brought up UK hedgehog conservation at the NatGeo workshop, and then while down at <a href=\"https://earthfest.world/\">Earthfest</a> at Google a few days later I learnt from <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a> that they've developed an extremely high-res map of <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">woodland and hedgerows</a> in the UK.  I've therefore created a new student project on <a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">hedgehog mapping</a> and hope to recruit a summer internship for this. It would be extremely cool to put the pieces together with a very concrete project such as this as a first small step for the UEP.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-1.webp\" title=\"NatGeo Basecamp is under construction, but still epic\"/></p>\n<p>I found the whole experience of visiting National Geographic inspirational, and not just because of the projects discussed. The walls of their HQ are full of incredible photographs of explorers all over the world, and a seemingly unbounded enthusiasm for exploring the unknown. I kind of thought I'd aged out on applying to become an explorer, but <a href=\"https://totalkatastrophe.blogspot.com/\">Kathy Ho</a> has been encouraging me to apply, and the same was echoed by the lovely conversations with NatGeo staffers.</p>\n<p>I'm therefore putting on my thinking hat on for what my Explorers project proposal should be, as I am on academic sabbatical next year and have more freedom to travel; suggestions are welcome if you see me at the pub!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/ngs-4.webp\" title=\"I might have deliberately gone the wrong way a few times while exploring the HQ\"/></p><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). We become Junior Rangers at Shenandoah. <a href=\"https://doi.org/10.59350/d27v1-5tk68\" target=\"_blank\"><i>10.59350/d27v1-5tk68</i></a></li>\n<li>Ellis-Soto et al (2023). A vision for incorporating human mobility in the study of human\u2013wildlife interactions. Nature Ecology &amp; Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02125-6\" target=\"_blank\"><i>10.1038/s41559-023-02125-6</i></a></li></ul>","doi":"https://doi.org/10.59350/7cpwj-d4161","guid":"https://doi.org/10.59350/7cpwj-d4161","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1749254400,"reference":[{"id":"https://doi.org/10.59350/j6zkp-n7t82","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.59350/d27v1-5tk68","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1038/s41559-023-02125-6","unstructured":"<b>[cito:cites]</b>"}],"rid":"pd2qp-yxd12","summary":"I stayed on for a few days extra in Washington DC after the biodiversity extravaganza to attend a workshop at legendary National Geographic Basecamp.","tags":["Natgeo","Usa","Biodiversity","Urban"],"title":"Visiting National Geographic HQ and the Urban Exploration Project","updated_at":1781259299,"url":"https://anil.recoil.org/notes/natgeo-urban-wildlife","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>Apple made a notable <a href=\"https://developer.apple.com/videos/play/wwdc2025/346/\">announcement</a> in <a href=\"https://developer.apple.com/wwdc25/\">WWDC 2025</a> that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early <a href=\"https://docs.docker.com/desktop/setup/install/mac-install/\">Docker for Mac</a> days in 2016 when we <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">announced</a> the first mainstream use of the <a href=\"https://developer.apple.com/documentation/hypervisor\">hypervisor framework</a>, so I couldn't resist taking a quick peek under the hood.</p>\n<p>There were two separate things announced: a <a href=\"https://github.com/apple/containerization\">Containerization framework</a> and also a <a href=\"https://github.com/apple/container\">container</a> CLI tool that aims to be an <a href=\"https://opencontainers.org/\">OCI</a> compliant tool to manipulate and execute container images. The former is a general-purpose framework that could be used by Docker, but it wasn't clear to me where the new CLI tool fits in among the existing layers of <a href=\"https://github.com/opencontainers/runc\">runc</a>, <a href=\"https://containerd.io/\">containerd</a> and of course Docker itself. The only way to find out is to take the new release for a spin, since Apple open-sourced everything (well done!).</p>\n<h2 id=\"getting-up-and-running\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#getting-up-and-running\"></a>Getting up and running</h2>\n<p>To get the full experience, I chose to install the <a href=\"https://www.apple.com/uk/newsroom/2025/06/macos-tahoe-26-makes-the-mac-more-capable-productive-and-intelligent-than-ever/\">macOS Tahoe beta</a>, as there have been improvements to the networking frameworks<sup id=\"fnref:1\"><a class=\"footnote\" href=\"https://anil.recoil.org/notes/apple-containerisation/#fn:1\">[1]</a></sup> that are only present in the new beta. It's essential you only use the <a href=\"https://developer.apple.com/news/releases/?id=06092025g\">Xcode 26 beta</a> as otherwise you'll get Swift link errors against vmnet. I had to force my installation to use the right toolchain via:</p>\n<pre><code>sudo xcode-select --switch /Applications/Xcode-beta.app/Contents/Developer\n</code></pre>\n<p>Once that was done, it was simple to clone and install the <a href=\"https://github.com/apple/container\">container\nrepo</a> with a <code>make install</code>. The first\nthing I noticed is that everything is written in Swift with no Go in sight.\nThey still use Protobuf for communication among the daemons, as most of the\nwider Docker ecosystem does.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/macos-ss-1.webp\" title=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they're impossible to distinguish!\"/></p>\n<h2 id=\"starting-our-first-apple-container\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#starting-our-first-apple-container\"></a>Starting our first Apple container</h2>\n<p>Let's start our daemon up and take the <code>container</code> CLI for a spin.</p>\n<pre><code class=\"language-sh\">$ container system start\nVerifying apiserver is running...\nInstalling base container filesystem...\nNo default kernel configured.\nInstall the recommended default kernel from [https://github.com/kata-containers/kata-containers/releases/download/3.17.0/kata-static-3.17.0-arm64.tar.xz]? [Y/n]: y\nInstalling kernel... \n\u2819 [1/2] Downloading kernel 33% (93.4/277.1 MB, 14.2 MB/s) [5s]\n</code></pre>\n<p>The first thing we notice is it downloading a full Linux kernel from the <a href=\"https://github.com/kata-containers/kata-containers\">Kata Containers</a> project. This system spins up a VM per container in order to provide more isolation. Although I haven't tracked Kata closely since its <a href=\"https://techcrunch.com/2017/12/05/intel-and-hyper-partner-with-the-openstack-foundation-to-launch-the-kata-containers-project/\">launch</a> in 2017, I did notice it being used to containerise <a href=\"https://confidentialcomputing.io/\">confidential computing enclaves</a> while <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and I were working on <a href=\"https://anil.recoil.org/projects/difc-tee\">TEE programming models</a> a few years ago.</p>\n<p>The use of Kata tells us that <code>container</code> spins up a new kernel using the\nmacOS <a href=\"https://developer.apple.com/documentation/virtualization\">Virtualization framework</a> every time a new container is started. This\nis ok for production use (where extra isolation may be appropriate in a\nmultitenant cloud environment) but very memory inefficient for development\n(where it's usual to spin up 4-5 VMs for a development environment with a\ndatabase etc). In contrast, Docker for Mac <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">uses</a> a single Linux kernel and runs\nthe containers within that instead.</p>\n<p>It's not quite clear to me why Apple chose the extra overheads of a\nVM-per-container, but I suspect this might be something to do with running code securely\ninside the <a href=\"https://support.apple.com/en-gb/guide/security/sec59b0b31ff/web\">many hardware enclaves</a>\npresent in modern Apple hardware, a usecase that is on the rise with <a href=\"https://www.apple.com/uk/apple-intelligence/\">Apple\nIntelligence</a>.</p>\n<h2 id=\"peeking-under-the-hood-of-the-swift-code\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#peeking-under-the-hood-of-the-swift-code\"></a>Peeking under the hood of the Swift code</h2>\n<p>Once the container daemon is running, we can spin up our first container using Alpine, which uses the familiar Docker-style <code>run</code>:</p>\n<pre><code class=\"language-sh\">$ time container run alpine uname -a \nLinux 3c555c19-b235-4956-bed8-27bcede642a6 6.12.28 #1 SMP\nTue May 20 15:19:05 UTC 2025 aarch64 Linux\n0.04s user 0.01s system 6% cpu 0.733 total\n</code></pre>\n<p>The container spinup time is noticable, but still less than a second and pretty acceptable for day to day use. This is possible thanks to a custom userspace they implement via a Swift init process that's run by the Linux kernel as the <em>sole</em> binary in the filesystem, and that provides an RPC interface to manage other services. The <a href=\"https://github.com/apple/containerization/tree/main/vminitd/Sources/vminitd\">vminitd</a> is built using the Swift static Linux SDK, which links <a href=\"https://musl.libc.org/\">musl libc</a> under the hood (the same one used by <a href=\"https://www.alpinelinux.org/\">Alpine Linux</a>).</p>\n<p>We can see the processes running by using <a href=\"https://man7.org/linux/man-pages/man1/pstree.1.html\">pstree</a>:</p>\n<pre><code>|- 29203 avsm /System/Library/Frameworks/Virtualization.framework/\n   Versions/A/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/\n   Contents/MacOS/com.apple.Virtualization.VirtualMachine\n|- 29202 avsm &lt;..&gt;/plugins/container-runtime-linux/\n   bin/container-runtime-linux\n   --root &lt;..&gt;/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n   --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm &lt;..&gt;/bin/container-network-vmnet\n   start --id default\n   --service-identifier &lt;..&gt;network.container-network-vmnet.default\n|- 28899 avsm &lt;..&gt;/bin/container-core-images start\n|- 29202 avsm &lt;..&gt;/bin/container-runtime-linux\n   --root &lt;..&gt;/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n   --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm &lt;..&gt;/container-network-vmnet start --id default\n   --service-identifier &lt;..&gt;network.container-network-vmnet.default\n</code></pre>\n<p>You can start to see the overheads of a VM-per-container now, as each container\nneeds the host process infrastructure to not only run the computation, but also to\nfeed it with networking and storage IO (which have to be translated from the\nhost).  Still, its a drop in the ocean for macOS these days, as I'm running 850\nprocesses in the background on my Macbook Air from an otherwise fresh\ninstallation! This isn't the lean, fast MacOS X Cheetah I used on my G4 Powerbook anymore,\nsadly.</p>\n<h3 id=\"finding-the-userspace-ext4-in-swift\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#finding-the-userspace-ext4-in-swift\"></a>Finding the userspace ext4 in Swift</h3>\n<p>I then tried to run a more interesting container for my local dev environment:\nthe <a href=\"https://hub.docker.com/r/ocaml/opam\">ocaml/opam</a> Docker images that we use\nin OCaml development.  This showed up an interesting new twist in the Apple\nrewrite: they have an entire <a href=\"https://en.wikipedia.org/wiki/Ext4\">ext4</a> filesystem <a href=\"https://github.com/apple/containerization/tree/main/Sources/ContainerizationEXT4\">implementation written in\nSwift</a>!\nThis is used to extract the OCI images from the Docker registry and then\nconstruct a new filesystem.</p>\n<pre><code class=\"language-sh\">$ container run ocaml/opam opam list\n\u2826 [2/6] Unpacking image for platform linux/arm64 (112,924 entries, 415.9 MB, Zero KB/s) [9m 22s] \n\u2839 [2/6] Unpacking image for platform linux/arm64 (112,972 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u2807 [2/6] Unpacking image for platform linux/arm64 (113,012 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u283c [2/6] Unpacking image for platform linux/arm64 (113,059 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u280b [2/6] Unpacking image for platform linux/arm64 (113,104 entries, 415.9 MB, Zero KB/s) [9m 24s] \n# Packages matching: installed                                                                      \n# Name                # Installed # Synopsis\nbase-bigarray         base\nbase-domains          base\nbase-effects          base\nbase-threads          base\nbase-unix             base\nocaml                 5.3.0       The OCaml compiler (virtual package)\nocaml-base-compiler   5.3.0       pinned to version 5.3.0\nocaml-compiler        5.3.0       Official release of OCaml 5.3.0\nocaml-config          3           OCaml Switch Configuration\nopam-depext           1.2.3       Install OS distribution packages\n</code></pre>\n<p>The only hitch here is how slow this process is. The OCaml images do have a lot of individual\nfiles within the layers (not unusual for a package manager), but I was surprised that this took\n10 minutes on my modern M4 Macbook Air, versus a few seconds on Docker for Mac.  I <a href=\"https://github.com/apple/container/issues/136\">filed a bug</a> upstream to investigate further since (as with any new implementation) there are many <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">edge cases</a> when handling filesystems in userspace, and the Apple code seems to have <a href=\"https://github.com/apple/container/issues/134\">other limitations</a> as well.  I'm sure this will all shake out as the framework gets more users, but it's worth bearing in mind if you're thinking of using it in the near term in a product.</p>\n<h2 id=\"whats-conspicuously-missing\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#whats-conspicuously-missing\"></a>What's conspicuously missing?</h2>\n<p>I was super excited when this announcement first happened, since I thought it might be the beginning of a few features I've needed for years and years. But they're missing...</p>\n<h3 id=\"running-macos-containers-nope\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#running-macos-containers-nope\"></a>Running macOS containers: nope</h3>\n<p>In OCaml-land, we have gone to ridiculous lengths to be able to run macOS CI on our own infrastructure. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> first wrote a <a href=\"https://tarides.com/blog/2023-08-02-obuilder-on-macos/\">custom snapshotting builder</a> using undocumented interfaces like userlevel sandboxing, subsequently taken over and maintained by <a href=\"https://www.tunbury.org/\">Mark Elvers</a>. This is a tremendous amount of work to maintain, but the alternative is to depend on very expensive hosted services to spin up individual macOS VMs which are slow and energy hungry.</p>\n<p>What we <em>really</em> need are macOS containers! We have dozens of mechanisms to run Linux ones already, and only a few <a href=\"https://github.com/dockur/macos\">heavyweight alternatives</a> to run macOS itself within macOS. However, the VM-per-container mechanism chosen by Apple might be the gateway to supporting macOS itself in the future. I will be first in line to test this if it happens!</p>\n<h3 id=\"running-ios-containers-nope\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#running-ios-containers-nope\"></a>Running iOS containers: nope</h3>\n<p>Waaaay back when we were <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">first writing</a> Docker for Mac, there were no mainstream users of the Apple Hypervisor framework at all (that's why we built and released <a href=\"https://github.com/moby/hyperkit\">Hyperkit</a>. The main benefit we hoped to derive from using Apple-blessed frameworks is that they would make our app App-Store friendly for distribution via those channels.</p>\n<p>But while there do exist <a href=\"https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.security.hypervisor\">entitlements</a> to support virtualisation on macOS, there is <em>no</em> support for iOS or iPadOS to this day! All of the trouble to sign binaries and deal with entitlements and opaque Apple tooling only gets it onto the Mac App store, which is a little bit of a graveyard compared to the iOS ecosystem.\nThis thus remains on my wishlist for Apple: the hardware on modern iPad adevices <em>easily</em> supports virtualisation, but Apple is choosing to cripple these devices from having a decent development experience by not unlocking the software capability by allowing the hypervisor, virtualisation and container frameworks to run on there.</p>\n<h3 id=\"running-linux-containers-yeah-but-no-gpu\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#running-linux-containers-yeah-but-no-gpu\"></a>Running Linux containers: yeah but no GPU</h3>\n<p>One reason to run Linux containers on macOS is to handle machine learning workloads. Actually getting this to be performant is tricky, since macOS has its own custom <a href=\"https://github.com/ml-explore/mlx\">MLX-based</a> approach to handling tensor computations. Meanwhile, the rest of the world mostly uses nVidia or AMD interfaces for those GPUs, which is reflected in container images that are distributed.</p>\n<p>There is some chatter on the <a href=\"https://github.com/apple/container/discussions/62#discussioncomment-13414483\">apple/container GitHub</a> about getting GPU passthrough working, but I'm still unclear on how to get a more portable GPU ABI. The reason Linux containers work so well is that the Linux kernel provides a very stable ABI, but this breaks down with GPUs badly.</p>\n<h1 id=\"does-this-threaten-dockers-dominance\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/apple-containerisation/#does-this-threaten-dockers-dominance\"></a>Does this threaten Docker's dominance?</h1>\n<p>I have mixed feelings about the Containerization framework release. On one hand, it's always fun to see more systems code in a new language like Swift, and this is an elegant and clean reimplementation of classic containerisation techniques in macOS. But the release <strong>fails to unlock any real new end-user capabilities</strong>, such as running a decent development environment on my iPad without using cloud services. Come on Apple, you can make that happen; you're getting ever closer every release!</p>\n<p>I don't believe that Docker or Orbstack are too threatened by this release at this stage either, despite some reports that <a href=\"https://appleinsider.com/articles/25/06/09/sorry-docker-macos-26-adds-native-support-for-linux-containers\">they're being Sherlocked</a>. The Apple container CLI is quite low-level, and there's a ton of quality-of-life features in the full Docker for Mac app that'll keep me using it, and there seems to be no real blocker from Docker adopting the Containerization framework as one of its optional backends. I prefer having a single VM for my devcontainers to keep my laptop battery life going, so I think Docker's current approach is better for that usecase.</p>\n<p>Apple has been a very good egg here by open sourcing all their code, so I believe this will overall help the Linux container ecosystem by adding choice to how we deploy software containers. Well done <a href=\"https://github.com/crosbymichael\">Michael Crosby</a>, <a href=\"https://github.com/mavenugo\">Madhu Venugopal</a> and many of my other former colleagues who are all merrily hackily away on this for doing so!  As an aside, I'm also just revising a couple of papers about the history of using OCaml in several Docker components, and a retrospective look back at the hypervisor architecture backing Docker for Desktop, which will appear in print in the next couple of months (I'll update this post when they appear). But for now, back to my day job of marking undergraduate exam scripts...</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>vmnet is a networking framework for VMs/containers that I had to <a href=\"https://github.com/mirage/ocaml-vmnet\">reverse engineer</a> back in 2014 to use with OCaml/MirageOS.</p>\n<a class=\"reversefootnote\" href=\"https://anil.recoil.org/notes/apple-containerisation/#fnref:1\">\u21a9</a></p></li></ol></div><h1>References</h1><ul><li>Ridge et al (2015). SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems. ACM. <a href=\"https://doi.org/10.1145/2815400.2815411\" target=\"_blank\"><i>10.1145/2815400.2815411</i></a></li></ul>","doi":"https://doi.org/10.59350/70ynk-ves20","guid":"https://doi.org/10.59350/70ynk-ves20","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1749600000,"reference":[{"id":"https://doi.org/10.1145/2815400.2815411","unstructured":"<b>[cito:citesAsSourceDocument]</b>"}],"rid":"5gf2r-ag171","summary":"Apple made a notable announcement in WWDC 2025 that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early Docker for Mac days in 2016 when we announced the first mainstream use of the hypervisor framework, so I couldn't resist taking a quick peek under the hood.","tags":["Docker","Containers","Systems","Networking","Macos"],"title":"Under the hood with Apple's new Containerization framework","updated_at":1781259298,"url":"https://anil.recoil.org/notes/apple-containerisation","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>The <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass\">BIOMASS</a> forest mission satellite was <a href=\"https://www.bbc.co.uk/newsround/articles/c0jzy3g0zx2o\">successfully</a> boosted into space a couple of days ago, after decades of development from just down the road in <a href=\"https://www.gov.uk/government/news/british-built-satellite-to-map-earths-forests-in-3d-for-the-first-time\">Stevenage</a>. I'm excited by this because it's the first global-scale <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">P-band SAR</a> instrument that can penetrate forest canopys to look underneath. This, when combined with <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">hyperspectral mapping</a> will give us a lot more <a href=\"https://anil.recoil.org/projects/rsn\">insight</a> into global tree health.</p>\n<p>Weirdly, the whole thing almost never happened because permission to use the <a href=\"https://ieeexplore.ieee.org/document/9048581\">P-band</a> was blocked because it might <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">interfere with US nuclear missile warning radars</a> back in 2013.</p>\n<blockquote>\n<p>Meeting in Graz, Austria, to select the the 7th Earth Explorer mission to be flown by the 20-nation European Space Agency (ESA), backers of the Biomass mission were pelted with questions about how badly the U.S. network of missile warning and space-tracking radars in North America, Greenland and Europe would undermine Biomass' global carbon-monitoring objectives.</p>\n<p>Europe's Earth observation satellite system may be the world's most dynamic, but as it pushes its operating envelope into new areas, it is learning a lesson long ago taught to satellite telecommunications operators: Radio frequency is scarce, and once users have a piece of it they hold fast.\n<cite>-- <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">Spacenews</a> (2013)</cite></p>\n</blockquote>\n<p>Luckily, all this got sorted by international frequency negotiators, and after\n<a href=\"https://www.thecomet.net/news/25125302.satellite-built-stevenage-airbus-launches-space/\">being built by Airbus in Stevenage</a>\n(and Germany and France, as it's a complex instrument!) it took off without a hitch. Looking forward to getting my hands on the first results later in the year over at the <a href=\"https://eo.conservation.cam.ac.uk\">Centre for Earth Observation</a>.</p>\n<p>Check out this cool <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">ESA video</a> about the instrument to learn more, and congratulations to the team at ESA. Looking forward to the next <a href=\"https://anil.recoil.org/notes/biospace-25\">BIOSPACE</a> where there will no doubt be initial buzz about this.</p>\n<p><div class=\"video-center\"><iframe allowfullscreen=\"\" frameborder=\"0\" height=\"315px\" sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" src=\"https://crank.recoil.org/videos/embed/c3981e1f-3f2d-439a-924d-6d29de33cfe4\" title=\"BIOMASS p-band mirror\" width=\"100%\"></iframe></div></p>\n<p><em>Update 28th June 2025:</em> See also this <a href=\"https://www.bbc.co.uk/news/resources/idt-d7353b50-0fea-46ba-8495-ae9e25192cfe\">beautiful BBC article</a> about the satellite, via <a href=\"https://coomeslab.org\">David Coomes</a>.</p><h1>References</h1><ul><li>Madhavapeddy (2025). ESA's first BioSpace conference seems a huge success. <a href=\"https://doi.org/10.59350/vd6af-4bc83\" target=\"_blank\"><i>10.59350/vd6af-4bc83</i></a></li>\n<li>Ball et al (2024). Harnessing temporal &amp; spectral dimensionality to identify individual trees in tropical forests. bioRxiv. <a href=\"https://doi.org/10.1101/2024.06.24.600405\" target=\"_blank\"><i>10.1101/2024.06.24.600405</i></a></li>\n<li>Li et al (2019). The P-band SAR Satellite: Opportunities and Challenges. 2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR). <a href=\"https://doi.org/10.1109/APSAR46974.2019.9048581\" target=\"_blank\"><i>10.1109/APSAR46974.2019.9048581</i></a></li></ul>","doi":"https://doi.org/10.59350/53zjq-ft509","guid":"https://doi.org/10.59350/53zjq-ft509","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1746057600,"reference":[{"id":"https://doi.org/10.59350/vd6af-4bc83","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1101/2024.06.24.600405","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1109/apsar46974.2019.9048581","unstructured":"<b>[cito:cites]</b>"}],"rid":"46t51-7zq39","summary":"The BIOMASS forest mission satellite was successfully boosted into space a couple of days ago, after decades of development from just down the road in Stevenage. I'm excited by this because it's the first global-scale P-band SAR instrument that can penetrate forest canopys to look underneath. This, when combined with hyperspectral mapping will give us a lot more insight into global tree health.","tags":["Sensing","Space","Satellite","Forests","Biodiversity"],"title":"BIOMASS launches to measure forest carbon flux from space","updated_at":1781259297,"url":"https://anil.recoil.org/notes/biomass-launches","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>The exam marking is over, and a glorious Cambridge summer awaits! This year, we\nhave a sizeable cohort of undergraduate and graduate interns joining us from\nnext week.</p>\n<p>This note serves as a point of coordination to keep track of what's\ngoing on, and I'll update it as we get ourselves organised.\nIf you're an intern, then I highly recommend you take the time to carefully\nread through all of this, starting with <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#who-we-all-are-this-summer\">who we are</a>,\nsome <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#ground-rules\">ground rules</a>, <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#where-we-will-work\">where we will work</a>,\n<a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#registering-on-chat-channels\">how we chat</a>, <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#how-you-will-get-paid\">how to get paid</a>, and of course <a href=\"https://anil.recoil.org/notes/eeg-interns-2025/#summer-social-activities\">social activities</a> to make sure we have some fun!</p>\n<h2 id=\"who-we-all-are-this-summer\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#who-we-all-are-this-summer\"></a>Who we all are this summer</h2>\n<p>We're working on quite the diversity of projects this summer, ranging from classic\ncomputer systems and programming problems all the way through to environmental\nscience. Here's a recap of what's going on.</p>\n<p>First we're working against the <a href=\"https://anil.recoil.org/projects/ce\">evidence database</a> we've been building for the past couple of years:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/ai-assisted-inclusion-criteria\">Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</a>\"</em> with <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>, supervised by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/accurate-summarisation-for-ce\">Accurate summarisation of threats for conservation evidence literature</a>\"</em> with <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>, supervised by <a href=\"https://toao.com\">Sadiq Jaffer</a> following up her successful MPhil submission.</li>\n</ul>\n<p>We're then heading into <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> and working on some mapping projects:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/cairngorms-connect-habitats\">Habitat mapping of the Cairngormes Connect restoration area</a>\"</em> with <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>, supervised by <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">Mapping urban and rural British hedgehogs</a>\"</em> with <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>, supervised by <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>, as well as writing up his MPhil dissertation on <em>\"<a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">Enhancing Navigation Algorithms with Semantic Embeddings</a>\"</em></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/validating-anti-poaching-predictions\">Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</a>\"</em> with <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>, supervised by <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a></li>\n</ul>\n<p>Dropping down towards <a href=\"https://anil.recoil.org/projects/osmose\">embedded systems</a> and fun \"real-world\" projects, we have:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">Affordable digitisation of insect collections using photogrammetry</a>\"</em> with <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a> and <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, supervised by <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing the planet (or bits of it)</a>\"</em> with <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, supervised by <a href=\"https://mynameismwd.org\">Michael Dales</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/embedded-whisper\">Low power audio transcription with Whisper</a>\"</em> with <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a> and <em>\"<a href=\"https://anil.recoil.org/ideas/battery-free-riotee\">Battery-free wildlife monitoring with Riotee</a>\"</em> with <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>, both supervised by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a></li>\n</ul>\n<p>Going back to classic computer science, we have a few programming language and systems projects:</p>\n<ul>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/hazel-to-ocaml-to-hazel\">Bidirectional Hazel to OCaml programming</a>\"</em> with <a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a>, supervised by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">Effects based scheduling for the OCaml compiler pipeline</a>\"</em> with <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <em>\"<a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">Runtimes \u00e0 la carte: crossloading native and bytecode OCaml</a>\"</em> with <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>, both supervised by <a href=\"https://www.dra27.uk\">David Allsopp</a></li>\n<li><em>\"<a href=\"https://anil.recoil.org/ideas/zfs-filesystem-perf\">ZFS replication strategies with encryption</a>\"</em> with <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>, supervised by <a href=\"https://www.tunbury.org/\">Mark Elvers</a></li>\n</ul>\n<h2 id=\"ground-rules\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#ground-rules\"></a>Ground rules</h2>\n<p>Since there are so many of us this summer, it's imperative that you're all\n<strong>proactive about communicating</strong> any problems or clarifications you need. If something\nhere doesn't make sense, or you have a better idea, then just reach out to any\nof the supervisors or me directly!</p>\n<p>Do also take time to <strong>learn from each other</strong>. Read up on not just your own project in the\nlist above, but take some to read the remainder so that you have a sense of what everyone\nis working on. When you see each other, it'll be much easier to chat about what's going\non and find opportunities for commonality.</p>\n<p>The projects above have been carefully selected to <strong>not be on the critical path</strong> for any\ndeadlines. If it's not going well from your perspective, then it's ok to take a step back\nand figure out why! We're hear to learn and discover things, so take the time to do so.</p>\n<h2 id=\"where-we-will-work\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#where-we-will-work\"></a>Where we will work</h2>\n<p>This will be different for everyone, since it depends on which home department will house the project.\nSome of us will be in the David Attenborough Building, in the third floor where the <a href=\"https://www.conservation.cam.ac.uk\">CRI</a> is:</p>\n<ul>\n<li><a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a> and <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a> will be with the <a href=\"https://anil.recoil.org/projects/ce\">CE</a> crew near <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s office</li>\n<li><a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a> and <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a> will hang out with <a href=\"https://coomeslab.org\">David Coomes</a>'s group</li>\n<li><a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> can work near <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>'s office where <a href=\"https://charlesemogor.com\">Charles Emogor</a> works</li>\n</ul>\n<p>Those working on the Zoology Museum itself (<a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>) will have an health and safety induction on Monday with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> and find offices there.</p>\n<p>The rest of us will be in the Computer Lab over in West Cambridge:</p>\n<ul>\n<li><a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a> will work out of FW15 with <a href=\"https://www.dra27.uk\">David Allsopp</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a></li>\n<li><a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>, <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a> and <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a> will be in FW15/14.  We may need to clear out one desk in FW15 to make room here (just put the stuff in my office in FW16). <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> will work out of my office (FW16) for the summer, and <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> is away for an internship in the USA.</li>\n<li>We'll find somewhere for <a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a> either in West Cambridge or in Pembroke soon, depending on preferences and heat!</li>\n</ul>\n<p>It'll probably take a week to let this all shake out, so please do shout if you find yourself stuck in your room and without an office! You should of course arrange to meet your immediate supervisors regularly according to whatever schedule and location works for you.</p>\n<h2 id=\"how-you-will-get-paid\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#how-you-will-get-paid\"></a>How you will get paid</h2>\n<p>The way you get paid weekly is via the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">Cambridge Casual Worker</a> system. This has a few important steps that you <strong>must</strong> pay attention to, or you will not get paid!</p>\n<ul>\n<li><strong>Before starting work</strong> you must go find <a href=\"https://www.cst.cam.ac.uk/people/ac733\">Alicja Zavros</a> in the Computer Lab with your passport or other proof of your right to work in the UK.  I've told Alicja that may of you will show up on Monday 30th June morning. It won't take more than a few minutes, as she'll take a photocopy of your id. You should also have registered on the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">CCWS</a> and gotten a login.</li>\n<li><strong>Every Friday</strong> that you do some work, fill in a timesheet on the CCWS. Round this off to a full day (8 hours) and don't do fine-grained timekeeping; just the number of days you've worked is fine. If you don't fill in a timesheet promptly, you won't get paid.</li>\n<li><strong>You must keep a research log with weeknotes</strong> that record what you've been up to. The exact style of weeknotes are entirely up to you, but it's vital that you get in the habit of keeping a log. If you have your own homepage, then send an <a href=\"https://en.wikipedia.org/wiki/Atom_(web_standard)\">Atom feed</a> to me. If you don't, then we have a <a href=\"https://github.com/ucam-eo/interns-2025\">github/ucam-eo/interns-2025</a> which I can give you write access to.  It's typical to store your weeknotes in Markdown format, and just a simple subdirectory with a date-based convention is fine. The primary use of weeknotes is to highlight things you've accomplished, areas where you are blocked, and interesting things you have run across. Try to make it a record to your future self, and also a way to let those around you know what's going on. While missing the occasional weeknote is just fine, missing them all will be a problem, so plan your time accordingly.  Weeknotes are also <em>not</em> a mechanism to assess anything to do with your progress, but a simple form of communication.</li>\n</ul>\n<h2 id=\"registering-on-chat-channels\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#registering-on-chat-channels\"></a>Registering on chat channels</h2>\n<p>Since we're all going to spread around Cambridge physically, it's important to have a chat channel. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> is setting up a WhatsApp group for social things (see below), but we also use <a href=\"https://matrix.org\">Matrix</a> as our \"hackers choice\" for day-to-day messaging.</p>\n<p>We host a Computer Lab <a href=\"https://matrix.org\">Matrix</a> server on which anyone with a valid Raven account can create an account. Since Matrix is a decentralised chat system, it is also possible to use other accounts from third-party servers, and also to join channels elsewhere.</p>\n<p>To create an account:</p>\n<ul>\n<li>In your Matrix client (we most commonly use <a href=\"https://element.io\">Element</a>), select <code>eeg.cl.cam.ac.uk</code> as your homeserver.</li>\n<li>Login with SSO (Single Sign On)</li>\n<li>You should see a Cambridge authentication screen for your CRSID.</li>\n</ul>\n<p>Once you create your account, you will be in the \"EEG\" Matrix space.  A <a href=\"https://matrix.org/blog/2021/05/17/the-matrix-space-beta/\">Matrix space</a> is a collection of channels, and you should join \"EEGeneral\" as the overall channel for the group. We'll create a separate room just for intern chats. We also have a bot in the room that posts our blogs to the channel, so you can keep up with what the group members are all chattering about. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> runs the CL matrix server, and there are occasional quirks, so just let us know if you run into any problems.  I am <code>@avsm:recoil.org</code> on there, not <code>avsm2</code> as I use my personal Matrix for a bunch of stuff.</p>\n<h2 id=\"summer-social-activities\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/eeg-interns-2025/#summer-social-activities\"></a>Summer social activities</h2>\n<p>It's important to get some downtime this summer and recharge. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> has been setting up a social group for the interns to hang out together, and we'll organise a punting excursion at some point to get us out to the river.  Of course, many of us will be travelling this summer (I'm heading off to Botswana in late July for instance), so please do also make suggestions.</p>","doi":"https://doi.org/10.59350/tf22g-p1822","guid":"https://doi.org/10.59350/tf22g-p1822","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1751068800,"rid":"0wpqs-a7079","summary":"The exam marking is over, and a glorious Cambridge summer awaits! This year, we have a sizeable cohort of undergraduate and graduate interns joining us from next week. This note serves as a point of coordination to keep track of what's going on, and I'll update it as we get ourselves organised.","tags":["Urop"],"title":"EEG internships for the summer of 2025","updated_at":1781259296,"url":"https://anil.recoil.org/notes/eeg-interns-2025","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>For the past few years, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I been working with our colleagues in\n<a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> to do <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">analysis at scale</a> on the\nacademic literature. Getting local access to millions of fulltext papers has not\nbeen without drama, but made possible thanks to huge amounts of help from our\n<a href=\"https://www.lib.cam.ac.uk/\">University Library</a> who helped us navigate our\nrelationships with scientific publishers. We have just <strong><a href=\"https://rdcu.be/evkfj\">published a comment\nin Nature</a></strong> about the next phase\nof our research, where are looking into the impact of AI advances on evidence synthesis.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/davidparkins-ai-poison.webp\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\"/> </a></p>\n<p>Our work on literature reviews led us into assessing methods for <a href=\"https://royalsociety.org/news-resources/projects/evidence-synthesis/\">evidence\nsynthesis</a>\n(which is crucial to rational policymaking!) and specifically about how recent advances in AI may\nimpact it.  The current methods for <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">rigorous systematic literature review</a> are expensive and slow, and authors are already struggling to keep up with the <a href=\"https://ourworldindata.org/grapher/scientific-and-technical-journal-articles?time=latest\">rapidly expanding</a>\nnumber of legitimate papers. Adding to this, <a href=\"https://retractionwatch.com/2025/\">paper retractions</a> are increasing near\n<a href=\"https://doi.org/10.1038/d41586-023-03974-8\">exponentially</a> and already\nsystematic reviews <a href=\"https://retractionwatch.com/the-retraction-watch-leaderboard/top-10-most-highly-cited-retracted-papers/\">unknowingly cite</a>\nretracted papers, with most remaining uncorrected even a year (after notification!)</p>\n<p>This is all made much more complex as LLMs are flooding the landscape with\nconvincing, fake manuscripts and doctored data, potentially overwhelming our\ncurrent ability to distinguish fact from fiction.  Just this March, the <a href=\"https://sakana.ai/ai-scientist/\">AI\nScientist</a> formulated hypotheses, designed and\nran experiments, analysed the results, generated the figures and produced a\nmanuscript that <a href=\"https://sakana.ai/ai-scientist-first-publication/\">passed human peer\nreview</a> for an ICLR\nworkshop! Distinguishing genuine papers from those produced by LLMs isn't just\na problem for review authors; it's a threat to the very foundation of\nscientific knowledge. And meanwhile, Google is taking a different tack with a\ncollaborative <a href=\"https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/\">AI co-scientist</a> who acts as a multi-agent assistant.</p>\n<p>So the landscape is moving <em>really</em> quickly! Our proposal for the future of\nliterature reviews builds on our desire to move towards a more regional,\nfederated network approach. Instead of having giant repositories of knowledge\nthat <a href=\"https://en.wikipedia.org/wiki/2025_United_States_government_online_resource_removals\">may be erased unilaterally</a>,\nwe're aiming for a more bilateral network of \"living evidence databases\".\nEvery government, especially those in the Global South, should have the ability to build their\nown \"<a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">national data libraries</a>\" which represent the body\nof digital data that affects their own regional needs.</p>\n<p>This system of living evidence databases can be incremental and dynamically\nupdated, and AI assistance can be used as long as humans remain in-the-loop.\nSuch a system can continuously gather, screen, and index literature,\nautomatically remove compromised studies and recalculating results.  We're\nworking on this on multiple fronts this year; ranging from the computer science\nto figure out the distributed-nitty-gritty <sup id=\"fnref:1\"><a class=\"footnote\" href=\"https://anil.recoil.org/notes/ai-poisoning/#fn:1\">[1]</a></sup>, over to working with the\n<a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">GEOBON folk</a> on global biodiversity <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">data\nmanagement</a>, and continuing\nto drive the core LED design at Conservation Evidence. It feels like a</p>\n<p>Read our <a href=\"https://www.nature.com/articles/d41586-025-02069-w\">Nature Comment piece</a> (<a href=\"https://www.linkedin.com/posts/anilmadhavapeddy_will-ai-speed-up-literature-reviews-or-derail-activity-7348317711002705920-Y5UT?rcm=ACoAAAB0Kb0BNo1v6ylsGU2NtPa95mj-w1VcaJA\">comment on LI</a>) to learn more about how we think we can safeguard evidence synthesis against the rising tide of \"AI-poisoned literature\" and ensure the continued integrity of scientific discovery. As a random bit of trivia, the incredibly cool artwork in the piece was drawn by the legendary <a href=\"https://www.davidparkins.com/\">David Parkins</a>, who also drew <a href=\"https://www.beano.com/\">Beano</a> and <a href=\"https://en.wikipedia.org/wiki/Dennis_the_Menace_and_Gnasher\">Dennis the Menace</a>!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>My instinct is that we'll end up with something <a href=\"https://arxiv.org/abs/2402.03239\">ATProto based</a> as it's so convenient for <a href=\"https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/\">distributed system authentication</a>.</p>\n<a class=\"reversefootnote\" href=\"https://anil.recoil.org/notes/ai-poisoning/#fnref:1\">\u21a9</a></p></li></ol></div><h1>References</h1><ul><li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li>\n<li>Noorden (2023). More than 10,000 research papers were retracted in 2023 \u2014 a new record. Nature. <a href=\"https://doi.org/10.1038/d41586-023-03974-8\" target=\"_blank\"><i>10.1038/d41586-023-03974-8</i></a></li>\n<li>Kleppmann et al (2024). Bluesky and the AT Protocol: Usable Decentralized Social Media. Proceedings of the ACM Conext-2024 Workshop on the Decentralization of the Internet. <a href=\"https://doi.org/10.1145/3694809.3700740\" target=\"_blank\"><i>10.1145/3694809.3700740</i></a></li></ul>","doi":"https://doi.org/10.59350/pbxew-d2j78","guid":"https://doi.org/10.59350/pbxew-d2j78","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1751932800,"reference":[{"id":"https://doi.org/10.1038/d41586-025-02069-w","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/j6zkp-n7t82","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.59350/fk6vy-5q841","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1371/journal.pone.0323563","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.1038/d41586-023-03974-8","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1145/3694809.3700740","unstructured":"<b>[cito:cites]</b>"}],"rid":"exrf3-3m363","summary":"For the past few years, Sadiq Jaffer and I been working with our colleagues in Conservation Evidence to do analysis at scale on the academic literature. Getting local access to millions of fulltext papers has not been without drama, but made possible thanks to huge amounts of help from our University Library who helped us navigate our relationships with scientific publishers.","tags":["Evidence","Llms","Ai","Federation","Networks"],"title":"Is AI poisoning the scientific literature? Our comment in Nature","updated_at":1781259295,"url":"https://anil.recoil.org/notes/ai-poisoning","version":"v1"},{"authors":[{"affiliation":[{"id":"https://ror.org/013meh722","name":"University of Cambridge"}],"contributor_roles":[],"family":"Madhavapeddy","given":"Anil","url":"https://orcid.org/0000-0001-8954-2428"}],"blog":{"authors":null,"community_id":"472a49be-dc61-4a17-97f0-d1ff17b0dadd","created":1760313600,"current_feed_url":null,"description":null,"favicon":"https://rogue-scholar.org/api/communities/472a49be-dc61-4a17-97f0-d1ff17b0dadd/logo","feed_format":"application/feed+json","feed_url":"https://anil.recoil.org/perma.json","filter":null,"generator":"Other","home_page_url":"https://anil.recoil.org/notes","issn":null,"language":"eng","license":"https://creativecommons.org/licenses/by/4.0/legalcode","prefix":"10.59350","relative_url":null,"secure":true,"slug":"anil","status":"active","subfield":"1702","title":"Anil Madhavapeddy's feed","updated":1781222400,"use_api":null},"blog_name":"Anil Madhavapeddy's feed","blog_slug":"anil","content_html":"<p>I was a bit sleepy getting into the Royal Society <a href=\"https://royalsociety.org/science-events-and-lectures/2025/07/future-of-scientific-publishing/\">Future of Scientific\nPublishing</a>\nconference early this morning, but was quickly woken up by the dramatic passion\non show as publishers, librarians, academics and funders all got together for a\n\"frank exchange of views\" at a meeting that didn't pull any punches!</p>\n<p>These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available\nfrom the RS in due course.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-1.webp\" title=\"Sir Mark Walport FRS opens up the conference\"/></p>\n<h2 id=\"mark-walport-sets-the-scene\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#mark-walport-sets-the-scene\"></a>Mark Walport sets the scene</h2>\n<p>Sir Mark Walport was a delightful emcee for the proceedings of the day, and\nopened how important the moment is for the future of how we conduct science.\nAcademic publishing faces a perfect storm: peer review is buckling under\nenormous volume, funding models are broken and replete with perverse\nincentives, and the entire system groans with inefficiency.</p>\n<p>The Royal Society is the publisher of the world's oldest continuously published\nscientific journal <a href=\"https://royalsocietypublishing.org/journal/rstb\">Philosophical Transactions</a>\n(since 1665) and has convened this conference for academies worldwide. The\noverall question is: what <em>is</em> a scientific journal in 2025 and beyond?\nWalport traced the economic evolution of publishing: for centuries, readers\npaid through subscriptions (I hadn't realised that the <a href=\"https://royalsociety.org/blog/2015/03/philosophical-transactions-the-early-years/\">early editions of the RS</a>\nused to be sent for free to libraries worldwide until the current commercial\nmodel arrived about 80 years ago).. Now, the pendulum has swung to open access\nthat creates perverse incentives that prioritize volume over quality. He called\nit a \"smoke and mirrors\" era where diamond open access models obscure who\n<em>actually</em> pays for the infrastructure of knowledge dissemination: is it the\npublishers, the governments, the academics, the libraries, or some combination\nof the above?  The profit margins of the commercial publishers answers that\nquestion for me...</p>\n<p>He then identified the transformative forces that are a forcing function:</p>\n<ul>\n<li>LLMs have <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">entered</a> the publishing ecosystem</li>\n<li>The proliferation of journals has created an attention economy rather than a knowledge economy</li>\n<li><a href=\"https://openreview.net/\">Preprint</a> archives are reshaping how research is shared quickly</li>\n</ul>\n<p>The challenges ahead while dealing with these are maintaining metadata\nintegrity, preserving the scholarly archive into the long term, and ensuring\nsystematic access for meta-analyses that advance human knowledge.</p>\n<h2 id=\"historical-perspectives-350-years-of-evolution\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#historical-perspectives-350-years-of-evolution\"></a>Historical Perspectives: 350 Years of Evolution</h2>\n<p>The opening pair of speakers were unexpected: they brought a historical and\nlinguistic perspective to the problem. I found both of these talks the\nhighlights of the day!  Firstly <a href=\"https://www.st-andrews.ac.uk/history/people/akf\">Professor Aileen\nFyfe</a> drew upon her research\nfrom 350 years of the Royal Society archives. Back in the day, there was no\nreal fixed entity called a \"scientific journal\". Over the centuries, everything\nfrom editorial practices to publication methods over to dissemination means\nhave transformed repeatedly, so we shouldn't view the status quo as set in stone.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-2.webp\" title=\"Professor Aileen Fyfe talks publishing history\"/></p>\n<p>While the early days of science were essentially people writing letters to each\nother, the post-WWII era of journals marked the shift to \"scale\". The tools for\ndistance communication (i.e. publishing collected issues) and universities\nswitching from being teaching focused over to today's research-centric\npublishing ecosystem were both key factors. University scientists used to\nproduce 30% of published articles in 1900; by 2020, that figure exceeded 80%.\nThis parallels the globalization of science itself in the past century;\nresearch has expanded well beyond its European origins to encompass almost all\ninstitutions and countries worldwide.</p>\n<p>Amusingly, Prof Fyfe pointed out that a 1960 Nature editorial asked <em>\"<a href=\"https://www.nature.com/articles/186018a0\">How many more new\njournals?</a>\"</em> even back then! The 1950s\ndid bring some standardization efforts (nomenclature, units, symbols) also\nthough citation formats robustly seem to resist uniformity. English was also\nexplicitly selected as the \"<a href=\"https://en.wikipedia.org/wiki/Languages_of_science\">default language for\nscience</a>, and peer review\nwas also formalised via papers like <em>\"<a href=\"https://journals.sagepub.com/doi/10.1177/000456327901600179\">Uniform requirements for manuscripts submitted to biomedical journals</a>\"</em> (in 1979). <a href=\"https://nsf-gov-resources.nsf.gov/pubs/1977/nsb77468/nsb77468.pdf\">US Congressional hearings</a>\nwith the NSF began distinguishing peer review from other evaluation methods.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-3.webp\" title=\"Professor Aileen Fyfe shows the globalisation of research over the years\"/></p>\n<p>All of this scale was then \"solved\" by financialisation after WWII. At the turn of the\n20th century, almost no journals generated any profit (the Royal Society\ndistributed its publications freely). By 1955, financial pressures and growing scale of submissions forced a\n<a href=\"https://journals.sagepub.com/doi/10.1177/0073275321999901\">reckoning</a>, leading\nto more self-supporting models by the 1960s. An era of mergers and acquisitions\namong journals followed, reshaping the <a href=\"https://serials.uksg.org/articles/259/files/submission/proof/259-1-259-1-10-20150210.pdf\">scientific information system</a>.</p>\n<p><a href=\"https://www.universiteitleiden.nl/en/staffmembers/vincent-lariviere#tab-1\">Professor Vincent Larivi\u00e8re</a> then took the stage to dispel some myths of English monolingualism in scientific publishing. While <a href=\"https://garfield.library.upenn.edu/essays/V1p019y1962-73.pdf\">English offers some practical benefits</a>, the reality at non-Anglophone institutions (like his own Universit\u00e9 de Montr\u00e9al) reveals that researchers spend significantly more time reading, writing, and processing papers as non-native language speakers, and often face higher rejection rates as a result of this.\nThis wasn't always the case though; Einstein published primarily in German, not English!</p>\n<p>He went on to note that today's landscape for paper language choices is more\ndiverse than is commonly assumed. English represents only 67% of publications,\na figure whic itself has been inflated by non-English papers that are commonly\npublished with English abstracts. Initiatives like the <a href=\"https://pkp.sfu.ca/2025/03/05/ojs-workshops-indonesia/\">Public Knowledge\nProject</a> has enabled\ngrowth in Indonesian and Latin America for example.  Chinese journals now\npublish twice the volume of English-language publishers, but are difficult to\nindex which makes Lariviere's numbers even more interesting: a growing majority\nof the world is no longer publishing in English! I also heard this in my trip\nin 2023 to China with the Royal Society; the scholars we met had a sequence of\nChinese language journals they submitted too, often before \"translating\" the\noutputs to English journals.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-4.webp\" title=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\"/></p>\n<p>All this leads us to believe that the major publisher's market share is smaller than commonly believed, which gives us reason for hope to change! Open access adoption worldwide currently varies fairly dramatically by per-capita <a href=\"https://ourworldindata.org/grapher/scientific-publications-per-million\">wealth and geography</a>, but reveals substantive greenspace for publishing beyond the major commercial publishers. Crucially, Larivi\u00e8re argued that research \"prestige\" is a socially constructed phenomenon, and not intrinsic to quality.</p>\n<p>In the Q&amp;A, Magdalena Skipper (Nature's Editor-in-Chief) noted that the private sector is reentering academic publishing (especially <a href=\"https://www.science.org/content/article/china-tops-world-artificial-intelligence-publications-database-analysis-reveals\">in AI topics</a>). Fyfe noted the challenge of tracking private sector activities; e.g. varying corporate policies on patenting and disclosure mean they are hard to infdex. A plug from <a href=\"https://coherentdigital.net/\">Coherent Digital</a> noted they have catalogued 20 million reports from non-academic research; this is an exciting direction (we've got <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">30TB of grey literature</a> on our servers, still waiting to be categorisd).</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-5.webp\" title=\"Professor Lariviere shows how uneven citations are across languages and geographies\"/></p>\n<h2 id=\"what-researchers-actually-need-from-stem-publishing\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#what-researchers-actually-need-from-stem-publishing\"></a>What researchers actually need from STEM publishing</h2>\n<p>Our very own <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> opened with a sobering demonstration of \"AI\npoisoning\" in the literature, referencing <a href=\"https://anil.recoil.org/static/papers/2025-ai-poison.pdf\">our recent Nature\ncomment</a>. He did the risky-but-catchy\ngeneration of a plausible-sounding but entirely fabricated conservation study\nusing an LLM and noted how economically motivated rational actors might quite\nreasonably use these tools to advance their agendas via the scientific record.\nAnd recovering from this will be very difficult indeed once it mixes up with\nreal science.</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-6.webp\" title=\"Bill talks about our recent AI poisoning piece\"/></p>\n<p>Bill then outlined our <a href=\"https://anil.recoil.org/projects/ce\">emerging approach to subject-wide synthesis</a> via:</p>\n<ul>\n<li><strong>Systematic reviews</strong>: Slow, steady, comprehensive</li>\n<li><strong>Rapid reviews</strong>: Sprint-based approaches for urgent needs</li>\n<li><strong>Subject-wide evidence synthesis</strong>: Focused sectoral analyses</li>\n<li><strong>Ultrafast bespoke reviews</strong>: AI-accelerated with human-in-the-loop</li>\n</ul>\n<p>Going back to what ournals are <em>for</em> in 2025, Bill then discussed how they were\noriginally vehicles for exchanging information through letters, but now serve\nprimarily as stamps of authority and quality assurance. In an \"AI slop world,\"\nthis quality assurance function becomes existentially important, but shouldn't\nnecessarily be implemented in the current system of incentives. So then, how do\nwe maintain trust when the vast majority of submissions may soon be\nAI-generated? <em>(Bill and I scribbled down a plan on the back of a napkin for\nthis; more on that soon!)</em></p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-7.webp\" title=\"Bill also does a cheeky advert for his Conservation Concepts channel!\"/></p>\n<h3 id=\"early-career-researcher-perspectives\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#early-career-researcher-perspectives\"></a>Early Career Researcher perspectives</h3>\n<p><a href=\"https://www.york.ac.uk/psychology/staff/postdocs/meekings,-sophie/\">Dr. Sophie Meekings</a> then took the stage to discuss the many barriers facing early career researchers (ECRs). They're on short-term contracts, are dependent on others people's grant funding, and yet are the ones conducting the frontline research that drives scientific progress. And this is <em>after</em> years spent on poorly paid PhD stipends!</p>\n<p>ECRs require:</p>\n<ul>\n<li>clear, accessible guidelines spelling out each publishing stage without requiring implicit knowledge of the \"system\"</li>\n<li>constructive, blinded peer review** that educates rather than gatekeeps</li>\n<li>consistent authorship conventions like <a href=\"https://www.elsevier.com/researcher/author/policies-and-guidelines/credit-author-statement\">CRediT</a> (Contributor Roles Taxonomy)</li>\n</ul>\n<p>Dr. Meekings then noted how the precarious nature of most ECR positions creates cascading complications for individuals. When job-hopping between short-term contracts, who funds the publication of work from previous positions? How do ECRs balance completing past research with new employers' priorities? <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> also had this issue when joining my group a few years ago, as it took a significant portion of her time in the first year to finish up her previous publication from her last research contract.</p>\n<p>If we're going to fix the system itself, then ECRs need better incentives for PIs to publish null results and exploratory work, the councils need to improve support for interdisciplinary research that doesn't fit traditional journal boundaries (as these as frontiers between \"conventional\" science where many ECRs will work), and recognition that ECRs often lack the networks for navigating journal politics where editors rule supreme.</p>\n<p>Dr. Meekings summarized ECR needs with an excellent new acronym (SCARF) that drew a round of applause!</p>\n<ul>\n<li><strong>S</strong>peed in publication processes</li>\n<li><strong>C</strong>larity in requirements and decisions</li>\n<li><strong>A</strong>ffordability of publication fees</li>\n<li><strong>R</strong>ecognition of contributions</li>\n<li><strong>F</strong>airness in review and credit</li>\n</ul>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-8.webp\" title=\"Dr Sophie Meekings' SCARF principles for ECRs\"/></p>\n<p>The audience Q&amp;A was quite robust at this point. The first question was about how might we extend the evidence synthesis approach widely?\n<a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> noted that we are currently extending this to education working with <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a>. Interconnected datasets <em>across</em> subjects are an obvious future path for evidence datasets, with common technology for handling (e.g.) retracted datasets that can be applied consistently. <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> are supervising <a href=\"https://anil.recoil.org/notes/eeg-interns-2025\">projects on evidence synthesis</a> this summer on just this topic here in Cambridge.</p>\n<p>Another question was why ECRs feel that double blind review is important. Dr. Meekings noted that reviewers may not take ECR peer reviews as seriously, but this coul dbe fixed by opening up peer review and assigning credit <em>after</em> the process is completed and not during. Interestingly, the panel all like double-blind, which is the norm in computer science but not in other science journals. Some from the  BMJ noted there exists a lot of research into blinding; they summarised it that blinding doesn't work on the whole (people know who it is anyway) and open review doesn't cause any of the problems that people think it causes.</p>\n<p>A really interesting comment from Mark Walport was that a grand scale community project could work for the future of evidence collation, but this critically depends on breaking down the current silos since it doesn't work unless everyone makes their literature available. There was much nodding from the audience in support of this line of thinkin.g</p>\n<h2 id=\"charting-the-future-for-scientific-publishing\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#charting-the-future-for-scientific-publishing\"></a>Charting the future for scientific publishing</h2>\n<p>The next panel brought together folks from across the scientific\npublishing ecosystem, moderated by Clive Cookson of the Financial Times. This\nwas a particularly frank and pointed panel, with lots of quite direct messages\nbeing sent between the representatives of libraries, publishers and funders!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-9.webp\" title=\"Amy Brand from MIT Press opens the panel\"/></p>\n<p>Amy Brand (MIT Press) started by delivered a warning about conflating \"open to\nread\" with \"open to train on\". She pointed out that when MIT Press did a survey\nacross their authors, many of them raised concerns about the reinforcement of\nbias through AI training on scientific literature. While many of the authors\nacknowledged a moral imperative to make science available for LLM training,\nthey also wanted the <em>choice</em> of making their own work used for this. She urged\nthe community to pause and ask fundamental questions like \"AI training, at what\ncost?\" and \"to whose benefit?\". I did think she made a good point by drawing\nparallels with the early internet, where Brand pointed out that lack of\nregulation accelerated the decline of non-advertising-driven models. Her\nclosing question asked if search engines merely lead to AI-generated summaries,\nwhy serve the original content at all? This is something we discuss in our\n<a href=\"https://anil.recoil.org/papers/2025-internet-ecology\">upcoming Aarhus paper on an Internet ecology</a>.</p>\n<p><a href=\"https://experts.deakin.edu.au/66981-danny-kingsley\">Danny Kingsley</a> from Deakin University Library then delivered a biting perspective as a representative of libraries. She said that libraries are \"the ones that sign the cheques that keeps the system running\", which the rest of the panel all disagreed with in the subsequent discussion (they all claimed to be responsible, from the government to the foundations).  Her survey of librarians was interesting; they all asked for:</p>\n<ul>\n<li>Transparent peer review processes</li>\n<li>Unified expectations around AI declarations and disclosures</li>\n<li>Licensing as open as possible, resisting the \"salami slicing\" of specific use. We also ran across this problem of overly precise restrictions on use while <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">building our paper corpus</a> for <a href=\"https://anil.recoil.org/projects/ce\">CE</a>.</li>\n</ul>\n<p>Kingsley had a great line that \"publishers re monetizing the funding mandate\",\nwhich <a href=\"https://www.stats.ox.ac.uk/~deane/\">Charlotte Deane</a> later also said was the most succinct way she had heard\nto describe the annoyance we all have with the vast profit margins of\ncommercial publishers.  Kingsley highlighted this via the troubling practices\nin the IEEE and the American Chemical Society by charging to place repositories\nunder green open access. Her blunt assessment was that publishers are not\nnegotiating in good faith. Her talk drew the biggest applause of the day by\nfar.</p>\n<p>After this, <a href=\"https://wellcome.org/about-us/our-people/staff/john-arne-rottingen\">John-Arne\nR\u00f8ttingen</a>\n(CEO of the Wellcome Trust) emphasised that funders depend on scientific\ndiscourse as a continuous process of refutations and discussions. He expressed\nconcern about overly depending on brand value as a proxy for quality, calling\nit eventually misleading even if it works sometimes in the short term. Key\npriorities the WT have is ensuring that reviewers have easy access to all\nliterature, to supporting evidence synthesis initiatives to translate research\ninto impact, and controlling the open body of research outputs through digital\ninfrastructure to manage the new scale.  However, his challenge lies in\nmaintaining sustainable financing models for all this research data; he noted\nexplicitly that the Wellcome would not cover open access costs for commercial\npublishers.</p>\n<p>R\u00f8ttingen further highlighted the Global Biodata Coalition (which he was a\nmember of) concerns about US data resilience and framed research infrastructure\nas \"a global public good\" requiring collective investment and fair financing\nacross nations. Interestingly, he explicitly called out UNESCO as a weak force\nin global governance for this from the UN; I hadn't even realised that UNESCO\nwas responsible for this stuff!</p>\n<p>Finally, <a href=\"https://www.stats.ox.ac.uk/~deane/\">Prof Charlotte Deane</a> from the EPSRC also discussed what a scientific\njournal is for these days. It's not for proofreading or typesetting anymore and\n(as <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> also noted earlier), the stamp of quality is key. Deane\nargued that \"research completion\" doesn't happen until someone else can read it\nand reasonably verify the methods are sound; not something that can happen\nwithout more open access.  Deane also warned of the existential threat of <a href=\"https://anil.recoil.org/notes/ai-poisoning\">AI poisoning</a> since \"AI can make fake papers at a rate humans can't\nimagine. It won't be long before mose of the content on the Internet will be AI\ngenerated\".</p>\n<p>The audience Q&amp;A was <em>very</em> blunt here.  <a href=\"https://uniweb.uottawa.ca/view/profile/members/2846\">Stefanie Haustein</a> pointed out that we\nare pumping of billions of dollars into the publishing industry, many of which\nare shareholder companies, and so we are losing a significant percentage of\neach dollar spent. There is enough money in the system, but it's very\ninefficiently deployed right now!</p>\n<p><a href=\"https://www.linkedin.com/in/richardsever\">Richard Sever</a> from openRxiv asked\nhow we pay for this when major funders like the NIH have issued a series of\n<em>unfunded</em> open data mandates over recent years. John-Arne Rottingen noted that\nUNESCO is a very weak global body and not influential here, but that we need\ncoalitions of the willing to build such open data approaches from the bottom\nup. Challenging the publisher hegemony can only be done as a pack, which lead\nnicely onto the next session after lunch where the founder of\n<a href=\"https://openalex.org/\">OpenAlex</a> would be present!</p>\n<h2 id=\"who-are-the-stewards-of-knowledge-\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#who-are-the-stewards-of-knowledge-\"></a>Who are the stewards of knowledge ?</h2>\n<p>After lunch (where sadly, the vegetarian options were terrible but\nluckily I had my trustly Huel bar!), we reconvened with a panel debating\nwho the stewards of the scientific record should be. This brought together\nperspectives from commercial publishers (Elsevier), open infrastructure advocates (OpenAlex),\nfunders (MRC), and university leadership (pro-VC of Birmingham).</p>\n<p><a href=\"https://www.elsevier.com/people/victoria-eva\">Victoria Eva</a> (<a href=\"https://researcheracademy.elsevier.com/publication-process/open-science/open-access-end-user-licenses\">SVP from\nElsevier</a>)\nopened by describing the \"perfect storm\" facing their academic publishing\nbusiness as they had 600k more submissions this year than the previous year.\nThere was a high level view on how their digital pipeline \"aims to insert\nsafeguards\" throughout the publication process to maintain integrity. She\nargued in general terms to view GenAI through separate lenses of trust and\ndiscoverability and argud that Elsevier's substantial technological investments\nposition them to manage both challenges well. I was\n<a href=\"https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science\">predisposed</a>\nto dislike excuses from staggeringly profitable commercial publishers, but I\ndid find her answers to providing bulk access to their corpus unsatisfying.\nWhile she highlighted their growing open access base of papers, she also noted\nthat the transitionon to open access cannot happen overnight (my personal\ntranslation is that this means slow-walking). She mentioned special cases in\nplace for\n<a href=\"https://www.elsevier.com/en-gb/about/open-science/research-data/text-and-data-mining\">TDM</a>\nin the Global South and healthcare access (presumably at the commercial\ndiscretion of Elsevier).</p>\n<p><a href=\"https://jasonpriem.org/\">Jason Priem</a> from <a href=\"https://openalex.org/\">OpenAlex</a>\n(part of <a href=\"https://ourresearch.org/\">OurResearch</a>) then offered a radically\ndifferent perspective. I'm a huge fan of OpenAlex, as we use it extensively in\nthe <a href=\"https://anil.recoil.org/projects/ce\">CE</a> infrastructure. He disagreed with the conference framing of\npublishers as \"custodians\" or \"stewards,\" noting that these evoke someone\nmaintaining a static, old lovely house. Science <em>isn't</em> a static edifice but a\ngrowing ecosystem, with more scientists alive today than at any point in\nhistory. He instead proposed a \"gardener\" as a better metaphor; the science\necosystem needs to nourish growth rather than merely preserving what exists.\nExtending the metaphor, Priem contrasted French and English garden styles:\nFrench gardens constrain nature into platonic geometric forms, while English\ngardens embrace a more rambling style that better represents nature's inherent\ndiversity. He argued that science needs to adopt the \"English garden\" approach\nand that we don't have an information overload problem but rather \"<a href=\"https://www.cnet.com/culture/shirky-problem-is-filter-failure-not-info-overload/\">bad\nfilters</a>\"\n(to quote Clay Shirky).</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-11.webp\" title=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\"/></p>\n<p>Priem advocated <em>strongly</em> for open infrastructures since communities don't just produce papers: also software, datasets, abstracts, and things we don't envision yet. If we provide them with the \"digital soil\" (open infrastructure) then they will prosper. OpenAlex and <a href=\"https://zenodo.org/\">Zenodo</a> are great examples of how such open infrastructure hold up here. I use both all the time; I'm a huge fan of Jason's work and talk.</p>\n<p><a href=\"https://www.ukri.org/people/patrick-chinnery/\">Patrick Chinnery</a> from the Medical Research Council brought the funder perspective with some numbers: publishing consumes 1 to 2% of total research turnover funds (roughly \u00a324 million for UKRI) . He noted that during the pandemic, decision-makers were reviewing preprint data in real-time to determine which treatments should proceed to clinical trials and decisions had to be reversed after peer review revealed flaws. He emphasised the the need for more real time quality assurance in rapid decision-making contexts.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Adam_Tickell\">Adam Tickell</a> from the University of Birmingham declared the current model \"broken\", and not that each attempt at reform fails to solve the <em>basic problem of literature access</em> (something I've faced myself). He noted that David Willetts (former UK Minister for Science) couldn't access paywalled material while minister of science in government (!) which significantly influenced <a href=\"https://www.gov.uk/government/news/government-to-open-up-publicly-funded-research\">subsequent government policy</a> towards open access.\nTickell was scathing about the oligopolies of Elsevier and Springer, arguing their <a href=\"https://www.researchprofessionalnews.com/rr-news-world-2025-2-elsevier-parent-company-reports-10-rise-in-profit-to-3-2bn/\">profit margins</a> are out of proportion with the public funding for science. He noted that early open access attempts from the <a href=\"https://ioppublishing.org/news/spotlight-on-the-finch-report/\">Finch Report</a> were well-intentioned but ultimately insufficient to break the hegemony. Perhaps an opportunity for a future UK <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">National Data Library</a>...\nTickell closed his talk with an observation about the current crisis of confidence in science. This did make me think of a <a href=\"https://bsky.app/profile/hetanshah.bsky.social/post/3lttyexntps2y\">recent report on British confidence in science</a>, which shows the British public still retains belief in scientific institutions. So at least we're doing better than the US in this regard for now!</p>\n<p><a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7350547427319275520?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7350547427319275520%2C7350886618490130433%29&amp;replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7350547427319275520%2C7350908587134644225%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287350886618490130433%2Curn%3Ali%3Aactivity%3A7350547427319275520%29&amp;dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287350908587134644225%2Curn%3Ali%3Aactivity%3A7350547427319275520%29\"> <img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-ss-1.webp\" title=\"Stefanie Haustein points out ChatGPT-related content in response to Elsevier's comments on stage.\"/> </a></p>\n<p>The Q&amp;A session opened with Mark Walport asked how Elsevier manages to publish so many articles. Victoria Eva from Elsevier responded that they receive 3.5m articles annually with ~750k published. Eva mentioned something about \"digital screening throughout the publication process\" but acknowledged that this was a challenge due to the surge from paper mills. A suggestion of paying peer reviewers was raised from the audience but not substantively addressed. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> once again made a great point from the audience about how Elsevier could let through <a href=\"https://www.vice.com/en/article/scientific-journal-frontiers-publishes-ai-generated-rat-with-gigantic-penis-in-worrying-incident/\">AI generated rats with giant penises</a> with all this protection in place; clearly, some papers have been published by them with no humans ever reading it. This generated a laugh from the audience, and an acknowlegment from the Elsevier rep that they needed to invest more and improve.</p>\n<h2 id=\"how-to-make-open-infrastructure-sustainable\"><a aria-hidden=\"true\" class=\"anchor\" href=\"https://anil.recoil.org/notes/rs-future-of-publishing/#how-to-make-open-infrastructure-sustainable\"></a>How to make open infrastructure sustainable</h2>\n<p>My laptop power ran out at this point, but the next panel was an absolute treat as it had both <a href=\"https://kaythaney.com/\">Kaitlin Thaney</a> and <a href=\"https://en.wikipedia.org/wiki/Jimmy_Wales\">Jimmy Wales</a> of Wikipedia fame on it!</p>\n<p><img alt=\"%c\" src=\"https://anil.recoil.org/images/rspub-12.webp\" title=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\"/></p>\n<p>Jimmy Wales pointed out an interesting point from his \"seven rules of trust\" is that a key one is to be personal with human-to-human contact and not run too quickly to technological solutions. Rather than, for example, asking what percentage of academic papers showed evidence of language from ChatGPT, it's more fruitful to ask whether the science contained within the paper is good instead of how it's written. There are many reasons why someone might have used ChatGPT (non-native speakers etc) but also many reasons unrelated why the science might be bad.</p>\n<p>Kaitlin Thaney pointed out the importance of openness given <a href=\"https://www.motherjones.com/politics/2025/07/trump-war-assault-national-science-foundation-american-innovation-greatness-education/\">the US assault on\nscience</a>\nmeans that the open data repositories can be replicated reasonably as well.</p>\n<p>Ian Mulvaney pointed out that Nature claims to have invested $240m in research\ninfrastructure, and this is a struggle for a medium sized publisher (like his\nown <a href=\"https://www.bmj.com/\">BMJ</a>). Open infrastructure allows sharing and\ncreation of value to make it possible to let these smaller organisations\nsurvive.</p>\n<p>When it comes to policy recommendations, what did the panel have to say about a more trustworthy literature?</p>\n<ul>\n<li>The <a href=\"https://www.ccsd.cnrs.fr/en/posi-principles/\">POSI principles</a> came up as important levels.</li>\n<li>Kaitlin mentioned the <a href=\"https://www.nextgenlibpub.org/forest-framework\">FOREST framework</a> funded by Arcadia and how they need to manifest in concrete infrastructure. There's an implicit reliance on infrastructure that you only notice when it's taken away! Affordability of open is a key consideration as well.</li>\n<li>Jimmy talked about open source software, and what generally works is not one-size-fits-all. Some are run by companies (their main product and they sell services), and others by individuals.  If we bring this back to policy, we need to look at preserving whats already working sustainably but support it. Dont try to find a general solution but adopt targeted, well thought through interventions instead.</li>\n</ul>\n<p><em>I'm updating this as I go along but running out of laptop battery too!</em></p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). Is AI poisoning the scientific literature? Our comment in Nature. <a href=\"https://doi.org/10.59350/pbxew-d2j78\" target=\"_blank\"><i>10.59350/pbxew-d2j78</i></a></li>\n<li>Madhavapeddy (2025). EEG internships for the summer of 2025. <a href=\"https://doi.org/10.59350/tf22g-p1822\" target=\"_blank\"><i>10.59350/tf22g-p1822</i></a></li>\n<li>Richter (1960). How Many More New Journals?. Nature. <a href=\"https://doi.org/10.1038/186018a0\" target=\"_blank\"><i>10.1038/186018a0</i></a></li>\n<li>Editors (1979). Uniform Requirements for Manuscripts Submitted to Biomedical Journals. Annals of Clinical Biochemistry. <a href=\"https://doi.org/10.1177/000456327901600179\" target=\"_blank\"><i>10.1177/000456327901600179</i></a></li>\n<li>Fyfe (2022). Self-help for learned journals: Scientific societies and the commerce of publishing in the 1950s. History of Science. <a href=\"https://doi.org/10.1177/0073275321999901\" target=\"_blank\"><i>10.1177/0073275321999901</i></a></li></ul>","doi":"https://doi.org/10.59350/nmcab-py710","guid":"https://doi.org/10.59350/nmcab-py710","language":"en","license":"https://creativecommons.org/licenses/by/4.0/legalcode","published_at":1752451200,"reference":[{"id":"https://doi.org/10.1145/3744169.3744180","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/fk6vy-5q841","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1038/d41586-025-02069-w","unstructured":"<b>[cito:citesAsSourceDocument]</b>"},{"id":"https://doi.org/10.59350/pbxew-d2j78","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.59350/tf22g-p1822","unstructured":"<b>[cito:citesAsRelated]</b>"},{"id":"https://doi.org/10.1038/186018a0","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1177/000456327901600179","unstructured":"<b>[cito:cites]</b>"},{"id":"https://doi.org/10.1177/0073275321999901","unstructured":"<b>[cito:cites]</b>"}],"rid":"7p1xb-30w84","summary":"I was a bit sleepy getting into the Royal Society Future of Scientific Publishing conference early this morning, but was quickly woken up by the dramatic passion on show as publishers, librarians, academics and funders all got together for a \"frank exchange of views\" at a meeting that didn't pull any punches! These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available from the RS in due course.","tags":["Royalsociety","Evidence","Publishing","Ai","Livenotes"],"title":"Royal Society's Future of Scientific Publishing meeting","updated_at":1781259294,"url":"https://anil.recoil.org/notes/rs-future-of-publishing","version":"v1"}],"out_of":50496,"page":1,"per_page":10,"total-results":50496}
