A model is only as good as its numbers.
Decarbonisation has quietly turned a simple question – how much does this voyage emit, and what does it cost? – into a standing research problem. We encountered this while building the emissions model behind EcoRouter, a weather-routing system for shipping.
The surprise was that routing itself was never the hard part. The challenge was keeping the software aligned with a body of external knowledge that is large, scattered, and constantly changing.
The routing algorithm barely changed. The assumptions beneath it changed continuously.

Why this is a hard problem
Working out the climate cost of a voyage sounds simple – burn this much fuel, apply emissions factors, add the charges. In practice, the calculation sits at the intersection of physics, chemistry, economics, geography and law, and depends on a surprisingly large number of changing elements.
None of it is permanent. A fuel type is not one thing: methanol from fossil gas and methanol from renewable electricity are chemically alike but worlds apart in lifecycle emissions and cost. So the production pathway has to travel with the fuel. Gases beyond CO₂ – methane, nitrous oxide – are folded into a single figure using global-warming-potential values that are themselves regulatory settings, and those are revised. A voyage crosses many overlapping regulatory zones, each with its own rules and its own arithmetic (Figure 2).

The deeper obstacle is not just that the data changes, but that the sources carry different kinds of authority. A market feed is not a regulation. One regulation may amend another. A value can be perfectly accurate yet drawn from a superseded edition or correct under one instrument and wrong under another. These are the hardest errors to catch, because the numbers still look reasonable.
When we audited our own data, we found a stale carbon price, a missing NOₓ control area that had quietly left Baltic and North Sea traffic unflagged, and an out-of-date emissions basis. None of them in the routing algorithm, all of them in the assumptions around it.
Why it matters
For shipping, this is not academic. Carbon now carries a price and a compliance cost, and the gap between a plausible answer and a correct one shows up directly in fuel bills, penalties, and reported emissions.
EcoRouter exists to help operators choose routes that are genuinely cheaper, safer and more carbon efficient. But that promise is only as good as the inputs underneath it. Keep the inputs honest and the advice can be trusted; let them drift and every number downstream drifts with them.
The solution: An LLM assistant
This is where a large-language-model (LLM) agent earns its place. No person can monitor sixteen settings, ten fuel pathways, eighteen zones and twenty-two sources – each changing on its own schedule – and catch the moment one of them quietly goes out of date or a new one appears. The volume alone defeats manual control.
So, we built an agent, connected through the Model Context Protocol (MCP) to the data behind EcoRouter, and gave it a narrow, careful job. Not to run the model, but to help keep the inputs real.
The point is not that the agent can change values. It is that it checks its context first, reading the current value, where it came from, when it was last changed, and which sources stand behind it. And then follows standing instructions a careful analyst would recognise: prefer official, in-force sources; treat a stale citation as a defect; never invent a missing number; record uncertainty rather than hide it.
Importantly, the agent can suggest a change but never make one (Figure 3). Whatever the change, the agent reports what should change, attaches the evidence, and states how confident it is; a human reviews and approves. Approval is the only route into the live model. Keeping a human in the loop is a deliberate choice for transparency, not because the agent cannot be trusted.

What we learned
The most useful lesson had little to do with emissions. Most discussion of AI systems centres on access – connecting tools, retrieving data, automating workflows. Access turned out to be the easy part. The hard part was twofold. There is simply too much changing data, from too many independent sources, for any person to monitor. And even with a number in front of you, deciding whether it should currently be treated as real is a matter of judgement.
A regulatory value carries provenance, applicability, effective dates and amendment history, so the work is weighing evidence, not just gathering it.
We set out to build a routing system. What really needed tending was the model of reality beneath it – the fuels, the rules, the boundaries, the prices – and unlike the algorithm, that work never ends.

Comments
No comments yet. Be the first to comment!