← back to work

§ /work / emissions-calculator · 2025

Automated Emissions Calculation System

MSc dissertation (Distinction). Python app using LLMs to extract, classify and calculate emissions data from unstructured documents.

year
2025
role
Solo — MSc dissertation
stack
Python · LLMs · Document Parsing
status
Shipped

What it is

My MSc dissertation at Nottingham Trent University, graded Distinction. A Python system that takes unstructured documents — invoices, supplier reports, utility statements — and returns a structured emissions breakdown at the other end.

What the LLMs actually do

  • Extract: pull the fields that matter (quantity, unit, supplier, activity type) out of messy PDFs and scans
  • Classify: map free-text line items onto emission-factor categories
  • Calculate: feed structured output into deterministic emissions math — the LLM doesn’t do the arithmetic

The split matters. LLMs are the right tool for “read a weird document and normalise it”; they are the wrong tool for “multiply two numbers.” Keeping those jobs separate is most of what made the system reliable.

Why it’s interesting

Emissions calculation is usually a manual, spreadsheet-driven process. Document variety is the bottleneck, not the math. Putting an LLM at the extraction layer turns a multi-hour job into a pipeline step — and gives you citations back to the source document, which auditors need.