How TSK works
Each section is labeled Live (capability complete), Partial (capability works end-to-end with named limitations enumerated below), or Deferred (capability on roadmap, not yet implemented). Five of the six sections below are Partial — read the limitations to understand exactly which surfaces are scoped today.
A methodology document for sustainability practitioners. Each section explains what TSK does today, names the source-of-truth code path, and calls out honest limits.
1. Extraction logic
LiveTSK processes uploaded supplier documents in two passes. The first pass uses regex patterns to extract structured values — energy readings, invoice totals, meter references — from plain text and native-PDF text layers. A pre-validation step rejects documents that contain no parseable text before any extraction begins, so garbage documents are discarded early. When an API key is configured (via OPENAI_API_KEY or TSK_LLM_PROVIDER), a second LLM extraction pass runs in parallel: the provider chain tries Ollama (local, free) first, then OpenAI (cloud), then falls back to regex if both fail. The best result from any provider is selected for the evidence record.
Honest limits
LLM extraction is OFF by default. Without an API key the system uses regex only. Regex covers the most common UK utility bill formats; documents with non-standard layouts or heavily scanned content may produce partial extractions, which are flagged for review rather than silently accepted.
Source trace
tools/ingestion/extraction_chooser_v1.py— native vs OCR routing decision (lines 1–40)tools/ingestion/llm_extract_v1.py— provider chain: Ollama → OpenAI → regex fallbackconfig/feature_flags.py:110-114—FF_INGEST_LLM_EXTRACT_V1(defaultFalse; auto-enabled when an API key is detected)
2. Confidence scoring
PartialEach extracted evidence item carries a source_status signal (unconfirmed → confirmed) that reflects how the value was obtained and whether a human reviewer has validated it. A user-confirmed flag is set when a supplier explicitly accepts a value in the review interface. Before any item reaches the supplier pack, a dimensional sanity gate checks that extracted quantities are unit-consistent — for example, that a reading expressed in kWh is not paired with a price denominated in cubic metres. Items that fail this gate are quarantined into 3_AUDIT/dimensional_quarantine_v1.json and marked for manual review rather than being silently dropped or promoted.
Honest limits
The dual-LLM verification signal — produced when two independent AI models are run on the same extraction — is not yet surfaced in supplier-facing confidence badges. The field dual_llm_verified exists in the evidence schema but is never populated in the current output. When this wiring ships it will add a second layer of machine-validation to confidence scoring; until then, badges reflect regex and single-LLM extraction quality plus the dimensional sanity gate only.
Source trace
tools/evidence/evidence_summary_v1.py—source_statusenum definition anddual_llm_verifiedfield (near top of file)tools/extraction/dimensional_sanity_gate_v1.py:32-150— dimensional quarantine logic; writes to3_AUDIT/dimensional_quarantine_v1.json
3. Emission factor sources
PartialTSK uses the DEFRA 2025 conversion factors for all emissions calculations — electricity, natural gas, liquid fuels, water, and waste. The factor set is pinned in contracts/emission_factors/uk_defra_2025_v1.json, where the vintage year is hard-coded in the file header (line 6: "year": 2025) and the calculation method is declared as "method": "location-based". The calculate_emissions() function selects this factor set as its default at line 27 of tools/calc/emissions_calculator_v1.py.
Honest limits
The factor library is DEFRA 2025, UK only. Non-UK geographies fall back to the UK factor today — this is a known limitation. Calculations applied to non-UK consumption data using the UK factor will misrepresent location-based emissions and should be re-run when a regional factor set ships. We pin a specific DEFRA edition deliberately to ensure reproducibility of historical calculations — emissions calculated today against DEFRA 2025 will calculate identically a year from now. When DEFRA releases the next annual update, refreshing the pinned factor set is on the post-launch roadmap. Only location-based Scope 2 is supported; market-based calculation is deferred.
Source trace
contracts/emission_factors/uk_defra_2025_v1.json— year hard-coded in header (lines 1–16);"method": "location-based"declared heretools/calc/emissions_calculator_v1.py:27— default factor set selection
4. GHG Protocol scope rules
PartialTSK follows GHG Protocol scope boundaries for all emissions derivation. Scope 1 covers direct combustion from owned or controlled sources — natural gas, diesel, petrol, LPG, and fuel oil are all supported with live factor lookups. Scope 2 uses a location-based approach for purchased electricity: the grid emission factor is applied to metered kWh consumption, following the "method": "location-based" declaration in contracts/emission_factors/uk_defra_2025_v1.json. Scope 3 activity keys are mapped from supplier document data using KEYWORD_METRIC_MAP in tools/ingestion/mapping_candidates_v1.py so that category assignments are ready when calculation is added.
Honest limits
Scope 3 category mapping is schema-ready; per-category emissions calculation is NOT yet implemented. The mapping schema currently addresses 9 of the 15 GHG Protocol Scope 3 categories: Cat 1 (purchased goods & services), Cat 2 (capital goods), Cat 3 (fuel & energy-related), Cat 4 (upstream transport), Cat 5 (operational waste), Cat 6 (business travel), Cat 7 (employee commuting), Cat 9 (downstream transport), and Cat 12 (end-of-life treatment). The remaining 6 categories — including Cat 8 (upstream leased assets), Cat 10 (processing of sold products), Cat 11 (use of sold products), Cat 13 (downstream leased assets), Cat 14 (franchises), and Cat 15 (investments) — are not addressed in the current schema. Scope 2 market-based calculation is deferred. Scope 1 coverage is limited to the five fuels listed above — refrigerants, fugitive emissions, and process-combustion sources are not yet in scope.
Source trace
tools/calc/emissions_derivation_v1.py:64-86—METRIC_KEY_TO_FACTORmapping table: metric keys for Scope 1 fuels (natural gas, diesel, petrol, LPG, fuel oil) and Scope 3 water/waste keys, mapped to their DEFRA factor names and expected unitscontracts/emission_factors/uk_defra_2025_v1.json— Scope 2 location-based method declared at line 16 ("method": "location-based");tools/calc/emissions_calculator_v1.pysurfaces this viafactor.get("method")KEYWORD_METRIC_MAPintools/ingestion/mapping_candidates_v1.py(lines 62–83) — Scope 3 activity key mapping schema (no emissions calculation)
5. Dual-AI verification
PartialTSK's production profile runs two independent AI models on extractions where the dual-LLM adjudication module is engaged. Today, model disagreements are logged internally for operations review; the verification result is not yet surfaced in supplier-facing confidence badges. Surfacing the signal in badges is the next step in dual-LLM visibility. This module is implemented in tools/ingestion/dual_llm_adjudication_v1.py and is enabled in production via the FF_DUAL_LLM_ADJUDICATION feature flag.
Honest limits
The dual_llm_verified field exists in tools/evidence/evidence_summary_v1.py but is never populated in the current supplier output. When this wiring ships, Section 2 (Confidence scoring) will also be updated to reflect the added signal.
Source trace
config/feature_flags.py:171-175—FF_DUAL_LLM_ADJUDICATION(defaultFalse; enabled via"quality"group — the flag belongs to thequalitygroup, so it does not need to be listed by name inproduction.json; group membership is sufficient for ON status)tools/ingestion/dual_llm_adjudication_v1.py— adjudication logic (~180 lines)config/beta_profiles/production.json— includes"quality"in itsgroupsarray, which enables all flags in that group includingFF_DUAL_LLM_ADJUDICATIONtools/evidence/evidence_summary_v1.py—dual_llm_verifiedfield defined but not yet populated in supplier-facing output
6. Chain of custody
PartialEvery supplier pack produced by TSK is tamper-evident. When the pipeline finalises a run, it writes a manifest to 3_AUDIT/manifest.json inside the supplier_pack.zip archive. Each file in the pack receives a SHA256 hash entry in this manifest, so any post-delivery modification to an individual evidence file is detectable by recomputing and comparing hashes. The manifest is written by tools/pack/supplier_pack_zip_v1.py.
Honest limits
The manifest provides integrity (SHA256 hashes per file). Cryptographic signing — auditor-verifiable authenticity with a private key — is on the post-launch roadmap. Your completed pack benefits from terminal-state durability (H2 Item 2.5): once a run is finalised, it survives a server restart. Mid-pipeline resume is not promised — if a run is interrupted before reaching the 5-artifact threshold, it may be left in an incomplete state and will not be automatically re-completed. The orphan-recovery mechanism is a heuristic, not a guarantee — runs with fewer than 5 artifacts are not recovered.
Source trace
tools/pack/supplier_pack_zip_v1.py— manifest written to3_AUDIT/manifest.json; SHA256 per file computed at ~line 42+web_service/run_manager.py:217— orphan recovery threshold (5+ artifacts → terminal-state completion; fewer than 5 → not recovered)