01 CONTEXT
Resolutiion: conflict & dispute management for contract delivery.
Early-stage. The product was still being defined.
At an early company the first product is the foundation. Build only for disputes and you anchor everything to a symptom, while incumbents keep their head start.
02 THE REFRAME
Disputes are a symptom. The job was upstream.
It surfaced in client root-cause conversations: nobody wants to get good at managing disputes. By the time there's a dispute, the damage is already done: the cost, the time, the relationship.
Draft & Sign
Handover
Deliver & Manage
Renew
Global events (tariffs, regional conflict)
Dispute
03 INSIGHTS
…
We didn't assume we knew where risk formed, so we ran workshops with the people who live inside these contracts: e.g. contract managers, procurement and supply chain to map where things actually go wrong and to pressure-test the "avoid, don't manage" reframe.
Data sits in multiple sources - I don't have a holistic view or one source of truth
I don't have the time to read contracts when something happens, so I don't know what the consequences are
Sometimes events like tariffs don't impact us directly, but they impact our supplier, and we are affected
What about my manager? They'll want to see a horizontal view of the entire portfolio
People with different titles are responsible for capturing risk, so they won't always know how that risk looks like (tonality in an email)
04 PROBLEM DEFINITION & HYPOTHESIS
Hypothesis
The shift was in what counted as value. These users already had tools that showed them data. What none of them had was anything that interpreted it. The expected output stopped being raw data and became the “so what.”
If we surface interpreted risk with a recommended action rather than more data, the daily ops user will act on it, because they don't have the time or training to interpret raw data themselves, and what they want is for the dispute not to happen.
We'd know we were wrong if users kept reaching for raw data anyway, dismissed interpreted flags at high rates, or valued only the market-news feed while ignoring the risk that sat closer to home.
Success would mean the daily ops user acting on what the system surfaces — not because we can count the disputes that never happened, but because acting on a flag is the closest early signal that one is being avoided rather than waited for.
05 PRIORITISATION
Deciding what to build
Users looked at two levels, and both were real needs. We didn't compromise, we sequenced.
Prioritised
Macro-level
Doesn't depend on internal integrations and IT approvals
Doesn't depend on internal integrations
Faster to demo & ship, equally valuable
Weaker moat: we risked it being read as a news feed
Second in sequence
Micro-level
Stronger moat - competitors can't easily copy
Sticky - once embedded, it's difficult to get rid of the solution
Not just pulling data; matching to contracts and proposed action is the USP
Longer time-to-value with high risk of stalling
More expensive and complex (engineering effort)
Higher responsibility to flag high-stakes events
Draft & Sign
Handover
Deliver & Manage
Renew
Global events (tariffs, regional conflict)
Dispute
06 THE SYSTEM
Designing for an AI that's sometimes wrong (unhappy path)
Users wanted the system to use judgment for them — to decide what mattered and act on it. But that wish carried an unspoken assumption: that the AI would always be right, which we can't promise
system surfaces the risk and AI assigns itself a confidence score
system shows reasoning/context to support judgement
users can act on itm dismiss it, or leave it

07 THE SYSTEM IN CONTEXT
Safety vs agency; highlighting what matters
The aim was a surface where the next thing worth attention is obvious without the user having to go looking for it — because looking for it is exactly what their job doesn't leave room for.
09 SUCCESS
How we'd know it's working
The problem with a our product: the thing we're trying to create is the absence of disputes. We can't directly measure the disputes that didn't happen. So the outcome is, in the short term, structurally unmeasurable, and anchoring on it would have been wishful. What we can measure:
Data connected
If we could connect 2> sources that'd normally live separately
Flags acted on
///
Trust
///
10 OUTCOME
….
Users wanted the system to use judgment for them — to decide what mattered and act on it. But that wish carried an unspoken assumption: that the AI would always be right, which we can't promise
system surfaces the risk and AI assigns itself a confidence score
system shows reasoning/context to support judgement
users can act on itm dismiss it, or leave it

11 REFLECTION
…
We didn't assume we knew where risk formed, so we ran workshops with the people who live inside these contracts: e.g. contract managers, procurement and supply chain to map where things actually go wrong and to pressure-test the "avoid, don't manage" reframe.
Data sits in multiple sources - I don't have a holistic view or one source of truth
I don't have the time to read contracts when something happens, so I don't know what the consequences are
Sometimes events like tariffs don't impact us directly, but they impact our supplier, and we are affected
What about my manager? They'll want to see a horizontal view of the entire portfolio
Separating a flag that was ignored because the user didn't trust it from one that was ignored because they didn't have the capacity to deal with it that day.