283 vs. 38: Why You Need More Than Memory

Introducing our first Pulse Intelligence Partner vs. Claude performance benchmark.

We asked the same question in one single Claude session — to ensure consistency:

Name every lithium company in the US and Canada, from grassroots exploration through to definitive feasibility. Pin each one to its exact stage.

One method was our platform, connected via Claude MCP. The other was a general AI model — Claude Opus 4.8 — answering from memory, with no live mining data underneath it.

Benchmark overview: one question, two methods. Pulse platform returned 283 companies. AI model from memory returned 38.

Pulse returned 283 companies. Every one tagged to its stage, every stage traceable to the filing it came from. In about five seconds.

The model named 38. Roughly one in seven. And several of those were mis-staged or already in construction, which puts them outside the question entirely.

Measurement 01 — Companies found. Pulse mapped the whole market at 283. AI model recall saw one in seven at 38. 245 companies never surfaced — 87% of the market, invisible to recall.

The 245 it never surfaced are not obscure. They are the junior explorers and single-asset developers that make up most of the lithium market in North America, and most of the movement in it.

Measurement 02 — Time to answer. Pulse returned one structured query in ~5 seconds, fully sourced. Rebuilding the same 283 names by hand would take hours to days of reading filings with no audit trail.

This is the gap people underestimate. A capable model is not the same thing as a complete, auditable data foundation. Speed was never the hard part. Coverage you can stand behind is.

Measurement 03 — Tokens and compute. Pulse used ~5K tokens for one query, one table out. Web-research parity requires 100K–1M+ tokens across dozens of search and fetch calls — 200x more compute to reach the same answer.

Now imagine adding actual complexity — 10 technical metrics, permitting status, leadership track record. It is simply not possible to achieve with a generic LLM if your definition of success is accuracy and data you can actually use without checking everything.

Why the gap matters. Completeness: 283 vs 38 — Pulse returns the full universe, recall sees the majors and misses the long tail. Stage accuracy: verified vs guessed — every Pulse stage traces to a filing. Provenance: audit trail vs none — one click to source on Pulse.

One honest qualifier: the model was answering cold, from training data, with no tools attached. That is exactly how most people are using AI to screen mining markets right now.

If your screening still depends on what a model can recall, there is a chance you are working from about 13% of the market.

From a week of searching to one query. 283 companies, every stage sourced to the filing, in seconds. That is the difference between searching the data and owning it.

Perhaps worth some reflection.

Less searching. More strategising.™

AI Readiness Diagnostic

Where does your team's data infrastructure sit today?

Answer 10 questions. Get a private diagnostic on your AI readiness — in minutes.

Take the Diagnostic → Request a demo

Pulse Intelligence

Less Searching. More Strategising.™

See the platform running on real mining data. Book a demo to see what this looks like for your team.

Request a demo → Take the Diagnostic