I’ve spent a lot of time leveraging LLMs for finance work and watching them get better at grounding claims and avoiding hallucinations, so I was curious whether GPT-5.1 could compare ~500 pages of 2026 macro outlooks across 14 banks and keep citations back to the PDFs.

What I learned

  • Forcing page refs in the prompt + page markers in the text makes citations usable.
  • A simple multi-pass pipeline beats a single giant prompt.
  • Costs/time were surprisingly reasonable for this scale.
  • The system is blind to charts; text-only leaves gaps.
  • Spot-checking still matters, even with citations.

Output: 2026 Macro Analysis

Write-up: Methods + notes