Phase 7: ground-truth export (JSONL + stats) + CLI tool
- GET /api/v1/ground-truth/export streaming JSONL (approved_only,
since, until, has_corrections, limit)
- GET /api/v1/ground-truth/stats total / approved / corrections
counts + top-N most-corrected field paths
- python -m ocr_sprint.tools.export_ground_truth operator CLI with
the same filters + optional --print-stats
- Ground-truth sample reconstructs the pipeline's original output by
replaying job_corrections in reverse
- docs/ground-truth-format.md schema + fine-tuning guidance
- 17 new tests (service replay, endpoint filters, CLI)
- 201 total tests passing, ruff / mypy --strict clean
Co-Authored-By: adrian kuman firmansah <adriancuman@gmail.com>