OCR-SPRIN-SERVICE

adrian/OCR-SPRIN-SERVICE

Fork 0

Commit Graph

Author	SHA1	Message	Date
Devin AI	0755fbebda	Fix empty-dict consistency in ground-truth export Devin Review (post-merge on PR #6) flagged that the `final_result` assignment used a truthiness check (`if job_row.result`) while `build_initial_result` used an identity check (`is None`). For a job whose result is an empty dict (`{}`), the emitted `GroundTruthSample` ended up with `initial_result={}` but `final_result=None` — logically inconsistent. Switch the final-result assignment to the same `is None` check so both fields agree. Added `test_empty_dict_result_stays_consistent` to lock the invariant in, and fixed the test helper so callers can pass `{}` without the helper's `or` fallback replacing it. Co-Authored-By: adrian kuman firmansah <adriancuman@gmail.com>	2026-04-25 20:33:26 +00:00
Devin AI	6003d96a94	Phase 7: ground-truth export (JSONL + stats) + CLI tool - GET /api/v1/ground-truth/export streaming JSONL (approved_only, since, until, has_corrections, limit) - GET /api/v1/ground-truth/stats total / approved / corrections counts + top-N most-corrected field paths - python -m ocr_sprint.tools.export_ground_truth operator CLI with the same filters + optional --print-stats - Ground-truth sample reconstructs the pipeline's original output by replaying job_corrections in reverse - docs/ground-truth-format.md schema + fine-tuning guidance - 17 new tests (service replay, endpoint filters, CLI) - 201 total tests passing, ruff / mypy --strict clean Co-Authored-By: adrian kuman firmansah <adriancuman@gmail.com>	2026-04-25 20:24:40 +00:00

Author

SHA1

Message

Date

Devin AI

0755fbebda

Fix empty-dict consistency in ground-truth export

Devin Review (post-merge on PR #6) flagged that the `final_result`
assignment used a truthiness check (`if job_row.result`) while
`build_initial_result` used an identity check (`is None`). For a
job whose result is an empty dict (`{}`), the emitted
`GroundTruthSample` ended up with `initial_result={}` but
`final_result=None` — logically inconsistent.

Switch the final-result assignment to the same `is None` check so
both fields agree. Added `test_empty_dict_result_stays_consistent`
to lock the invariant in, and fixed the test helper so callers can
pass `{}` without the helper's `or` fallback replacing it.

Co-Authored-By: adrian kuman firmansah <adriancuman@gmail.com>

2026-04-25 20:33:26 +00:00

Devin AI

6003d96a94

Phase 7: ground-truth export (JSONL + stats) + CLI tool

- GET /api/v1/ground-truth/export  streaming JSONL (approved_only,
  since, until, has_corrections, limit)
- GET /api/v1/ground-truth/stats   total / approved / corrections
  counts + top-N most-corrected field paths
- python -m ocr_sprint.tools.export_ground_truth  operator CLI with
  the same filters + optional --print-stats
- Ground-truth sample reconstructs the pipeline's original output by
  replaying job_corrections in reverse
- docs/ground-truth-format.md    schema + fine-tuning guidance
- 17 new tests (service replay, endpoint filters, CLI)
- 201 total tests passing, ruff / mypy --strict clean

Co-Authored-By: adrian kuman firmansah <adriancuman@gmail.com>

2026-04-25 20:24:40 +00:00

2 Commits