New OCR model; better OCR heuristics
New OCR model that is better all-around, but particularly at math Improved OCR heuristics, will now prioritize accuracy over speed Drop format_lines, since force_ocr is generally a bit more accurate,
- New OCR model that is better all-around, but particularly at math
- Improved OCR heuristics, will now prioritize accuracy over speed
- Drop
format_lines, sinceforce_ocris generally a bit more accurate, and less error-prone
What's Changed
- feat: allow option to keep tables split across pages by @zanussbaum in #813
- New OCR Model by @tarun-menta in #820
- Bump surya version by @VikParuchuri in #821
Full Changelog: v1.8.2...v1.8.3