01 — Problem
What was hard about this
Research papers frequently get desk-rejected on formatting alone (margins, fonts, figure resolution, citation style). Authors waste days on revisions that an automated checker could surface in seconds.
02 — Solution
How it works
Built an OCR + computer-vision audit pipeline: PyTesseract extracts text, OpenCV measures layout properties (margins, line spacing, font sizes), and a rules engine validates each property against a configurable submission template. Generates an annotated PDF report flagging every violation with location and fix.
03 — Impact
What shipped
- 40% reduction in research paper rejection rates among test users
- Annotated PDF reports pinpoint every compliance violation
- Configurable rule engine — swap submission templates per venue
- 3rd place at MINeD 2025 Hackathon