Hybrid PDF → Excel Invoice Extraction (OCR + Digital)

تفاصيل العمل

I built a hybrid pipeline that auto-detects PDF type and selects the best path:

Digital PDFs: text is parsed directly with layout-aware extraction.

Scanned PDFs: converted to images and processed via Tesseract OCR (Poppler for rendering), then reassembled into searchable PDFs.

اسم المستقل

عدد الإعجابات

عدد المشاهدات

تاريخ الإضافة

04/09/2025

تاريخ الإنجاز

28/08/2025

المهارات