This project presents a Machine Learning–based system designed to analyze passport images and extract textual information using Optical Character Recognition (OCR), along with detecting whether a passport image is real or potentially fake using image feature extraction techniques.
The system allows users to upload a passport image through an interactive interface, processes the image, extracts important text fields, and performs classification using a trained machine learning model.
Project features include:
Uploading passport images through a simple interface
Image preprocessing for better recognition accuracy
Extracting text using OCR (Tesseract)
Feature extraction using Histogram of Oriented Gradients (HOG)
Passport classification using an SVM model
Displaying extracted results clearly to the user
Technologies and libraries used:
Python
OpenCV
pytesseract (OCR)
scikit-learn
matplotlib
This system can be extended and applied in real-world scenarios such as:
Identity verification systems
Airport security automation
Government document validation
Intelligent document processing solutions
The project demonstrates practical implementation of Computer Vision and Machine Learning techniques