I will build a data normalization desktop tool using Python + Tkinter that cleans and standardizes vendor/company/facility records from TXT/TSV files, validates vendors against a reference list, filters special/invalid characters, and normalizes records using a trusted LUT (lookup table) with both exact matching and similarity (fuzzy) matching.
This is perfect for messy operational datasets where vendor names and facility details are inconsistent and you need a clean, standardized output for reporting or database loading.
Normalization_Tool_V2_GUI
What you’ll get
Tkinter GUI to select input/output paths (no CLI needed)
Vendor validation against an approved manufacturer/vendor list (Excel)
Special-character / invalid-character detection
Support for “Mixed / Not Clear / Re-Ask” exception lists (Excel)
LUT-based normalization:
Exact match
Similarity match using SequenceMatcher thresholds
Export:
Clean normalized output TXT/TSV (UTF-8)
Checked LUT export
Log output shown live inside the GUI