تفاصيل العمل

Transformer-Based English-to-French Translation Model

This project implements a custom Transformer model from scratch to perform machine translation from English to French using TensorFlow and Keras. The entire workflow is contained within a Jupyter Notebook, showcasing data preprocessing, model architecture, training, and inference.

Key Features:

Custom Transformer Implementation: Manually built the encoder-decoder architecture using multi-head attention, positional encoding, and feed-forward layers without relying on high-level APIs.

Dataset Preprocessing: Loaded parallel English-French sentence pairs, applied tokenization using TextVectorization, and padded sequences for consistent input lengths.

Training Pipeline: Trained the model using the Adam optimizer and sparse categorical crossentropy, with masking techniques to handle padding in loss calculation and attention.

Inference Mechanism: Developed a custom translate function that uses greedy decoding to generate French translations from English input sentences.

Performance Insight: Model demonstrates the ability to learn simple translations and aligns closely with the theoretical structure of Vaswani et al.'s "Attention is All You Need."

بطاقة العمل

اسم المستقل
عدد الإعجابات
0
تاريخ الإضافة
المهارات