This project involves the development of a deep learning model to detect emojis in images, predict their
class, and localize them using bounding box regression. The model is trained on 144x144 RGB images with
a dataset of 9 unique emojis. It is a multi-task learning setup, combining classification and localization tasks
using a convolutional neural network (CNN). The project also implements custom metrics like Intersection
over Union (IoU) for evaluating bounding box predictions and features a custom data generator for on-the-fly
example creation.