What are we trying to do?
In drug discovery, a drug molecule needs to bind to a protein target in the body to have an effect. The strength of this binding is called affinity — measured here as pKd (a log-scale score).
Higher pKd = stronger binding = more effective drug
Our dataset has 68 drugs × 379 protein targets = 30,056 drug-target pairs
Each pair has a measured pKd value we want to predict
Our goal: Build a machine learning model that, given a drug's molecular structure and a protein's amino acid sequence, predicts how strongly they bind.