The diagram above illustrates the end-to-end flow of the SmartLend system:
Client Layer: External applications send loan application data via REST API requests with API key authentication.
API Gateway (FastAPI):
Receives incoming requests at the /predict endpoint
Validates input data against Pydantic schemas (RequestDataModel)
Enforces API key security via the Loan-API-Key header
Data Processing Pipeline:
Raw loan application data passes through the fitted preprocessor.pkl
Handles categorical encoding, numerical scaling, and feature transformation
Outputs processed features aligned with the trained model's expectations
SHAP Feature Alignment : The pipeline ensures the outputted processed features precisely align with the specific subset of features the model was originally trained on (which were selected based on SHAP importance during the training phase).
Prediction Engine (LightGBM):
Loads the trained lgbm_model.pkl for inference
Computes default probability and applies business-configured threshold (default: 0.45)
Classifies loans as DEFAULT or NO DEFAULT
Explainability Module (SHAP):
Generates SHAP values for each prediction using shap_explainer.pkl
Identifies top 5 risk factors (positive SHAP impact → increases default probability)
Identifies top 5 protective factors (negative SHAP impact → decreases default probability)
Response Builder:
Constructs a structured JSON response (ResponseDataModel)
Includes prediction result, probability, risk level, and feature explanations
Returns interpretable insights for downstream decision-making