EXECUTIVE SUMMARY
Developed and compared multiple ensemble machine learning models to predict credit card attrition for Pinnacle Bank, delivering a high-recall solution to identify at-risk customers and support targeted retention campaigns.
Goal:
Build a predictive model that flags customers likely to churn so retention teams can intervene and reduce revenue loss.
Approach:
Conducted extensive EDA on 10,000+ customer records and addressed class imbalance with resampling techniques. Trained and evaluated several powerful classifiers, including Random Forest, Gradient Boosting, and XGBoost. The evaluation prioritized a balance of precision and recall to effectively identify true churners.
Outcome:
The XGBoost Classifier provided the best performance, achieving 85% recall and 92% precision on the test set. This model outperformed the others and provides an actionable framework for proactive retention strategies.
THE CHALLENGE
Customer attrition is a persistent challenge for credit card issuers. Even small increases in churn rates have significant revenue impacts given the high customer acquisition costs. Pinnacle Bank required a model that could reliably predict which customers were likely to close their accounts so the bank could focus retention resources efficiently.
MY APPROACH
1. Data Preparation & Feature Engineering:
- Worked with a structured dataset of over 10,000 customer records containing demographic, transactional, and account activity variables.
- Conducted EDA to identify key churn drivers and visualize relationships within the data.
- Handled the inherent class imbalance using resampling techniques to ensure the models could effectively learn from the minority (churn) class.
2. Model Design & Training:
- Trained and rigorously evaluated three powerful ensemble models: Random Forest, Gradient Boosting, and XGBoost.
- Focused the evaluation on finding the best balance between precision (minimizing false positives) and recall (capturing the highest proportion of true churners).
3. Model Evaluation:
- Compared models on precision, recall, and F1-score to determine the most robust solution.
- Key test set results for the "Churn" class:
- Random Forest: 82% Recall, 91% Precision
- Gradient Boosting: 84% Recall, 92% Precision
- XGBoost: 85% Recall, 92% Precision (Best Performer)
PERFORMANCE & VALIDATION
- Best Model: The XGBoost Classifier delivered the strongest results with 85% recall and 92% precision, demonstrating excellent generalization on unseen test data.
- The model successfully identifies the vast majority of customers who are likely to churn while maintaining a high level of accuracy in its predictions.
- This confirms that prioritizing a high-recall model allows the bank to minimize the risk of missing at-risk customers, which is crucial for effective retention.
IMPACT & BUSINESS RELEVANCE
- Retention ROI: The model enables Pinnacle Bank to proactively intervene, reducing customer attrition and protecting long-term revenue.
- Operational Efficiency: Improves the targeting of retention campaigns, focusing resources on customers most likely to leave.
- Scalability: The methodology is generalizable to other financial institutions or subscription-based businesses facing similar churn challenges.