PORTFOLIO

ML: Pinnacle Bank Credit Card Retention Model

Ensemble machine learning model to predict credit card churn with attention to class imbalance and overfitting controls.

Machine Learning Scikit-learn Ensemble Learning Gradient Boosting Decision Trees Random Forest Imblearn SMOTE XGBoost


EXECUTIVE SUMMARY

Developed and compared multiple ensemble machine learning models to predict credit card attrition for Pinnacle Bank, delivering a high-recall solution to identify at-risk customers and support targeted retention campaigns.


Goal:

Build a predictive model that flags customers likely to churn so retention teams can intervene and reduce revenue loss.


Approach:

Conducted extensive EDA on 10,000+ customer records and addressed class imbalance with resampling techniques. Trained and evaluated several powerful classifiers, including Random Forest, Gradient Boosting, and XGBoost. The evaluation prioritized a balance of precision and recall to effectively identify true churners.


Outcome:

The XGBoost Classifier provided the best performance, achieving 85% recall and 92% precision on the test set. This model outperformed the others and provides an actionable framework for proactive retention strategies.

THE CHALLENGE

Customer attrition is a persistent challenge for credit card issuers. Even small increases in churn rates have significant revenue impacts given the high customer acquisition costs. Pinnacle Bank required a model that could reliably predict which customers were likely to close their accounts so the bank could focus retention resources efficiently.

MY APPROACH

1. Data Preparation & Feature Engineering:

  • Worked with a structured dataset of over 10,000 customer records containing demographic, transactional, and account activity variables.
  • Conducted EDA to identify key churn drivers and visualize relationships within the data.
  • Handled the inherent class imbalance using resampling techniques to ensure the models could effectively learn from the minority (churn) class.

2. Model Design & Training:

  • Trained and rigorously evaluated three powerful ensemble models: Random Forest, Gradient Boosting, and XGBoost.
  • Focused the evaluation on finding the best balance between precision (minimizing false positives) and recall (capturing the highest proportion of true churners).

3. Model Evaluation:

  • Compared models on precision, recall, and F1-score to determine the most robust solution.
  • Key test set results for the "Churn" class:
    • Random Forest: 82% Recall, 91% Precision
    • Gradient Boosting: 84% Recall, 92% Precision
    • XGBoost: 85% Recall, 92% Precision (Best Performer)

PERFORMANCE & VALIDATION

  • Best Model: The XGBoost Classifier delivered the strongest results with 85% recall and 92% precision, demonstrating excellent generalization on unseen test data.
  • The model successfully identifies the vast majority of customers who are likely to churn while maintaining a high level of accuracy in its predictions.
  • This confirms that prioritizing a high-recall model allows the bank to minimize the risk of missing at-risk customers, which is crucial for effective retention.

IMPACT & BUSINESS RELEVANCE

  • Retention ROI: The model enables Pinnacle Bank to proactively intervene, reducing customer attrition and protecting long-term revenue.
  • Operational Efficiency: Improves the targeting of retention campaigns, focusing resources on customers most likely to leave.
  • Scalability: The methodology is generalizable to other financial institutions or subscription-based businesses facing similar churn challenges.