Fairness and Bias in Credit Risk Models
Part 4 of the Modern Model Management Series
How to ensure credit risk models are fair and non-discriminatory presents an ongoing challenge and opportunity to lenders. With heightened regulatory attention and social awareness, lenders recognise the importance of avoiding the use of models that may exhibit undesirable bias, placing certain demographic and groups with protected characteristics at a systematic disadvantage in terms of their access to credit. The challenge is multifaceted as bias can be subtle, deeply embedded in historical data, and difficult to detect. Also problematic is that well-intentioned efforts to create "fair" models can sometimes inadvertently introduce new forms of bias or reduce model performance in ways that ultimately harm the very populations they were designed to protect.
The Hidden Sources of Bias
Bias in credit risk models rarely stems from malicious intent. Instead, it emerges from historical patterns, incomplete data, and flawed assumptions that seemed reasonable during model development. Credit risk models trained on historical lending data inevitably reflect past discriminatory practices. Even when these practices were legal, they create patterns that models learn to perpetuate. Geographic bias exemplifies this challenge, where models may learn that certain post codes are "high risk" without recognising that this correlation stems from historical ‘redlining’ rather than genuine creditworthiness factors.
Perhaps more problematic is proxy discrimination, where models achieve biased outcomes through seemingly neutral variables that correlate with protected characteristics. The challenge is that these variables may also have genuine predictive value, creating a complex problem between predictive power and discriminatory impact.
Representation Gaps
Credit risk models can suffer from significant representation gaps where certain populations are underrepresented in the training (development) data. This leads to models that perform poorly for these groups - not due to inherent differences in creditworthiness, but insufficient data to learn appropriate patterns. When deployed to serve underrepresented populations, these models' predictions may not accurately reflect the actual risk profiles, leading to inappropriate risk assessments.
The Fairness Paradox
The quest for fairness has spawned numerous mathematical definitions, each capturing different aspects of what is considered "fair". However, research has demonstrated that many fairness criteria are mathematically incompatible - satisfying one necessarily violates others.
Statistical parity requires equal approval rates across demographic groups, which seems intuitively fair but can be problematic if genuine risk differences exist between groups. Equalized odds allows for different approval rates if justified by risk profiles but ensures qualified applicants from all groups have equal approval chances.
Individual fairness suggests similar individuals should receive similar treatment regardless of group membership, while group fairness focuses on equitable treatment across demographic groups but may allow for individual disparities within groups.
This impossibility of perfect fairness doesn't mean abandoning this goal, but rather being explicit about trade-offs and making conscious choices about which aspects to prioritise. Organisations must move beyond seeking perfect solutions to making informed decisions about acceptable compromises.
Possible solutions
Organisations have several approaches for bias mitigation, from data pre-processing to post-modelling adjustments. Each has trade-offs in effectiveness, interpretability, and model performance impact.
Pre-processing Approaches attempt to remove bias from training data before model development through re-weighting samples, removing biased features, or using adversarial debiasing techniques. While allowing standard modelling techniques post-mitigation, aggressive pre-processing may significantly reduce model performance.
In-Processing Methods modify model training to incorporate fairness constraints through fairness-aware regularisation or multi-objective optimisation balancing accuracy with fairness. These may achieve better trade-offs than pre-processing but require specialised techniques.
Post-Processing Techniques adjust model outputs through calibration adjustments or threshold optimisation. While applicable to existing models without re-training, they may not address the underlying bias.
Conclusion
Addressing bias in credit risk models represents both ethical business and opportunity. Organisations viewing fairness as merely a compliance burden miss opportunities to build more robust models, serve new markets, and establish competitive advantages.
Fairness should be recognised as a continuous improvement journey rather than destination. Organisations that embed fairness considerations into their modelling and risk culture will build credit risk systems that are not just more fair, but more robust, inclusive, and capable of serving all of society.
Paragon Business Solutions is currently partnering with the University of Southampton to explore and research these possible solutions further. At the Edinburgh Credit Scoring conference in August, we will present and discuss insights gained thus far from an ongoing project in which we set out to explore a series of different options for incorporating fairness into the various stages of credit scoring. In so doing, we seek to consider the role of reject inference and data debiasing, and trial different solutions on real-life application scoring data. Our goal is to formulate suggestions for analyses to include in the various steps of scorecard development and validation.
This article is part of a series exploring modern approaches to credit risk management. For more insights on how these principles can be applied in your organisation, we'd be happy to discuss your specific challenges and objectives. Please get in touch.