The Discovery of Discrimination

Dino Pedreschi and Salvatore Ruggieri and Franco Turini

The Discovery of Discrimination

Finding Hidden Biases in Decisions

Presented by:

Youssef Amjahdi & Abdelmounaim Sadir

GISMA University of Applied Sciences

For:

Professor Dr. Reza Babaei

Finding Direct Bias

Direct bias means treating someone unfairly because of who they are, like their gender or origin. This often happens in decisions made by computers, like giving out loans or job offers. These systems can accidentally learn unfair patterns from past data.

  • What it is: When a decision rule clearly shows a protected group is treated worse.
  • Why it matters: Even if not intended, the outcome is unfair.
  • How we find it: We look for specific rules that directly link a protected group to a negative outcome.

Finding Indirect Bias

Indirect bias is hidden unfairness. A rule might seem fair, but it still harms a specific group more than others, even if it doesn't mention their protected quality directly. This is tricky because the bias isn't obvious. A common example is 'redlining'.

  • What it is: A neutral rule (like based on address) that leads to unfair results for a protected group.
  • Why it matters: Bias can sneak in through indirect connections, even if no one means to be unfair.
  • How we find it: We link seemingly neutral information to protected groups using other data (like census info) to reveal the hidden bias.

How "Redlining" Works (Simplified)

Fair-Looking Rule

"People from ZIP 1234 are denied jobs."

99%

Denied Rate

 → 

Plus Local Info

"80% of people in ZIP 1234 are Black."

80%

Link

 → 

Hidden Bias Found

This "fair" rule actually harms Black job seekers.

>10x

More Likely to be Denied

This example shows how we use data and extra information (like census data) to find hidden bias. Even if the original decision records don't mention race, we can still see if a rule leads to unfair results for a specific group. This helps us understand bias better.

More About Bias & Fairness

If an Organization Says "No Bias"?

If we find a rule that seems unfair, an organization can sometimes explain why. This is called "argumentation."

  • Example: Fewer women might be hired for a specific job. But if that job truly needs heavy lifting that few women can do, then the rule might be fair.
  • Called a "business need": It's a real job requirement that explains the outcome, not an unfair bias.

What are "Affirmative Actions"?

We can also check for "affirmative actions." These are special steps taken to help groups that faced unfairness in the past.

  • Goal: To balance past unfairness and create more equal chances.
  • How we check: We look for rules where a protected group *receives* a benefit more often.
  • Why: This helps us see if these helpful programs are doing their job and making things fairer.

Paper Overview: "The Discovery of Discrimination"

Research Context

  • Authors: Dino Pedreschi, Salvatore Ruggieri, Franco Turini
  • Institution: University of Pisa, Italy
  • Published: 2008, KDD Conference
  • Citations: 1000+ (highly influential)

Key Contributions

  • • First systematic approach to discrimination discovery
  • • Introduction of α (alpha) and β (beta) measures
  • • Extended lift (elift) metric for bias quantification
  • • Framework for both direct and indirect discrimination

Why This Research Matters Today

2008 (Original Context):

  • • Data mining was becoming mainstream
  • • Legal frameworks for digital discrimination emerging
  • • Need for automated bias detection tools

2024/2025 (Current Relevance):

  • • AI/ML bias in hiring, lending, healthcare
  • • EU AI Act and algorithmic accountability
  • • Explainable AI and fairness requirements

Methodology Overview

1

Classification Rules

Extract decision rules from data

2

Bias Measurement

Calculate α, β, and elift metrics

3

Discrimination Detection

Identify problematic patterns

Real-World Applications

🏦 Financial Services

Credit Scoring:

Detecting bias against protected groups in loan approvals, interest rate setting, and credit limit decisions.

Insurance:

Identifying unfair pricing based on demographics, redlining in coverage areas.

Example: A bank's algorithm systematically denies loans to applicants from certain ZIP codes, indirectly discriminating against minorities.

👔 Human Resources

Recruitment:

AI resume screening tools showing bias against female names, ethnic minorities, or certain universities.

Promotion Decisions:

Performance evaluation systems that systematically underrate certain groups.

Example: Amazon's AI recruiting tool showed bias against women by downgrading resumes with words like "women's" (e.g., "women's chess club captain").

🏥 Healthcare

Treatment Recommendations:

Clinical decision support systems showing racial or gender bias in treatment suggestions.

Resource Allocation:

Hospital admission algorithms that may discriminate based on socioeconomic status.

Example: A widely-used healthcare algorithm was found to systematically recommend lower levels of care for Black patients than for white patients with the same health conditions.

⚖️ Criminal Justice

Risk Assessment:

COMPAS and similar tools used for bail, sentencing, and parole decisions showing racial bias.

Predictive Policing:

Algorithms that direct police resources based on historical data with embedded biases.

Example: ProPublica's investigation found that COMPAS incorrectly flagged Black defendants as future criminals at almost twice the rate as white defendants.

🔧 Technical Implementation Challenges

Data Challenges:

  • • Incomplete protected attribute data
  • • Historical bias embedded in training data
  • • Proxy variables that correlate with protected attributes

Algorithmic Challenges:

  • • Trade-offs between different fairness metrics
  • • Intersectionality (multiple protected attributes)
  • • Temporal bias (discrimination patterns changing over time)

Conclusion & Future Directions

✅ Key Takeaways

  • Systematic Detection: Discrimination can be automatically detected using data mining techniques
  • Hidden Bias: Indirect discrimination is often more prevalent and harder to detect than direct bias
  • Quantifiable Metrics: Bias can be measured objectively using α, β, and elift measures
  • Legal Compliance: These tools help organizations meet anti-discrimination regulations

🚀 Future Research Directions

  • Intersectionality: Better handling of multiple protected attributes simultaneously
  • Causal Inference: Moving beyond correlation to understand causal mechanisms of bias
  • Dynamic Fairness: Addressing how bias evolves over time and contexts
  • Counterfactual Fairness: What would have happened in a fair world?

⚖️ Current Legal & Regulatory Landscape

🇪🇺 European Union

  • • EU AI Act (2024)
  • • GDPR algorithmic decision-making rights
  • • Digital Services Act

🇺🇸 United States

  • • Equal Credit Opportunity Act
  • • Fair Housing Act
  • • State-level AI bias auditing laws

🌍 Global Trends

  • • Algorithmic accountability frameworks
  • • Mandatory bias testing
  • • Right to explanation

🛠️ For Data Scientists & Researchers

Best Practices:

  • • Always test for bias before model deployment
  • • Use multiple fairness metrics, not just accuracy
  • • Document dataset biases and limitations
  • • Implement continuous monitoring for bias drift
  • • Involve domain experts and affected communities

Tools & Libraries:

  • Python: Fairlearn, AIF360, Themis-ml
  • R: fairness, fairmodels
  • Platforms: Google What-If Tool, IBM Fairness 360
  • Open Source: Aequitas, FairTest

The Future of Fair AI

Discrimination discovery is not just about detecting bias—it's about building a more equitable future through responsible data science.

"The price of freedom is eternal vigilance" - and in the age of AI, vigilance means constant monitoring for algorithmic bias.