The emergence of algorithmic decision-making systems has become a hallmark of the 21st century, influencing both public and private sectors. These algorithms are utilized for a variety of purposes, including assessing loan eligibility, screening job applications, shaping policing strategies, and distributing social welfare benefits, thereby automating and enhancing complex processes. ‘Although these systems leverage extensive datasets and sophisticated mathematical models, they are often regarded as impartial and efficient tools. Nonetheless, the design and data underlying these algorithms can introduce the risk of algorithmic bias.(O'Neil, 2016) ‘Algorithmic bias is defined as the systematic and unjust outcomes produced by computer algorithms, stemming from flawed or biased training data or the design choices made by their developers’ (O'Neil, 2016)
‘For example, in law enforcement, predictive algorithms based on historical crime data—which may already reflect biased policing practices—can result in the disproportionate targeting of marginalized communities (Danaher, 2019). In the realm of recruitment, algorithms trained on datasets that lack diverse representation may unintentionally disadvantage qualified candidates from underrepresented backgrounds. Similarly, credit scoring algorithms that utilize data reflecting existing economic inequalities can reinforce cycles of financial exclusion. Additionally, social welfare algorithms that allocate resources may exhibit bias against certain demographic groups if the foundational data or design emphasizes specific criteria that are not universally accessible. In India, these issues are further complicated by the country's intricate social dynamics, including caste, religious, and linguistic diversity, which can be easily mirrored and intensified by biased algorithms. The growing dependence on these systems calls for a thorough evaluation of their potential to worsen existing inequalities.
What does marginalized community mean?
‘A marginalized community refers to a group of people who experience social exclusion, discrimination, and limited access to resources and opportunities due to unequal power relationships across economic, political, social, and cultural dimensions (National Collaborating Centre for Determinants of Health). These groups are often pushed to the periphery of society and face systemic disadvantages based on characteristics such as race, ethnicity, gender, sexual orientation, socioeconomic status, disability, or religion (ACTEC)’
‘Marginalized communities facing algorithmic bias are groups more likely to experience unfair outcomes from algorithms due to underrepresentation, reflected societal biases in training data, or a failure to consider their specific needs’.(O'Neil, 2016; Eubanks, 2018; Benjamin, 2019)
What is Algorithmic Bias?
Algorithmic bias denotes the systematic and unjust distortion of results that favors or disadvantages specific individuals or groups. It is essential to recognize that this bias typically does not arise from an intentional desire to discriminate. Rather, it emerges from a combination of factors inherent in the AI development process.
There are several primary avenues through which bias can infiltrate AI systems. One major source is biased training data. Machine learning algorithms derive patterns and relationships from the data provided to them. If this data reflects existing societal biases or historical discrimination, the algorithm is likely to perpetuate and even exacerbate these biases. This issue is particularly pronounced in the Indian context. For example, datasets that inadequately represent marginalized groups, such as Dalits or Adivasis, can result in algorithms that perform poorly or discriminate against these communities. Additionally, language datasets that disproportionately favor certain languages and dialects may lead to biased outcomes in natural language processing applications.
Another pathway for bias introduction is through the design of the algorithms themselves. The decisions made by developers regarding feature selection, variable weighting, and algorithm choice can unintentionally embed bias. For instance, if an algorithm intended to assess loan eligibility disproportionately prioritizes factors like land ownership or family background, it may unfairly disadvantage individuals from marginalized communities who have historically lacked access to such resources.
Thirdly, feedback loops can intensify algorithmic bias. When a biased algorithm is implemented, its outputs can generate new data that reinforces the initial bias, resulting in a self-sustaining cycle of discrimination. ‘For instance, a biased policing algorithm may result in heightened surveillance and arrests in specific neighborhoods, which subsequently strengthens the algorithm's assumption that those areas are prone to high crime rates.’(O'Neil, 2017, p. 84; Richardson, Schultz and Crawford, 2019, pp. 195–196). This issue is particularly concerning in India, where certain communities are disproportionately represented in crime statistics due to systemic issues.
Various forms of bias have been recognized in machine learning:
- Statistical bias occurs when the training data fails to accurately represent the real-world population, leading to distorted probabilities and predictions.
- Label bias emerges when the labels used for training the algorithm are biased themselves, reflecting pre-existing prejudices or inaccuracies.
- Measurement bias arises when the techniques employed to gather and assess data systematically favor certain groups over others.
‘In India, the tech sector is predominantly comprised of upper-caste, urban, English-educated men, with women, religious minorities, and marginalised caste groups significantly underrepresented, and people from rural areas largely excluded’. (D’Cruz and Noronha, 2016; Shah, 2017; Upadhya, 2009)
Its Impacts in the real world
The effects of algorithmic bias extend beyond theoretical discussions; they have tangible repercussions for individuals and communities. Numerous case studies demonstrate how biased algorithms can sustain and intensify existing inequalities, particularly impacting marginalized populations. In India, these biases can emerge in ways that mirror and worsen the country's distinct social dynamics.
a. Criminal Justice
In the United States, COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) algorithm exhibits racial bias, disproportionately labeling Black defendants as higher risk for recidivism compared to white defendants with similar histories. This leads to higher false positive rates for Black individuals and higher false negative rates for white individuals. ‘The bias likely stems from biased training data and proxy variables reflecting systemic inequalities. Consequently, this can result in harsher sentencing and perpetuate racial disparities in the criminal justice system.’
Although the COMPAS algorithm is specific to the United States, similar challenges are present in India's criminal justice framework. The implementation of predictive policing algorithms could easily reflect and exacerbate existing biases within law enforcement. For instance, if historical crime data disproportionately captures arrests in specific neighborhoods or among particular caste groups, algorithms developed from this data are likely to continue this bias. This poses a risk of heightened surveillance and targeting of already marginalized communities, thereby deepening social inequalities. Additionally, ‘facial recognition technology, which is increasingly utilized in India, has been found to demonstrate bias, showing lower accuracy rates for individuals with darker skin tones, which could result in wrongful identifications and arrests.’
Surveillance technology reinforces pre-existing biases within the technological codes, making it more likely to target racial minorities. ‘In India, a 2021 report by the Centre for Legal Policy suggested that the Delhi Police’s use of facial recognition technology (FRT) could disproportionately target Muslims.’
‘In India, a man’s life changed due to AI powered surveillance used by the Delhi Police for investigation during protests. Arrests were made using advanced technologies, including video and image enhancement tools (Amped FIVE by Amped Software) and facial recognition software (AI Vision by Innefu Labs), the news report by The Wire’s report read’
Speaking to The Wire, he also described systemic discrimination against Muslim prisoners in jail. He alleged that they were routinely humiliated, asked their names, and if identified as Muslim, were assigned degrading tasks. "We were forced to scrub toilets and mop floors with our bare hands, denied even basic cleaning tools like wipers," he recalled. The abuse extended beyond physical violence. "I was constantly humiliated, called a terrorist, and subjected to unbearable psychological torment. I spent countless days crying and praying – as did my mother," he alleged.
Further, an Internet Freedom Foundation report highlighted that ‘despite the limitations of such technology, including the inaccuracy of biometric facial recognition and the lack of diversity in its databases leading to racial bias, the Delhi Police considers an accuracy threshold of merely 80% sufficient to positively identify suspects.’
b. Hiring Algorithms
The issue of bias in hiring algorithms is equally pertinent in India. When companies employ AI-driven tools to screen resumes or conduct interviews, and these tools are trained on data that reflects existing biases in the job market (such as the underrepresentation of women or certain caste groups in specific professions), they may inadvertently reinforce these biases. For example, an algorithm might disadvantage resumes that list experience from institutions predominantly attended by students from marginalized communities’(Dastin, 2018)
One such example is reflected in LedBy Foundation’s Research in Hiring Bias in entry level jobs for Muslim Women. ‘Their study involved sending out identical resumes with Hindu and Muslim-sounding names for entry-level jobs and found a significant disparity in callback rates, indicating a strong bias against Muslim women in the hiring process. This research highlights how deeply ingrained biases can affect employment opportunities, even when candidates have the same qualification’.
c. Healthcare Algorithms
In the Indian healthcare system, the presence of algorithmic bias can lead to significant repercussions. When algorithms designed for resource allocation or disease diagnosis are developed using data that does not accurately reflect the country's diverse demographics, they risk resulting in inequitable treatment. For instance, a diagnostic algorithm predominantly based on data from urban populations may struggle to effectively identify conditions in rural patients, where healthcare access and disease prevalence can vary greatly (Obermeyer et al., 2019)
‘Predictive COVID‑19 models in India were found to have substantial inaccuracies over the long term due to outdated or incomplete demographic inputs, with only short‑term forecasts showing reasonable reliability.’
d. Social Welfare Programs
In India, algorithms are increasingly utilized to assess eligibility for social welfare initiatives. However, if these algorithms are not meticulously crafted and regularly evaluated, they may reinforce existing disparities. For example, an algorithm that allocates benefits based on criteria such as land ownership or formal employment could disadvantage marginalized groups, who are often less likely to own land or are more likely to be involved in informal employment. Additionally, AI models relying on outdated census data in India risk producing inaccurate or biased results due to unrepresentative demographic information.
4. Legal, Ethical, and Regulatory Landscape
The increasing recognition of algorithmic bias has prompted legal, ethical, and regulatory responses, though these responses vary significantly across jurisdictions, including India.
The General Data Protection Regulation (GDPR) in the European Union includes Article 22, which grants individuals the right not to be subject to decisions based solely on automated processing, including profiling, that produces legal effects or similarly significant effects on them. While Article 22 provides some protection against purely automated decision-making, its scope is limited, and it does not explicitly address the issue of algorithmic bias.
In India, there is no specific legislation that directly addresses algorithmic bias. However, the Digital Personal Data Protection Act, 2023, aims to regulate the processing of digital personal data and includes provisions related to fairness and transparency, which could have implications for mitigating bias. The Act emphasizes data minimization, purpose limitation, and data accuracy, which are important principles in preventing biased outcomes. However, some scholars argue that the DPDP Act may not be sufficient to address the complexities of algorithmic discrimination, particularly in contexts where sensitive social categories like caste and religion are involved.
The DPDPA 2023 fails to fully address the hazards associated with automated decision-making and profiling, and consent methods are unable to keep up with AI's massive data requirements. Clear standards for cross-border data transfers for AI and how they interact with current regulations are also lacking, as are transparency and responsibility for AI systems. In order to successfully strike a balance between innovation and privacy protection, the DPDPA needs more precise AI-focused measures. Otherwise, it would be necessary to enact a dedicated legislation to regulate AI, similar to the EU AI Act in Europe, which specifically addresses privacy concerns arising from artificial intelligence.
The Indian Constitution, with its emphasis on equality and non-discrimination (Articles 14, 15, and 16), provides a framework for challenging discriminatory practices, including those perpetuated by algorithms. However, applying these constitutional principles to algorithmic decision-making can be challenging, particularly in cases where the bias is indirect or unintentional.
The ethical debate surrounding algorithmic bias centers on principles such as accountability, transparency, and explainability. Accountability requires that those who design and deploy algorithms be held responsible for their outcomes. Transparency demands that algorithms be open to scrutiny, so that their decision-making processes can be understood. Explainability calls for algorithms to provide clear and understandable reasons for their decisions, particularly when those decisions have significant consequences for individuals.
Various organizations and bodies have proposed AI ethics guidelines. India has also made efforts in this direction, with NITI Aayog releasing a National Strategy for Artificial Intelligence that outlines principles for responsible AI development, including fairness and non-discrimination. However, these guidelines are often non-binding, and their effectiveness depends on their implementation and enforcement.
Disproportionate Impact on Marginalized Communities & how to mitigate the bias.
Algorithmic bias disproportionately affects marginalized communities, exacerbating existing inequalities and creating new forms of discrimination. In India, this is particularly concerning due to the deeply entrenched nature of social hierarchies and discrimination.
Caste, religious, and gender biases are often reflected in the design and deployment of technology in India. Algorithms can perpetuate and amplify these biases, leading to unequal outcomes in various domains. For example, language processing algorithms may be biased towards dominant languages, disadvantageous to those who speak marginalized languages or dialects. Facial recognition technology may exhibit lower accuracy rates for individuals from certain ethnic groups or regions.
Intersectionality plays a crucial role in understanding the impact of algorithmic bias in India. Individuals who belong to multiple marginalized groups, such as Dalit women, Muslim transgender individuals, or persons with disabilities from lower socioeconomic backgrounds, often experience the compounding effects of discrimination. They may face unique challenges and barriers due to the intersection of their identities, which are often overlooked in the design of algorithmic systems.
Algorithmic bias can lead to unequal access to justice, healthcare, and economic opportunities in India. As discussed earlier, biased algorithms can result in wrongful arrests, denial of loans, and limited access to healthcare, perpetuating cycles of poverty and marginalization.
The digital divide also contributes to algorithmic exclusion in India. Marginalized communities often have limited access to technology and digital literacy, making them less aware of the potential risks of algorithmic bias and less able to challenge unfair outcomes. This lack of access can further disadvantage them in an increasingly digital world. Addressing algorithmic bias in India requires a multi-faceted approach involving technical, legal, and social interventions, tailored to the specific context of the country.
Mandatory bias audits and algorithmic impact assessments are crucial for identifying and mitigating potential sources of bias. These assessments should be conducted throughout the algorithm lifecycle, from data collection and design to deployment and monitoring. In India, these audits should specifically address the potential for bias related to caste, religion, gender, and other relevant social categories.
Diverse and inclusive tech design teams are essential for creating algorithms that are fair and equitable. Tech companies and government agencies in India should prioritize hiring and retaining individuals from diverse backgrounds, including different castes, religions, genders, linguistic groups, and socioeconomic statuses.
Transparent algorithmic decision-making is necessary to ensure accountability and enable individuals to challenge unfair outcomes. This includes providing individuals with a right to explanation, or at least some understanding of how an algorithm arrived at a particular decision, especially when that decision has significant consequences.
Stronger data protection frameworks are needed to regulate the collection, use, and sharing of data used to train algorithms. The Digital Personal Data Protection Act, 2023 (DPDPA) primarily focuses on regulating the processing of personal data, and while a crucial step, it exhibits limitations in directly addressing algorithmic bias. Its emphasis on data privacy, individual rights, and obligations of data fiduciaries does not explicitly target the design, functioning, or discriminatory outcomes of algorithms. The absence of definitions for "algorithmic bias" and related concepts, alongside a lack of mandates for bias audits or impact assessments, highlights this gap. A data protection regulation is inadequate to address the concerns of Algorithmic bias as the law might scratch on the surface but would not be able to provide rules that will look up to the functioning of the AI models and that will cover the AI guardrails like transparency, fairness, accountability and accuracy.
Although principles like data accuracy and purpose limitation could indirectly mitigate bias, the Act lacks specific mechanisms to ensure algorithmic fairness. The complexity of identifying and proving algorithmic bias, often stemming from intricate data and algorithmic interactions, necessitates specialized expertise potentially beyond the scope of a general data protection law. Furthermore, exemptions for government agencies and the focus on "personal data" might overlook biases in anonymized datasets and limit the Act's reach. The absence of a dedicated AI ethics regulatory body further suggests that the DPDPA alone may not suffice to comprehensively tackle algorithmic discrimination. Addressing this challenge effectively likely requires supplementary measures focusing explicitly on the ethical design and deployment of AI systems (The Digital Personal Data Protection Act, 2023 (India)).
Though some civil society organizations (CSOs) empower individuals against algorithmic bias by providing digital literacy training, especially to marginalized communities, and conducting awareness campaigns about the risks and implications of biased algorithms. They also advocate for policy changes based on their research into local contexts of algorithmic discrimination. Though more emphasis is required to be made in ways of public awareness and digital literacy are essential for empowering individuals to understand and challenge algorithmic bias. Governments, educational institutions, and more CSOs in India should invest in programs that promote digital literacy and raise awareness about the potential risks of biased algorithms, particularly among marginalized communities.
Engaging communities in algorithm development through participatory design and advisory boards is crucial for ensuring technologies reflect local values and needs in India, addressing bias and privacy concerns.
Real-life examples in India, like Digital Green's farmer videos and Internet Saathi's digital literacy program, demonstrate the power of community involvement in technology development. By including local users in the design process, these initiatives ensure that technologies, including algorithms and data handling practices, are relevant, trusted, and address specific community needs and privacy concerns. This participatory approach leads to more effective and ethically sound solutions compared to top-down development.
Conclusion
Algorithms are not neutral; they reflect the values, assumptions, and biases of those who create them and the data they are trained on. While designed to bring efficiency and objectivity, algorithmic systems often reflect and amplify existing social inequalities, disproportionately affecting marginalized communities. In India, this is particularly concerning due to the country's complex social landscape. Without careful oversight, these systems can deepen structural inequalities, perpetuate discrimination, and erode trust in institutions. There is an urgent need for legal frameworks, ethical guidelines, and inclusive design practices, tailored to the Indian context, to prevent algorithmic harm and ensure that these technologies are used to promote fairness, justice, and equality. The fight against algorithmic bias is not merely a technical challenge; it is a social and ethical imperative for India to uphold its constitutional values and build a more equitable future.