Chapter 124: Humans In The Loop

"AI is like a child."

Last Week, some of us at the DEF Team were able to catch a screening of Humans In The Loop at the Habitat Film Festival in Delhi, a movie on the labour that builds everyday AI tools. This week's TypeRight is partly a review of the movie directed by Aranya Sahay.


"Human-in-the-loop (HITL) generally refers to the need for human interaction, intervention, and judgment to control or change the outcome of a process, and it is a practice that is being increasingly emphasized in machine learning, generative AI." The AI tools that are available to us are not just the result of engineers and coders sitting at offices in Silicon Valley or Bengaluru, powering the code infrastructure are an army of labourers who sort and label raw data that will train and teach the AI. Data Labelling is an important, process that is the backend of the AI infrastructure, as the director points out in an interview with Konya Tomar. And quite a bit of this work happens in India. Data from 2021 shows around 70,000 people employed in the country and potentially hitting a million by the end of this decade.

“One day, I realised this is not for gaming. We are teaching machines to see like a human. We teach a robot how to understand things on their own. But no one knows that these human processes are happening in the background, and they don’t know that such a workforce is coming from our rural area. They just see the final application.”

(From Karishma Mehrotra's Interview with her respondent, in her article on Fifty-Two.)

Sonal Madhushankar plays Nehma, an Adivasi woman from Jharkhand, who lands a job in an AI data labelling centre. She has to balance between returning to her village after divorcing an inter-caste marriage, facing the community's ostracisation, taking care of her one-year-old baby, and fighting for the custody of her older daughter.

Nehema bringing her child to work (Still from Trailer)

In three parts, Sahay covers several nuances of this new employment and the technology it powers - both an issue of labour exploitation in what could be termed as digital sweatshops but also the independence this grants to women workforce; the issue of access which is considered ubiquitous; and the question of epistemic violences, where knowledge has to be seen beyond the western gaze.

AI is like a Child

Nehma is seen sifting through videos, labeling joints, and body parts that teach the AI model to recognise similar patterns. Just as her fed data teaches the child on the screen to get up and walk, she returns in the evening to find her one-year-old slowly taking their first steps.

Labelling (Still from Trailer)

Just as she has joy in seeing her one-year-old walk, Nehema is intrigued by her engagement with the software. The relation is somewhat playful, like how Seth Giddings tries to reformulate the history of the development of techniques in play, prioritizing the imaginative element of play over the instrumental notion of a tool.

But as the environment, behavior, and imitation shape a child, the engagement also shapes AI.

Pillars of Data and AI Justice

The Global Partnership on AI in their research on Data Justice identifies six pillars that form the cornerstones of a Just system (more or less similar markers form the basis of most other studies too) - and just to point out a few from them: Democratize data and data work and embrace and acknowledge multiple forms of knowledge. Just as AI is ciruous like a child, the AI also moulds like one. "If you teach it the wrong thing, it will learn the wrong thing."

Labelling pests and weeds in an agricultural detection software, the knowledge systems clash. Are all worms pests?

Stock Image of Worms

No, as Nehema shows, there are some worms that only eat the rotten parts of the leaf, which actually save the rest of the plant from rotting. But this is not something the white man overseeing the project nor the middle manager knows; Nehema or her co-workers might know because of the way they have experienced nature - this is part of their everyday knowledge. Wrongly trained data could kill and have adverse effects- but the data labelers are not considered experts. Beyond their skills of operating a computer, and despite several of the real-life labelers being graduates themselves, they would be categorized as a level of unskilled labor similar to the BPO/Call Center workers of the previous decade. Diversity of knowledge or the vantage points of these perspectives are erased in mass-outsourced data work.

Identities and Imaginations

When the Black Panther movie had come out back in 2018, former first lady of the US, Michelle Obama tweeted how "young people will finally see superheroes that look like them on the big screen." Indian Cinema has come to represent their less visible communities even lesser. Training data reinforces cultural misrepresentations and stereotypes. These coded caste and race shows its way to the outputs, as this article by Dhiraj shows - AI changed my surname. That’s how I found out how caste-coded it is.

Still from the movie

When Nehema prompts her AI engine to generate her image, she gets instead a mix of white skinned women (keyword: beautiful) wearing Native American gowns and feathered headdresses (keyword: tribal). Quite directly, the movie also points out this nuance of identities that are keyed in and represented. By engaging playfully with the algorithm, feeding it new data and images her daughter has clicked, the AI now learns.

Labour: Material, Emotional, Epistemic

While important, the issue is beyond the representation in AI's mind while generating content. Several issues are at play here, and some of them come through more subtler than the above ones, even though they are not emphasized.

It has been 20 years since Amazon launched their Mechanical Turk crowd-work platform that enlists workers on an online gig-basis, for on-demand tasks like labelling. Today, companies have realized this work is cheaper in the global south's smaller towns, where the marginally increasing levels of higher education were not met with employment opportunities. The second half of the previous century told us women have 'nimble-fingers,' and are therefore increasingly hired for work like garment export that has repetitive tasks - however it probably has to do more with precarity, docility and discipline that patriarchy brings. While allowing Nehema to be independent and look after her children, the nature of the employment gives little opportunities to move further or question things, teaching an ML tool how to replace the worker. While the movie does not paint an optimistic or romantic picture of this work, it leaves it to the viewer to reflect on this irony. The AI learning to create diversity in its art also points to this irony - just as indigenous art was appropriated by more dominant groups commercially, one aspect that could perhaps be gleaned from the movie is the tensions between authenticity, artist and profiting when the AI will inevitably learn to paint indigenous art like it does with Ghibli.

In the boundaries of playing/teaching, child/AI, representation/appropriation, Nehema (and perhaps many of the workers that Sahay would have witnessed in his research) goes beyond the labour of clicking on labels for money - she is seen being emotionally invested in the process. And as the director pointed out in his conversation post screening, he wantedly portrays the overarching idea of colonialism in newer digital forms that seeps into the labour inside 'data-sweatshops'.


In Other News

This news on the other side of the AI boom on India's IT Jobs:

-and this news on the effects of the 10-Min-Delivery app boom on the smaller retailers

And our rural schools continue to be digitally excluded…

... it is interesting to think how Indian digital models can benefit African countries …

... the high intensity need of hardware to support the real strength of AI would be a huge challenge amidst environmental concerns…

are some of the highlights of the week.

Other DEF Updates

DEF attended the Roundtable on 'WSIS+20: A Multi-Stakeholder Deliberation on India’s Priorities', hosted by the Centre for Communication Governance at NLU, Delhi, held at Habitat Center. The event brought together policymakers, academics, civil society, and industry experts to discuss India's evolving digital governance priorities ahead of the WSIS+20 review at the UN later this year.


Write a comment ...

TypeRight - The Digital Nukkad

Show your support

Kindly support to fight digital divide and connect marginalised people. Donate here https://www.defindia.org/donate-page/

Recent Supporters

Write a comment ...

TypeRight - The Digital Nukkad

Pro
TypeRight - The Digital Nukkad, is a weekly conversational sharing of developments through the prism of a "digital citizen".