Chapter Sixty-Nine: The Different Dialects of AI Tools

First, Happy New Year to our readers! This week, we would first like to share several of those New Years posts through DEF and its associate handles. And then we will get on to the real issue of the phenomenon called ChatGPT.

A post and wish from Smartpur team…

DEF's Abner Manzar has travelled to over 200 villages across fifteen states in the country to document twenty years of the organisation's work. Read them here through his twitter posts in pictures:

Once there, also check out his new series, Daily Digital Story- where Abner is promising to post one small digital story a day for the next 365 days of 2023.

On this new year issue, we would like to remind our readers of the Digital Swaraj Fellowship which has an application deadline on this fifteenth.

- and also our SoochnaPreneur Awards enthusiasts:


Now, onto this week's cover story-

In an earlier chapter of TypeRight, we talked about a pilot research project we had done along with the Alan Turing Institute and the Global Partnership on AI. In one of the conversations, we spoke to a developer who worked on an AI speech recognition tool. The following is an excerpt from our published Data Justice Report:

There are issues around inclusivity. For instance, in developing AI tools for farmers on speech to text assistance, there are gaps in linguistic inclusivity. According to the 2001 census, there are 30 languages in India, it is estimated that there are 1599 dialects within these main languages. India also has a complicated history of linguistic politics, where the official languages are the tongues spoken by the dominant communities and several regional dialects are considered inferior. The speech to text conversion software built did not identify these variations and catered to only a few dominant languages. This requires the collection of data from a large number of speakers to process it into efficient speech recognition tools.  The issue is the lack of commercial interest to do something like this, even by the larger companies funding or behind such projects. The project this particular developer worked on, for example, did not include languages from the northeast, which is one of the neglected communities in India that have been facing systemic racism and exclusion.

While our conversation problematised the focus of a mainstream dialect of the language being used, we were also told that the larger inclusivity would entail a much higher development cost.

Here is Google's announcement of its collaboration project, Vaani with IISc Bangalore.

The AI and Robotics Technology Park (Artpark) is a not-for-profit foundation set up in 2020 by the Bengaluru-based Indian Institute of Science (IISc) and AI Foundry in a public-private model. The Department of Science and Technology has announced a 22 million dollar funding to the venture.

The importance of the initiative and expanding the scope of languages and dialects comes from the state of connectivity in India. As the ARTPARK president says, “Over the past decade, most apps for frontline health and agriculture workers have failed because digital interfaces feel alien to them. More than 1 billion Indians still cannot speak or type in English." If people who are recently getting introduced to and accessing technology can do it in their own language, or through speech, it would mean worlds of difference to the digital divide in the country.

The question in the end becomes one of how far one can push the corporate profit model. While initially the OpenAI group's charter pledged to advance technology for the benefit of humanity instead of corporate profit, and promised to abandon the race to develop artificial general intelligence - this didn't last long, as critics point out.

Following Microsoft's investment, Mr. Altman pushed OpenAI to bring in more revenue to attract funding and support the computational resources needed to train its algorithms. The deal also gave Microsoft a strategic foothold in the arms race to capitalize on advancements in AI. Microsoft became OpenAI's preferred partner for commercializing its technologies, an arrangement that allows Microsoft to easily integrate OpenAI's models into products such as Bing.

While not completely related, here's an article on how the AI model has potential cybersecurity flaws, by helping someone with malicious intent code malware using questions in a natural language.

This leaves the fate of AI tools and language processors open ended - lots of potential to transform the connectivity landscape, perhaps given there are enough audits?


Until next week, we wish all our readers a great start to the new year!

Write a comment ...

TypeRight - The Digital Nukkad

Show your support

Kindly support to fight digital divide and connect marginalised people. Donate here https://www.defindia.org/donate-page/

Recent Supporters

Write a comment ...

TypeRight - The Digital Nukkad

TypeRight - The Digital Nukkad, is a weekly conversational bulletin curated through the news and discussions on social media as well as what's happening on the ground. Through the eyes and ears of Digital Empowerment Foundation across rural India and global south, TypeRight aspires to focus on bringing the contextual relevance of digital technologies and developments on the society - both connected and unconnected.