Pakistan to Develop Urdu LLM for Generative AI

National University of Science and Technology (NUST), National Information Technology Board (NITB) and Telecom network operator Jazz have signed a Memorandum of Understanding (MOU) to develop Pakistan’s first indigenous Large Language Model (LLM) with focus on Urdu, including datasets for Pashto and Punjabi languages. It is aimed at empowering individuals, businesses, and organizations with advanced AI tools in their native languages. The envisioned LLM is expected to drive innovation in Generative AI applications, boosting productivity and accessibility in critical sectors like healthcare, education, and agriculture.

GPT-4 Accuracy Scores. Source: The Economist

Generative AI tools such as ChatGPT are powered by large language models, or LLMs. These models need to be trained on vast amounts of data in specific languages to be useful. Unfortunately, the Urdu content of the Internet is less than 0.1%. This will present a challenge for the developers of Urdu LLMs.

Online Content of Various Languages. Source: W3Techs 

Lack of Urdu content available for training ChatGPT affects the accuracy of the results for Urdu language users. For example, the GPT-4 accuracy score in question-answer tests in Urdu is just over 70%, compared with 85% accuracy score in the English language, according to data from OpenAI. Other South Asian languages, including Hindi, Bengali, Punjabi, Marathi and Telugu, suffer from the same problem. 

It's not just a South Asian problem. These challenges exist in the developing world. Non-European languages are generally poorly represented online. It's a major obstacle for non-European nations in developing their own generative artificial-intelligence (AI) models, which rely on vast amounts of training data. Generative artificial intelligence (AI) can produce biased results due to a number of factors, including the data it's trained on, the algorithms used, and how it's deployed. 

The use of AI in developing nations such as Pakistan will remain limited to a small number of people proficient in the use of the English language. Broadening the adoption of AI applications will require LLMs trained on local language content. The absence of this development could cost Pakistan the opportunity to take full advantage of the AI Revolution

Views: 116

Comment by Riaz Haq on November 8, 2024 at 8:52am

VEON’s Jazz Launches FikrFree: An AI-Powered Digital


https://www.globenewswire.com/news-release/2024/10/24/2968536/0/en/...

VEON Ltd. (Nasdaq: VEON, Euronext Amsterdam: VEON), a global digital operator (“VEON” or the “Company”), today announces that Jazz, its digital operator in Pakistan, has launched FikrFree, a new AI-powered digital marketplace for insurance and healthcare. The platform aims to bridge a significant gap in Pakistan, where insurance sector penetration is less than 1% of GDP according to the Securities and Exchange Commission of Pakistan, and millions lack access to essential healthcare. In comparison, insurance penetration in other countries is significantly higher (over 7% of GDP in the US and more than 9% of GDP in the UK, according to the World Bank). FikrFree helps users find accessible and affordable coverage through personalized insurance plans and healthcare services.

FikrFree aims to reach the underserved healthcare market in Pakistan through an innovative platform that seamlessly integrates insurance, healthcare, and financial services all in one mobile app. FikrFree also leverages artificial intelligence to recommend personalized insurance plans for customers. The new digital service builds on VEON’s commitment to creating innovative digital solutions as part of its Digital Operator 1440 strategy, offering customers a portfolio of connected services that are relevant for each of the 1,440 minutes in a day. In 2Q24, direct digital revenues represented over 10% of VEON Group’s total revenues.

"Access to affordable healthcare is a fundamental need. In Pakistan, where millions struggle to find suitable insurance coverage and healthcare services, VEON is addressing this challenge with connected digital services. With the launch of FikrFree, we are empowering customers to access personalized insurance plans, specialist doctors, and on-demand medicine delivery—all in one seamless platform. Our digital operator strategy focuses on investing in services that enhance lives, and with FikrFree, we aim to make affordable healthcare accessible to all Pakistanis," says Kaan Terzioglu, CEO of VEON Group.

Comment by Riaz Haq on November 8, 2024 at 8:58am

UNODC Pakistan provided Law Enforcement with Cutting-Edge Training on Crime Analytics and AI Models to Counter Terrorism


https://www.unodc.org/copak/en/Stories/SP4/unodc-pakistan-provided-...


28 September 2024, Islamabad - UNODC Pakistan organized a comprehensive workshop aimed at building the capacity of National Counter Terrorism Authority analyst’s in using advanced crime analytics and artificial intelligence (AI) to combat terrorism. The workshop covered a wide range of critical topics, equipping participants with the skills and knowledge needed to analyze data and counter terrorism through innovative AI techniques. In total 25 analysts including 7 women participated in the training session.

The participants were introduced to the fundamentals of intelligence gathering, the intelligence cycle, and the development of intelligence products. Practical discussions were held around strategic intelligence and its pivotal role in decision-making. Participants also reviewed products developed in earlier training sessions on i2 Analyst's Notebook and Power BI, enabling them to grasp how past learnings integrate with the current focus on terrorism prevention. The workshop covered data analysis, beginning with an introduction to various data forms and their relevance in crime intelligence. Sessions covered both qualitative and quantitative data, with participants learning how to distinguish between structured and unstructured data and their real-world applications in intelligence work.

The hands-on segment includes Textalyser, an online tool used to analyze qualitative data specially for conducting sentimental analysis allowing participants to experiment with real-world examples. Participants were engaged through thought-provoking case studies, including analyses of social media sentiment and notable incidents such as the Al Qaeda network and the Sialkot lynching case. These examples highlighted the practical value of AI tools like Voyant in unraveling criminal networks and understanding public sentiment related to terrorist activities.

The overall workshop was dedicated to hands-on sessions with low-code and no-code AI platforms, empowering participants to leverage AI without the need for extensive programming knowledge. Practical exercises included case studies using Google Teachable Machines for image classification and Google Cloud AutoML for predictive crime analytics, both of which offer powerful tools for identifying criminal patterns and behaviors in complex datasets.

The workshop concluded with a closing session that recapped the key learnings and allowed participants to discuss the next steps in their professional development.

Comment by Riaz Haq on November 8, 2024 at 7:35pm

Generalists vs. Specialists: Evaluating Large Language Models for Urdu


https://arxiv.org/html/2407.04459v1

In this paper, we compare general-purpose pretrained models, (OpenAI's) GPT-4-Turbo and (Meta/Facebook) Llama-3-8b-Instruct with special-purpose models fine-tuned on specific tasks, XLM-Roberta-large, mT5-large, and Llama-3-8b-Instruct. We focus on seven classification and six generation tasks to evaluate the performance of these models on Urdu language. Urdu has 70 million native speakers, yet it remains underrepresented in Natural Language Processing (NLP). Despite the frequent advancements in Large Language Models (LLMs), their performance in low-resource languages, including Urdu, still needs to be explored. We also conduct a human evaluation for the generation tasks and compare the results with the evaluations performed by GPT-4-Turbo and Llama-3-8b-Instruct. We find that special-purpose models consistently outperform general-purpose models across various tasks. We also find that the evaluation done by GPT-4-Turbo for generation tasks aligns more closely with human evaluation compared to the evaluation by Llama-3-8b-Instruct. This paper contributes to the NLP community by providing insights into the effectiveness of general and specific-purpose LLMs for low-resource languages.

Comment by Riaz Haq on November 24, 2024 at 9:09pm

Labelers training AI say they're overworked, underpaid and exploited by big American tech companies - CBS News

https://www.cbsnews.com/news/labelers-training-ai-say-theyre-overwo...

Naftali Wambalo: I did labeling for videos and images.

Naftali and digital workers like him, spent eight hours a day in front of a screen studying photos and videos, drawing boxes around objects and labeling them, teaching the AI algorithms to recognize them.

Naftali Wambalo: You'd label, let's say, furniture in a house. And you say "This is a TV. This is a microwave." So you are teaching the AI to identify these items. And then there was one for faces of people. The color of the face. "If it looks like this, this is white. If it looks like this, it's Black. This is Asian." You're teaching the AI to identify them automatically.

Humans tag cars and pedestrians to teach autonomous vehicles not to hit them. Humans circle abnormalities to teach AI to recognize diseases. Even as AI is getting smarter, humans in the loop will always be needed because there will always be new devices and inventions that'll need labeling.

Lesley Stahl: You find these humans in the loop not only here in Kenya but in other countries thousands of miles from Silicon Valley. In India, the Philippines, Venezuela - often countries with large low wage populations - well educated but unemployed.

Nerima Wako-Ojiwa: Honestly, it's like modern-day slavery. Because it's cheap labor–

Lesley Stahl: Whoa. What do you –

Nerima Wako-Ojiwa: It's cheap labor.

Like modern day slavery, says Nerima Wako-Ojiwa, a Kenyan civil rights activist, because big American tech companies come here and advertise the jobs as a ticket to the future. But really, she says, it's exploitation.

Nerima Wako-Ojiwa: What we're seeing is an inequality.

Lesley Stahl: It sounds so good. An AI job! Is there any job security?

Nerima Wako-Ojiwa: The contracts that we see are very short-term. And I've seen people who have contracts that are monthly, some of them weekly, some of them days. Which is ridiculous.

She calls the workspaces AIi sweatshops with computers instead of sewing machines.

Nerima Wako-Ojiwa: I think that we're so concerned with "creating opportunities," but we're not asking, "Are they good opportunities?"

Because every year a million young people enter the job market, the government has been courting tech giants like Microsoft, Google, Apple, and Intel to come here, promoting Kenya's reputation as the Silicon Savannah: tech savvy and digitally connected.

Nerima Wako-Ojiwa: The president has been really pushing for opportunities in AI –

Lesley Stahl: President?

Nerima Wako-Ojiwa: Yes.

--------------

Fasica: I was basically reviewing content which are very graphic, very disturbing contents. I was watching dismembered bodies or drone attack victims. You name it. You know, whenever I talk about this, I still have flashbacks.

Lesley Stahl: Are any of you a different person than they were before you had this job?

Fasica: Yeah. I find it hard now to even have conversations with people. It's just that I find it easier to cry than to speak.

Nathan: You continue isolating you-- yourself from people. You don't want to socialize with others. It's you and it's you alone.

Lesley Stahl: Are you a different person?

Naftali Wambalo: Yeah. I'm a different person. I used to enjoy my marriage, especially when it comes to bedroom fireworks. But after the job I hate sex.

Lesley Stahl: You hated sex?

---------
These three and nearly 200 other digital workers are suing SAMA and Meta over "unreasonable working conditions" that caused psychiatric problems

Comment by Riaz Haq on January 26, 2025 at 11:25am

How China’s new AI model DeepSeek is threatening U.S. dominance


https://www.cnbc.com/2025/01/24/how-chinas-new-ai-model-deepseek-is...

A little-known AI lab out of China has ignited panic throughout Silicon Valley after releasing AI models that can outperform America's best despite being built more cheaply and with less-powerful chips.

DeepSeek, as the lab is called, unveiled a free, open-source large-language model in late December that it says took only two months and less than $6 million to build, using reduced-capability chips from Nvidia called H800s.

------------------
China’s cheap, open AI model DeepSeek thrills scientists


https://www.nature.com/articles/d41586-025-00229-6


A Chinese-built large language model called DeepSeek-R1 is thrilling scientists as an affordable and open rival to ‘reasoning’ models such as OpenAI’s o1.

These models generate responses step-by-step, in a process analogous to human reasoning. This makes them more adept than earlier language models at solving scientific problems and could make them useful in research. Initial tests of R1, released on 20 January, show that its performance on certain tasks in chemistry, mathematics and coding is on par with that of o1 — which wowed researchers when it was released by OpenAI in September.

“This is wild and totally unexpected,” Elvis Saravia, an AI researcher and co-founder of the UK-based AI consulting firm DAIR.AI, wrote on X.

R1 stands out for another reason. DeepSeek, the start-up in Hangzhou that built the model, has released it as ‘open-weight’, meaning that researchers can study and build on the algorithm. Published under an MIT licence, the model can be freely reused but is not considered fully open source, because its training data has not been made available.

“The openness of DeepSeek is quite remarkable,” says Mario Krenn, leader of the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. By comparison, o1 and other models built by OpenAI in San Francisco, California, including its latest effort o3 are “essentially black boxes”, he says.

--------------

China’s AI industry has almost caught up with America’s And it is more open and more efficient, too

https://www.economist.com/briefing/2025/01/23/chinas-ai-industry-ha...

The WORLD’s first “reasoning model”, an advanced form of artificial intelligence, was released in September by OpenAI, an American firm. o1, as it is called, uses a “chain of thought” to answer difficult questions in science and mathematics, breaking down problems to their constituent steps and testing various approaches to the task behind the scenes before presenting a conclusion to the user. Its unveiling set off a race to copy this method. Google came up with a reasoning model called “Gemini Flash Thinking” in December. OpenAI responded with o3, an update of o1, a few days later.

Comment by Riaz Haq on March 27, 2025 at 7:58pm

Pakistani tech firm launches first ‘home grown’ GPT platform | Arab News

https://www.arabnews.com/node/2594353/pakistan


Zahanat AI is a text-based generative AI model that enables users to engage in human-like conversations, answer queries, and assist in various domains
Its key differentiator is its hosting and local training on Pakistani culture and localized issues, which makes it equipped to address regional challenges

---------------
Meet the woman who made Pakistan's first AI chatbot

https://theprint.in/go-to-pakistan/woman-pakistans-first-ai-chatbot...

The platform is designed to empower Pakistani citizens, particularly in sectors like education and healthcare. One of Zahanat’s most anticipated developments is the upcoming Z2 model, which will support Urdu and multiple regional languages. This is a game-changer for more than half of Pakistan’s population, a large part of which struggleswith English or even Urdu.

Ali imagines the platform being used by a rural student to access world-class education in their native Sindhi or Pashto. She dreams of Zahanat helping an elderly womanreceive her healthcare diagnosis in Balochi.

“We’re not just enabling access to AI, we’re redefining who gets to be part of the ecosystem. We’re moving from a digital divide to digital empowerment. This isn’t just tech progress. It’s social progress,” Ali said.

Zahanat is a personal mission for Ali to break the gender barriers that persist in tech. She has faced bias in the male-dominated industry, both spoken and unspoken.

“When I lead a project like Zahanat, it’s not just innovation, it’s disruption. It’s proof that women can lead tech.”

------------

Kineto: K-Electric Becomes Pakistan’s First Power Utility to launch Generative AI Chatbot To Enhance Customer Experience

https://propakistani.pk/2025/03/26/kineto-k-electric-becomes-pakist...

“Users of the KE Live App have grown by 21% annually over the last 5 years and now stand at 1.3 million digitally connected customers. This is over one-third of KE’s total customer base and conveys our digital-savvy population. We then heralded another innovation when we launched the WhatsApp platform back in 2021, and now this platform caters to over 2.0 million people. Additionally, nearly another half a million subscribe to our e-billing feature, a step that helps save Pakistan paper and reduce its import bill.”

“Now, leading the way with digital transformation in customer engagement, Kineto was just the next logical step forward reflecting our investment in future-ready digital platforms, further transforming the way Karachi’s customers interact with its power utility.”

The chatbot has been developed in collaboration with Convex Interactive, KE’s technology partner for this initiative.

“This partnership with K-Electric aligns with our mission to revolutionize customer engagement through AI,” said Aamir Irfan Siddiqui, CEO & Founder of Convex Interactive. “By leveraging generative AI, we’re making customer interactions faster, smarter, and more intuitive.”

Comment by Riaz Haq on March 29, 2025 at 8:49am

TCF set to bring AI-powered learning to teachers with Khanmigo

https://www.thenews.com.pk/print/1296015-tcf-set-to-bring-ai-powere...

The Citizens Foundation (TCF) and Khan Academy Pakistan have announced an innovative AI-powered collaboration to support teachers and enhance classroom learning in selected TCF schools.

This pilot initiative aims to empower teachers by enhancing teachers’ lesson delivery, fostering critical thinking, and improving classroom engagement for students in Grades 6-8. Under this collaboration, Khanmigo will be integrated into selected TCF schools to enhance mathematics and science instruction.

Unlike traditional AI, Khanmigo acts as an interactive teaching assistant, helping educators enhance their knowledge, craft lesson hooks, develop quizzes, and foster deeper student engagement.

The pilot programme will equip teachers with AI-driven teacher tools, provide structured prompts to guide teachers to develop learning material relevant to their students, and offer bilingual support in English and Urdu.

Additionally, Khan Academy Pakistan will train school leaders on effective AI integration, offering guidance on best practices for using Khanmigo in classrooms. This initiative will empower TCF teachers to refine their teaching methods, personalise learning experiences, and drive meaningful classroom discussions, making AI-driven learning more accessible, structured, and engaging for students. “At TCF, we want to ensure that technology serves as a bridge to better learning opportunities rather than a barrier,” shared Syed Asaad Ayub Ahmad, the president and CEO of TCF.

“We are hopeful that Khanmigo will be useful in serving as a thinking partner for TCF teachers in the classroom and a transformative step towards making high-quality education accessible and engaging.”

One of Khanmigo’s most promising features is its bilingual support, allowing teachers to instruct in both English and Urdu. This ensures that educators from diverse backgrounds can fully engage with the content. As the programme progresses, regional language support will be explored, further broadening its accessibility.

“Khanmigo aims to give every child in Pakistan access to world-class education,” said Zeeshan Hasan, CEO of Khan Academy Pakistan. “By empowering teachers, we are ensuring that AI becomes a tool for empowerment rather than a shortcut. This partnership with TCF is a step forward towards transforming how education is delivered in classrooms.”

“TCF strongly believes in the power of good teachers, and there is an undeniable social aspect of learning from a teacher. We are hopeful that KhanMigo will augment teacher skills to make classroom experience fun, engaging, and meaningful for the students,” shared Shazia Kamal, executive vice president, Outcomes at TCF.

With Pakistan facing a critical education crisis and a shortage of trained teachers, AI-powered solutions like Khanmigo offer a scalable and cost-effective way to enhance teaching quality.

While this initiative is currently in its pilot phase, TCF and Khan Academy Pakistan envision expanding the programme to more schools.

As AI continues to reshape global education, this partnership reaffirms TCF’s commitment to equipping teachers with the best tools to inspire and educate the next generation of Pakistan’s changemakers.

TCF is a non-profit organisation set up in 1995 by a group of citizens who wanted to bring about positive social change through education.

The 30-year-old organisation is among Pakistan’s leading organisations in the field of education, educating 301,000 students across 2,033 school units in the country.

Comment

You need to be a member of PakAlumni Worldwide: The Global Social Network to add comments!

Join PakAlumni Worldwide: The Global Social Network

Pre-Paid Legal


Twitter Feed

    follow me on Twitter

    Sponsored Links

    South Asia Investor Review
    Investor Information Blog

    Haq's Musings
    Riaz Haq's Current Affairs Blog

    Please Bookmark This Page!




    Blog Posts

    Pakistan to Explore Legalization of Cryptocurrency

    Islamabad is establishing the Pakistan Crypto Council (PCC) to look into regulating and legalizing the use of cryptocurrencies, according to media reports. Cryptocurrency refers to digital currencies that can be used to make purchases or investments using encryption algorithms. US President Donald Trump's endorsement of cryptocurrencies and creation of a "bitcoin reserve" has boosted investors’…

    Continue

    Posted by Riaz Haq on March 28, 2025 at 8:30pm — 2 Comments

    World Happiness Report 2025: Poor Ranking Makes Indians Very Unhappy

    Pakistan has outranked India yet again on the World Happiness Index, making Indians very very unhappy. Indian media commentators' strong negative emotional reaction to their nation's poor ranking  betrays how unhappy they are even as they insist they are happier than their neighbors. Coming from the privileged upper castes, these commentators call the report "…

    Continue

    Posted by Riaz Haq on March 22, 2025 at 10:30am — 7 Comments

    © 2025   Created by Riaz Haq.   Powered by

    Badges  |  Report an Issue  |  Terms of Service