Quality Check Executive

Location: Remote

Type: Full-time Consultant

Overview of the role

As part of an ambitious nationwide program, you will help create unique, high-quality open-source speech and text datasets spanning every district to accelerate the state-of-the-art in NLP (Natural Language Processing) in Indian languages. 

With several large projects under BhashaSetu,  ARTPARK’s vision is to spearhead the creation of an inclusive digital India through propel the AI advancements in Indic languages spanning projects in speech data collection, curation and advanced language modelling.

You will be part of the operations team which drives data collection and curation across all projects in BhashaSetu program, working closely with the ARTPARK team.

Key Responsibilities

  1. You will be responsible for the data from one or more of the following states:

    • Punjab

    • Haryana

    • Delhi

    • Himachal Pradesh

    • Uttarakhand

    • Uttar Pradesh

    • Bihar

  2. Understand requirements for data curation and its implications on the AI models built. Understanding the requirements thoroughly as and when guideline documentation is received.

  3. Search and recruit the correct curation experts as required by the project through searching and contacts (e.g., NGOs, local institutes etc.) in  the districts/area that you are managing

    • Design task flyers and find out all ways to reach the individuals and language experts(local in a district) who could be interested in data curation and quality checking

    • Contact (through phone call and WhatsApp) to applicants as well as those who did not apply

    • Host project awareness calls with potential experts to drive understanding of their tasks

  4. Manage day-to-day curation operations for audio and transcription data

    • Training the recruited experts in required tasks. Provide relevant documentation for training

    • Assigning of daily workload basis their availability and closely coordinating to get the work done

    • Supervising their daily performance and review their work on a daily basis.

Skills and background

  1. Should be a native and local language speaker of the local language of the following districts:

    Punjabi - (primarily from Kapurthala, Fazilka, Pathankot)

    Haryanvi, Jatu - (primarily from Charkhi Dadri, Jhajjar, Rohtak)

    Hindi - (primarily from Chandigarh, Delhi, Lucknow)

  2. Should be good at verbal and written communication both in English and local language

  3. Should be good with handling multiple people (remotely working) and get the task done by them.

  4. Skills: Microsoft Office (Excel, Word, PowerPoint) and Google Workspace (Docs, Sheets, Slides)

Good to haves

  1. Experience in data curation

  2. Experience in speech data annotation and labelling

  3. Experience in working with data sourcing and annotation companies

Why Join us?

You will be a part of the Language Data & AI team at ARTPARK and IISc.

We are focused on building an ecosystem for AI in the Indian languages space. To this end, we create datasets and models for enabling applications for broad societal impact. We are running some of the largest data initiatives in the world and these ambitious India-wide programs are creating high-quality open-source speech and text datasets spanning every district to accelerate the state-of-the-art in NLP (Natural Language Processing). You will be part of this core team at ARTPARK.

ARTPARK at IISc drives impact through innovations in AI & Robotics, by harnessing the best of research/academia,  startups/industry, and government/nonprofits.

Our pioneering platform initiatives in language data & AI and health data & AI are driving national-scale impact with stakeholders such as MeitY’s Bhashini,  Office of PSA, ICMR, States and Cities. At ARTPARK, you will work with the best researchers in the country and around the world in a strong data-driven environment and have the opportunity to address systemic issues and implement solutions.

These platforms are in pursuit of our vision – AI for All.

Next
Next

Data & AI Operations Manager