China’s AI boom depends on an army of exploited student interns
相关议题：工资报酬, 实习, 职业教育
In March 2021, during her third year at a vocational school in Shandong province, Lucy worked as an intern at a data annotation center. For four months, she spent eight hours a day sitting in her office, sorting through audio files, tagging images of children on surveillance camera footage, and differentiating trees from pedestrians in videos used to develop automated driving systems. She stayed at a four-person dorm provided by the company, and earned over 1,000 yuan ($137) a month — about 80% of the local minimum wage — which was just enough to cover her daily expenses.
When Lucy enrolled in the school’s computer science program, she thought she would learn to code and become a programmer. But in reality, her work was no different from an assembly line job. “It was very boring. We didn’t learn anything,” Lucy told Rest of World, speaking under a pseudonym to avoid being identified. But the school demanded that the whole class complete the internship, or else they would not be allowed to graduate.
Lucy is part of China’s new digital underclass — one of hundreds of thousands of data annotators fuelling the country’s booming artificial intelligence industry. Data annotators label vast quantities of raw data — tagging images of cars, screening videos for violent content, and filtering audio for keywords — to train machine learning models. Their labor, often underpaid and overlooked, is crucial to the development of new AI applications — from intelligent chatbots to autonomous vehicles.
In recent years, China’s data labeling companies have partnered with vocational schools, recruiting student interns to do this tedious and labor-intensive work — often for subminimum wages and under poor conditions — in order to fulfill their graduation requirements, a Rest of World investigation has found. New regulations published by the Ministry of Education in January 2022 require employers to pay interns minimum wage, and ban schools from taking commissions. They also prohibit educational institutions from making students do “simple, repetitive work.” Under these guidelines, some data labeling internships could be considered a violation.
In China, vocational school students are required to do mandatory internships, which have long served as a source of cheap labor for factories, call centers, content moderation companies, and amusement parks. Students enrolled in vocational school programs titled “Computer Science,” “Big Data,” and “Artificial Intelligence” now take on the lowest-paid jobs in China’s lucrative AI industry, according to public reports and interviews with students, data vendors, vocational school staff, a recruitment agency, and labor researchers.
Although vocational schools advertise data annotation internships as a way for students to improve their career prospects and acquire AI-related skills, many students say the jobs seem more like a form of manual labor, Xia Bingqing, a researcher at East China Normal University, told Rest of World. Xia interviewed students who interned in the industry between 2018 and 2019. Some of them were unpaid, while others were paid by the amount of data they processed — for example, 0.2 yuan (three cents) for each “bounding box” they used to label an image.
A recent vocational school graduate in southern China, who spoke on condition of anonymity, told Rest of World she had enrolled in a self-driving program for her internship. She said she spent her final semester drawing bounding boxes on street footage for a car producer, working 60-hour weeks for around $500 a month. Her classmates who had interned at a car repair service had better experiences than her, she said, and she regretted applying for the internship. “You learn more by fixing cars than by drawing boxes,” she said.
Globally, tech companies have outsourced data annotation work to developing countries in Africa and Southeast Asia. In China, companies have built annotation centers in poorer, inland regions, often backed by local governments eager to attract investments and boost employment rates.
Many of China’s tech giants have partnered with vocational schools in these less-developed regions to create data annotation internships. Last March, search giant Baidu established an annotation center with a vocational school in Jiuquan, Gansu, one of China’s poorest provinces. The company received 30 million yuan ($4.1 million) from the local city government. According to a student intern’s post on the Jiuquan mayor’s online message board in 2022, the school forced more than 160 students to annotate data for Baidu, or they would not receive their degree.
In an email reply to Rest of World, Baidu said it was not aware of the situation. The company said it prioritized labor rights and “the dignity of employees,” and urged its service providers to do the same.
The Guizhou-based data labeling firm Mengdong has worked with tech giants such as Baidu, Alibaba, and JD.com. Its founder also runs a vocational school called Forerunner College. In 2021, Mengdong employed 1,461 students, who generated more than 19 million yuan ($2.6 million) in revenue, according to a state media report. “Going to school means going to work. Their teachers are their managers,” said Hu Dingxiang, a teacher who also worked for Mengdong. The firm said students were paid, and that rural students could offset school fees by working.
Managers at data annotation firms in three different regions told Rest of World that companies use student interns because they can pay them less, and do not need to pay for their social security. Two of them said schools often took commissions. A recruiter at an employment agency in Jiangsu said the cut could be up to 50% of students’ salaries. One annotation firm owner said he hired the students through online job portals, and did not give commissions to schools.
One data firm manager, who requested anonymity for fear of being identified by others in the industry, told Rest of World 60% of his company’s annotators were vocational school students. They worked eight hours a day, six days a week, for more than six months — and were paid 3000 to 4000 yuan ($409–$545) a month, of which the schools took a cut of 600 to 1000 yuan ($82–$136), he said. Another business owner, who also asked to remain anonymous, estimated student interns made up 20% to 30% of China’s data annotation work force.
A school official at a vocational school in Zhejiang province told Rest of World their students worked as interns for a data labeling company based in Shenzhen, annotating handwritten math problems to train an AI program that helps students with homework. The official, who requested anonymity for fear of repercussions, said students were paid 0.32 to 0.38 yuan (4 to 5 cents) per annotation. Every day, students were ranked according to the amount of data they processed. The school eventually halted the program in 2021 because students complained of low pay and overwork, the official said.
Vocational school students in China, many of whom come from lower-class and rural backgrounds, are particularly vulnerable to labor abuse. Cases of such abuse and student suicides in vocational school internships have sparked public outrage in the country. Students have been forced to work grueling shifts, take on jobs unrelated to their studies, and share their wages with their schools. Xia, the researcher, cited the case of a female student who lived in a dormitory for her data annotation internship and was banned from going home for six months.
Data annotators who work in content moderation are also often exposed to traumatic content. According to Ryan, a former manager at a data labeling firm in Zhejiang, their team of annotators — mostly female vocational school students — reviewed data sets from April to June 2020 to train Chinese tech giant NetEase’s AI content moderation system. The work involved screening out violent and pornographic content, such as hate speech, bloody images and nude photos. “After a day of work, you just wanted to wash out your eyes,” said Ryan, requesting to use a pseudonym for fear of repercussions.
NetEase, Alibaba, JD.com, Mengdong, and iFlytek did not respond to Rest of World’s request for comment.
Without formal contracts and channels to voice grievances, students are easily subject to exploitation and abuse, such as long work hours and safety hazards, according to Jenny Chan, a sociologist at Hong Kong Polytechnic University who studied student interns in the manufacturing sector. “They are unfree labor,” Chan told Rest of World. “Can they talk to their teacher about problems about work, if the teacher was the one who sent them in?”
In a tough job market with record-high youth unemployment rates, some vocational school and university students voluntarily take on data annotation jobs to beef up their resumes or earn extra cash to make ends meet.
Xu, a communications major from Hefei province, told Rest of World she applied for an internship at AI company iFlytek this summer. She had wanted to gain work experience and add a reputable company to her resume, but ended up annotating data to train learning devices. “My brain feels stiff and my eyes hurt from staring at the same stuff on a computer all day,” said Xu, who preferred to use only her last name for fear of being identified. “We are workers at the bottom of the AI industry.”
After her internship, Lucy from Shandong stayed on as an employee at the data company for another year. She eventually left the industry in 2022 because the pay was too low, and took up a marketing position at an internet company instead. She said that while the annotation gig helped her understand the AI industry, it didn’t bring her any closer to her dream job of a software engineer. “It wasn’t even a stepping stone,” she said.