Like a number of big technology companies, Microsoft recently admitted that humans sometimes hear your sensitive voice conversations, but that doesn’t mean it’s going to stop. Rather than abandoning the use of human contractors to improve its AI accuracy, the company has simply decided to be more transparent about it.
Earlier this month, Microsoft was found sharing conversations with its Skype Translator product, an AI-powered system that translates in near real-time between 10 languages. It also let contractors listen to audio from user conversations with its Cortana voice assistant, becoming the latest in a series of companies embarrassed by similar revelations.
Our processing of personal data for these purposes includes both automated and manual (human) methods of processing. Our automated methods often are related to and supported by our manual methods. For example, our automated methods include artificial intelligence (AI), which we think of as a set of technologies that enable computers to perceive, learn, reason, and assist in decision-making to solve problems in ways that are similar to what people do.
To build, train, and improve the accuracy of our automated methods of processing (including AI), we manually review some of the predictions and inferences produced by the automated methods against the underlying data from which the predictions and inferences were made. For example, we manually review short snippets of a small sampling of voice data we have taken steps to de-identify to improve our speech services, such as recognition and translation.
The company also updated its Skype Translator Privacy FAQ, adding the following text clarifying an existing paragraph explaining how it analyzed transcripts:
This may include transcription of audio recordings by Microsoft employees and vendors, subject to procedures designed to protect users’ privacy, including taking steps to de-identify data, requiring non-disclosure agreements with vendors and their employees, and requiring that vendors meet the high privacy standards set out in European law and elsewhere.
In April 2019, it admitted that it also shared Alexa recordings with thousands of contractors so that they could improve the AI’s accuracy.
Google was next on the list in July 2019, when a whistleblower revealed that it, too, was sharing recordings with contractors. That same month, an Apple contractor revealed that third party workers were listening to Siri’s accidental recordings of drug deals and people having sex.
After Microsoft was caught doing the same thing, Facebook contractors said this week that they were listening in on Facebook Messenger transcriptions.
So, everyone’s been at it. The interesting thing is how the different companies reacted. Earlier this month, both Google and Apple said that they would suspend contractor access to voice recordings, but both of these announcements had their caveats. Apple’s suspension wasn’t permanent, and it didn’t say when it might resume the practice. Google only suspended the sharing of voice recordings for three months, and only in the EU.
Facebook said this week that it had already discontinued the practice, but there was no indication of whether – or when – it might resume the practice. For any more information on that, users may be forced to monitor any updates to the companies’ privacy policies.
These companies generally claim that the data is anonymous, but at least one Siri contractor questioned that, arguing:
These recordings are accompanied by user data showing location, contact details, and app data.
In any case, researchers have also made considerable progress re-identifying anonymous data sets.
These companies are busy trying to find a balance between the need for more data to enhance their AI and what the law – and customers – will tolerate. How comfortable are you with their sharing of your recordings?