How Apple Will Analyze Your Data to Train Its AI (While Protecting Your Privacy)

Apple said it will begin analyzing on-device user data as part of a broader push to strengthen its AI platform.

In a blog post, the company outlined a new approach designed to expand its AI capabilities while safeguarding user privacy, especially as competitors like OpenAI and Google advance more quickly with fewer restrictions. Apple said it will train its AI models using synthetic data, known as information that mimics the format and characteristics of real-world messages without including any actual user-generated content.

«When creating synthetic data, our goal is to produce synthetic sentences or emails that are similar enough in topic or style to the real thing to help improve our models for summarization, but without Apple collecting emails from the device,» the company said in a blog post.

For Apple Intelligence features including summarization and writing tools that handle longer content, the company said its usual methods, like those used for short-form prompts in Genmoji, aren’t effective.

Instead, its new approach will generate a large set of synthetic emails on various topics – such as, «Want to play tennis tomorrow?» – without referencing any actual user data. Each message is converted into what Apple calls an «embedding,» a numerical summary capturing attributes including topic and length. The embeddings are sent only to opted-in devices, which then compare them to a small, private sample of recent user emails stored locally.

«This process allows us to improve the topics and language of our synthetic emails, which helps us train our models to create better text outputs in features like email summaries, while protecting privacy,» the company said.

Apple said it will start using this approach «soon» with users who opt in to sharing device analytics.

A «sophisticated» approach to privacy

Jason Hong, a computer science professor at Carnegie Mellon University, said this type of «differential privacy» is a sophisticated approach for analyzing and using data aggregated from large numbers of people.

«Apple could have taken the easy approach of just taking everyone’s data and using it to build their AI models,» he said. «Instead, Apple chose to deploy these differential privacy approaches for Apple Intelligence, and they should be applauded for putting their customers’ privacy first.»

However, he said there will likely be tradeoffs, including the possibility that Apple Intelligence may not be as effective as some competitors because rivals will have more access to people’s data. He also said Apple’s models may likely be harder to debug and might take more battery power to deploy.

«It’s hard to say at this point,» he said.

How Apple Will Analyze Your Data to Train Its AI (While Protecting Your Privacy)

A «sophisticated» approach to privacy

Recent Articles

What President Trump’s Department of Education Closure Could Mean for Student Broadband Access

Download iOS 18.4.1 Right Now for These Important Security Fixes

The 6 New Google AI Features I’m Using to Plan My Summer Travel

Should You Buy a New iPhone This Weekend Before They Get More Expensive?

Today’s NYT Mini Crossword Answers for Saturday, April 19

Related Stories

Stay on op - Ge the daily news in your inbox