The Globe and Mail reports in its Saturday, Sept. 20, edition that Meta Platforms used public Facebook and Instagram posts to train parts of its new Meta AI virtual assistant, but excluded private posts shared only with family and friends. A Reuters dispatch to The Globe reports that
Meta also filtered private details from public datasets used for training. Meta's Nick Clegg said datasets with lots of personal information were excluded. He said the "vast majority" of the data used by Meta for training were publicly available.
He cited LinkedIn as an example of a website whose content Meta deliberately chose not to use because of privacy concerns.
Mr. Clegg's comments come as tech companies including Meta, Microsoft-backed OpenAI and Alphabet's Google have been criticized for using information scraped from the Internet without permission to train their AI models, which ingest massive amounts of data in order to summarize information and generate imagery.
The companies are weighing how to handle the private or copyrighted materials vacuumed up in that process that their AI systems may reproduce, while facing lawsuits from authors accusing them of infringing copyrights.
© 2024 Canjex Publishing Ltd. All rights reserved.