Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Proof Ingredients: How AI companies stole your YouTube videos

Автор: Proof News

Загружено: 2024-07-24

Просмотров: 334

Описание:

YouTubers have long wondered whether their work has been scraped by AI companies to train their models — but Proof News investigative reporter Annie Gilbertson has proven it. She found that big companies, including Apple, Anthropic, Nvidia, and Bloomberg have all used a dataset containing the transcripts to more than 170,000 YouTube videos, including videos by megastars like Mr. Beast, Marques Brownlee, and PewDiePie.

In this interview, Proof founder Julia Angwin talks to Annie about the investigation and what went into it. The interview is the first in our new series, Proof Ingredients. In this series, Julia will talk to journalists, researchers and content creators about what their investigations are made of, walking through the hypothesis, sample size, techniques, key findings, and limitations. Hopefully these ingredients help you evaluate our work and give you a framework for judging other news, too.

Ingredients

Hypothesis: AI companies are using YouTube videos to build models that may come to compete against YouTube creators.

Sample size: A 5.7 GB (489-million-word) training dataset called YouTube Subtitles.

Techniques: We linked subtitles in the dataset to videos on YouTube in order to determine whose creative material was used to train AI models. We found evidence of AI companies’ using the data through white papers and posts online.

Key findings: The training data contained 173,536 YouTube videos, more than 12,000 of which have been deleted from the platform but were still ingested by AI models.

Limitations: AI companies do not often disclose what data they use to train their models, so we are unable to produce a comprehensive list of companies that used this dataset.

Why we think news needs an ingredients label
   • What's in your news?  

Links

Full story on Proof News
https://www.proofnews.org/apple-nvidi...

Search tool — see if you or your favorite YouTuber were used by AI giants
https://www.proofnews.org/youtube-ai-...

Research paper about The Pile published by Eleuther AI
https://arxiv.org/abs/2101.00027

NYT article about Open AI and Google’s use of YouTube transcripts in AI training
https://www.nytimes.com/2024/04/06/te...

WSJ interview with OpenAI CTO Mira Murati
   • OpenAI's Sora Made Me Crazy AI Videos—Then...  

https://www.proofnews.org/
  / proof_news  
  / proof__news  

Join us in making trustworthy, verifiable information the new baseline:
https://www.proofnews.org/donate/

Proof Ingredients: How AI companies stole your YouTube videos

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

USF Fall 2025 Commencement | Friday 1:30PM

USF Fall 2025 Commencement | Friday 1:30PM

Proof Ingredients: Teacher Gives AI Models a “C” In Black History

Proof Ingredients: Teacher Gives AI Models a “C” In Black History

Вы просыпаетесь в 3 часа ночи? Вашему телу нужна помощь! Почему об этом не говорят?

Вы просыпаетесь в 3 часа ночи? Вашему телу нужна помощь! Почему об этом не говорят?

Proof Ingredients: Is AI going to replace software developers?

Proof Ingredients: Is AI going to replace software developers?

«Мессенджер Max — это МЕНТ в вашем телефоне» | Как безопасно звонить и обходить блокировки в России

«Мессенджер Max — это МЕНТ в вашем телефоне» | Как безопасно звонить и обходить блокировки в России

Proof Ingredients: How AI models answered our queries about Kamala Harris

Proof Ingredients: How AI models answered our queries about Kamala Harris

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

How AI Models Steal Creative Work — and What to Do About It | Ed Newton-Rex | TED

How AI Models Steal Creative Work — and What to Do About It | Ed Newton-Rex | TED

Ваш браузер знает о вас все и сливает данные: как защититься?

Ваш браузер знает о вас все и сливает данные: как защититься?

What's in your news?

What's in your news?

Не жалуйтесь, создавайте решения: философия генерального директора, движущая Veolia вперед

Не жалуйтесь, создавайте решения: философия генерального директора, движущая Veolia вперед

Рунет с «белыми списками» сайтов. Будет как в КНДР?

Рунет с «белыми списками» сайтов. Будет как в КНДР?

LISTEN LIVE: Supreme Court hears 1st Amendment case on subpoena sent to pregnancy center

LISTEN LIVE: Supreme Court hears 1st Amendment case on subpoena sent to pregnancy center

Retention Over Installs: The New App Growth Reality

Retention Over Installs: The New App Growth Reality

Proof Ingredients: AI no longer says to eat rocks, but still ok with licking them

Proof Ingredients: AI no longer says to eat rocks, but still ok with licking them

Ideologia Rosji jako trwałe źródło zagrożenia || Radosław Sikorski - didaskalia#163

Ideologia Rosji jako trwałe źródło zagrożenia || Radosław Sikorski - didaskalia#163

Proof Ingredients: What does AI say about voting with disabilities?

Proof Ingredients: What does AI say about voting with disabilities?

Ingredients: Dr. Sasha Luccioni on the energy usage of AI

Ingredients: Dr. Sasha Luccioni on the energy usage of AI

Was your favorite YouTube channel used to train AI?

Was your favorite YouTube channel used to train AI?

Governor’s Budget Address for FY2027

Governor’s Budget Address for FY2027

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]