Content ID (algorithm)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Content ID is a digital fingerprinting system developed by Google which is used to easily identify and manage copyrighted content on YouTube. Videos uploaded to YouTube are compared against audio and video files registered with Content ID by content owners, looking for any matches. Content owners have the choice to have matching content taken down or to monetize it. The system began to be implemented around 2007. By 2016, it had cost $60 million to develop and led to around $2 billion in payments to copyright holders.[1]

Overview[edit]

Content ID [2] creates an ID File for copyrighted audio and video material, and stores it in a database. When a video is uploaded, it is checked against the database, and flags the video as a copyright violation if a match is found.[3] When this occurs, the content owner has the choice of blocking the video to make it unviewable, tracking the viewing statistics of the video, or adding advertisements to the "infringing" video, the advertisements money automatically going to the content owner.

Only uploaders who meet specific criteria can use Content ID.[4][5] These criteria makes the use of Content ID without the aid of a major backer difficult, de facto limiting its usage to big corporations.[6]

Context[edit]

Between 2007 and 2009 Organizations including Viacom, Mediaset, and the English Premier League filed lawsuits against YouTube, claiming that it has done too little to prevent the uploading of copyrighted material.[7][8][9] Viacom, demanding $1 billion in damages, said that it had found more than 150,000 unauthorized clips of its material on YouTube that had been viewed "an astounding 1.5 billion times".

During the same court battle, Viacom won a court ruling requiring YouTube to hand over 12 terabytes of data detailing the viewing habits of every user who has watched videos on the site. On March 18, 2014, the lawsuit was settled after seven years with an undisclosed agreement.[10]

History[edit]

In June 2007, YouTube began trials of a system for automatic detection of uploaded videos that infringe copyright. Google CEO Eric Schmidt regarded this system as necessary for resolving lawsuits such as the one from Viacom, which alleged that YouTube profited from content that it did not have the right to distribute.[11] The system was initially called "Video Identification"[12][13] and later became known as Content ID[2] By 2010, YouTube had "already invested tens of millions of dollars in this technology".[13] In 2011, YouTube described Content ID as "very accurate in finding uploads that look similar to reference files that are of sufficient length and quality to generate an effective ID File".[3]

By 2012, Content ID accounted for over a third of the monetized views on YouTube.[14]

In 2016, Google stated that Content ID had paid out around $2 billion to copyright holders (compared to around $1 billion by 2014), and had cost $60 million to develop.[1]

Since mid 2018, Google is Beta testing a new tool called Copyright Match, a simplified version of Content ID with more limited options, which would be available to uploaders with more than 100000 views.[6][15] However contrary to Content ID which sends copyright notices automatically, with Copyright Match no action is taken until the creator chooses to do so.

Trademark lawsuit[edit]

In 2006, YouTube and Audible Magic signed an agreement to license the use of Audible Magic's own "Content ID" fingerprinting technology. When Google bought YouTube in November the same year, the license was transferred to Google.[16] The agreement was terminated in 2009, but in 2014 Google obtained a trademark for their own "Content ID" implementation.[17] Audible Magic sued Google the same year on the basis that they owned the "Content ID" trademark and therefore that Google trademarking their implementation was a fraud.

Criticisms[edit]

An independent test in 2009 uploaded multiple versions of the same song to YouTube, and concluded that while the system was "surprisingly resilient" in finding copyright violations in the audio tracks of videos, it was not infallible.[18] The use of Content ID to remove material automatically has led to controversy in some cases, as the videos have not been checked by a human for fair use.[19]

If a YouTube user disagrees with a decision by Content ID, it is possible to fill in a form disputing the decision.[20] Prior to 2016, videos weren't monetized until the dispute was resolved.

In December 2013, Google changed the way the system worked (seemingly to cover YouTube in case of lawsuits), leading to numerous content creation copyright notices being sent to gameplay videos YouTube content creators. Those notices led to ad revenues being automatically diverted to third parties, which sometimes had even no connection to the games.[21][22]

Since April 2016, videos continue to be monetized while the dispute is in progress, and the money goes to whoever won the dispute.[23] Should the uploader want to monetize the video again, they may remove the disputed audio in the "Video Manager".[24] YouTube has cited the effectiveness of Content ID as one of the reasons why the site's rules were modified in December 2010 to allow some users to upload videos of unlimited length.[25]

The music industry has criticized Content ID as inefficient, with Universal Music Publishing Group (UPMG) estimating in a 2015 filing to the US Copyright Office "that Content ID fails to identify upwards of 40 percent of the use of UMPG’s compositions on YouTube".[1][26] Google has countered these assertions by stating that (as of 2016) Content ID detected over 98% of known copyright infringement on YouTube and humans filing removal notices only 2%.[1]

In January 2018, a YouTube uploader who created a white noise generator received copyright notices about a video he uploaded which was created using this tool, and therefore containing only white noise.[27]

In September 2018, a german university professor uploaded videos with several classical music performances for which their copyright had expired, because both the composers were dead long ago, and the performances were not covered anymore by copyright. After he received several copyright violations by YouTube, he could lift the majority of them, but Deutsche Grammophon refused to lift two of them even if their copyright had expired.[28][29][30] In other cases, copyright violations notices were even sent to uploaders who recorded themselves playing public domain classical music, with Sony Music asserting copyright over more than 1,100 compositions by Johann Sebastian Bach via Content ID.[31] Commentators noted that this was also the case on other platforms such as Facebook.[32]

See also[edit]

References[edit]

  1. ^ a b c d Popper, Ben (2016-07-13). "YouTube to the music industry: here's the money". The Verge. Retrieved 2018-09-20. 
  2. ^ a b "YouTube Content ID". YouTube. September 28, 2010. Retrieved May 25, 2015. 
  3. ^ a b More about Content ID YouTube. Retrieved December 4, 2011.
  4. ^ "Qualifying for Content ID". Google. Retrieved 2018-09-09. 
  5. ^ "Content eligible for Content ID". Google. Retrieved 2018-09-09. 
  6. ^ a b "YouTube Beta Testing Content ID for Everyone". plagiarismtoday.com. 2018-05-02. Retrieved 2018-09-09. 
  7. ^ "Viacom will sue YouTube for $1bn". BBC News. March 13, 2007. Retrieved May 26, 2008. 
  8. ^ "Mediaset Files EUR500 Million Suit Vs Google's YouTube". CNNMoney.com. July 30, 2008. Retrieved August 19, 2009. 
  9. ^ "Premier League to take action against YouTube". The Daily Telegraph. Telegraph Media Group. May 5, 2007. Retrieved March 26, 2017. 
  10. ^ "Google and Viacom settle seven-year YouTube row". BBC News. March 18, 2014. Retrieved March 18, 2014. 
  11. ^ Delaney, Kevin J. (June 12, 2007). "YouTube to Test Software To Ease Licensing Fights". Wall Street Journal. Retrieved December 4, 2011. 
  12. ^ YouTube Advertisers (February 4, 2008), Video Identification, retrieved August 29, 2018 
  13. ^ a b King, David (December 2, 2010). "Content ID turns three". Official YouTube Blog. Retrieved August 29, 2018. 
  14. ^ Press Statistics YouTube. Retrieved March 13, 2012.
  15. ^ "YouTube to Launch Tool to Detect Re-Uploaded Videos Automatically". Variety. 2018-07-11. Retrieved 2018-09-09. 
  16. ^ "Audible Magic Accuses YouTube of Fraud Over Content ID Trademark". torrentfreak.com. 2017-01-11. Retrieved 2018-09-09. 
  17. ^ "Audible Magic Accuses YouTube of Fraud Over Content ID Trademark". digitalmusicnews.com. 2017-01-12. Retrieved 2018-09-09. However, in 2013, Google signed a declaration stating that it knew of no other company entitled to use the Content ID brand 
  18. ^ Von Lohmann, Fred (April 23, 2009). "Testing YouTube's Audio Content ID System". Retrieved December 4, 2011. 
  19. ^ Von Lohmann, Fred (February 3, 2009). "YouTube's January Fair Use Massacre". Retrieved December 4, 2011. 
  20. ^ Content ID disputes YouTube. Retrieved December 4, 2011.
  21. ^ "YouTube video game shows hit with copyright blitz". Polygon. 2013-12-10. Retrieved 2018-09-09. 
  22. ^ "YouTube Responds To Content ID Crackdown, Plot Thickens". Forbes. 2013-12-17. Retrieved 2018-09-09. 
  23. ^ Hernandez, Patricia. "YouTube's Content ID System Gets One Much-Needed Fix". Kotaku. Retrieved September 16, 2017. 
  24. ^ "Remove Content ID claimed songs from my videos – YouTube Help". support.google.com. Retrieved September 17, 2017. 
  25. ^ Siegel, Joshua; Mayle, Doug (December 9, 2010). "Up, Up and Away – Long videos for more users". Official YouTube Blog. Google. Retrieved March 25, 2017. 
  26. ^ "Comments of Universal Music Group". Scribd. 2015. Retrieved 2018-09-20. 
  27. ^ "YouTube's problematic Content ID says white noise is copyrighted". Thenextweb. 2018-01-05. Retrieved 2018-09-09. 
  28. ^ Kaiser, Ulrich (2018-09-03). "Google: Sorry professor, old Beethoven recordings on YouTube are copyrighted". Arstechnica. Retrieved 2018-09-09. 
  29. ^ "YouTube's Content-ID Flags Music Prof's Public Domain Beethoven and Wagner Uploads". torrentfreak.com. 2018-09-03. Retrieved 2018-09-09. 
  30. ^ "How The EU May Be About To Kill The Public Domain: Copyright Filters Takedown Beethoven". Techdirt. 2018-08-28. Retrieved 2018-09-09. 
  31. ^ "The Empire Strikes Bach". freebeacon.com. 2018-09-08. Retrieved 2018-09-09. 
  32. ^ "The future is here today: you can't play Bach on Facebook because Sony says they own his compositions". Boing Boing. 2018-09-05. Retrieved 2018-09-09.