Your Home for Data Science
|Founder||Anthony Goldbloom, Ben Hamner|
|Headquarters||San Francisco, United States|
|Anthony Goldbloom (CEO)|
Ben Hamner (CTO)
Jeff Moser (Chief Architect)
|Products||Competitions, Kaggle Kernels, Kaggle Datasets, Kaggle Learn, Jobs Board|
Kaggle is an online community of data scientists and machine learners, owned by Google LLC. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges. Kaggle got its start by offering machine learning competitions and now also offers a public data platform, a cloud-based workbench for data science, and short form AI education. On 8 March 2017, Google announced that they were acquiring Kaggle.
In June 2017, Kaggle announced that it passed 1,000,000 registered users, or Kagglers. The community spans 194 countries. It is the largest and most diverse data community in the world, ranging from those just starting out to many of the world's best known researchers.
Kaggle competitions regularly attract over a thousand teams and individuals. Kaggle's community has thousands of public datasets and code snippets (called "kernels" on Kaggle). Many of these researchers publish papers in peer-reviewed journals based on their performance in Kaggle competitions.
- Machine learning competitions: this was Kaggle's first product and still what the site is most famous for. Companies post problems and machine learners compete to build the best algorithm.
- Kaggle Kernels: a cloud-based workbench for data science and machine learning. Allows data scientists to share code and analysis in Python and R. Over 150K "kernels" (code snippets) have been shared on Kaggle covering everything from sentiment analysis to object detection.
- Public datasets platform: community members share datasets with each other. Has datasets on everything from bone x-rays to results from boxing bouts.
- Kaggle Learn: for short-form AI education.
- Jobs board: employers post machine learning and AI jobs.
How Kaggle competitions work
- The competition host prepares the data and a description of the problem.
- Participants experiment with different techniques and compete against each other to produce the best models. Work is shared publicly through Kaggle Kernels to achieve a better benchmark and to inspire new ideas. Submissions can be made through Kaggle Kernels, through manual upload or using the Kaggle API. For most competitions, submissions are scored immediately (based on their predictive accuracy relative to a hidden solution file) and summarized on a live leaderboard.
- After the deadline passes, the competition host pays the prize money in exchange for "a worldwide, perpetual, irrevocable and royalty-free license [...] to use the winning Entry", i.e. the algorithm, software and related intellectual property developed, which is "non-exclusive unless otherwise specified".
Alongside its public competitions, Kaggle also offers private competitions limited to Kaggle's top participants. Kaggle offers a free tool for data science teachers to run academic machine learning competitions, Kaggle In Class. Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview at leading data science companies like Facebook, Winton Capital, and Walmart.
Impact of Kaggle competitions
Kaggle has run hundreds of machine learning competitions since the company was founded. Competitions have ranged from improving gesture recognition for Microsoft Kinect to improving the search for the Higgs boson at CERN.
Competitions have resulted in many successful projects including furthering the state of the art in HIV research, chess ratings and traffic forecasting. Most famously, Geoffrey Hinton and George Dahl used deep neural networks to win a competition hosted by Merck. And Vlad Minh (one of Hinton's students) used deep neural networks to win a competition hosted by Adzuna. This helped show the power of deep neural networks and resulted in the technique being taken up by others in the Kaggle community. Tianqi Chen from the University of Washington also used Kaggle to show the power of XGBoost, which has since taken over from Random Forest as one of the main methods used to win Kaggle competitions.
Several academic papers have been published on the basis of findings made in Kaggle competitions. A key to this is the effect of the live leaderboard, which encourages participants to continue innovating beyond existing best practice. The winning methods are frequently written up on the Kaggle blog, No Free Hunch.
- Lardinois, Frederic; Mannes, John; Lynley, Matthew (March 8, 2017). "Google is acquiring data science community Kaggle". Techcrunch. Archived from the original on March 9, 2017. Retrieved March 9, 2017.
Sources tell us that Google is acquiring Kaggle [...] the official announcement could come as early as tomorrow.
- "Google buys Kaggle and its gaggle of AI geeks". CNET. 2017-03-08. Retrieved 2018-06-01.
- "We've passed 1 million members". No Free Hunch. 2017-06-06. Retrieved 2018-08-19.
- Markoff, John. "Scientists See Advances in Deep Learning, a Part of Artificial Intelligence". Retrieved 2018-08-19.
- "Google Scholar". scholar.google.com. Retrieved 2018-08-19.
- Wigglesworth, Robin (March 8, 2017). "Hedge funds adopt novel methods to hunt down new tech talent". The Financial Times. United Kingdom. Retrieved October 29, 2017.
- Kaggle. "Terms and Conditions - Kaggle".
- Kaggle. "Kaggle in Class".
- Byrne, Ciara (December 12, 2011). "Kaggle launches competition to help Microsoft Kinect learn new gestures". VentureBeat. Retrieved 13 December 2011.
- "The machine learning community takes on the Higgs". Symmetry Magazine. July 15, 2014. Retrieved 14 January 2015.
- Carpenter, Jennifer (February 2011). "May the Best Analyst Win". Science Magazine. Retrieved 1 April 2011.
- Sonas, Jeff (20 February 2011). "The Deloitte/FIDE Chess Rating Challenge". Chessbase. Retrieved 3 May 2011.
- Foo, Fran (April 6, 2011). "Smartphones to predict NSW travel times?". The Australian. Retrieved 3 May 2011.
- "NIPS 2014 Workshop on High-energy Physics and Machine Learning". JMLR W&CP. 42.
- Athanasopoulos, George; Hyndman, Rob (2011). "The Value of Feedback in Forecasting Competitions" (PDF). International Journal of Forecasting. 27. pp. 845–849.
- "Welcome Kaggle to Google Cloud". Google Cloud Platform Blog. Retrieved 2018-08-19.
- "Competition shines light on dark matter", Office of Science and Technology Policy, Whitehouse website, June 2011
- "May the best algorithm win...", The Wall Street Journal, March 2011
- "Kaggle contest aims to boost Wikipedia editors", New Scientist, July 2011
- "Verification of systems biology research in the age of collaborative competition", Nature Nanotechnology, September 2011