Datasets

  • UCI Machine Learning Repository: More than 400 ML datasets.

  • Kaggle: More than 8,000 datasets of varying quality covering numerous topics.

  • Rdatasets: A collection of 1161 datasets that were originally distributed alongside the statistical software environment R and some of its add-on packages. Curated by Vincent Arel-Bundock.

  • Awesome Public Datasets: A VERY large list of tidied, public data sets. Most of the datasets are free, some are not.

  • IMDb Datasets: Lots and lots of movie data from IMDb.

  • Gun Violence Database: A crowdsourced database of gun violence incidents in the US.

  • UN Data: Data from diverse UN sub-organizations.

  • Eurostat: EU data and statistics.

  • Gapminder: An independent Swedish foundation dedicated to fighting misconceptions about global development, Gapminder offers datasets related to various development indicators.

  • Open Source Psychometrics Project: A website providing a collection of interactive personality tests with detailed results that can be taken for personal entertainment or to learn more about personality assessment. The tests range from very serious to not so much. Special focus is given to the strengths, weaknesses and validity of the various systems.

  • OpenStreetMap: Collaborative project to create a free editable map of the world. Geographic data can be downloaded as XML files.

  • The Standford Open Policing Project: Standardized data on interactions between police and public, e.g. vehicle and pedestrian stops, from law enforcement departments across the USA.

  • Our World in Data: Online publication giving overview of global living conditions. Topics covered: health, food provision, the growth and distribution of incomes, violence, rights, wars, culture, energy use, education, and environmental changes. Charts generally include option to download data.

  • Open Food Facts: Collaborative database of food products from around the world.

  • Climate Data Online: Global climate data from the National Climatic Data Center, U.S. Department of Commerce.

  • Data Sets: From the book Applied Regression Analysis and Generalized Linear Models.

  • United States Census Bureau: Demographic data from the USA.