Public Data Sets¶
Datasets are such an integral part of data science and algorithms that it’s almost impossible to talk about H2O without talking about data. This is a small but growing collection of links with publicly available data.
Open City¶
| Type/Source | Link |
|---|---|
| Palo Alto Open Data | http://www.cityofpaloalto.org/gov/depts/it/open_data/default.asp |
| Chicago | https://data.cityofchicago.org/ |
| 20-year span of crime data | https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2 |
| NYC | https://nycopendata.socrata.com/ |
| Rents & Neighborhoods | http://www.huduser.org/portal/datasets/HUD_data_matrix.html |
Transportation and Travel¶
| Type/Source | Link |
|---|---|
| Airlines (1987-2007) | http://stat-computing.org/dataexpo/2009/the-data.html (based on RHIPE’s dataset. Data source) |
| Open flights | http://openflights.org/data.html |
| Capital Bike Share | https://www.capitalbikeshare.com/trip-history-data |
Sciences and Engineering¶
| Type/Source | Link |
|---|---|
| Elements Of Statistics Learning Data | http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html |
| NASA | http://data.nasa.gov/ |
| Seismic | http://sioseis.ucsd.edu/segy.header.html |
| Weather | http://OpenWeatherMap.org | http://OpenMeteoData.org |
| NIST | http://srdata.nist.gov/gateway/gateway?dblist=0 |
| GitHub Archive | http://www.githubarchive.org |
Diverse Data Sets¶
Public Policy Data¶
| Type/Source | Link |
|---|---|
| European Open Data (6098 datasets) | http://open-data.europa.eu/en/ |
| US Open Data | http://www.data.gov/ | http://www.data.gov/opendatasites |
| WorldBank | http://data.worldbank.org/data-catalog |
| Guardian | http://www.guardian.co.uk/news/datablog/interactive/2013/jan/14/all-our-datasets-index |
| Statistics Netherlands | http://www.cbs.nl/en-GB/menu/home/default.htm?Languageswitch=on |
| Quandl 6M Financial, Economics, and Social Datasets | http://www.quandl.com/ |
Other¶
| Type/Source | Link |
|---|---|
| MovieLens film recommendations | http://grouplens.org/datasets/movielens/ |