Datasets
The home of the U.S. Government’s open data: Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more. https://www.data.gov
Awesome Public Datasets
https://github.com/awesomedata/awesome-public-datasets
Kaggle is great because it promotes the use of different dataset publication formats. However, the better part is that it strongly recommends that the dataset publishers share their data in an accessible, non-proprietary format.
https://www.kaggle.com/datasets
Google’s Open Images Dataset
https://storage.googleapis.com/openimages/web/index.html
UCI Machine Learning Dataset Repository
https://archive.ics.uci.edu/ml/index.php
COVID-19 Open Research Dataset
https://www.semanticscholar.org/cord19
Best FREE Datasets | Open-Source data for machine learning projects
10 Popular Machine Learning Datasets, Explained
How to Create a Dataset for Machine Learning
APIs
Instagram API
https://developers.facebook.com/docs/instagram-api
Twitter API
https://developer.twitter.com/en/docs/twitter-api
Discogs API
https://www.discogs.com/developers/
NASA API
https://api.nasa.gov
Youtube Data API
https://developers.google.com/youtube/v3
Flickr API
https://www.flickr.com/services/api/
Instagram-Scraper: A command-line application written in Python that scrapes and downloads an instagram user’s photos and videos.
https://github.com/arc298/instagram-scraper
Web Scraping With Python 101
Google Images Download: Python Script for ‘searching’ and ‘downloading’ hundreds of Google images to the local hard disk!
https://github.com/hardikvasa/google-images-download
Memory of the World Library
https://library.memoryoftheworld.org
EasyOCR: Ready-to-use optical character recognition with 70+ languages supported including Chinese, Japanese, Korean and Thai.
https://github.com/JaidedAI/EasyOCR
Sound Maker made using Nsynth: a research project that trained a neural network on over 300,000 instrument sounds.
https://experiments.withgoogle.com/ai/sound-maker/view/
NSynth Dataset
https://magenta.tensorflow.org/datasets/nsynth
How Machine Learning Is Generating Strange, New Sounds
Listening to data from the Large Hadron Collider