Using Kaggle datasets in Colab
Here's how I got Kaggle datasets into Colab.
The book I'm following referred to the Dogs vs Cats dataset. After creating an account, I went to the dataset page.
One way would be to download it to my laptop, then upload it to Colab, but it would be more straightforward to download it once in Colab instead.
It turns out Kaggle made a CLI tool to download datasets from the commandline:
In Colab, I ran:
!pip install kaggle
Then I went to my account settings on Kaggle, and created an API key, as the doc suggests.
Finally I set up the .kaggle/kaggle.json file on Colab:
!mkdir .kaggle
!echo '{"username":"...","key":"..."}' > .kaggle/kaggle.json
!cat .kaggle/kaggle.json
But running !kaggle competitions download -c dogs-vs-cats
would still complain that the file couldn't be found. It also suggests an alternative with environment variables. On the GitHub repo they also show how to use environment variables: https://github.com/Kaggle/kaggle-api#api-credentials.
However in Colab, environment variables are set up with %env
instead:
%env KAGGLE_USERNAME=...
%env KAGGLE_KEY=...
That did the trick!
Once downloaded, unzip it and you're good to go.