5 Ways to load a dataset in google colab python notebook.
5 Ways to load a dataset in google colab python notebook.
Method 1: Load CSV File in Google Colab
How to load a dataset from a csv file from your local computer to google colab for data analysis using python and pandas
There are 2 ways to load a csv file in google colab:
Option 1: Upload CSV file manually
Option 2: Upload CSV file using Code
In google colab, you can upload a CSV file using this line of Code.
When you run this cell, you will receive a prompt to “browse” your files and upload the file that you want to upload.
After you have uploaded the CSV file, now it is time to read the CSV file. Here is a quick code snippet showing you how to read a CSV file.
First line of code is reading the file (Make sure Pandas as pd is Imported first)
This line of code is for getting the first 5 rows from the dataset and it is a way to quickly verify that the data is loaded correctly.
WARNING!!! When you upload files manually like this, next time when you re-open the google colab notebook, you will need to re-upload the files again. That is why using data stored in a URL is ideal for data analysis when using digital tools like google colab. Go to method 2 below to learn how to upload your dataset using a URL.
Method 2: Load dataset from URL in Google Colab
How to load a dataset from a url to google colab for data analysis using python and pandas.
Loading data from a URL is quite simple. There are 2 ways you can execute reading data from a URL
Option 1: You could do…
STEP 1: Get the URL that contains the data
STEP 2: Import Pandas as pd.
Then…Step 3
URL = “url link”
Data = pandas.read_csv(URL) then
Data.head() to verify it is loaded correctly
OR…
Option 2: you could do…
Method 3: Load Dataset from Google drive in Google Colab
How to load a dataset from Google Drive to google colab for data analysis using python and pandas
To load data from Google Drive to use in google colab, you can type in the code manually, but I have found that using google colab code snippet is the easiest way to do this.
Step 1: Click on arrow on top left side of the page.Â
Step 2: Click on “Code Snippets”
Step 3: type in “DRIVE” in the search bar, then Click the ARROW pointing to the right or the INSERT button to insert the code snippet into your google colab notebook.
Step 4: This is what the code snippet is supposed to look like.
Method 4: Load dataset from ZIP file in Google Colab
How to load a dataset from a ZIP file in google colab for data analysis using python and pandas
To load data from a zip file in google colab, you have to do something a little extra. There are 2 options to load a zip file in google colab.
Option 1: Unzip File(s) On Local Computer
Unzip and extract the zipped files on your local computer, then upload the CSV file to google colab. Or make the CSV file available online and then use the URL that contains the data to access the dataset. Read this blog post to learn how to convert your CSV file into a URL link that you can use for data analysis.
Option 2: Unzip and Extract Zip files inside Google Colab
Unzip and extract the zipped files inside google colab using code and this is how you do it.
Step 1: Get the data from the URL containing the zip file.
Step 2 A: Unzip the zipped file.
Step 2 B: When you click on the tiny arrow on top left corner –> click on files –> This is what you should see.
Step 3: Quickly get a glance of the data and verify that it has been unzipped by using the “!head” and “!tail” keywords to quickly view the head and tail of the dataset without loading the dataset with pandas first.
Step 4: Load the CSV file using `data = pandas.read_csv(‘File name’)`.
Step 5: Verify that the data is loaded correctly by using data.head() after you have loaded the csv file using pandas read_csv.
When you have a dataset that is stored in a TAR file type instead of a ZIP file, you can still unlock it and use the CSV file inside it.
Method 5: Load .TAR Files in Google Colab
How to load a dataset from a TAR file in google colab for data analysis using python and pandas
There are 2 options to unlock a tar file type.
Option 1: Unlock the file locally
Unlock the file locally and then upload the CSV file to google colab or make the CSV file available online and then use the URL that contains the data to access the dataset. Read this blog post to learn how to convert your CSV file into a URL link that you can use for data analysis.
Most computers can’t unlock tar file locally, so this is how to unzip tar files locally.
In windows, this is how you unzip tar file.
Step 1: Download this 7 zip software
Step 2: Right click on the zip file and choose unzip with 7-ZIP
In Mac, this is how you unzip tar file.
Option 2: Unzip and Extract .TAR Files(s) inside Google Colab
Unzip and extract the tar files inside google colab using code and this is how you do it.
Step 1: Get the data from the URL containing the RAR file.
Step 2: When you click on the tiny arrow on top left corner –> click on files –> This is what you should see.
Step 3: Use the following line of code to extract the data from the RAR file.
Step 4: Change into the directory that is holding the CSV files using this line of code.
This is the directory you are switching to.
Step 5: you can get a list of the csv files inside the directory you just switched to by using this line of code.
Step 6: load and read any CSV file from the tar file just to verify that you have successfully unzipped your files.
I hope you liked and enjoyed this content. Let me know in the comment section below if there are any other methods that I should include in this blog post.
Leave a Comment