How to upload your CSV file online for data analysis
This tutorial is going to teach you how to take a csv file and put it online so that you can do data analysis with the url link containing the data instead of using the csv file containing the data.
To take a csv file and make it available online, there are a couple of ways that you can do it. If you are a data scientist, I have seen that the easiest way to achieve this is to use Github. So this is how you do it.
Step 2: Click on create new repository on upper right corner of github account
Step 3: Fill out information for your new repository and remember to add a read.me file by clicking the check box.
Step 4: Click on upload (or you can do this step using your terminal and code)
Step 5: Drag and drop the CSV file or Folder containing your csv file into Github (Or click on upload) to upload your CSV file or folder.
Step 6: Write a useful message to yourself and commit your changes/upload
Step 7: Click on the CSV file you just uploaded
Step 8: Click on RAW on the top of the page
Step 9: Copy the link/url on the RAW data page
Step 10: Paste the raw github file link into pandas read_csv function
Step 11: Verify that the data is loaded correctly and that your link containing the data is working properly using pandas head() function
What is an alternative to using Github for people that doesn't have Github?
There is an alternative to using github. I have found this website to be one of the better ones where you can upload a csv file and get a link for it without dealing with a bunch of ads.
Step 13: Upload your csv file. You will get a preview of the data.
Step 14: You can share the csv file link with somebody or you can download the data into CSV format if you want.
- Sometimes, the ShareCSV website doesn’t work when you try to download the data back into CSV format and
- YOU CANNOT USE THE SHARECSV URL LINK CONTAINING THE DATA TO DO DATA ANALYSIS. PANDAS CANNOT READ THE DATA FROM THE SHARE CSV LINK.
- This share csv is ideal
- When you don’t want the data on github,
- When you don’t have a github account, and
- When you just want an easy way to create and get a url link for your data so that you can quickly view or share the data using the url instead of a csv file.
Why would you want to use a url containing your data instead of a csv file for data analysis? Well, there are a few reasons why.
- When you are working on another computer, it makes it easier to access the data
- When you share your data science notebook with somebody, it makes it easy for that person to access the data used in the analysis without you having to send csv files for that person to load locally
- When you are working in google colab, you don’t have to load your csv file manually everytime you open your notebook. Google colab deletes anything that is uploaded to it when it refreshes. This makes it a nightmare when it comes to data being used. If you use a csv file for your data analysis in google colab, whenver you open that notebook, you have to re-upload the CSV file. Using a link containing the data prevents this headache and allows you to access your data even when google colab refreshes.
Downside of using data stored in a url in data analysis is that you need to have internet access to retrieve the data from the url. But most people have access to wifi a lot of the time, so this shouldn’t be a problem.