Reading csv in colab errors - google-colaboratory

Reading csv in colab errors - google-colaboratory

I'm trying to import a file to c-lab. I've tried various versions https://buomsoo-kim.github.io/colab/2018/04/15/Colab-Importing-CSV-and-JSON-files-in-Google-Colab.md/
#import packages
import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import io
print("Setup Complete")
from google.colab import files
uploaded = files.upload()
# Read the file into a variable power_data
#power_data = pd.read("DE_power prices historical.csv")
data = pd.read_csv('DE_power prices historical.csv', error_bad_lines=False)
Keep getting error:
enter image description here

Try using this method it works a bit easier:
Upload .csv files to your Google Drive
Run the following code in your Colab cell:
from google.colab import drive
drive.mount('/content/drive')
Follow the link the output cell gives you and verify your Gmail account
Import using Pandas like:
power_data = pd.read_csv('/content/drive/My Drive/*filename.csv*')

Mount google drive in google-colab
from google.colab import drive
drive.mount('/content/drive')
copy file path add into URL variable
import pandas as pd
url = 'add copy path your csv file'
df=pd.read_csv(url)
df.head()

Related

Loading a csv file with no header on my Colab by Pandas read_csv and Numpy loadtxt gave me a different results

This is the image of the error on my Colab when I used pd.dtye to pd_data
This is the image of the error on my Colab when I used np.dtye to pd_data
I have loaded one csv file to my Colab note by two diffrent way. By pd.read_csv() and np.loadtxt(). And I have assigned these two in nd_data and pd_data ,repectively. After that I printed the shape of each data. At this point I've got two diffrent shape even though I loaded the same csv file.
My question is why I've got two diffrent shape by loading the same data.
this is the link to ThoraricSurgery.csv file which I've used.
'''
from google.colab import drive
drive.mount('/content/drive')
import pandas as pd
pd_data = pd.read_csv('/content/drive/MyDrive/딥러닝과실습1/ThoraricSurgery.csv')
print(pd_data.shape)
print(type(pd_data))
import numpy as np
nd_data = np.loadtxt('/content/drive/MyDrive/딥러닝과실습1/ThoraricSurgery.csv', delimiter=",")
print(nd_data.shape)
print(type(nd_data))this is the mentioned result
'''

Export Google Colab Notebook to csv file

I need to download the results of a for loop on Google Colab to a csv file, but I haven't been able to do it.
This is my for loop:
for num in range(1, 101):
if ( num%2 == 0 and num%6 != 0) or (num%3 ==0 and num%6 != 0):
list = print(num)
The Notebook is called AHW1.ipynb
I tried:
from google.colab import files
files.download("AHW1.csv")
What can I do to download the results of this for loop as a csv file?

# data analysis libraries
import numpy as np
import pandas as pd
# visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# ignore warnings
import warnings
warnings.filterwarnings('ignore')

How to load data from your S3 bucket to Sagemaker jupyter notebook to train the model?

I have csv files in S3 bucket, I want to use those to train model in sagemaker.
using this code but it gives an error (file not found)
import boto3
import pandas as pd
region = boto3.Session().region_name
train_data_location = 's3://taggingu-{}/train.csv'.format(region)
df=pd.read_csv(train_data_location, header = None)
print df.head
What can be the solution to this ?

Not sure but could this stackoverflow answer it? Load S3 Data into AWS SageMaker Notebook
To quote #Chhoser:
import boto3
import pandas as pd
from sagemaker import get_execution_role
role = get_execution_role()
bucket='my-bucket'
data_key = 'train.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)
pd.read_csv(data_location)

You can use AWS SDK for Pandas, a library that extends Pandas to work smoothly with AWS data stores.
import awswrangler as wr
df = wr.s3.read_csv("s3://bucket/file.csv")
Most notebook kernels have it, if missing it can be installed via pip install awswrangler.

Generating a NetCDF from a text file

Using Python can I open a text file, read it into an array, then save the file as a NetCDF?
The following script I wrote was not successful.
import os
import pandas as pd
import numpy as np
import PIL.Image as im
path = 'C:\path\to\data'
grb = [[]]
for fn in os.listdir(path):
file = os.path.join(path,fn)
if os.path.isfile(file):
df = pd.read_table(file,skiprows=6)
grb.append(df)
df2 = pd.np.array(grb)
#imarray = im.fromarray(df2) ##cannot handle this data type
#imarray.save('Save_Array_as_TIFF.tif')

i once used xray or xarray (they renamed them selfs) to get a NetCDF file into an ascii dataframe... i just googled and appearantly they have a to_netcdf function
import xarray and it allows you to treat dataframes just like pandas.
so give this a try:
df.to_netcdf(file_path)
xarray slow to save netCDF

How can i download a zipped file from the internet using pandas 0.17.1 and python 3.5

What am i doing wrong? here is what i am trying to do:
import pandas as pd
url='http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip'
df = pd.read_csv(url, compression='gzip',
header=0, sep=',', quotechar='"',
engine = 'python')

#Abbas, thanks so much. Indeed i ran it step by step and here is what i came up with. Not the fastest indeed, but it works fine.
I ran it with pandas 0.18.1 on python 3.5.1 on Mac
from zipfile import ZipFile
from urllib.request import urlopen
import pandas as pd
import os
URL = \
'http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip'
# open and save the zip file onto computer
url = urlopen(URL)
output = open('zipFile.zip', 'wb') # note the flag: "wb"
output.write(url.read())
output.close()
# read the zip file as a pandas dataframe
df = pd.read_csv('zipFile.zip') # pandas version 0.18.1 takes zip files
# if keeping on disk the zip file is not wanted, then:
os.remove(zipName) # remove the copy of the zipfile on disk
I hope this helps. Thanks!

The answer by Cy Bu didn't quite work for me in Python 3.6 on Windows. I was getting an invalid argument error when trying to open the file. I modified it slightly:
import os
from urllib.request import urlopen, Request
r = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
b2 = [z for z in url.split('/') if '.zip' in z][0] #gets just the '.zip' part of the url
with open(b2, "wb") as target:
target.write(urlopen(r).read()) #saves to file to disk
data = pd.read_csv(b2, compression='zip') #opens the saved zip file
os.remove(b2) #removes the zip file

IIUC here is a solution instead of directly passing zip file to pandas, first unzip it and then pass the csv file:
from StringIO import StringIO
from zipfile import ZipFile
from urllib import urlopen
import pandas as pd
url = urlopen("http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip")
zipfile = ZipFile(StringIO(url.read()))
f = open(zipfile.NameToInfo.keys()[0],'wb')
f.write(zipfile.open(zipfile.NameToInfo.keys()[0]).read())
f.close()
df = pd.read_csv(zipfile.NameToInfo.keys()[0])
And will produce a DataFrame like this:

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Reading csv in colab errors - google-colaboratory

Mount google drive in google-colab from google.colab import drive drive.mount('/content/drive') copy file path add into URL variable import pandas as pd url = 'add copy path your csv file' df=pd.read_csv(url) df.head()

Related

Loading a csv file with no header on my Colab by Pandas read_csv and Numpy loadtxt gave me a different results

Export Google Colab Notebook to csv file

How to load data from your S3 bucket to Sagemaker jupyter notebook to train the model?

Generating a NetCDF from a text file

How can i download a zipped file from the internet using pandas 0.17.1 and python 3.5

Categories

Resources