pandas read csv giving creating incorrect columns - pandas

I am using pandas to read a csv file
import pandas as pd
data = pd.read_csv('file_name.csv', "ISO-8859-1")
Op:
col1,col2,col3
"sample1","sample2","sample3"
but the DF is just creating 1 column instead of the 5 it is supposed to create. I checked the csv and its fine.
Any suggestions on why this could be happening will be useful.

Related

reading csv file into python pandas

I want to read a csv file into a pandas dataframe but I get an error when executing the code below:
filepath = "https://drive.google.com/file/d/1bUTjF-iM4WW7g_Iii62Zx56XNTkF2-I1/view"
df = pd.read_csv(filepath)
df.head(5)
To retrieve information or data from google drive, at first, you need to identify the file id.
import pandas as pd
url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
file_id=url.split('/')[-2]
dwn_url='https://drive.google.com/uc?id=' + file_id
df = pd.read_csv(dwn_url)
print(df.head())
Try the following code snippet to read the CSV from Google Drive into the pandas DataFrame:
import pandas as pd
url = "https://drive.google.com/uc?id=1bUTjF-iM4WW7g_Iii62Zx56XNTkF2-I1"
df = pd.read_csv(url)
df.head(5)

Loading CSV file with pandas - Error Tokenizing

i am trying to load a csv file
import pandas as pd
dfc = pd.read_csv('data/Vehicles0515.csv', sep =',')
but i have the following error
ParserError: Error tokenizing data. C error: Expected 22 fields in line 3004427, saw 23
i have read to include error_bad_lines = False
but it doesn't solve the problem
Thanks a lot
Sometimes parser is getting confused by the head of csv file.
try this:
dfc = pd.read_csv('data/Vehicles0515.csv', header=None)
or
dfc = pd.read_csv('data/Vehicles0515.csv', skiprows=2)
Also you don't need provide an comma separator owing to fact that comma is default value in pandas read_csv method.

Pandas groupby. Grouping covid19 cases by continents

File
https://www.dropbox.com/sh/cx9kasx83qmsi33/AABfOzVgzBuQe2ORU_t65J4Ta?dl=0
What I have done.
Here is the code I used
import pands as pd
Read the csv file and set the file as 'covid'
covid.groupby('continent').TotalCases() it generates
KeyError: 'continent'

Beckhoff TwinCat Scope CSV Format into pandas dataframe

After recording data in Beckhoff TwinCAT Scope, one can export this data to a CSV file. Said CSV file, however, has a rather complicated format. Can anyone suggestion the most effective way to import such a file into a pandas Dataframe so I can perform analysis?
An example of the format can be found here:
https://infosys.beckhoff.com/english.php?content=../content/1033/tcscope2/html/TwinCATScopeView2_Tutorial_SaveExport.htm&id=
No need to write a custom parser. Using the example data scope_data.csv:
Name,fasd,,,,
File,C;\,,,,
Start,dfsd,,,,
,,,,,
,,,,,
Name,Peak,Name,PULS1,Name,SINUS_FAST
Net id,123.123.123,Net id,123.123.124,Net Id,123.123.125
Port,801,Port,801,Port,801
,,,,,
0,0.6113936598,0,0.07994111349,0,0.08425652468
0,0.524852539,0,0.2051963401,0,0.4391185847
0,0.4993723482,0,0.2917317117,0,0.4583736263
0,0.5976553194,0,0.8675482865,0,0.8435987898
0,0.06087224998,0,0.7933980583,0,0.5614294705
0,0.1967968423,0,0.3923966599,0,0.1951608414
0,0.9723649064,0,0.5187276782,0,0.7646786192
You can import as follows:
import pandas as pd
scope_data = pd.read_csv(
"scope_data.csv",
skiprows=[*range(5), *range(6, 9)],
usecols=[*range(1, 6, 2)]
)
Then you get
>>> scope_data.head()
Peak PULS1 SINUS_FAST
0 0.611394 0.079941 0.084257
1 0.524853 0.205196 0.439119
2 0.499372 0.291732 0.458374
3 0.597655 0.867548 0.843599
4 0.060872 0.793398 0.561429
I don't have the original scope csv, but a little adjustment of skiprows and use_cols should give you the desired result.
To read the bulk of the file (ignoring the header material) use the skiprows keyword argument to read_csv:
import pandas as pd
df = pd.read_csv('data.csv', skiprows=18)
For the header material, I think you'd have to write a custom parser.

pandas read_sas "ValueError: Length of values does not match length of index"

I need some help... I got some troubles reading my sas table in python using the pandas function read_sas. I got the following error:
"ValueError: Length of values does not match length of index".
Here is the code I run:
import pandas as pd
data=pd.read_sas("my_table.sas7bdat")
data.head()
My sas table is pretty big with 505 columns and 100 000 rows.
Thanks all for your help.
I had the same problem with few sas files. I solved it in 2 ways:
1. encoding
df=pd.read_csv('foo.sas7bdat.csv', encoding='iso-8859-1')
2. With sas7bdat library installed in Anaconda with:
conda install -c prometeia/label/pytho sas7bdat
In python file:
from sas7bdat import SAS7BDAT
f=SAS7BDAT('foo.sas7bdat').to_data_frame()
A solution I found is to export my sas table as a csv file with the code down below:
proc export data=my_table
outfile='c:\myfiles\my_table.csv'
dbms=csv
replace;
run;
After that, I use pandas function read_csv to read the csv file I have just created:
import pandas as pd
data=pd.read_csv("my_table.csv")
data.head()
Hope this might help.