Error 'utf-8' codec can't decode byte 0xf4 - pandas

i'm trying to open an csv file but i'm having an error 'utf-8' codec can't decode byte 0xf4 in position 1: invalid continuation byte. What i did is :
df = pd.read_csv("covidrisks.csv")

I had to save the file under csv utf-8 and that solved my problem

Related

Error: 'utf-8' codec can't decode byte 0xb2 in position 181: invalid start byte

I'm trying to run the following code:
file_list=['TC20_test1.csv','TC20_test2.csv','TC20_test3.csv','TC20_test4.csv','TC20_test5.csv']
main_dataframe = pd.DataFrame(pd.read_csv(file_list[0])) #Read data
for i in range(1,len(file_list)):
data = pd.read_csv(file_list[i])
df = pd.DataFrame(data)
main_dataframe = pd.concat([main_dataframe,df],axis=1)
print(main_dataframe)
The main idea is to extract data from the same place in various csv files and add them to a data frame, however I keep getting this error code every time I try to run it.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 181: invalid start byte.
Any help would be greatly appreciated :))

pandas read .txt doc with specific seperator

I'm trying to read a .txt file using pandas that has a ^ type separator.
I keep running into error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 7: invalid start byte
tried to use pd.read_csv(txt.file , sep = '^' , header = None)
txt.file has no headers ,
Am I missing an argument?
UPDATE:
13065^000000000^aaaaa^test , conditions^123455^^01.01.01:Date^^^^^^ 77502^000000123^aaaaa^test, conditions^123456^^^^^^^^
seems there is is uneven amount of the sep ^ on each row.
How could I fix?

File csv has "ñ" in headers, I can't read it with pandas

I'm trying read csv with pandas, it has a header "año"
This is the unicode error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 1: invalid continuation byte
How can I read this csv file? I have a lot of files with this problem.
It is not in UTF-8 format. You need to give the format ISO-8859-1 to pandas.
You should post the pandas code where it's specifying UTF-8

'utf-8' codec can't decode byte 0xdb in position 1:

The following code gives me a
import pandas as pd
path = 'C:\\Users\\vlac284\\Desktop\\Lighting\\sample_Myer\\sample_2.xlsx
df = pd.read_csv(path)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdb in position 1: invalid continuation byte
Similar postings did not help.
You have an Excel format file (.xlsx) and you are trying to read it as csv (read_csv). They are 2 different formats... that will not work.
Either save your Excel file as .csv or use a library / method that reads Excel.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcc in position 3: invalid continuation byte

I'm trying to load a csv file using pd.read_csv but I get the following unicode error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcc in position 3: invalid continuation byte
Unfortunately, CSV files have no built-in method of signalling character encoding.
read_csv defaults to guessing that the bytes in the CSV file represent text encoded in the UTF-8 encoding. This results in UnicodeDecodeError if the file is using some other encoding that results in bytes that don't happen to be a valid UTF-8 sequence. (If they by luck did also happen to be valid UTF-8, you wouldn't get the error, but you'd still get wrong input for non-ASCII characters, which would be worse really.)
It's up to you to specify what encoding is in play, which requires some knowledge (or guessing) of where it came from. For example if it came from MS Excel on a western install of Windows, it would probably be Windows code page 1252 and you could read it with:
pd.read_csv('../filename.csv', encoding='cp1252')
I got the following error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position
51: invalid continuation byte
This was because I made changes to the file and its encoding. You could also try to change the encoding of file to utf-8 using some code or nqq editor in ubuntu as it provides directory option to change encoding. If problem remains then try to undo all the changes made to the file or change the directory.
Hope this helps
Copy the code, open a new .py file and enter code and save.
I had this same issue recently. This was what I did
import pandas as pd
data = pd.read_csv(filename, encoding= 'unicode_escape')