I am getting an error SyntaxError: invalid character in identifier - pandas

These are the two paths
market_overview_preprocessed_path='C://Users/anubhav/Downloads/a.xlsx'
crop_amiga_mapping_path='C://Users/anubhav/Downloads/b.xlsx'
then i start writing my code
input_dfs_names=[str(market_overview_preprocessed_path),str(crop_amiga_mapping_path)]
df1 = pd.DataFrame()
for files in input_dfs_names:
    df = pd.read_excel(files)
i am getting an error as shown in fig

I think your path definitions are wrong. Check if this works:
market_overview_preprocessed_path='C:\\Users\\anubhav\\Downloads\\a.xlsx'
crop_amiga_mapping_path='C:\\Users\\anubhav\\Downloads\\b.xlsx'
input_dfs_names=[market_overview_preprocessed_path, crop_amiga_mapping_path]
# rest of your code here...
alternatively:
market_overview_preprocessed_path=r'C://Users/anubhav/Downloads/a.xlsx'
crop_amiga_mapping_path=r'C://Users/anubhav/Downloads/b.xlsx'
# Rest of your code

I solve the error it was due to indendation error.

Related

Spark read multiple csv into one dataframe - error with path

I'm trying to read all the csv under a HDFS directory to a dataframe, but got an error that says its "not a valid DFS filename" Could someone help to point out what I did wrong? I tried without the hdfs:// part as well but it says path could not be found. Many thanks.
val filelist = "hdfs://path/to/file/file1.csv,hdfs://path/to/file/file2.csv "
val df = spark.read.csv(filelist)
val df = spark.read.csv(filelist:_*)

Pandas - No Quote Character saved to file

I am having a difficult time trying to get any "Quote" Character to print out using to_csv function in Pandas.
import pandas as pd
final = pd.DataFrame(dataset.loc[::])
final.to_csv(r'c:\temp\temp2.dat', doublequote=True, mode='w',
sep='\x14', quotechar='\xFE', index=False)
print (final)
I have tried various options without success, I am not sure what i am missing. Wondering igf anyone can point me in the right direction. thank you in advance.
Finally! it appears the documentation has changed or it not updated on the this. adding the option of quoting=1 cures the issues. apparently, quoting=csv.QUOTE_ALL no longer works.
the complete command is
import pandas as pd
final = pd.DataFrame(dataset.loc[::])
final.to_csv(r'c:\temp\temp2.dat', index=False, doublequote=True,sep='\x14', quoting=1, quotechar='\xFE')
print (final)

Python Pandas Series.any() ==

Am I using pandas correctly? I am trying to loop through files and find if any value in a series matches
import pandas as pd
path = user/Desktop/New Folder
for file in path:
df = pd.read_excel(file)
if df[Series].any() == "string value"
do_something()
Please, check if this address your problem:
if df[df['your column']=="string value"].any()
do_something()
I think you should fix also your file iteration, please check this: https://www.newbedev.com/python/howto/how-to-iterate-over-files-in-a-given-directory/

EOF error using input Python 3

I keep getting an EOF error but unsure as to why. I have tried with and without int() but it makes no difference. I'm using Pycharm 3.4 and Python 3.
Thanks,
Chris
while True:
try:
number = int(input("what's your favourite number?"))
print (number)
break
You must close a try statement because you are declaring that there might be an error and you want to handle it
while True:
try:
number = int(input("what's your favourite number?"))
print(number)
break
except ValueError as e:
print("Woah, there is an error: {0}".format(e))

Puzzling Python I/O error: [Errno 2] No such file or directory

I'm trying to grab an XML file from a server (using Python 3.2.3), but I keep getting this error that there's "no such file or directory". I'm sure the URL is correct, since it outputs the URL in the error message, and I can copy-n-paste it and load it in my browser. So I'm very puzzled how this could be happening. Here's my code:
import xml.etree.ElementTree as etree
class Blah(object):
def getXML(self,xmlurl):
tree = etree.parse(xmlurl)
return tree.getroot()
def pregameData(self,url):
try:
x = self.getXML('{0}linescore.xml'.format(url))
except IOError as err:
x = "I/O error: {0}".format(err)
return x
if __name__ == '__main__':
x = Blah()
l = ['http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_15/gid_2013_04_15_anamlb_minmlb_1/',
'http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_15/gid_2013_04_15_phimlb_cinmlb_1/',
'http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_15/gid_2013_04_15_slnmlb_pitmlb_1/'
]
for url in l:
pre = x.pregameData(url)
print(pre)
And it always returns this error:
I/O error: [Errno 2] No such file or directory: 'http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_15/gid_2013_04_15_anamlb_minmlb_1/linescore.xml'
I/O error: [Errno 2] No such file or directory: 'http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_15/gid_2013_04_15_phimlb_cinmlb_1/linescore.xml'
I/O error: [Errno 2] No such file or directory: 'http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_15/gid_2013_04_15_slnmlb_pitmlb_1/linescore.xml'
You can copy-n-paste those URL's and see the files do exist in those locations. I even copied the files & directories to localhost, and tried this as localhost in case the foreign server had some kind of block. It gave me the same errors, so that's not an issue. I wondered if Etree's parse() can't handle HTTP, but the documentation doesn't say anything about that, so I'm guessing that's not an issue either.
UPDATE: As suggested in the comments, I went with using open(), but it still returned the error. Importing & trying urllib.request.urlopen(url) returns an error that AttributeError: 'module' object has no attribute 'request'.
You're correct, xml.etree dosen't automatically download and parse urls, if you want to do that you'll need to download it yourself first (using urllib or requests...).
The documentation explicitly states that parse takes a filename or fileobject, if it would support an url i'm sure it would say so explicitly. lxml.etree.parse() for example does.