Problems with selenium 2 and python 3.4.1 - testing

I have a simple automation to fill login form fields. Actually, it passes good, but there's the problem. I need to see actual output in my console after the script filled fields, like "Logged in successfully" or "Username not found". I tried many stuff, but nothing worked this way, my last try was while loop and it works great, but only when I have positive result. I wrote a second condition, but when I type incorrect data, it drives me crazy to see all these errors in my console. So here's the code and part of output.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
baseurl = "http://www.somesite/login"
email = input("Type an email: ")
password = input("Type a password: ")
xpaths = { 'loginBox' : "//input[#id='session_email']",
'passwordBox' : "//input[#id='session_password']",
'submitButton' : "//input[#class='ufs-but']",
'success' : "//div[#class='flash-message success']",
'error' : "//span[#class='form_error']"
}
mydriver = webdriver.Firefox()
mydriver.get(baseurl)
mydriver.find_element_by_xpath(xpaths['loginBox']).send_keys(email)
mydriver.find_element_by_xpath(xpaths['passwordBox']).send_keys(password)
mydriver.find_element_by_xpath(xpaths['submitButton']).click()
while mydriver.find_element_by_xpath(xpaths['success']):
print("Success")
if mydriver.find_element_by_xpath(xpaths['error']):
print("No")
And there's what I got when I try to interrupt an error:
File "ab.py", line 32, in <module>
while mydriver.find_element_by_xpath(xpaths['success']):
File "/usr/local/lib/python3.4/site-packages/selenium-2.43.0-py3.4.egg/selenium/webdriver/remote/webdriver.py", line 230, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "/usr/local/lib/python3.4/site-packages/selenium-2.43.0-py3.4.egg/selenium/webdriver/remote/webdriver.py", line 662, in find_element
{'using': by, 'value': value})['value']
File "/usr/local/lib/python3.4/site-packages/selenium-2.43.0-py3.4.egg/selenium/webdriver/remote/webdriver.py", line 173, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.4/site-packages/selenium-2.43.0-py3.4.egg/selenium/webdriver/remote/errorhandler.py", line 166, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: 'Unable to locate element: {"method":"xpath","selector":"//div[#class=\'flash-message success\']"}' ; Stacktrace:
at FirefoxDriver.prototype.findElementInternal_ (file:///tmp/tmpjax8kj1u/extensions/fxdriver#googlecode.com/components/driver-component.js:9618:26)
at FirefoxDriver.prototype.findElement (file:///tmp/tmpjax8kj1u/extensions/fxdriver#googlecode.com/components/driver-component.js:9627:3)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmpjax8kj1u/extensions/fxdriver#googlecode.com/components/command-processor.js:11612:16)
at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmpjax8kj1u/extensions/fxdriver#googlecode.com/components/command-processor.js:11617:7)
at DelayedCommand.prototype.execute/< (file:///tmp/tmpjax8kj1u/extensions/fxdriver#googlecode.com/components/command-processor.js:11559:5)
As I said, successfull result ain't a problem.
UPD. I corrected the last part of my code a little bit and now I have this:
while mydriver.find_element_by_xpath(xpaths['success']):
print("Success")
break
while mydriver.find_element_by_xpath(xpaths['error']):
print("No")
break
And it works, but not like I want, the output when I want a negative result:
Type an email: w
Type a password: wer
Success
No
As you see, I wanna see 'success' when result is positive and 'no' when it's negative, but I don't want to see them at the same time.
UPD. Props to Macro Giancarli for huge help, so that's how I got what I exactly want:
try:
success = True
success_element = mydriver.find_element_by_xpath(xpaths['success'])
except NoSuchElementException:
success = False
print("Can't log in. Check email and/or password")
try:
failure = True
failure_element = mydriver.find_element_by_xpath(xpaths['error'])
except NoSuchElementException :
failure = False
print("Logged in successfully")

The problem looks like it's in the way you structure your while loop at the end. You shouldn't need to loop in order to check for success or failure.
Consider that there are four outcomes, assuming that you input the login data. You could either find the element that determines success, find the element that determines failure, find both (should be impossible), or find neither (probably in the case of an unexpected screen, or a failure to load the page).
Instead of expecting some values to be returned from the webdriver queries, try putting them in a try block to catch a NoSuchElementException and checking for non-None contents. Also, try handling each of the four cases so that your program will crash less often.
Edit:
Try this.
try:
success = True
success_element = mydriver.find_element_by_xpath(xpaths['success'])
except NoSuchElementException:
success = False
try:
failure = True
failure_element = mydriver.find_element_by_xpath(xpaths['error'])
except NoSuchElementException :
failure = False
# now handle the four possibilities

Related

Add Proxy to Selenium & export dataframe to CSV

I'm trying to make a scraper for capterra. I'm having issues getting blocked, so I think I need a proxy for my driver.get. Also, I am having trouble exporting a dataframe to a CSV. The first half of my code (not attached) is able to get all the links and store them in a list that I am trying to access with Selenium to get the information I want, but the second part is where I am having trouble.
For an example, these are the types of links I am storing in the plinks list and that the driver is accessing:
https://www.capterra.com/p/212448/Blackbaud-Altru/
https://www.capterra.com/p/80509/Volgistics-Volunteer-Management/
https://www.capterra.com/p/179048/One-Earth/
for link in plinks:
driver.get(link)
#driver.implicitly_wait(20)
companyProfile = bs(driver.page_source, 'html.parser')
try:
name = companyProfile.find("h1", class_="sm:nb-type-2xl nb-type-xl").text
except AttributeError:
name = "couldn't find"
try:
reviews = companyProfile.find("div", class_="nb-ml-3xs").text
except AttributeError:
reviews = "couldn't find"
try:
location = driver.find_element(By. XPATH, "//*[starts-with(., 'Located in')]").text
except NoSuchElementException:
location = "couldn't find"
try:
url = driver.find_element(By. XPATH, "//*[starts-with(., 'http')]").text
except NoSuchElementException:
url = "couldn't find"
try:
features = [x.get_text() for x in companyProfile.select('[id="LoadableProductFeaturesSection"] li span')]
except AttributeError:
features = "couldn't find"
companyInfo.append([name, reviews, location, url, features])
companydf = pd.DataFrame(companyInfo, columns = ["Name", "Reviews", "Location", "URL", "Features"])
companydf.to_csv(wmtest.csv, sep='\t')
driver.close()
I am using Mozilla for the webdriver, and I am happy to change to Chrome if it works better, but is it possible to have the webdriver pick from a random set of proxies for each get request?
Thanks!

how to read the console output in python without executing any command

I have an API which gets the success or error message on console.I am new to python and trying to read the response. Google throws so many examples to use subprocess but I dont want to run,call any command or sub process. I just want to read the output after below API call.
This is the response in console when success
17:50:52 | Logged in!!
This is the github link for the sdk and documentation
https://github.com/5paisa/py5paisa
This is the code
from py5paisa import FivePaisaClient
email = "myemailid#gmail.com"
pw = "mypassword"
dob = "mydateofbirth"
cred={
"APP_NAME":"app-name",
"APP_SOURCE":"app-src",
"USER_ID":"user-id",
"PASSWORD":"pw",
"USER_KEY":"user-key",
"ENCRYPTION_KEY":"enc-key"
}
client = FivePaisaClient(email=email, passwd=pw, dob=dob,cred=cred)
client.login()
In general it is bad practice to get a value from STDOUT. There are some ways but it's pretty tricky (it's not made for it). And the problem doesn't come from you but from the API which is wrongly designed, it should return a value e.g. True or False (at least) to tell you if you logged in, and they don't do it.
So, according to their documentation it is not possible to know if you're logged in, but you may be able to see if you're logged in by checking the attribute client_code in the client object.
If client.client_code is equal to something then it should be logged in and if it is equal to something else then not. You can try comparing it's value when you successfully login or when it fails (wrong credential for instance). Then you can put a condition : if it is None or False or 0 (you will have to see this by yourself) then it is failed.
Can you try doing the following with a successful and failed login:
client.login()
print(client.client_code)
Source of the API:
# Login function :
# (...)
message = res["body"]["Message"]
if message == "":
log_response("Logged in!!")
else:
log_response(message)
self._set_client_code(res["body"]["ClientCode"])
# (...)
# _set_client_code function :
def _set_client_code(self, client_code):
try:
self.client_code = client_code # <<<< That's what we want
except Exception as e:
log_response(e)
Since this questions asks how to capture "stdout" one way you can accomplish this is to intercept the log message before it hits stdout.
The minimum code to capture a log message within a Python script looks this:
#!/usr/bin/env python3
import logging
logger = logging.getLogger(__name__)
class RequestHandler(logging.Handler):
def emit(self, record):
if record.getMessage().startswith("Hello"):
print("hello detected")
handler = RequestHandler()
logger.addHandler(handler)
logger.warning("Hello world")
Putting it all together you may be able to do something like this:
import logging
from py5paisa import FivePaisaClient
email = "myemailid#gmail.com"
pw = "mypassword"
dob = "mydateofbirth"
cred={
"APP_NAME":"app-name",
"APP_SOURCE":"app-src",
"USER_ID":"user-id",
"PASSWORD":"pw",
"USER_KEY":"user-key",
"ENCRYPTION_KEY":"enc-key"
}
client = FivePaisaClient(email=email, passwd=pw, dob=dob,cred=cred)
class PaisaClient(logging.Handler):
def __init__():
self.loggedin = False # this is the variable we can use to see if we are "logged in"
def emit(self, record):
if record.getMessage().startswith("Logged in!!")
self.loggedin = True
def login():
client.login()
logging.getLogger(py5paisa) # get the logger for the py5paisa library
# tutorial here: https://betterstack.com/community/questions/how-to-disable-logging-from-python-request-library/
logging.basicConfig(handlers=[PaisaClient()], level=0, force=True)
c = PaisaClient()
c.login()

How to handle "flood wait" errors when using telethon.sync?

Looks like sync version of the client doesn't throw any errors?
What is the right way to handle errors when working with telethon.sync?
the code below results in client going to "sleep", but no errors are cought.
I tried to do the same with explicit exception for FloodWaitError, it doesn't solve the issue.
from telethon.sync import TelegramClient
from telethon.tl.functions.channels import GetFullChannelRequest
if __name__ == '__main__':
setup_logging(level=logging.INFO)
tg = TelegramClient(
'anon',
api_id=config.API_ID,
api_hash=config.API_HASH,
)
with tg as client:
try:
result = client(GetFullChannelRequest(-1001100118939))
except ValueError as e:
print(e)
break;
# print('Flood wait for ', e.seconds)
# time.sleep(e.seconds)
print(result)
telethon.sync doesn't change the behavior of exceptions. However, FloodWaitError is not a ValueError, so your except won't catch it. The following will work:
from telethon import errors
try:
...
except errors.FloodWaitError as e:
print('Flood wait for ', e.seconds)
Note that the library automatically sleeps if the flood error is less than a minute by default, in which case it will wait and not raise for convenience.

python - urllib.request.urlretrieve throws unexpected exception unknown url type: ' '

I am trying to download files using urllib.request.retrieve()
I am using Python 3 and the downloads are successful, but I don't know why it throws exception.
For some reason it throws an exception.
This is the main file:
import os
import urllib.request
zip_file_open = open("urls.txt")
if not os.path.exists('zip'):
os.makedirs('zip')
num=1
true = True
b = true
for i in zip_file_open.read().splitlines():
try:
print(str(i))
#response = urllib.request.urlopen(str(i))
#print(response)
#html = response.read()
urllib.request.urlretrieve(i, "zip/code"+str(num)+".zip")
if(b):
num+=1
b=False
else:
b=true
except Exception as e:
print("Exception: "+str(e))
if(b):
num+=1
b=False
else:
b=true
This is urls.txt:
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c01_code.zip
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c02_code.zip
........
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c25_code.zip
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c26_code.zip
Here is how I create the txt file:
f = open("urls.txt","w")
k = """http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c"""
k1 = """_code.zip"""
import os
for i in range(26):
if(i<9):
f.write(k+str(0)+str(i+1)+k1+os.linesep)
else:
f.write(k+str(i+1)+k1+os.linesep)
f.close()
Here is the output
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c01_code.zip
Exception2: unknown url type: ''
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c02_code.zip
Exception3: unknown url type: ''
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c03_code.zip
Exception3: HTTP Error 404: Not Found
........
Exception26: unknown url type: ''
http://media.wiley.com/product_ancillary/50/11188580/DOWNLOAD/c26_code.zip
Exception27: unknown url type: ''
I didn't include all the lines of output as they were same. The code is functional but I would like to know if we can remove the exception.
It looks like you have some blank lines in your file, so urllib throws a ValueError exception when you try to fetch '', which is clearly not a url.
You can fix this error if you add a condition in the loop to check for empty strings.
for i in zip_file_open.read().splitlines():
if not i.strip():
continue
...
But this won't work for non-empty strings that are not urls, for example 'not a url'.
A better approach would be to check the url scheme with urlparse.
for i in zip_file_open.read().splitlines():
if not urllib.parse.urlparse(i).scheme:
continue
...

f.open() issue resulting in Unbound error for f.close()

I don't quite have an answer but I'm narrowing it down. Somehow I'm mixing/confusing types, I believe, between what is provided by commands like 'os.path' and type str().
As I've made the assignment of the logfile(s) globally, even though I can print it in the function, when the variable is used in fout = open(... it's actually a null that's being referenced, i.e. open() doesn't like/can't use the type it finds.
The error:
UnboundLocalError: local variable 'fout' referenced before assignment
I am simply writing a log of dot files (left on USB drives by OSX) for deletion, but the try/except is now falling over. First the original version.
working code:
logFile = "/Users/dee/Desktop/dotFile_names.txt"
try:
fout = open(logFile, 'w')
for line in dotFile_names:
fout.write(line)
except IOError as e:
print ("Error : %s not found." % fout)
finally:
fout.close()
Attempting better practice, I sought to put the log file specs and path as variables so they can be modified if need be - I hope to make it cross platform workable. these variables are at the head of the program, i.e. not in main(), but I pass them in and print() statements have shown me they are successfully being referenced. i.e. I get this printed:
/Users/dee/Desktop/dotFile_names.txt
Despite this the error I get is:
UnboundLocalError: local variable 'fout' referenced before assignment -
error points at the "fout.close()" line
Error producing code
logFilespec = "dotFile_names.txt"
fullLogFileSpec = []
userDesktop = os.path.join(os.path.expanduser('~'), 'Desktop')
fullLogFilespec = os.path.join(userDesktop, logFilespec)
try:
print "opening " + fullLogFilespec
fout = open(fullLogFileSpec, 'w')
for line in dotFile_names:
print "..", # are we executing this line..?
fout.write(line)
except IOError as e:
print ("Error : %s not found." % fout)
finally:
print "\nclosing " + fullLogFilespec
fout.close()
I've found that if I modify this line by converting to a string
fout = open(fullLogFileSpec, 'w')
fout = open(str(fullLogFileSpec), 'w')
the error goes away, BUT NO file is created on the Desktop!
At the very least I guess that I am passing something unrecognisable to fout = open() but it is not being caught by the except. Then when I pass something that does seem to allow fout =open() to work it seems to be a ghost?
So I figure I am lost between a String and whatever kind of reference/pointer os.path.expanduser() gives me.
I'm sure it's insanely simple. Before adding the str() code I also checked all indentation, removing them all and adding back using the editor indent hotkeys, just in case that was affecting things somehow.
OK, it looks like I was wearing my dumb glasses, I think declaring
fullLogFileSpec = []
as a list instead of a string was my error.
Similar as it is, having re-written it without that list declaration this code is working fine:
logfile_directory = os.path.join(os.path.expanduser('~'),'Desktop')
log_bf_file_spec = 'ItemsFoundByFolder_' + Deez_1.current_datetime() + '.txt'
log_by_folder = os.path.join(logfile_directory, log_bf_file_spec)
the function later calls, with no error:
fout_by_folder = open(log_by_folder, 'w')