scraping authentication page with multi form - scrapy

On the site I want scraping there an authentication page.
there is no username and password to enter but among twenty click button to select a location.
example the form of a button:
<form id="liliForm-2519" name="liliForm-2519" method="post" action="http://exemple.com/rat/body.ayers.verlayer/57">
<input type="hidden" name="t:formdata" value="ERERAAADFDFvzloEVAN3OqfcAA">
<input class="submit" type="submit" value="Acces">
</form>
is that it is possible to simulate the clik for the submit? And if so could someone help me? Thank you in advance STEF

You need to specify the clickdata parameter in the FormRequest class:
http://doc.scrapy.org/en/latest/topics/request-response.html#formrequest-objects
def parse_page(self,response):
return FormRequest.from_response(response,clickdata={'value':'Acces'})
or you can use the formxpath parameter to match a certain xpath.
If you want me to further look into it, feel free to post a link.

with python scrapy you can do something like
def parse_page(self, response):
FormRequest(url=http://exemple.com/rat/body.ayers.verlayer/57",
method="POST",
response=reponse)
it will retrieve actual values from your html reponse to fill in missing input fields of the form.

Related

How to match xpath for id element that changes each time page loads?

I have these 2 xpath that are different each time I load a webpage.
The xpaths were recorded by Selenium-IDE and always have mainForm_view within the id string and the text before and after this always changes.
xpath=//input[#id='abc_hyd_wuu2_8333nd_mainForm_view_jjd_uueue2_jjd_11_jkdhd']
xpath=//div[#id='abc_hyd_wuu2_8333nd_mainForm_view_kcjjcs_sjsjs_jjdj_994_kkk']/div/div[2]/div/div/div/a[1]/h2
I've tried to locate the id like below but doesn't work.
xpath=//input[contains(#id,'mainForm_view')]
xpath=//div[contains(#id,'mainForm_view')]
Which would be the correct way to do it?
Thanks in advance.
UPDATE
I've tried with CSS selector like below but it seems is taking another id that is within an input element
document.querySelector("input[id*='mainForm_view']").id
Examining the html code I see that the id I need is related with a unique class. The code is like below:
<div class="Class_p2">
<div class="Class_p3" style="...">
<input name="8333nd$mainForm$view$jjd$uueue2" type="text" class="class a1 n1-Control" value="xyz" id="8333nd_mainForm_view_jjd_uueue2" disabled="disabled" style="..">
</div>
<input name="8333nd$mainForm$view$ttyi" type="text" disabled="disabled">
</div>
I've tried the following Javascript code in Chrome console but it doesn't work
document.getElementsByClassName("class a1 n1-Control").id
How would be to get the id=8333nd_mainForm_view_jjd_uueue2 that is related with Class=class a1 n1-Control?
UPDATE2
I was finally able to do it with
document.getElementsByClassName("class a1 n1-Control")[0].id
Thanks for all the help and time.
You can write css selector as :
input[id*='mainForm_view']
for div it'd be :
div[id*='mainForm_view']
Asterisk is to match the sub string part.
Note that if any id contains mainForm_view that will also be selected, so better to check in developers tool before proceeding.
You can try finding some other element for which xpath/css locator remains same and then try to reach to this element by traversing from there. You can use parent, ancestor, preceding-sibling, following-sibling keywords in order to traverse. Hope it helps :)

flask - user input (login/password) to a python variable

I'm trying to learn about login/password/user session stuff in flask.
i found this link and have been trying to understand the code it provides (on the bottom of the page, the largest piece of code).
http://thecircuitnerd.com/flask-login-tokens/
The link doesn't provide, though, the contents of the login.html file.
So far, the way i've been handling forms in flask requires me to specify to the render_template function what user input will be attributed to each python variable. But since the author didn't do it, i suppose his method of getting the user input should be different than that.
If you look at the login route handler in the code you linked you'll see that it uses request.form to get out two variables, 'username' and 'password':
#app.route("/login/", methods=["GET", "POST"])
def login_page():
"""
Web Page to Display Login Form and process form.
"""
if request.method == "POST":
user = User.get(request.form['username'])
#If we found a user based on username then compare that the submitted
#password matches the password in the database. The password is stored
#is a slated hash format, so you must hash the password before comparing
#it.
if user and hash_pass(request.form['password']) == user.password:
login_user(user, remember=True)
return redirect(request.args.get("next") or "/")
return render_template("login.html")
The simplest way to do this would be with the following HTML:
<form action="/login/" method="POST">
<input name="username" placeholder="username">
<input type="password" name="password" placeholder="password">
<input type="submit" value="Login">
</form>
This will not re-populate the username if the user mis-types their username or password, nor will it give the user any indication that they failed to login. They will just see the login form again. However, this is just some example code, so it's understandable that the author chose to leave out useful code that would obscure the point he was trying to make.

How to select the correct submit button with mechanize for Python?

I'm trying to submit a form (using mechanize in Python) that has two submit buttons as shown below.
<input type="submit" value="Save Changes " name="SaveChanges">
<input type="submit" value="Reboot" name="SaveChanges">
Mechanize "print control" shows this...
<SubmitControl(SaveChanges=Save Changes )>
<SubmitControl(SaveChanges=Reboot)>
How do I select the "Reboot" submit button with mechanize? I've tried:
br.submit()
br.submit("Reboot")
br.submit("SaveChanges=Reboot")
The correct form is selected, but none of these submit options are working. I'm new to Python and would appreciate any help.
I just figured it out.
br.submit(nr=1)
will select the second submit button (nr count starts with zero)

Selenium WebDriver : Not able find xpath for Paytm.com , Proceed button

I was trying to automate paytm.com site ,
Here i found Proceed button attribute has name but when i tried to use xpath checker for the name attribute , it was showing 13 matches but my question here is in the webpage from the UI level am not able to see 13 Proceed buttons instead only one Proceed button are present .
Even i tried with other attribute to find the xpath , but it showing more matches found.
Below is the HTML code for Proceed
<div class="msg-container">
<div class="btn-spinner" alt="Proceed to Recharge">
<div class="spinner hidden"></div>
<input class="btn proceed active" type="submit" data-express-text="Recharge Now" data-soft-block-text="Proceed anyway" data-default-text="Proceed" name="Proceed" value="Proceed" alt="Proceed to Recharge">
Can you please let me where am going wrong ?
This xpath returns 1 match for me
//form[#id='prepaidMobile']//input[#name='Proceed']
Also, if want use only //input[#name='Proceed'] you can get it from List of WebElements:
WebElement firstInput = driver.findElements(by.xpath("//input[#name='Proceed']"))[0];
This will work for you, I think:
driver.findElement(By.xpath("(//input[#name='Proceed'])[1]")));

Can't select label by text when label contains more than text

I am driving myself bonkers with this.
I have three form fields in a form:
Customer: required dropdown field
Weight: required text field
Status: optional text field
Each element has that label. The required fields' labels contain a span with an asterisk.
I'm using Xpather in Chrome. When I search for this, I receive 2 results, when I should get 3:
//*[contains(text(),'t')]
This makes no sense to me At All.
Customer, which is working:
<label for='customer-field'>
<span class='required-marker'>*</span>
Customer
<input id='customer-field' type='text' />
</label>
Weight, which is not working:
<label class='control-label'>
<span id='ctl01_requiredMarker' class='required-marker'>*</span>
Weight
</label>
Status, which is working:
<label class='control-label'>
Status
</label>
The only workaround that works for me is removing the required marker from the Weight label container. However, that doesn't explain how "Customer" gets matched at all.
Noteworthy: I'm trying to automate testing this page, so I can't really remove that span tag.
What's going on? And/or what do I do?
Try changing your XPath to the following:
//*[text()[contains(.,'t')]]
The source of this fix breaks it down far better than I could've done, so refer to that for detailed explanation! I've tested it myself using the XPath Checker extension for Firefox, and it matches your three items.
Try with the below method
driver.findElement(By.xpath("//span[#class='required-marker']/..")).getText().contains("Weight")
Please Let me know above method is working or not.
I think your html is where the issue lies.
This is probably what your html should look like:
<span class='required-marker'>*
<label for='customer-field'>Customer</label>
<input id='customer-field' type='text' />
</span>
<span id='ctl01_requiredMarker' class='required-marker'>*
<label class='control-label'>Weight</label>
</span>
<label class='control-label'>Status</label>
Are you using Selenium or WebDriver? What does WebDriver return as a response? Also make sure you add a "." before the xpath like .//*[contains(text(),'t')]
What does this print?
List<WebElement> elements = driver.findElement(By.xpath(".//*[contains(text(),'t')]"));
s.o.p(elements.size());