Unable to access pdf document via requests or selenium

Unable to access pdf document via requests or selenium - selenium

I have a huge list of URLs and each one loads a different PDF document. This is one of them:
https://ccmspa.pinellascounty.org/PublicAccess/ViewDocumentFragment.aspx?DocumentFragmentID=74223655&CheckDocumentGroups=0
It will most likely open the website home page in the first try, but if you paste the link again it will open a pdf document.
I'm trying to write a python script to download those documents locally to extract contnet using tika, but this behavior where it opens the home page the first time is throwing a wrench in anything I try.
1. I tried requests, but expectedly it just returns the HTML content of home page
import requests
from tika import parser
link = "https://ccmspa.pinellascounty.org/PublicAccess/ViewDocumentFragment.aspx DocumentFragmentID=74223655&CheckDocumentGroups=0"
resp = requests.get(link)
with open('metadata.pdf', 'wb') as f:
f.write(resp.content)
raw = parser.from_file('metadata.pdf', xmlContent=False)
print(raw['content'])
output:
\n\n\n\n\n\n\n\n\n\n \n \t\t\n\n\t\tSkip to Main Content\xa0\xa0\xa0\xa0Logout\xa0\xa0\xa0\xa0My
Account\xa0\xa0\xa0\xa0\t\t\tHelp\n\n\n\n\n\n\n\t\t\t\nSelect a location\nPinellas County\n\n\xa0\nAll Case
Records Search\nCivil, Family Case Records\nCriminal & Traffic Case Records\nProbate Case Records\nCourt
Calendar\n\nAttorney Login\nRegistered User Login\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\t\n\n\n\t\t\
t\xa0\t\n\t\n\t\tClerk of the Circuit Court|Mortgage Foreclosure Sales|Pinellas County Government|Pinellas
County Sheriff's Office|Public Defender|Sixth Judicial Circuit|State of Florida|State Attorney|Self Help
Center|Court Forms|How-To Videos|Florida Courts eFiling Portal Video|Attorney Account Setup|Reports and
Statistics|Terms of Use|Contact UsCopyright 2003 Tyler Technologies. All rights Reserved.\n\t\n\n\n\n
\n
2. I tried to open the home page using Selenium, and transfer cookies from the webdriver to requests following this answer .
url = "https://ccmspa.pinellascounty.org/PublicAccess/ViewDocumentFragment.aspx?DocumentFragmentID=74223655&CheckDocumentGroups=0"
driver.get(url)
cookies = driver.get_cookies()
s = requests.Session()
for cookie in cookies:
s.cookies.set(cookie['name'], cookie['value'])
resp = s.get(url)
It did not work, and when I checked the CookieJar of the response object it came out empty.
I have to admit I have so little understanding of how cookies work, but it was just a desperate attempt. What am I misunderstanding here? I appreciate any input.
3. My last resort (for obvious reasons) was to open each document via webdriver and download the content, but even this did not work.
#opens a new window and assigns it as the working window
def open_window(driver, link):
driver.execute_script(f"window.open('{link}')")
new_window = driver.window_handles[-1]
driver.switch_to.window(new_window)
url = "https://ccmspa.pinellascounty.org/PublicAccess/ViewDocumentFragment.aspx?DocumentFragmentID=74223655&CheckDocumentGroups=0"
driver.get(url)
open_window(driver, url)
#print source of new window
print(driver.page_source)
The output is just this:
<html><head></head><body></body></html>

After a little more tinkering, solution #2 worked. But instead of getting cookies from the driver after accessing the main page only, I had the browser start another query (with little extra steps specific to this website) then I used the cookies. It looks like this
[{'domain': 'ccmspa.pinellascounty.org',
'expiry': 1670679832, #this is the time the cookie expires in epoch time
'httpOnly': True,
'name': '.ASPXFORMSPUBLICACCESS',
'path': '/',
'secure': True,
'value': '1DBB1EADBA199D246E84CCE7243202DCA6BBD7E383FE360ECBFC2E6150102C79F3EC2F6B232B85589C51976AF20EF7EBDF52CF74122A7A6E78B4C6F31434C58AB57E10005C41DE019814B704F12B150A0818585E85F0237EFCF1A11B205414325CA1850605FF932BC43CC5B36395488F40D58DA594899C4D62FF3ECCBE729C6BC001194225B6653CB89C1305C7FBCB26E1BCFCFF75476784D24ADFCA0AFF679A3BAA3131'},
{'domain': 'ccmspa.pinellascounty.org',
'httpOnly': True,
'name': 'ASP.NET_SessionId',
'path': '/',
'secure': True,
'value': '24552pqtb1tomjbw2gkzko55'},
{'domain': 'ccmspa.pinellascounty.org',
'httpOnly': False,
'name': 'EDLFDCVM',
'path': '/',
'sameSite': 'None',
'secure': True,
'value': '02282de498-9595-48s0hGpl59SkUKRZpRrS_b1TKJfXlz_3dGN9xGZ2tcTXrHuDsR5rN90I_Rp192pX48C1k'}]

Related

Scrapy won't let me login to asp.net page (ASPX)

Hi I'm having trouble getting my scrapy spider script to login to a aspx (asp.net) website
The script is supposed to crawl a website for product information (it's a suppliers website so we are allowed to do this) but for whatever reason the script is not able to login to the webpage using the script below, there is a username and password field along with a image button but when the script runs it simply doesn't work and we are redirected to the main page... I believe it has something to do with the page being asp.net and apparently i need to pass more information but i've honestly tried everything and im at a loss of what to do next!
What am I doing wrong?
import scrapy
class LeedaB2BSpider(scrapy.Spider):
name = 'leedab2b'
start_urls = [
'https://www.leedab2b.co.uk/customerlogin.aspx'
]
def parse(self, response):
return scrapy.FormRequest.from_response(response=response,
formdata={'ctl00$ContentPlaceHolder1$tbUsername': 'emailaddress#gmail.com', 'ctl00$ContentPlaceHolder1$tbPassword': 'yourpassword'},
clickdata={'id': 'ctl00_ContentPlaceHolder1_lbcustomerloginbutton'},
callback=self.after_login)
def after_login(self, response):
self.logger.info("you are at %s" % response.url)

FormRequest.from_response doesn't seem to send __EVENTTARGET, __EVENTARGUMENT in formdata, try to add them manually:
formdata={
'__EVENTTARGET': 'ctl00$ContentPlaceHolder1$lbcustomerloginbutton',
'__EVENTARGUMENT': '',
'ctl00$ContentPlaceHolder1$tbUsername': 'emailaddress#gmail.com',
'ctl00$ContentPlaceHolder1$tbPassword': 'yourpassword'
}

Getting cookies from my original chrome app and passing it into selenium chrome browser [duplicate]

I am trying to load cookies into my request session in Python from selenium exported cookies, however when I do it returns the following error:
"'list' object has no attribute 'extract_cookies'"
def load_cookies(filename):
with open(filename, 'rb') as f:
return pickle.load(f)
initial_state= requests.Session()
initial_state.cookies=load_cookies(time_cookie_file)
search_requests = initial_state.get(search_url)
Everywhere I see this should work, however my cookies are a list of dictionaries, which is what I understand all cookies are, and why I assume this works with Selenium. However for some reason it does not work with requests, any and all help in this regard would be really great, it feels like I am missing something obvious!
Cookies have been dumped from Selenium using:
with open("Filepath.pkl", 'wb') as f:
pickle.dump(driver.get_cookies(), f)
An example of the cookies would be (slightly obfuscated):
[{'domain': '.website.com',
'expiry': 1640787949,
'httpOnly': False,
'name': '_ga',
'path': '/',
'secure': False,
'value': 'GA1.2.1111111111.1111111111'},
{'domain': 'website.com',
'expiry': 1585488346,
'httpOnly': False,
'name': '__pnahc',
'path': '/',
'secure': False,
'value': '0'}]
I have now managed to load in the cookies as per the answer below, however it does not seem like the cookies are loaded in properly as they do not remember anything, however if I load the cookies in when browsing through Selenium they work fine.

Cookie
The Cookie HTTP request header contains stored HTTP cookie previously sent by the server with the Set-Cookie header. A HTTP cookie is a small piece of data that a server sends to the user's web browser. The browser may store the cookies and send it back with the next request to the same server. Typically, cookies to tell if two requests came from the same browser, keeping the user logged in.
Demonstration using Selenium
To demonstrate the usage of cookies using Selenium we have stored the cookies using pickle once the user had logged into the website http://demo.guru99.com/test/cookie/selenium_aut.php. In the next step, we opened the same website, adding the cookies and was able to land as a logged in user.
Code Block to store the cookies:
from selenium import webdriver
import pickle
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
driver.find_element_by_name("username").send_keys("abc123")
driver.find_element_by_name("password").send_keys("123xyz")
driver.find_element_by_name("submit").click()
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
Code Block to use the stored cookies for automatic authentication:
from selenium import webdriver
import pickle
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
driver.get('http://demo.guru99.com/test/cookie/selenium_cookie.php')
Demonstration using Requests
To demonstrate usage of cookies using session and requests we have accessed the site https://www.google.com, added a new dictionary of cookies:
{'name':'my_own_cookie','value': 'debanjan' ,'domain':'.stackoverflow.com'}
Next, we have used the same requests session to send another request which was successful as follows:
Code Block:
import requests
s1 = requests.session()
s1.get('https://www.google.com')
print("Original Cookies")
print(s1.cookies)
print("==========")
cookie = {'name':'my_own_cookie','value': 'debanjan' ,'domain':'.stackoverflow.com'}
s1.cookies.update(cookie)
print("After new Cookie added")
print(s1.cookies)
Console Output:
Original Cookies
<RequestsCookieJar[<Cookie 1P_JAR=2020-01-21-14 for .google.com/>, <Cookie NID=196=NvZMMRzKeV6VI1xEqjgbzJ4r_3WCeWWjitKhllxwXUwQcXZHIMRNz_BPo6ujQduYCJMOJgChTQmXSs6yKX7lxcfusbrBMVBN_qLxLIEah5iSBlkdBxotbwfaFHMd-z5E540x02-YZtCm-rAIx-MRCJeFGK2E_EKdZaxTw-StRYg for .google.com/>]>
==========
After new Cookie added
<RequestsCookieJar[<Cookie domain=.stackoverflow.com for />, <Cookie name=my_own_cookie for />, <Cookie value=debanjan for />, <Cookie 1P_JAR=2020-01-21-14 for .google.com/>, <Cookie NID=196=NvZMMRzKeV6VI1xEqjgbzJ4r_3WCeWWjitKhllxwXUwQcXZHIMRNz_BPo6ujQduYCJMOJgChTQmXSs6yKX7lxcfusbrBMVBN_qLxLIEah5iSBlkdBxotbwfaFHMd-z5E540x02-YZtCm-rAIx-MRCJeFGK2E_EKdZaxTw-StRYg for .google.com/>]>
Conclusion
Clearly, the newly added dictionary of cookies {'name':'my_own_cookie','value': 'debanjan' ,'domain':'.stackoverflow.com'} is pretty much in use within the second request.
Passing Selenium Cookies to Python Requests
Now, if your usecase is to passing Selenium Cookies to Python Requests, you can use the following solution:
from selenium import webdriver
import pickle
import requests
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
driver.find_element_by_name("username").send_keys("abc123")
driver.find_element_by_name("password").send_keys("123xyz")
driver.find_element_by_name("submit").click()
# Storing cookies through Selenium
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
driver.quit()
# Passing cookies to Session
session = requests.session() # or an existing session
with open('cookies.pkl', 'rb') as f:
session.cookies.update(pickle.load(f))
search_requests = session.get('https://www.google.com/')
print(session.cookies)

Since you are replacing session.cookies (RequestsCookieJar) with a list which don't have those attributes, it won't work.
You can import those cookies one by one by using:
for c in your_cookies_list:
initial_state.cookies.set(name=c['name'], value=c['value'])
I've tried loading the whole cookie but it seems like requests doesn't recognize those ones and returns:
TypeError: create_cookie() got unexpected keyword arguments: ['expiry', 'httpOnly']
requests accepts expires instead and HttpOnly comes nested within rest
Update:
We can also change the dict keys for expiry and httpOnly so that requests correctly load them instead of throwing an exception, by using dict.pop() which deletes an item from dict by the key and returns the value of deleted key so after we add a new key with deleted item value then unpack & pass them as kwargs:
for c in your_cookies_list:
c['expires'] = c.pop('expiry')
c['rest'] = {'HttpOnly': c.pop('httpOnly')}
initial_state.cookies.set(**c)

You can get cookies and use only name/value. You'll need headers also. You can get them from dev tools or by using proxy.
Basic example:
driver.get('https://website.com/')
# ... login or do anything
cookies = {}
for cookie in driver.get_cookies():
cookies[cookie['name']] = cookie['value']
# Write to a file if need or do something
# import json
# with open("cookies.txt", 'w') as f:
# f.write(json.dumps(cookies))
And usage:
# Read cookies from file as Dict
# with open('cookies.txt') as reader:
# cookies = json.loads(reader.read())
# use cookies
response = requests.get('https://website.com/', headers=headers, cookies=cookies)
Stackoverflow headers example, some headers can be required some not. You can find information here and here. You can get request headers using dev tools Network tab:
headers = {
'authority': 'stackoverflow.com',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'dnt': '1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36',
'sec-fetch-user': '?1',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'navigate',
'referer': 'https://stackoverflow.com/questions/tagged?sort=Newest&tagMode=Watched&uqlId=8338',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'ru,en-US;q=0.9,en;q=0.8,tr;q=0.7',
}

You can create a session. The session class handles cookies between requests.
s = requests.Session()
login_resp = s.post('https://example.com/login', login_data)
self.cookies = self.login_resp.cookies
cookiedictreceived = {}
cookiedictreceived=requests.utils.dict_from_cookiejar(self.login_resp.cookies)

So requests wants all "values" in your cookie to be a string. Possibly the same with the "key". Cookies also does not want a list as your function load_cookies returns. Cookies can be created for the request.utils with cookies = requests.utils.cookiejar_from_dict(....
Lets say I go to "https://stackoverflow.com/" with selenium and save the cookies as you have done.
from selenium import webdriver
import pickle
import requests
#Go to the website
driver = webdriver.Chrome(executable_path=r'C:\Path\\To\\Your\\chromedriver.exe')
driver.get('https://stackoverflow.com/')
#Save the cookies in a file
with open("C:\Path\To\Your\Filepath.pkl", 'wb') as f:
pickle.dump(driver.get_cookies(), f)
driver.quit()
#you function to get the cookies from the file.
def load_cookies(filename):
with open(filename, 'rb') as f:
return pickle.load(f)
saved_cookies_list = load_cookies("C:\Path\To\Your\Filepath.pkl")
#Set request session
initial_state = requests.Session()
#Function to fix cookie values and add cookies to request_session
def fix_cookies_and_load_to_requests(cookie_list, request_session):
for index in range(len(cookie_list)):
for item in cookie_list[index]:
if type(cookie_list[index][item]) != str:
print("Fix cookie value: ", cookie_list[index][item])
cookie_list[index][item] = str(cookie_list[index][item])
cookies = requests.utils.cookiejar_from_dict(cookie_list[index])
request_session.cookies.update(cookies)
return request_session
initial_state_with_cookies = fix_cookies_and_load_to_requests(cookie_list=saved_cookies_list, request_session=initial_state)
search_requests = initial_state_with_cookies.get("https://stackoverflow.com/")
print("search_requests:", search_requests)

Requests also accept http.cookiejar.CookieJar objects:
https://docs.python.org/3.8/library/http.cookiejar.html#cookiejar-and-filecookiejar-objects

How to fill JavaScript form using Python?

I want to use Python to fill this form.
I tried using Mechanize but this is a Microsoft Form which uses JavaScript and has no form tag and no GET/POST URL. Maybe BeautifulSoup/Selenium can do this, but I do not have any experience in scraping JS forms. Can anyone help me out and suggest how to go about this?
Here's what I've tried, Mechanize is unable to recognize any form on the page:
import mechanize
def main():
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
response = br.open("https://forms.office.com/Pages/ResponsePage.aspx?id=8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u")
for form in br.forms():
print("Form name:", form.name) #prints nothing
print(form) #prints nothing
if __name__ == '__main__':
main()

Selenium works fine.
You'll need to install the components
install selenium pip install selenium
You need to ensure you download the correct chromedriver (or other driver) for your browser and OS versions and add it to path
Then this runs:
from selenium import webdriver
driver = webdriver.Chrome()
url = "https://forms.office.com/Pages/ResponsePage.aspx?id=8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u"
driver.get(url)
name = driver.find_element_by_xpath("//div[#class='question-title-box'][.//span[text()='NAME']]/following-sibling::*//input")
name.send_keys("hello, World")
setionSelection = "F"
section = driver.find_element_by_xpath("//div[#class='question-title-box'][.//span[text()='Section']]/following-sibling::*//input[#value='" + setionSelection + "']")
section.click()
date = driver.find_element_by_xpath("//input[contains(#placeholder, 'Please input date')]")
date.send_keys("01/12/2020")
submit = driver.find_element_by_xpath("//div[text()='Submit']")
submit.click()
The xapths are a little long but they're based on the question text so potentially stable
For an alternative approach - When you say there is no POST url, did you check devtools? - That exposes the destination of the form:
Request URL: https://forms.office.com/formapi/api/aebbf9f0-23da-49e3-98bf-32171abbc9bc/users/f70e502c-96b2-4239-aca3-764dea371071/forms('8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u')/responses
Request Method: POST
it also exposes the payload... This is the first submit:
{startDate: "2020-08-17T10:40:18.504Z", submitDate: "2020-08-17T10:40:18.507Z",…}
answers: "[{"questionId":"r8f09d63e6f6f42feb2f8f4f8ed3f9389","answer1":"Hello, World"},{"questionId":"r28fe12073dfa47399f8ce95ae679dccf","answer1":"G"},{"questionId":"r8f9e9fedcc2e410c80bfa1e0e3ef9750","answer1":"2020-08-28"}]"
startDate: "2020-08-17T10:40:18.504Z"
submitDate: "2020-08-17T10:40:18.507Z"
Those post URL UUID/GUIDs questions IDs seem to be satic for this form. Every time i run form they're not chaning. This is the second run:
{startDate: "2020-08-17T10:43:48.544Z", submitDate: "2020-08-17T10:43:48.546Z",…}
answers: "[{"questionId":"r8f09d63e6f6f42feb2f8f4f8ed3f9389","answer1":"test me"},{"questionId":"r28fe12073dfa47399f8ce95ae679dccf","answer1":"G"},{"questionId":"r8f9e9fedcc2e410c80bfa1e0e3ef9750","answer1":"2020-08-12"}]"
startDate: "2020-08-17T10:43:48.544Z"
submitDate: "2020-08-17T10:43:48.546Z"
Once you capture this once you'll probably be able to do it through the API without a GUI.
... Just to make sure, i tried it and i get success...
import requests
url = "https://forms.office.com/formapi/api/aebbf9f0-23da-49e3-98bf-32171abbc9bc/users/f70e502c-96b2-4239-aca3-764dea371071/forms('8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u')/responses"
myobj = {"startDate":"2020-08-17T10:48:40.118Z","submitDate":"2020-08-17T10:48:40.121Z","answers":"[{\"questionId\":\"r8f09d63e6f6f42feb2f8f4f8ed3f9389\",\"answer1\":\"Hello again, World\"},{\"questionId\":\"r28fe12073dfa47399f8ce95ae679dccf\",\"answer1\":\"F\"},{\"questionId\":\"r8f9e9fedcc2e410c80bfa1e0e3ef9750\",\"answer1\":\"2020-08-26\"}]"}
x = requests.post(url, data = myobj)
My answers are just hard coded into the data object but it seems to work.
Remember to pip install requests if you don't already have it

Interact with password protected Jupyter /api

A friend is trying to run a script to check which notebooks are using the most memory, but their server is password protected. I'm trying to figure out how to configure authentication using urllib2 since I don't believe there is a username, only a password.

The #aiguofer answer did not work for me because jupyter now uses '_xsrf' in cookie. The follwoing woked for me:
s = requests.Session()
url='http://127.0.0.1:8888/login/'
resp=s.get(url)
xsrf_cookie = resp.cookies['_xsrf']
params={'_xsrf':xsrf_cookie,'password': password}
s.post(url, data=params)
After that s can be used to call the apis.

After digging into the notebook code and through some trial and error, I figured out how to do this (and I switched to using requests).
I can't guarantee that this is the best way to do it, but it certainly worked for me. I actually set my vars elsewhere in the code but included here for completeness
import requests
hostname = '127.0.0.1'
port = '8888'
password = 'mypassword'
base_url = 'http://{0}:{1}/'.format(hostname, port)
h = {}
if password:
r = requests.post(base_url + 'login', params={
'password': password
})
h = r.request.headers
sessions = requests.get(base_url + 'api/sessions', headers=h).json()
I believe this works because when you hit the /login endpoint, it redirects you with the right headers set. I guess requests keeps the headers of the redirect, so we can reuse those for the other call. It might be better to extract only the cookies and use those, but this works for now :)

It seems there are some changes with new version. url '/login' does not work for me, I need to add next parameters
url='http://localhost:8050/login?next=%2F'
For the login request. The rest just like Hassan answer

i found when use jupyter put api upload file response 403,
need add "X-XSRFToken" header can solve it..
data= json.dumps({
"name": "test.jpg",
"path": "path",
"type":"file",
"format": "base64",
"content": "base64 data"
})
headers["X-XSRFToken"] = xsrf_cookie
s.put(url, data=data, headers=headers)

Basic authentication with Selenium in Internet Explorer 11

I read Basic authentication with Selenium in Internet Explorer 10
And I change my register key and when I use the user and pass in the url I don't see the basic authentication popup, but actually the page is not load. I see blank page!
I see my url in the IE but nothing happened - I see white page.
Must I change somethin in IE too?

It is not possible without some workarounds.
I also needed the same feature and previous SO answer confirms, that is it either impossible or possible with high probability of failure.
One thing I learned about Protrator is not to try to make too complicated stuff with it, or I'll have a bad time.
As for the feature- I ended up making Protractor to initiate Node.js task, which use request to make the authentication and provide back the data.
Taken straight from request module:
request.get('http://some.server.com/').auth('username', 'password', false);
// or
request.get('http://some.server.com/', {
'auth': {
'user': 'username',
'pass': 'password',
'sendImmediately': false
}
});
// or
request.get('http://some.server.com/').auth(null, null, true, 'bearerToken');
// or
request.get('http://some.server.com/', {
'auth': {
'bearer': 'bearerToken'
}
});

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Unable to access pdf document via requests or selenium - selenium

Related

Scrapy won't let me login to asp.net page (ASPX)

Getting cookies from my original chrome app and passing it into selenium chrome browser [duplicate]

How to fill JavaScript form using Python?

Interact with password protected Jupyter /api

Basic authentication with Selenium in Internet Explorer 11

Categories

Resources