is there a Way to Grab the Filename and location from a File that is being downloaded by selenium and webdriver - selenium

I have a script that automates downloading a file using selenium and webdriver as chrome
basically, it logs in to a work website, and clicks on some settings to prepare to download a report file, and then clicks the download button
Is there a way for selenium or any other library to grab the file location and name of that file that is downloading or has just downloaded so I can store it in a variable to use later in the script
I do not know if the relative path will work or if it needs to be a full windows path name for it to work... probably safe to assume full path is more guaranteed to work
example being C:\Users\FunnyUserName\Downloads\report.xls
Adding Code to show what is happening better
#For Report Pull
#-----------------------------------------------
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime
import os
import glob
###############################################################
#Pull Report #
###############################################################
#Open Web Driver
browser = webdriver.Chrome()
#Open Website and Log in
print('Open Website and Log In')
browser.get(('https://SomeWebsite.com'))
print('Working on Getting the QC Report')
print('Please Stand By')
#####
#I Removed a lot of stuff not necessary to this question
#Get the File
WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.XPATH,'//*[#id="btnGenerateReport"]'))).click()
time.sleep(4)
#Working on getting the last downloaded Filename
# get the user download folder (dynamic so will work on any machine)
downLoadFolder =os.path.join( os.getenv('USERPROFILE'), 'Downloads')
print(downLoadFolder)
#This shows the correct folder....
#In My Case C:\Users\My UserName\Downloads
# get the list of files
list_of_files = glob.glob(downLoadFolder+"/*.*") # * means all if need specific formats (if you are looking for any specific format then specify eg: "/*.xls" to filter)
print (list_of_files)
#Always Shows ['C:\\Users\\My UserName\\Downloads\\desktop.ini']
# get the latest file name
#Forced the Folder and file type as a test
latest_file = max(glob.glob("C:/Users/My Username/Downloads/*.xls"), key=os.path.getctime)
#print the latest file name
print(latest_file)
#Returns:latest_file = max(glob.glob("C:/Users/My Username/Downloads/*.xls"), key=os.path.getctime)
#ValueError: max() arg is an empty sequence

I might have figured out where the problem is
I need it to grab the actual downloads directory, not the system default
I'm testing on my home computer, it default downloads to my dropbox download directory that I share with my work computer as well
so its not C:\Users\My Username\Downloads\
its actually D:\Dropbox\Downloads... which is where my chrome default is set currently
how do I get the chrome download directory?

Here is the solution in the python.
Imports Needed:
import glob
import os
Script:
# get the user download folder (dynamic so will work on any machine)
downLoadFolder =os.path.join( os.getenv('USERPROFILE'), 'Downloads')
# get the list of files
list_of_files = glob.glob(downLoadFolder+"/*") # * means all if need specific formats (if you are looking for any specific format then specify eg: "/*.xlsx" to filter)
# get the latest file name
latest_file = max(list_of_files, key=os.path.getctime)
#print the latest file name
print(latest_file)

Related

Can I install chromedriver on python anywhere and not use it headless?

I am trying to use this code that works on my local machine on python anywhere and i want to understand if it is even possible:
from selenium import webdriver
from bs4 import BeautifulSoup
import time
# Initialize webdriver
driver = webdriver.Chrome(executable_path="/Users/matteo/Downloads/chromedriver")
# Navigate to website
driver.get("https://apnews.com/article/prince-harry-book-meghan-royals-4141be64bcd1521d1d5cf0f9b65e20b5")
time.sleep(5)
# Parse page source
soup = BeautifulSoup(driver.page_source, "html.parser")
# Find desired elements using Beautiful Soup
elements = soup.find_all("p")
# Print element text
for element in elements:
print(element.text)
# Close webdriver
driver.quit()
Do i need to have installed chrome to make that work or is chromium enough? Because when i run that code on my local machine a chrome page opens up. How does that work on python anywhere? Would it crush?
I am wondering if the code i am using only works if someone is on a GUI with Chrome installed or if it can work on python anywhere too.
The short answer is no. ChromeDriver must have chrome installed. You can run your tests headless for time save, but chrome still must be installed.

Is there a way that you can use the browser profiles on your desktop as the Selenium webdriver? [duplicate]

So whenever I try to use my Chrome settings (the settings I use in the default browser) by adding
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=C:\Users\... (my webdriver path)")
driver = webdriver.Chrome(executable_path="myPath", options=options)
it shows me the error code
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes n 16-17: truncated \UXXXXXXXX escape
in my bash. I don't know what that means and I'd be happy for any kind of help I can get. Thanks in advance!
The accepted answer is wrong. This is the official and correct way to do it:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument(r"--user-data-dir=C:\path\to\chrome\user\data") #e.g. C:\Users\You\AppData\Local\Google\Chrome\User Data
options.add_argument(r'--profile-directory=YourProfileDir') #e.g. Profile 3
driver = webdriver.Chrome(executable_path=r'C:\path\to\chromedriver.exe', chrome_options=options)
driver.get("https://www.google.co.in")
To find the profile folder on Windows right-click the desktop shortcut of the Chrome profile you want to use and go to properties -> shortcut and you will find it in the "target" text box.
To get the path, follow the steps below.
In the search bar type the following and press enter
This will then show all the metadata. There find the path to the profile
As per your question and your code trials if you want to open a Chrome Browsing Session here are the following options:
To use the default Chrome Profile:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=C:\\Users\\AtechM_03\\AppData\\Local\\Google\\Chrome\\User Data\\Default")
driver = webdriver.Chrome(executable_path=r'C:\path\to\chromedriver.exe', chrome_options=options)
driver.get("https://www.google.co.in")
Note: Your default chrome profile would contain a lot of bookmarks, extensions, theme, cookies etc. Selenium may fail to load it. So as per the best practices create a new chrome profile for your #Test and store/save/configure within the profile the required data.
To use the customized Chrome Profile:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("user-data-dir=C:\\Users\\AtechM_03\\AppData\\Local\\Google\\Chrome\\User Data\\Profile 2")
driver = webdriver.Chrome(executable_path=r'C:\path\to\chromedriver.exe', chrome_options=options)
driver.get("https://www.google.co.in")
Here you will find a detailed discussion on How to open a Chrome Profile through Python
This is how I managed to use EXISTING CHROME PROFILE in php selenium webdriver.
Profile 6 is NOT my default profile. I dont know how to run default profile. It is IMPORTANT not to add -- before chrome option arguments! All other variants of options didnt work!
<?php
//...
$chromeOptions = new ChromeOptions();
$chromeOptions->addArguments([
'user-data-dir=C:/Users/MyUser/AppData/Local/Google/Chrome/User Data',
'profile-directory=Profile 6'
]);
$host = 'http://localhost:4444/wd/hub'; // this is the default
$capabilities = DesiredCapabilities::chrome();
$capabilities->setCapability(ChromeOptions::CAPABILITY, $chromeOptions);
$driver = RemoteWebDriver::create($host, $capabilities, 100000, 100000);
To get name of your chrome profile, go to chrome://settings/manageProfile, click on profile icon, click "Show profile shortcut on my desktop". After that right click on desktop profile icon and go to properties, here you will see something like "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --profile-directory="Profile 6".
Also I recommend you to close all chrome instances before running this code. Also maybe you need to TURN OFF chrome settings > advanced > system > "Continue running background apps when Google Chrome is closed".
None of the given answers were working for me so I researched a bit and now the working code is for is this one. I copied the user dir folder from Profile Path from chrome://version/ and made another argument for the profile as shown below:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument('user-data-dir=C:\\Users\\gupta\\AppData\\Local\\Google\\Chrome\\User Data')
options.add_argument('profile-directory=Profile 1')
driver = webdriver.Chrome(executable_path=r'C:\Program Files (x86)\chromedriver.exe', options=options)
driver.get('https://google.com')
Are you sure you are meant to be putting in the webdriver path in the user-data-dir argument? That's usually where you put your chrome profile e.g. "C:\Users\yourusername\AppData\Local\Google\Chrome\User Data\Profile 1\". Also you will need to use either double backslashes or forward slashes in your directory path (both work). You can test if your path works by using the os library
e.g.
import os
os.list("C:\\Users\\yourusername\\AppData\\Local\\Google\\Chrome\\User Data\\Profile 1")
will give you the directory listing.
I might also add that occasionally if you manage to crash chrome while running webdriver with a nominated user profile, that it seems to record the crash in the profile and the next time you open chrome, you get the Chrome prompt to restore pages after it exited abnormally. For me personally this had been a bit of headache to deal with and I no longer use a user profile with chromedriver because of it. I could not find a way around it. Other people have reported it here, but none of their solutions seemed to work for me, or were not suitable for my test cases. https://superuser.com/questions/237608/how-to-hide-chrome-warning-after-crash
If you don't nominate a user profile it seems to create a new (blank) temporary one each time it runs
Make sure you've got the path to the profile right, and that you double escape backslashes in said path.
For example, typically the default profile on windows is located at:
"C:\\Users\\user\\AppData\\Local\\Google\\Chrome\\User Data\\Default"
I managed to launch my chrome profile using these arguments:
ChromeOptions options = new ChromeOptions();
options.addArguments("--user-data-dir=C:\\Users\\user\\AppData\\Local\\Google\\Chrome\\User Data");
options.addArguments("--profile-directory=Profile 2");
WebDriver driver = new ChromeDriver(options);
You can find out more about the web driver here
You simply have to replace the '\' to '/' in your paths and that'll resolve it.
Get profile name by navigating to chrome://version from your chrome browser (You'll see Profile Path, but you only want the profile name from it (e.g. Profile 1)
Close out all Chrome sessions using the profile you want to use. (or else you will get the following error: InvalidArgumentException)
Now make sure you have the code below (Make sure you replace UserFolder with the name of your userfolder.
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=C:\\Users\\EnterYourUserFolder\\AppData\\Local\\Google\\Chrome\\User Data") #leave out the profile
options.add_argument("profile-directory=Profile 1") #enter profile here
driver = webdriver.Chrome(executable_path="C:\\chromedriver.exe", chrome_options=options)
this worked for me 100% and it showed up my selected profile.
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
# path of your chrome webdriver
dir_path = os.getcwd()
user_profile_path = os.environ[ 'USERPROFILE' ]
#if "frtkpr" which is ll be your custom profile does not exist it will be created.
option.add_argument( "user-data-dir=" + user_profile_path + "/AppData/Local/Google/Chrome/User Data/frtkpr" )
driver = webdriver.Chrome( dir_path + "/chromedriver.exe",chrome_options=option )
baseUrl = "https://www.facebook.com/"
driver.maximize_window()
driver.get( baseUrl )

How to use selenium in pandas to read a webpage?

I want to collect the information of a webpage using chromedriver. How do I install it and use it?
You have to install selenium first if you don't have it already. Then to use selenium:
from selenium.webdriver import Chrome
url="URL of the webpage you want to read"
setting up the driver
webdriver = "path of the chromedriver.exe file saved in your pc"
driver.get(url)
using css selector
y = driver.find_element_by_css_selector('css selector of the data you want to read from the webpage').text
print(y)
You don't install the chromedriver - you download the .exe (from here) and use the path to it in webdriver.Chrome(). This getting started page has a comprehensive guide:
from selenium import webdriver
driver = webdriver.Chrome('/path/to/chromedriver') # refers to the path where you saved the exe
driver.get('http://www.google.com/');
time.sleep(5) # Let the user actually see something!
search_box = driver.find_element_by_name('q')
search_box.send_keys('ChromeDriver')
search_box.submit()
time.sleep(5) # Let the user actually see something!
driver.quit()
Note: download the .exe that matches with your version of chrome!
(In Help > About Google Chrome)
As mentioned by #Patha_Mondal, you need to download the driver and select the elements you want to read. However, as your original question asks "How to use selenium in pandas to read a webpage?", I would say instead consider using Scrapy along with Selenium to create a ".csv" file from the Webpage Data.
Read the ".csv" data into pandas using pandas.read_csv() .
The data from the Webpage might not be clean or properly formatted. Using Scrapy to create a dataset out of it would be beneficial for reading it into pandas. Avoid using pandas directly in the same script as Selenium and Scrapy.
Hope it Helped.

Failed- Path too long error while downloading file in Chrome using selenium

I want to download file in my current working directory using selenium automation. But I am getting 'Path too long' error. The code I have written so far is:
os.chdir(os.path.dirname(__file__))
current_directory = os.getcwd()
windows_cwd = current_directory.replace('\\','\\\\')+'\\\\'
chrome_options = webdriver.ChromeOptions()
prefs = {'download.default_directory': windows_cwd,
'download.directory_upgrade': True,
'safebrowsing.enabled': False,
'safebrowsing.disable_download_protection': True
}
chrome_options.add_experimental_option('prefs',prefs)
browser = webdriver.Chrome(options=chrome_options)
My current working directory is:
C:\Users\US177\PycharmProjects\Plugin
where the path is too long.
But it successfully downloads to
C:\Users\US177\Desktop
failed-long path
I'm not exactly sure what your question is based on the information provided, but I'm guessing it's along the lines of "Why is this happening?", so I will address that question.
The maximum length of a file name in Windows is 260 characters. The file is able to download to your desktop because the name of the file (when appended to your path) does not exceed this limit. When trying to download to PycharmProjects\Plugin\ folder, the path has become too long.
While setting your download path, try using double backslash (ie. path\\to\\directory).
See this Github issue about programatically downloading from chrome

Program using selenium fails after building with cx_freeze

I'm developing an automatic web tester using Selenium (v2.37.2). Program works properly until I run the test built with cxfreeze (there is also tkinter gui).
there is the init function
def initDriver(self):
if self.browser == FIREFOX:
profile = webdriver.FirefoxProfile(profile_directory=self.profile);
self._driver = webdriver.Firefox(firefox_profile=profile)
elif self.browser == CHROME:
self._driver = webdriver.Chrome(self.executable, chrome_options=profile)
elif self.browser == IEXPLORER:
self._driver = webdriver.Ie(self.executable)
Now when I build it using Cx_freeze I get this error
method redirectToBlank(...) calls initDriver(..) as the first thingSo how I pack the .xpi file to the library.zip file - which option in setup.py I have to use? And do I even have to this?
And the second strange thing is, that the other browsers work fine, when I execute the .exe file in by clicking on its icon, but when I run it from command line, I get errors even for chrome and IE. (Sorry that the traceback isn't complete)
All paths are relative from the executed file (no matter from where you run it),
Thank you for any ideas to solve this problem.
(method redirectToBlank(...) calls initDriver(..) as the first thing)
First issue solved
It's problem with selenium - FirefoxProfile - class, which tries to load webdriver.xpi as a normal file, but selenium pack all libraries to a zip file, so selenium can't find it.
Even forcing cx_freeze in setup file to add webdriver.xpi to a proper directory in zip won't help.
It is necessary to edit FirefoxProfile (in firefox_profile module) class for example like this
def _install_extension(self, addon, unpack=True):
"""
Installs addon from a filepath, url
or directory of addons in the profile.
- path: url, path to .xpi, or directory of addons
- unpack: whether to unpack unless specified otherwise in the install.rdf
"""
if addon == WEBDRIVER_EXT:
# altered lines
import sdi.env
WEBDRIVER_SUBSTITUTE = "path/to/unpacked/webdrive.xpi"
addon = os.path.join(os.path.dirname(__file__), WEBDRIVER_SUBSTITUTE)
# Original lines:
# addon = os.path.join(os.path.dirname(__file__), WEBDRIVER_EXT)
< the rest of the method >
Issue 2
OSError: win error 6: the handle is invalid problem wasn't caused by either cxfreeze or selenium. I run the final exe file from git bash. There's the problem. For some reason git bash doesn't open stdin for the program and that's why it fails. When I run it in standard windows command line, everything is ok or if i run it from git bash like program.exe < empty_file
what i did was remove selenium form packages list.
and put it inside includefiles, then it works.
like this :
includefiles = [(seleniumPackage,'')]
...
options = {'build_exe': {'includes':includes,
'excludes':excludes,
'optimize':2,
'packages':packages,
'include_files':includefiles,
...