Download file after authenticated through WebBrowser - vb.net

Background
So I am creating a VB .NET program that will basically login to a web page, do a bunch of button clicking, which results in the website generating an excel report that can be downloaded. I have successfully gone through all the steps to produce the file so now I am trying to create a method that will download the file behind the scenes without the "Save as" dialog appearing.
Details
I have managed to trap the download through the Navigating event of the Webbrowser control:
Public Sub a(sender As Object, e As WebBrowserNavigatingEventArgs) Handles WebBrowser1.Navigating
'intercept the excel download. Retrieve the url but cancel dialog
If e.Url.AbsoluteUri.Contains("fmsdownload") Then
Label3.Text = e.Url.AbsoluteUri
e.Cancel = True
'e.Url.AbsoluteUri = the temporarily generated file URL to download from
'INSERT DOWNLOAD METHOD HERE
End If
End Sub
I verified that the e.Url.AbsoluteUri is indeed the correct path. If I copy / paste this URL into Chrome, it downloads.
Question
So ultimately I am simply trying to find a way to download the file after the download link has been generated. Please take a look in the section below for what I have tried as I believe I am close to achieving success.
What I have tried (Please read before posting)
Method 1: My.Computer.Network.DownloadFile(URL,SAVEPATH). This results in the server kicking back The repote server returned an error: (403) Forbidden. This leads me to understand that the authentication isn't being passed which makes sense.
Method 2: I read on a stackoverflow post to try the URLMON to initiate the download (http://www.pinvoke.net/default.aspx/urlmon/URLDownloadToFile%20.html). I thought that this would have some promise but results in the error Unable to find an entry point named URLDownloadToFile in DLL 'URLMON.dll' Here is the code I have used for this method as it may be something simple I am missing:
Private Declare Sub URLDownloadToFile _
Lib "URLMON.dll" (
ByVal lpCaller As Long,
ByVal szUrl As String,
ByVal szFilename As String,
ByVal dwReserved As Long,
ByVal lpBindStatusCallback As Long)
Public Sub a(sender As Object, e As WebBrowserNavigatingEventArgs) Handles WebBrowser1.Navigating
'intercept the excel download. Retrieve the url but cancel dialog
If e.Url.AbsoluteUri.Contains("fmsdownload") Then
Label3.Text = e.Url.AbsoluteUri
e.Cancel = True
Try
Kill(My.Computer.FileSystem.SpecialDirectories.MyDocuments & "\download.xls")
Catch
End Try
URLDownloadToFile(0, e.Url.AbsoluteUri, My.Computer.FileSystem.SpecialDirectories.MyDocuments & "\download.xls", 0, 0)
End If
End Sub
Method 3: After some research it seems like the authentication is stored as cookies so I tried to retrieve the cookies and then provide them back to the WebClient since WebClient supports downloading files. Here is where I capture the cookies:
Dim cookie_collection() As String
Public Sub webbrowser1_documentcompleted(sender As Object, e As
WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
If WebBrowser1.Document.Cookie Is Nothing Then
Else
Dim cookies As String() = WebBrowser1.Document.Cookie.Split({";"c}, StringSplitOptions.None)
For Each cookie As String In cookies
Dim name As String = cookie.Substring(0, cookie.IndexOf("="c)).TrimStart(" "c)
Dim value As String = cookie.Substring(cookie.IndexOf("="c) + 1)
If cookie_collection Is Nothing Then
ReDim cookie_collection(0)
Else
ReDim Preserve cookie_collection(cookie_collection.Length)
End If
cookie_collection(cookie_collection.Length - 1) = cookie
' MsgBox(cookie)
Next cookie
End If
End Sub
I verified that two cookies are captured during the authentication process:
So I try to reapply the cookies to my WebClient before downloading:
Public Sub a(sender As Object, e As WebBrowserNavigatingEventArgs) Handles WebBrowser1.Navigating
'intercept the excel download. Retrieve the url but cancel dialog
If e.Url.AbsoluteUri.Contains("fmsdownload") Then
Label3.Text = e.Url.AbsoluteUri
e.Cancel = True
Dim client As New System.Net.WebClient
For Each cookie As String In cookie_collection
client.Headers.Add(Net.HttpRequestHeader.Cookie, cookie)
Next
client.DownloadFile(e.Url.AbsoluteUri, My.Computer.FileSystem.SpecialDirectories.MyDocuments & "\download.xls")
End If
End Sub
This unfortunately results in the same error as before The repote server returned an error: (403) Forbidden which makes me realize the authentication isn't being passed still.
I know this is a big post but I feel like method 2 or method 3 should work so it may be possible I am missing something small (I hope).

In your last bit of code, where you use WebClient to retrieve the file, you currently add the cookies to the request as follows:
For Each cookie As String In cookie_collection
client.Headers.Add(Net.HttpRequestHeader.Cookie, cookie)
Next
However, there is only a single parameter in HTTP that includes all cookies at once, therefore you were overriding the previous cookies by pushing a new header for each cookie. You therefore do not need to extract the cookies by using String.Split, but rather pass the parameter with semicolons as it is, since HTTP should work everything out. Hope this solves your issue!
Edit: Here is what I think might work out:
Retrieve the cookies in the DocumentCompleted event like this (third bit of code):
Dim cookie_collection As String
Public Sub webbrowser1_documentcompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
If WebBrowser1.Document.Cookie IsNot Nothing Then
Dim cookies As String = WebBrowser1.Document.Cookie
If cookie_collection = "" Then
cookie_collection = cookies
Else
cookie_collection &= ";" & cookies
End If
End If
End Sub
Then proceed by simply passing the cookie_collection field to the header:
...
Dim client As New System.Net.WebClient
client.Headers.Add(Net.HttpRequestHeader.Cookie, cookie_collection)
client.DownloadFile(e.Url.AbsoluteUri, My.Computer.FileSystem.SpecialDirectories.MyDocuments & "\download.xls")
...

Related

How to wait until no more "404" is coming from weblient in vb.net?

I need to download a large file (about 3-5GB) in my application. The file is generated dynamically on request, so i can't predict, when it is ready for download. I need to try the download, and when i get a 404, i have to wait and retry later.
The download is async because i have a progressbar.
I tried also to put a "normal" download (WC.DownloadFile(...)) in a try..catch, but didn't solve my problem.
Private Sub DownloadUpdate()
Dim RndName As String = IO.Path.GetRandomFileName
UpdateTmpPath = IO.Directory.CreateDirectory(IO.Path.Combine(IO.Path.GetTempPath, RndName)).FullName
UpdateTmpFile = UpdateTmpPath & "\update.zip"
UpdateUnzipDir = IO.Directory.CreateDirectory(UpdateTmpPath & "\update").FullName
Log(UpdateTmpFile)
WC.DownloadFileAsync(New Uri(url), UpdateTmpFile)
End Sub
btw sorry for my english, it's not my first language :)
Found the answer :)
By adding the handler "DownloadFileCompleted" i'm able to check the Http-Status:
Private Sub AfterDownload(ByVal sender As Object, ByVal e As Object) Handles WC.DownloadFileCompleted
Dim status As HttpStatusCode = DirectCast(CType(e.Error, WebException).Response, HttpWebResponse).StatusCode
If status = HttpStatusCode.NotFound Then
"...wait and retry download"
Else
"...do something with downloaded file"
End If
End Sub
Hope it helps someone.
Daniel

VB.NET 2008 - Input to data to website controls and download results

this is my first Q on this website so let me know if I have missed any important details, and thanks in advance.
I have been asked to access a website and download the results from a user-inputted form. The website asks for a username/password and once accepted, several questions which are used to generate several answers.
Since I am unfamiliar with this area I have set up a simple windows form to tinker around with websites and try to pick things up. I have used a webbrowser control and a button to use it to view the website in question.
When I try to view the website through the control, I just get script errors and nothing loads up. I am guessing I am missing certain plug-ins on my form that IE can handle without errors. Is there anyway I can identify what these are and figure out what to do next? I am stumped.
The script errors are:
"Expected identifier, string or number" and
"The value of the property 'setsection' is null or undefined"
Both ask if I want to continue running scripts on the page. But it works in IE and I cannot see why my control is so different. It actually request a username and password which works fine, it is the next step that errors.
I can provide screenies or an extract from the website source html if needed.
Thanks,
Fwiw my code is:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
'WebBrowser1.ScriptErrorsSuppressed = True
WebBrowser1.Navigate("http://website.com")
'WebBrowser1.Navigate("http://www.google.com")
End Sub
Thanks for Noseratio I have managed to get somewhere with this.
Even though the errors I was getting seemed to be related to some XML/Java/Whatever functionality going askew it was actually because my webbrowser control was using ie 7.0
I forced it into using ie 9 and all is now well. So, using my above example I basically did something like this:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
'WebBrowser1.ScriptErrorsSuppressed = True
BrowserUpdate()
WebBrowser1.Navigate("http://website.com")
'WebBrowser1.Navigate("http://www.google.com")
End Sub
Sub BrowserUpdate()
Try
Dim IEVAlue As String = 9000 ' can be: 9999 , 9000, 8888, 8000, 7000
Dim targetApplication As String = Process.GetCurrentProcess.ToString & ".exe"
Dim localMachine As Microsoft.Win32.RegistryKey = Microsoft.Win32.Registry.LocalMachine
Dim parentKeyLocation As String = "SOFTWARE\Microsoft\Internet Explorer\MAIN\FeatureControl"
Dim keyName As String = "FEATURE_BROWSER_EMULATION"
Dim subKey As Microsoft.Win32.RegistryKey = localMachine.CreateSubKey(parentKeyLocation & "\" & keyName)
subKey.SetValue(targetApplication, IEVAlue, Microsoft.Win32.RegistryValueKind.DWord)
Catch ex As Exception
'Blah blah here
End Try
End Sub

VB.NET WebBrowser Control Programmatically Filling Form After Changing User-Agent (Object reference not set to an instance of an object.)

I'm working on a project where I have a WebBrowser control which needs to have a custom user-agent set, then go to Google and fill out the search box, click the search button, then click a link from the search results. Unfortunately I can't use HTTPWebRequest, it has to be done with the WebBrowser control.
Before I added the code to change the user-agent, everything worked fine. Here's the code that I have:
Imports System.Runtime.InteropServices
Public Class Form1
<DllImport("urlmon.dll", CharSet:=CharSet.Ansi)> _
Private Shared Function UrlMkSetSessionOption(dwOption As Integer, pBuffer As String, dwBufferLength As Integer, dwReserved As Integer) As Integer
End Function
Const URLMON_OPTION_USERAGENT As Integer = &H10000001
Public Sub ChangeUserAgent(Agent As String)
UrlMkSetSessionOption(URLMON_OPTION_USERAGENT, Agent, Agent.Length, 0)
End Sub
Private Sub btnGo_Click(sender As Object, e As EventArgs) Handles btnGo.Click
ChangeUserAgent("Fake User-Agent")
wb.Navigate("http://www.google.com", "_self", Nothing, "User-Agent: Fake User-Agent")
End Sub
Private Sub wb_DocumentCompleted(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs) Handles wb.DocumentCompleted
Dim Source As String = wb.Document.Body.OuterHtml
Dim Uri As String = wb.Document.Url.AbsoluteUri
If Uri = "http://www.google.com/" Then
wb.Document.GetElementById("lst-ib").SetAttribute("value", "browser info")
wb.Document.All("btnK").InvokeMember("click")
End If
If Uri.Contains("http://www.google.com/search?") Then
Dim TheDocument = wb.Document.All
For Each curElement As HtmlElement In TheDocument
Dim ctrlIdentity = curElement.GetAttribute("innerText").ToString
If ctrlIdentity = "BROWSER-INFO" Then
curElement.InvokeMember("click")
End If
Next
End If
End Sub
End Class
The problem lies in the following code:
wb.Document.GetElementById("lst-ib").SetAttribute("value", "browser info")
wb.Document.All("btnK").InvokeMember("click")
I thought the problem might be that the page not being fully loaded (frame issue) but I put the offending code in a timer to test, and got the same error. Any help would be much appreciated.
Do you realize .All("btnK") returns a collection? So, you are doing .InvokeMember("click") on a Collection :). You cannot do that, you can only do .InvokeMember("click") on an element for obvious reasons!
Try this:
wb.Document.All("btnK").Item(0).InvokeMember("click")
The .Item(0) returns the first element in the collection returned by .All("btnK"), and since there will only probably be one item returned, since there is only one on the page, you want to do the InvokeMember on the first item, being .Item(0).
May I ask what it is you are developing?
Since you're a new user, please up-vote and/or accept if this answered your question.

WebBrowser ignores the code

I am trying to use Mibbit irc in my project, and so far is working well, but there is a flaw. Links pasted in the chat upon click are getting opened in Internet explorer, instead of users' default web browser. I tried implementing a simple code, but half of it seems to get ignored.
http://i.stack.imgur.com/FKGGr.jpg
WebBrowser Component Startup page: http://widget.mibbit.com/?settings=4abcd3a5f0bf25306d4c6d1968e28cb2&server=irc.mibbit.net&channel=%23Mytestchannel12345
Ignore if contains: mibbit.com(the chat widged) & ad4game.com(the stupid banner...)
If contains because it places different banners - thus, different links. As well for the widged, it obviously have several servers that is hosting it on and it redirects to some of them, like widged1.mibbit.com, widged2.mibbit.com, etc.
Open in Default user browser: All, except those 2 above.
Public Class Form1
Private Sub WebBrowser1_Navigating(ByVal sender As Object, ByVal e As System.Windows.Forms.WebBrowserNavigatingEventArgs) Handles WebBrowser1.Navigating
Dim navTo As String = e.Url.ToString
If Not (navTo.ToLower.Contains("mibbit.com") OrElse navTo.ToLower.Contains("ad4game.com") OrElse navTo.ToLower.Contains("about:blank")) Then
e.Cancel = True
System.Diagnostics.Process.Start(e.Url.ToString())
End If
End Sub
End Class
Nothing so far worked...
Okay, I've updated your code sample:
Add a new function to find out what the path to the default browser is:
Public Class Form1
Private Sub WebBrowser1_Navigating(ByVal sender As Object, ByVal e As System.Windows.Forms.WebBrowserNavigatingEventArgs) Handles WebBrowser1.Navigating
Dim navTo As String = e.Url.ToString
If Not (navTo.ToLower.Contains("mibbit.com") OrElse navTo.ToLower.Contains("ad4game.com") OrElse navTo.ToLower.Contains("about:blank")) Then
e.Cancel = True
System.Diagnostics.Process.Start(GetDefaultBrowserPath, e.Url.ToString())
End If
End Sub
' get the default folder path from the registry
Public Function GetDefaultBrowserPath() As String
Dim defaultbrowser As String = My.Computer.Registry.GetValue("HKEY_CLASSES_ROOT\HTTP\shell\open\command", "", "Not Found")
Return Split(defaultbrowser, """")(1)
End Function
End Class

How to re-write url in asp.net 3.5

I converted a project from html into aspx
Issue is, all the extension got changed.
e.g. "www.example.com\index.html" Changed to "www.example.com\index.aspx"
which give problem to SEO's.
so now when i search for the web i get the link as www.example.com\index.html and if i try to go in it, it give me the error of 404 file not found.
I tried couple of methods for URL-ReWriting, it works fine at local side, but it fails at server side.
Both i tried in Global.asax
1.
Protected Overloads Sub Application_BeginRequest(ByVal sender As Object, ByVal e As System.EventArgs)
Dim CurrentPath As String = Request.Path.ToLower()
Dim strPageName As String = CurrentPath.Substring(CurrentPath.LastIndexOf("/") + 1, (CurrentPath.LastIndexOf(".") - CurrentPath.LastIndexOf("/")) + 4)
If strPageName.EndsWith(".html") Then
Select Case strPageName
Case "index.html"
RewriteUrl(CurrentPath)
End Select
End If
End Sub
Protected Sub RewriteUrl(ByVal URL As String)
Dim CurrentPath As String = URL
CurrentPath = CurrentPath.Replace(".html", ".aspx")
Dim MyContext As HttpContext = HttpContext.Current
MyContext.RewritePath(CurrentPath)
End Sub
2.
Sub Application_Start(ByVal sender As Object, ByVal e As EventArgs)
RegisterRoutes(RouteTable.Routes)
End Sub
Shared Sub RegisterRoutes(ByVal routes As RouteCollection)
routes.Add("Index", New Route _
( _
"index.html", New CustomRouteHandler("~/index.aspx") _
))
End Sub
Do you have control of the web server? What you most likely need SEO-wise is to create a permanent (301) URL redirect which you can do at the web server level...often in the server settings, or in a script file.
See: http://en.wikipedia.org/wiki/URL_redirection
I realize this doesn't answer your question directly, but if all you are trying to do is get the search results to the correct new page, this is the best way to do it.