Im tring to webparse some website but what im getting is a warn that tells me to turn on
javascript.
In order to bypass it I need to use WebBrowser but this method taking a lot of RAM when doing 18 requests it can go up to 3GB of ram .
Try
Dim Request As HttpWebRequest = HttpWebRequest.Create(TextBox1.Text)
Request.Proxy = Nothing
Request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:92.0) Gecko/20100101 Firefox/92.0"
Dim Response As HttpWebResponse = Request.GetResponse
Dim ResponseStream As System.IO.Stream = Response.GetResponseStream
Dim StreamReader As New System.IO.StreamReader(ResponseStream)
Dim Data As String = StreamReader.ReadToEnd
StreamReader.Close()
RichTextBox1.Text = Data
Catch ex As Exception
MsgBox("Error")
End Try
Output :
<!DOCTYPE html>
<html>
<script src="/vddosw3data.js"></script>
<body>
<div w3-include-html="/5s.html"></div>
<noscript><h1 style="text-align:center;color:red;"><strong>Please turn JavaScript on and
reload the page.</strong></h1></noscript>
<script>
Is there anyway to baypass it without using a webbrowser?
Related
hello i was using the code for a while in my vb.net app but today when i tried it it returned a 403 error.
Shared Function GetHtmlPage(ByVal http://mmo-stream.net/AiaSpecto/tester.php As String) As String
Dim strResult As String
Dim objResponse As WebResponse
Dim objRequest As WebRequest
objRequest = HttpWebRequest.Create(strURL)
objResponse = objRequest.GetResponse()
Using sr As New StreamReader(objResponse.GetResponseStream())
strResult = sr.ReadToEnd()
sr.Close()
End Using
Return strResult
End Function
I really cant understand why i get a 403 error because when i go on the link itself i get the OK message.
The request wants a useragent header.
If you change the object to HttpWebResponse and HttpWebRequest and add a generic useragent it should return OK :
Dim objResponse As HttpWebResponse
Dim objRequest As HttpWebRequest
objRequest = HttpWebRequest.Create(strURL)
objRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36"
objResponse = objRequest.GetResponse()
i have a vb.net console application that logged into a website (POST form) by using Webclient:
Dim responsebytes = myWebClient.UploadValues("https:!!xxx.com/mysession/create", "POST", myNameValueCollection)
Last friday this suddenly stopped working, it worked without a problem for about 2-3 years. With Fiddler I got a HTTP 504 error but without Fiddler I got the error message:
The underlying connection was closed: The connection was closed unexpectedly.
I assume that something on the server-side has changed, but I have no influence on that. It's a commercial website, where I want to login automatically on my account to fetch some data.
As Fiddler can't help me much further I decided to built a basic HttpWebRequest example to rule out it was caused by the WebClient.
The example does:
navigate to the homepage of the company and read out an securityToken (this goes ok!)
post the securityToken + username + password to get logged in.
Public Class Form1
Const ConnectURL = "https:!!member.company.com/homepage/index"
Const LoginURL = "https:!!member.company.com/account/logn"
Private Function RegularPage(ByVal URL As String, ByVal CookieJar As CookieContainer) As String
Dim reader As StreamReader
Dim Request As HttpWebRequest = HttpWebRequest.Create(URL)
Request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36"
Request.AllowAutoRedirect = False
Request.CookieContainer = CookieJar
Dim Response As HttpWebResponse = Request.GetResponse()
reader = New StreamReader(Response.GetResponseStream())
Return reader.ReadToEnd()
reader.Close()
Response.Close()
End Function
Private Function LogonPage(ByVal URL As String, ByRef CookieJar As CookieContainer, ByVal PostData As String) As String
Dim reader As StreamReader
Dim Request As HttpWebRequest = HttpWebRequest.Create(URL)
Request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36"
Request.CookieContainer = CookieJar
Request.AllowAutoRedirect = False
Request.ContentType = "application/x-www-form-urlencoded"
Request.Method = "POST"
Request.ContentLength = PostData.Length
Dim requestStream As Stream = Request.GetRequestStream()
Dim postBytes As Byte() = Encoding.ASCII.GetBytes(PostData)
requestStream.Write(postBytes, 0, postBytes.Length)
requestStream.Close()
Dim Response As HttpWebResponse = Request.GetResponse()
For Each tempCookie In Response.Cookies
CookieJar.Add(tempCookie)
Next
reader = New StreamReader(Response.GetResponseStream())
Return reader.ReadToEnd()
reader.Close()
Response.Close()
End Function
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim CookieJar As New CookieContainer
Dim PostData As String
Try
Dim homePage As String = (RegularPage(ConnectURL, CookieJar))
Dim securityToken = homePage.Substring(homePage.IndexOf("securityToken") + 22, 36) 'momenteel 36 characters lang
PostData = "securityToken=" + securityToken + "&accountId=123456789&password=mypassword"
MsgBox(PostData)
Dim accountPage As String = (LogonPage(LoginURL, CookieJar, PostData))
Catch ex As Exception
MsgBox(ex.Message.ToString)
End Try
End Sub
End Class
This line causes the connection to be closed:
Dim requestStream As Stream = Request.GetRequestStream()
Is it possible that this company doesnt like the automated login and somehow notices that a application is used for logging in? How can I debug this? Fiddler doesn't seem to work. Is my only option WireShark as this seems kind of difficult to me.
Also is it weird that the connection is already is closed before I do the Post?
Are there other languages I can program this "easily" to rule out it's VB.net / .NET problem?
Have you attempted to capture the request using something like your browser's networking tools?
The auth process may have changed. Could even be some name or post data changes.
I got this fixed by:
double checking all the headers to be sent when using a browser
made sure all those headers where sent by the VB.NET application.
Not sure which one did the trick, but just always make sure you replicate all the headers that the browser would sent!
VB2012: I'm trying to login to my company website to parse out some info. It has a typical page
portal.mycompany.com
which redirects eventually to
security.mycompany.com/login.jsp?TYPE=xxx&METHOD=GET&{more parameters}
and there we are presented with textboxes for user name and password. I have Fiddler running and am a little lost as to what to look for when setting up my POST. My example looks to be the same from various coding sites. I am mainly looking for help on what to look for in Fiddler to use as a basis for the POST request to login to my site programmatically.
I took a look at some of the Fiddler entries and added what I thought was the enrty with the credentials. But when i tried adding this to the POST request it just responded with the original page.
Dim cookieJar As New Net.CookieContainer()
Dim request As Net.HttpWebRequest
Dim response As Net.HttpWebResponse
Dim strURL As String
Try
'Get Cookies
strURL = "http://portal.mycompany.com"
request = Net.HttpWebRequest.Create(strURL)
request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
request.Method = "GET"
request.CookieContainer = cookieJar
response = request.GetResponse()
For Each tempCookie As Net.Cookie In response.Cookies
cookieJar.Add(tempCookie)
Next
'Send the post data now
request = Net.HttpWebRequest.Create(strURL)
request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
request.Method = "POST"
request.AllowAutoRedirect = True
request.CookieContainer = cookieJar
Dim writer As StreamWriter = New StreamWriter(request.GetRequestStream())
writer.Write("email=username&pass=password") 'where do I get this in Fiddler?
writer.Close()
response = request.GetResponse()
'Get the data from the page
Dim stream As StreamReader = New StreamReader(response.GetResponseStream())
Dim data As String = stream.ReadToEnd()
response.Close()
If data.Contains("<title>MyCompany") = True Then
'LOGGED IN SUCCESSFULLY
End If
Catch ex As Exception
MsgBox(ex.Message)
End Try
I have a url:
test.com/Search/NumberSearch.aspx
On the page there are a number of controls, one of them is textbox. When the user enters a six digit (approximately) number into the textbox and hits enter, the page goes to another page:
test.com/Data/DetailsPage.aspx?mynum=123456
on that page there are a number of textboxes and other controls from which I need to scrape the data in addition to a number of links that I need to caputure in my code.
I have tried using VB.NET WebRequest:
Dim wreq As WebRequest = WebRequest.Create("test.com/Data/DetailsPage.aspx?mynum=" & num)
Dim wresp As HttpWebResponse = CType(wreq.GetResponse(), HttpWebResponse)
Dim dStream As Stream = wresp.GetResponseStream()
Dim rdr As New StreamReader(dStream)
Dim respStr As String = rdr.ReadToEnd()
As a result my respStr contains a string with html code but that code is for
test.com/Search/NumberSearch.aspx
not for the resulting
test.com/Data/DetailsPage.aspx?mynum=123456
page with details.
My goal is to get the details page html programmatically.
I also tried using
WebClient.DownloadString
but gotten the same result. Can anyone help?
I would try setting the User-Agent header, as that is what many sites key off of:
Dim wreq As WebRequest = WebRequest.Create("test.com/Data/DetailsPage.aspx?mynum=" & num)
wreq.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36")
Dim wresp As HttpWebResponse = CType(wreq.GetResponse(), HttpWebResponse)
Dim dStream As Stream = wresp.GetResponseStream()
Dim rdr As New StreamReader(dStream)
Dim respStr As String = rdr.ReadToEnd()
I need to use HTTPWebRequest to login to an external website and redirect me to the default page. My code below is behind a button - when clicked it currently tries to do some processing but stays on the same page. I need it to redirect me to the default page of the external website without seeing the login page. Any help on what I'm doing wrong?
Dim loginURL As String = "https://www.example.com/login.aspx"
Dim cookies As CookieContainer = New CookieContainer
Dim myRequest As HttpWebRequest = CType(WebRequest.Create(loginURL), HttpWebRequest)
myRequest.CookieContainer = cookies
myRequest.AllowAutoRedirect = True
myRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20100101 Firefox/13.0.1"
Dim myResponse As HttpWebResponse = CType(myRequest.GetResponse(), HttpWebResponse)
Dim responseReader As StreamReader
responseReader = New StreamReader(myResponse.GetResponseStream())
Dim responseData As String = responseReader.ReadToEnd()
responseReader.Close()
'call a function to extract the viewstate needed to login
Dim ViewState As String = ExtractViewState(responseData)
Dim postData As String = String.Format("__VIEWSTATE={0}&txtUsername={1}&txtPassword={2}&btnLogin.x=27&btnLogin.y=9", ViewState, "username", "password")
Dim encoding As UTF8Encoding = New UTF8Encoding()
Dim data As Byte() = encoding.GetBytes(postData)
'POST to login page
Dim postRequest As HttpWebRequest = CType(WebRequest.Create(loginURL), HttpWebRequest)
postRequest.Method = "POST"
postRequest.AllowAutoRedirect = True
postRequest.ContentLength = data.Length
postRequest.CookieContainer = cookies
postRequest.ContentType = "application/x-www-form-urlencoded"
postRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20100101 Firefox/13.0.1"
Dim newStream = postRequest.GetRequestStream()
newStream.Write(data, 0, data.Length)
newStream.Close()
Dim postResponse As HttpWebResponse = CType(postRequest.GetResponse(), HttpWebResponse)
'using GET request on default page
Dim getRequest As HttpWebRequest = CType(WebRequest.Create("https://www.example.com/default.aspx"), HttpWebRequest)
getRequest.CookieContainer = cookies
getRequest.AllowAutoRedirect = True
Dim getResponse As HttpWebResponse = CType(getRequest.GetResponse(), HttpWebResponse)
'returns statuscode = 200
FYI - when i add in this code at the end, i get the HTML of the default page I'm trying to redirect to
Dim responseReader1 As StreamReader
responseReader1 = New StreamReader(getRequest.GetResponse().GetResponseStream())
responseData = responseReader1.ReadToEnd()
responseReader1.Close()
Response.Write(responseData)
Any help on whats missing to get the redirect working?
Cheers
The HttpWebRequest only automatically redirects you if the server sends an HTTP 3xx redirection status with a Location field in the response. Otherwise you are supposed to manually navigate to the page by using Response.Redirect, for example. Also keep in mind that the automatic redirection IGNORES ANY COOKIES sent by the server. That may be the problem in your case if the server is actually sending a redirection status.