VB.net Parsing HTML 100 times. Will it work? - vb.net

Imports System.Web
Imports System.Net
Imports System.Net.ServicePointManager
Public Class GetSource
Function GetHtml(ByVal strPage As String) As String
tryAgain:
ServicePointManager.UseNagleAlgorithm = True
ServicePointManager.Expect100Continue = True
ServicePointManager.CheckCertificateRevocationList = True
ServicePointManager.DefaultConnectionLimit = 100
Dim strReply As String = "NULL"
Try
Dim objhttprequest As System.Net.HttpWebRequest
Dim objhttpresponse As System.Net.HttpWebResponse
objhttprequest = System.Net.HttpWebRequest.Create(strPage)
objhttprequest.Proxy = proxyObject
objhttprequest.AllowAutoRedirect = True
objhttprequest.Timeout = 100000
objhttpresponse = objhttprequest.GetResponse
Dim objstrmreader As New StreamReader(objhttpresponse.GetResponseStream)
strReply = objstrmreader.ReadToEnd()
Catch ex2 As System.Net.WebException
GoTo tryAgain
Catch ex As Exception
strReply = "ERROR! " + ex.Message.ToString
GoTo tryAgain
End Try
Return strReply
End Function
What I got here is a vb.net code where I parse the website for its html
This function works fine.
The question is this...
1.If I run 100 threads with this function at the same time, Will it work?
2.Won't it affect my internet connection as well?
I don't want to waste time creating threads and codes a hundred times so if you know the answer please advice me on what should I do instead

One thing I see that could cause you problems is the goto. You retry if you get an error, but there is no way to break out of the method if an error does occur everytime you request the page, causing an infinite loop. You should put a check in, saying only try again if some cancel flag has not been set. Second, there could be issues with the number of threads you run depending on how much work each thread must do. There is a CPU and memory cost for each thread and it could peg your machine, especially if you get an infinite loop in one of them. Everything else gets a "it depends." Your pc and internet connection will determine everything else. There are tools available to monitor this and I would suggest using them to see what works. I found this page with a lot of information, it might have what you are looking for - http://www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html. Hope this helps.
Wade

Related

How to set variable in second process from first process

first of all: I'm a newbie in this forum and not a professional programmer, just hobby.
This is what I have: One solution in VisualStudio with 3 Projects in VB.net.
First project contains common functions etc.
Second project is a windows service.
Third project is a Windows Forms UI.
Second an third project imports the first project.
My problem: When the service from second project is running and UI from third project is started I want to set a variable (e.g. by pressing a button) that will be set in the service, too. So the service is informed to do some special things.
I've tried to declare this variable in the common project, but this doesn't work. After searching a bit I know now this can't work, because the service and the UI are seperate processes.
There are several solutions to communicate between processes, e.g. IPC, shared memory, named pipes....
But isn't there a simple way to solve my problem ?
Thanks a lot and with best regards,
Matthias
OK, thanks for your answers!! :-)
I decided to try it with memory mapped file.
This is my making function:
Public Function MakeMemoryMappedFile(ByVal pValue As String) As String
Dim Buffer As Byte() = ASCIIEncoding.ASCII.GetBytes(pValue)
Try
Dim mmf As MemoryMappedFile = MemoryMappedFile.CreateOrOpen("test", 10000)
Dim accessor As MemoryMappedViewAccessor = mmf.CreateViewAccessor()
accessor.Write(54, CType(Buffer.Length, UShort))
accessor.WriteArray(54 + 2, Buffer, 0, Buffer.Length)
Return "ok"
Catch ex As Exception
Return "Error = " & ex.Message
End Try
End Function
And this is my reading function:
Public Function ReadMemoryMappedFile() As String
Try
Dim mmf As MemoryMappedFile = MemoryMappedFile.OpenExisting("test")
Dim accessor As MemoryMappedViewAccessor = mmf.CreateViewAccessor()
Dim Size As UShort = accessor.ReadUInt16(54)
Dim Buffer As Byte() = New Byte(Size - 1) {}
accessor.ReadArray(54 + 2, Buffer, 0, Buffer.Length)
Return ASCIIEncoding.ASCII.GetString(Buffer)
Catch noFile As FileNotFoundException
Return "No File found ..."
Catch ex As Exception
Return "Error = " & ex.Message
End Try
End Function
I have on my application a button for setting the value and one button for reading the value -> works fine.
If I start this application twice and set the value in the first instance and then read the value from the second instance -> it works fine.
But: If I set the value in the application my windows service can't read ist. Always got a "FileNotFoundException".
Could you please tell me whats wrong ?!?
Thanks !!!

Filestream read only locking PC

I'm trying to read the Windows update log on remote PCs on my LAN. Most of the time I can successfully read the file but at times the program locks up. Likely due to one issue or another - doesn't really matter. What I do need is a way to recover when the Filestream/Streamreader locks up - I'm not sure which is causing the lock. Some streams can set a timeout but the filestream below returns False on a .CanTimeout call.
How can I break out if the stream locks up? (Sometimes the lock is so tight a power off is needed to recover.)
Is there a way to test if the stream will fail before I actually attempt the read?
Is there an alternate way to read a remote log file that another program has open? (I'm using the stream method because the regular File.IO was blocked because the file is open on the remote PC.)
I'm getting closer (I think) with this code. I browed the pathExists code from the referenced post but it was the OP and not an answer.
Imports System.IO
Import System.Threading
...
Function GetAULog(PCName As String) As String
Try
Dim sLogPath As String = String.Format("\\{0}\c$\Windows\SoftwareDistribution\ReportingEvents.log", PCName)
If PCName = My.Computer.Name Then
sLogPath = String.Format("C:\Windows\SoftwareDistribution\ReportingEvents.log", PCName)
End If
' read file open by another process
If Not pathExists(sLogPath) Then
MsgBox("AU log file not found - PC on?")
Return "NA"
End If
Using fs As New FileStream(sLogPath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
Using sr As New StreamReader(fs)
Dim s As String = sr.ReadToEnd
Return s
End Using
End Using
Catch ex As Exception
MsgBox(ex.Message)
Return ""
End Try
End Function
Public Function pathExists(path As String) As Boolean
Dim exists As Boolean = True
Dim t As New Thread(New ThreadStart(Sub() exists = System.IO.File.Exists(path)))
t.Start()
Dim completed As Boolean = t.Join(500)
'half a sec of timeout
If Not completed Then
exists = False
t.Abort()
End If
t = Nothing
Return exists
End Function
At least when the PC is off the pathExists() code returns False in short order.
My problem now is the process does not end when the program exits - at least in the IDE, didn't check runtime.
I added t = Nothing but that didn't help. I couldn't figure out the proper Using syntax to test that. How do I properly cleanup after a thread timeout?
I've had the situation with this locking until restart problem. It seems to be caused by the tcpip auto tuning feature. You can cure this issue by running
netsh interface tcp set global autotuninglevel=disable
Run this on both machines if you have access. I tried a few workarounds for this issue with checking locks etc but the only way I could solve it was to disable this. The issue is not really with locking but with something at a lower level in the file sharing protocol.
See this article for more detail
"Final" code shown below. The exceptions are not firing when the timeout occurs so the .Abort was evidently OK.
When the timeout does occur, because the remote PC did not respond, there is a process left hanging which goes away after 30 seconds or so. I notice this when using the IDE, I run the program and test a PC that is off. If I then exit the program the form closes but the IDE hangs for ~30 seconds - I can click Stop-Debugging at this point and it works, but the IDE continues on its own after the ~30 second timeout.
I guess the t = Nothing in the Finally block does not dispose of the thread. t.Dispose does not exists.
So, things are working OK with the exception of the dangling thread that eventually clears itself up. The program is no longer hanging to the point where it cannot not be stopped.
'Imports System.IO
'Imports System.Threading
Public Function pathExists(path As String) As Boolean
' check for file exists on remote PC
Dim exists As Boolean = False
Dim t As New Thread(New ThreadStart(Sub() exists = System.IO.File.Exists(path)))
Try
t.Start()
Dim completed As Boolean = t.Join(500)
'half a sec of timeout
If Not completed Then
exists = False
t.Abort()
End If
Catch ex2 As ThreadInterruptedException
MsgBox("timeout on AU log exists test" & vbNewLine & ex2.Message,, "ThreadInterruptedException")
Catch exAbort As ThreadAbortException
MsgBox("timeout on AU log exists test" & vbNewLine & exAbort.Message,, "ThreadAbortException")
Catch ex As Exception
MsgBox("exception on AU log exists test" & vbNewLine & ex.Message)
Finally
t = Nothing
End Try
Return exists
End Function

How do I solve 'System.OutOfMemoryException'

I have a Windows Service application. It is a very busy application. It is supposed to run continuously looking for things to do. After it runs for a while I get
Exception of type 'System.OutOfMemoryException' was thrown.
It can happen at different times but usually a this paragraph:
Private Shared Function GetUnprocessedQueue() As Boolean
Try
Dim l_svcOOA As New svcITGOOA.IsvcITGOOAClient(OOAProcessing.cGlobals.EndPoint_ITGOOA)
Dim l_iFilter As New svcITGOOA.clsFilter
With l_svcOOA
With l_iFilter
.FilingType = OOAProcessing.cGlobals.FilingType
End With
m_ReturnClass = .itgWcfOOA(1, cGlobals.DatabaseIndicator, svcITGOOA.eOOAAction.GetUnprocessedQueue, l_iFilter, 71)
Return CompletedGetUnprocessedQueue(m_ReturnClass)
End With
Catch ex As Exception
ExceptionHandling(ex, "GetUnprocessedQueue " & m_Application)
Return False
End Try
End Function
This is using a wcf service to read a queue. It reads the queue every two minutes to see if new records have been added to it.
Please help me solve this. I don’t know where to start.
The OutOfMemoryException exception occurs when the GC has completed a cycle of collection but the memory is not available even after that. I couldn't make out what the above code snippet does, but I think using Weak References for objects could be useful.
I had a timer that was generated within the same paragraph that I was setting
For example
m_svcTimer = New Timers.Timer With {.Interval = m_Interval, .Enabled = True}
AddHandler m_svcTimer.Elapsed, AddressOf StartTheQueueIfTime
m_svcTimer.Enabled = True
m_svcTimer.Start()
was within the paragraph StartTheQueueIfTime. I thought this would be a way to change the time interval. Instead it kept creating new events. Finally too many caused my crash.
Bob

Check if folder on web exists or not

I'm creating desktop aplication and when I write a username in TextBox1 and on Button1.Click event it should check does folder on web exists.
As far I've tried this:
username = Me.TextBox1.Text
password = Me.TextBox2.Text
Dim dir As Boolean = IO.Directory.Exists("http://www.mywebsite.com/" + username)
If dir = true Then
Dim response As String = web.DownloadString("http://www.mywebsite.com/" + username + "/Password.txt")
If response.Contains(password) Then
MsgBox("You've logged in succesfully", MsgBoxStyle.Information)
Exit Sub
Else
MsgBox("Password is incorect!")
End If
Else
MsgBox("Username is wrong, try again!")
End If
First problem is that my boolean is giving FALSE as answer (directory exists for sure and all permissions are granted to see folder). I tried to solve that with setting dir = false and after that I go into first IF (but that's not what I want, since it should be TRUE, not FALSE)
There we come to second problem, in this line: Dim response As String=web.DownloadString("http://www.mywebsite.com/" + username + "/Password.txt") I get this error message: The remote server returned an error: (404) Not Found.
Anyone more experienced with this kind of things who can help me?
IO.Directory.Exists will not work in this case. That method only works to check for a folder on a disk somewhere (locally or network) ; you can't use it to check for the existence of a resource over HTTP. (i.e a URI)
But even if it did work this way, it's actually pointless to call it before attempting to download - the method DownloadString will throw an exception if something goes wrong - as you have seen, in this case it's telling you 404 Not Found which means "This resource does not exist as far as you are concerned". **
So you should try/catch the operation, you need to catch exceptions of type WebException, cast its Response member to HttpWebException, and check the StatusCode property.
An good example (albeit in C#) is here
** I say "as far as you are concerned" because for all you know, the resource may very well exist on the server, but it has decided to hide it from you because you do not have access to it, etc, and the developer of that site decided to return 404 in this case instead of 401 Unauthorised. The point being that from your point of view, the resource is not available.
Update:
here is the code from the answer I linked to, translated via this online tool because my VB is dodgy enough :). This code runs just fine for me in LinqPad, and produces the output "testlozinka"
Sub Main
Try
Dim myString As String
Using wc As New WebClient()
myString = wc.DownloadString("http://dota2world.org/HDS/Klijenti/TestKlijent/password.txt")
Console.WriteLine(myString)
End Using
Catch ex As WebException
Console.WriteLine(ex.ToString())
If ex.Status = WebExceptionStatus.ProtocolError AndAlso ex.Response IsNot Nothing Then
Dim resp = DirectCast(ex.Response, HttpWebResponse)
If resp.StatusCode = HttpStatusCode.NotFound Then
' HTTP 404
'the page was not found, continue with next in the for loop
Console.WriteLine("Page not found")
End If
End If
'throw any other exception - this should not occur
Throw
End Try
End Sub
Hope that helps.

My url checker function is hanging application in vb.net

Here is vb.net 2008 code is:
Public Function CheckURL(ByVal URL As String) As Boolean
Try
Dim Response As Net.WebResponse = Nothing
Dim WebReq As Net.HttpWebRequest = Net.HttpWebRequest.Create(URL)
Response = WebReq.GetResponse
Response.Close()
Return True
Catch ex As Exception
End Try
End Function
when a url is processing in checking it hangs my application for a while. Is this possible it checks smoothly all url list without hanging my application..
Is there any other fastest way to check urls?
Note: I have about 800 urls in file to check all links a valid by website responce or not.
If an exception occurs, the WebResponse object isn't properly disposed of. This can lead to your app running out of connections. Something like this will work better:
Try
Dim WebReq As Net.HttpWebRequest = Net.HttpWebRequest.Create(URL)
Using Response = WebReq.GetResponse()
Return True
End Using
Catch ex as WebException
Return False
End Try
This using the Using keyword ensures that the response is closed and finalized whenever that block exits.
If it's the server itself that's taking awhile to respond, look into the BeginGetResponse method on the HttpWebRequest. Check MSDN for a sample on how to use it. But be warned, that way also lies madness if you are not careful.
The answer is two fold:
Most of the waiting time is due to downloading content you don't need. If you request to only return the header, you will receive substantially less data, which will make your process faster.
As Matt identified, you aren't disposing of your connections, which may slow your process.
Expanding on Matt's answer, do the following:
Try
Dim WebReq As Net.HttpWebRequest = Net.HttpWebRequest.Create(URL)
WebReq.Method = "HEAD" 'This is the important line.
Using Response = WebReq.GetResponse()
Return True
End Using
Catch ex as WebException
Return False
End Try
GetResponse delivers you the whole content to your request. If this is what you want, there's not many room to speed up the request on the client side, since it mostly depends on the URLs server how fast to reply and how much data will be send over. If you just want to check if the URL is valid (or responding at all), it might be better to just ping it.
Keep in mind GetResponse isn't disposed when it runs into an error, so use the code posted by Matt to avoid this.
For your other problem, hanging application, you might avoid this be running this code as a thread.
This works basically like this (from here):
rem at the top of the code
Imports System.Threading
...
rem your event handler, p.e. button click or whatever
trd = New Thread(AddressOf ThreadTask)
trd.IsBackground = True
trd.Start()
rem your code
Private Sub ThreadTask()
dim i as long
Do
i += 1
Thread.Sleep(100)
Loop
End Sub