Out Of Memory Exception - unmanaged memory - vb.net

I am working on a web scraper, and it gernally works quite well. It will go through thousands of pages on most sites and complete sucessfully with no issues.
On a few sites, I am repeatedly seeing the same issue.
Insufficient memory to continue the execution of the program.
Edit:
I used perfmon to determine that the leak is happening in unmanaged memory.
I know because "private bytes" keeps increasing as the program runs, while bytes in all heaps stays stable.
(actually, it goes up and down, but gradually climbs. it usually runs out of memory in the code section i listed above, but i dont think that section is the cause, but rather a likely first victim because it uses a lot of memory... i think it releases it afterwards though)
Edit 2:
I followed the directions on this site:
http://www.codeproject.com/Articles/42721/Best-Practices-No-5-Detecting-NET-application-memo
and i used debugDiag to inspect the program.
After analyzing the data, debug diag told me what was responsible for the leak:
jscript.dll is responsible for 1.10 GBytes worth of outstanding allocations. The following are the top 2 memory consuming functions:
jscript!Parser::GenerateCode+167: 498.19 MBytes worth of outstanding allocations.
jscript!NoRelAlloc::PvAlloc+96: 292.99 MBytes worth of outstanding allocations.
I am not referencing jscript.dll in my application, it must be being used by the web browser controls which I am using.
System.Windows.Forms.WebBrowser
Thats my guess, at least.
I am also getting a message box that pops up with the title "Message From webpage" that says something to the effect of "out of memory at line X."
So, i figured that i could just dispose of the webbrowser objects and get my memory back - so i added a button with the following code:
Me.wbMain.Dispose() 'dispose all of thwe web-browsers
frmDebugger.wbDebugMain.Dispose()
Me.WBNewWin.Dispose()
GC.Collect() 'just for the heck of it
So, after running it for awhile, i stopped scraping and clicked my new button... it didnt make any difference at all. I was watching the total "Private Bytes" in perfmon, and it didn't even move.
Any ideas, anyone?
Edit 3:
I have tried a bunch of the recommended solutions, none of them seem to be working.
Someone suggested that it may be due to images not being cleared from the cache, but i disabled images from loading, so i know that is not the problem.
I also heard that IE7 had an issue, and that upgrading to IE8 would resolve it. I have IE8 and it still leaks memory.
Someone suggested that minimizing the form with the webbrowser control would release some memory. I tried, and it does not make a difference.
I have also been told that i should not expect the memory use to just drop, as i will have to wait for the garbage collector. It is not a leak in managed code, so GC.Collect() wont do anything. It is in unmanaged memory. Apparently the javascript functionality uses different memory, and theres no manual way to force a collection. But its getting to the point where it crashes, so obviously there is a problem.
I am adding a bounty of 50 to this question, and i will award it to anyone who helps me solve the leak. I wanted to try this solution:
http://www.codeproject.com/Questions/322884/WPF-WebBrowser-control-vs-Internet-Explorer-browse
but i am unable to figure out what the vb.net equivalent would be. I have tried online converters, and they error when converting this code (though they work fine for other code i have converted in the past)
If i am unable to solve the leak, i will award it to anyone who converts the page i mentioned above from c# to vb.net.
My fallback plan is to create a separate application that only contains the webbrowser, and communicate with that process, until it runs low on memory, at which point i will restart it (memory is releasd when i clsoe my application completely). This is far from ideal for my application, as the webbrowser is woven pretty tightly into my project.
Edit 4
I tried to implement the javascript injection suggested - here is my code:
(I fire it just before navigating to a new page)
Public Shared Sub Clean_JS(ByRef wb As System.Windows.Forms.WebBrowser)
Dim args As Object() = {"document.body"}
Dim head As HtmlElement = wb.Document.GetElementsByTagName("head")(0)
Dim scriptEl0 As HtmlElement = wb.Document.CreateElement("script")
Dim element0 As mshtml.IHTMLScriptElement = DirectCast(scriptEl0.DomElement, mshtml.IHTMLScriptElement)
element0.text = "function ReleaseHandler() {" + vbCrLf + " var EvtMgr = (function() {" + vbCrLf + " var listenerMap = {};" + vbCrLf + " " + vbCrLf + " // Public interface" + vbCrLf + " return {" + vbCrLf + " addListener: function(evtName, node, handler) {" + vbCrLf + " node[""on"" + evtName] = handler;" + vbCrLf + " var eventList = listenerMap[evtName];" + vbCrLf + " if (!eventList) {" + vbCrLf + " eventList = listenerMap[evtName] = [];" + vbCrLf + " }" + vbCrLf + " eventList.push(node);" + vbCrLf + " }," + vbCrLf + " " + vbCrLf + " removeAllListeners: function() {" + vbCrLf + " for (var evtName in listenerMap) {" + vbCrLf + " var nodeList = listenerMap[evtName];" + vbCrLf + " for (var i = 0, node; node = nodeList[i]; i++) {" + vbCrLf + " node[""on"" + evtName] = null;" + vbCrLf + " }" + vbCrLf + " }" + vbCrLf + " }" + vbCrLf + " }" + vbCrLf + " })();" + vbCrLf + " }"
head.AppendChild(scriptEl0)
Dim scriptEl1 As HtmlElement = wb.Document.CreateElement("script")
Dim element1 As mshtml.IHTMLScriptElement = DirectCast(scriptEl1.DomElement, mshtml.IHTMLScriptElement)
element1.text = "function ReleaseHandler() {" + vbCrLf + " var EvtMgr = (function() {" + vbCrLf + " var listenerMap = {};" + vbCrLf + " " + vbCrLf + " // Public interface" + vbCrLf + " return {" + vbCrLf + " addListener: function(evtName, node, handler) {" + vbCrLf + " node[""on"" + evtName] = handler;" + vbCrLf + " var eventList = listenerMap[evtName];" + vbCrLf + " if (!eventList) {" + vbCrLf + " eventList = listenerMap[evtName] = [];" + vbCrLf + " }" + vbCrLf + " eventList.push(node);" + vbCrLf + " }," + vbCrLf + " " + vbCrLf + " removeAllListeners: function() {" + vbCrLf + " for (var evtName in listenerMap) {" + vbCrLf + " var nodeList = listenerMap[evtName];" + vbCrLf + " for (var i = 0, node; node = nodeList[i]; i++) {" + vbCrLf + " node[""on"" + evtName] = null;" + vbCrLf + " }" + vbCrLf + " }" + vbCrLf + " }" + vbCrLf + " }" + vbCrLf + " })();" + vbCrLf + " }"
head.AppendChild(scriptEl1)
wb.Document.InvokeScript("ReleaseHandler")
wb.Document.InvokeScript("purge", args)
End Sub
unfortunately, i am still seeing privaty bytes increasing in perfmon.
can anyone see any flaws in my logic? I am trying to implement this fix:
http://www.codeproject.com/Questions/322884/WPF-WebBrowser-control-vs-Internet-Explorer-browse
btw - i tested it using simple code such as this:
object[] args = {"my important message"};
webBrowser1.Document.InvokeScript("alert",args);
and this:
Dim head As HtmlElement = wb.Document.GetElementsByTagName("head")(0)
Dim scriptEl As HtmlElement = wb.Document.CreateElement("script")
Dim element As mshtml.IHTMLScriptElement = DirectCast(scriptEl.DomElement, mshtml.IHTMLScriptElement)
element.text = "function sayHello() { alert('hello') }"
head.AppendChild(scriptEl)
wb.Document.InvokeScript("sayHello")
and it showed the message in both test cases.
Curiously, when i tried to test the script injection by doing this:
Dim head As HtmlElement = wbMain.Document.GetElementsByTagName("head")(0)
Dim scriptEl As HtmlElement = wbMain.Document.CreateElement("script")
Dim element As mshtml.IHTMLScriptElement = DirectCast(scriptEl.DomElement, mshtml.IHTMLScriptElement)
element.text = "function sayHello() { alert('hello') }"
head.AppendChild(scriptEl)
wbMain.Document.InvokeScript("sayHello")
RTB_RawHTML.Text = "TEST" + vbCrLf + wbMain.DocumentText
I didnt see the injected code reflected in the text box - the only change i saw was the word "test" appearing (i run the code RTB_RawHTML.Text = wbMain.DocumentText when the pages finish loading from the documentCompleted event...)

The code in your referenced article is not C#, it is Javascript. I believe the idea would be to inject the JS into your HTML page so that it can run when the page unloads, which will clean out the existing JS events.
You can check out this article for adding JS to a page in your WebBrowser control:
http://www.codeproject.com/Articles/94777/Adding-a-Javascript-Block-Into-a-Form-Hosted-by-We
Dim scriptText As String =
<string>
function ReleaseHandler() {
var EvtMgr = (function() {
var listenerMap = {};
// Public interface
return {
addListener: function(evtName, node, handler) {
node["on" + evtName] = handler;
var eventList = listenerMap[evtName];
if (!eventList) {
eventList = listenerMap[evtName] = [];
}
eventList.push(node);
},
removeAllListeners: function() {
for (var evtName in listenerMap) {
var nodeList = listenerMap[evtName];
for (var i = 0, node; node = nodeList[i]; i++) {
node["on" + evtName] = null;
}
}
}
}
})();
}
function purge(d){
var a = d.attributes, i, l, n;
if (a) {
for (i = a.length - 1; i >= 0 ; i -= 1) {
n = a[i].name;
if (typeof d[n] === 'function') {
d[n] = null;
}
}
}
a = d.childNodes;
if (a) {
l = a.length;
for (i = 0; i < l; i += 1) {
purge(d.childNodes[i]);
}
}
}
<string>
Dim head As HtmlElement = webBrowser1.Document.GetElementsByTagName("head")(0)
Dim script As HtmlElement = webBrowser1.Document.CreateElement("script")
Dim domElement As IHTMLScriptElement = CType(script.DomElement, IHTMLScriptElement)
domElement.text = scriptText
head.AppendChild(script)
I've not tested this code (I'm not really sure how I'd go about doing that since you've offered no example code yourself)... this is more of a suggestion for how you might proceed. I've never tried to insert JS into a WebBrowser control, so I'm not quite sure how you'd go about executing it (since, in theory, the JS will have already executed after loading the page, thus your injected JS would be "late to the party").
You'll also need to find a way to wire-up the document so that it calls both of these functions when it unloads. The idea is to eliminate JS memory leaks by eliminating JS objects and events, so simply having the functions declared is insufficient. I've seen a lot of articles online discussing how the OnBeforeUnload event is broken in the WebBrowser control (it doesn't fire correctly), so you may have quite a bit of work cut out for you.

May be you can tried code for not saving the cookie to the user computer. Cause temporary item can make several issue to user computer

Related

Processing TIF images with Marshal.Copy ends in a Fatal Error

I have a function that inserts one image into another byte-wise:
Public Function InsertPIP(ByRef BmpSrc As Bitmap, ByRef BmpTgt As Bitmap, InsertPoint As Point, Optional bpp As Int16 = 1) As Bitmap
If BmpSrc.Width + InsertPoint.X <= BmpTgt.Width And BmpSrc.Height + InsertPoint.Y <= BmpTgt.Height Then
Try
Dim DataSrc As Imaging.BitmapData = BmpSrc.LockBits(New Rectangle(0, 0, BmpSrc.Width, BmpSrc.Height), Imaging.ImageLockMode.[WriteOnly], Imaging.PixelFormat.Format1bppIndexed)
Dim BytesSrc As Byte() = New Byte(DataSrc.Height * DataSrc.Stride) {}
If VerboseLevel >= 3 Then LW.WriteLine("InsertPIP: Stride=" & DataSrc.Stride.ToString & " Scan=" & DataSrc.Scan0.ToInt64 & " DataSRC.Size=" & DataSrc.Width & "x" & DataSrc.Height & " BytesSRC.len=" & BytesSrc.Length.ToString & " Height*Stride=" & (DataSrc.Height * DataSrc.Stride).ToString)
Marshal.Copy(DataSrc.Scan0, BytesSrc, 0, BytesSrc.Length)
'^^^^^^^^^^^^ ERROR OCURS HERE ^^^^^^^^^^^^
Dim DataTgt As Imaging.BitmapData = BmpTgt.LockBits(New Rectangle(0, 0, BmpTgt.Width, BmpTgt.Height), Imaging.ImageLockMode.[WriteOnly], Imaging.PixelFormat.Format1bppIndexed)
Dim BytesTgt As Byte() = New Byte(DataTgt.Height * DataTgt.Stride - 1) {}
Marshal.Copy(DataTgt.Scan0, BytesTgt, 0, BytesTgt.Length)
BytesTgt = InsertPictureIntoPicture(BytesSrc, BytesTgt, BmpSrc.Width, 1, New Rectangle(InsertPoint.X, InsertPoint.Y, BmpSrc.Width, BmpSrc.Height), DataSrc.Stride * 8 / bpp, DataTgt.Stride * 8 / bpp)
Marshal.Copy(BytesTgt, 0, DataTgt.Scan0, BytesTgt.Length)
BmpSrc.UnlockBits(DataSrc)
BmpTgt.UnlockBits(DataTgt)
Catch ex As Exception
If VerboseLevel >= 3 Then LW.WriteLine("InsertPIP ERROR: " & ex.Message & vbCrLf)
End Try
Return BmpTgt
Else
LW.WriteLine("WARNING: Inserting bitmap into bitmap: Source doesn't fit the target. BmpSrc=[" & BmpSrc.Width & "," & BmpSrc.Height & "] BmpTgt=[" & BmpTgt.Width & "," & BmpTgt.Height & "] pt=[" & InsertPoint.X & "," & InsertPoint.Y & "]")
Return BmpTgt
End If
End Function
It's been working for more than a year fine in 99+% cases. But some rare images throw cause a fatal error on line Marshal.Copy(...) ("Fatal Error Detected - unable to continue. Unhandled operating system exception: e0434352"). I am trying to find the cause of the problem but:
debugger seems to me useless, it just says System.AccessViolationException and System.Reflection.TargetInvocationException with no other details
try-catch does not catch the fatal error
trying to see the print the inputs didn't help me:
InsertPIP: Stride=512 Scan=2562876506112 DataSRC.Size=4088x712 BytesSRC.len=364545 Height*Stride=364544
EDIT: I tried to isolate it into standalone application and debug there and no I get little bit more detailed exception message (and what I expected):
System.AccessViolationException: 'Attempted to read or write protected memory. This is often an indication that other memory is corrupt.'
I generate all the TIF images, they're generated with the same methods and they have all `PixelFormat.Format1bppIndexed`.
Affected Image:
I haven't been able to find much on the internet, suggestions apprectiated. If I can improve my question, please let me know how.
Link to the original TIF image: https://drive.google.com/file/d/1xTIQi5M9UxK7A98992W3PUNMOmODP8p2/view?usp=sharing

PrintDialog ToPage should be less than the page count

I recently upgraded my computer to Windows 10, and now one of the programs is acting weird since the upgrade.
Trying to print a page range out of a PDF, and when I print pages 1 to 100 (out of 477 pages) I get an error saying ToPage should be less than the page count even though 100 is less than 477.
If I skip the page range part and just have it print all of the pages, it works fine.
Sub PrintToPaperSync(ByVal InputfilePath As String, Optional ByVal DeleteAfter As Boolean = False)
On Error GoTo sError
Dim tError As String = ""
Dim toPage As Integer = 0
Console.WriteLine("PrintToPaper " & InputfilePath)
Dim File As String = Split(InputfilePath, "\")(Split(InputfilePath, "\").Length - 1)
Dim viewer As New Syncfusion.Windows.Forms.PdfViewer.PdfDocumentView
viewer.Load(InputfilePath)
Dim print As New System.Windows.Forms.PrintDialog()
print.Document = viewer.PrintDocument
print.Document.DocumentName = File
'print 100 pages at a time
Do While toPage < viewer.PageCount
print.Document.PrinterSettings.PrintRange = Drawing.Printing.PrintRange.SomePages
print.Document.PrinterSettings.FromPage = toPage + 1
toPage += 100
print.Document.PrinterSettings.ToPage = IIf(toPage < viewer.PageCount, toPage, viewer.PageCount)
tError = "From: " & print.Document.PrinterSettings.FromPage & " | To: " & print.Document.PrinterSettings.ToPage & " | PageCount: " & viewer.PageCount & " - "
print.Document.Print()
Application.DoEvents()
Loop
viewer.Unload()
viewer.Dispose()
Console.WriteLine("Printing: " & InputfilePath)
Exit Sub
sError:
Dim ErrorStr As String = ErrorToString()
WriteLine("PrintToPaper " & tError & " " & InputfilePath & " - " & ErrorStr)
End Sub
Full text of the error:
PrintToPaper From: 1 | To: 100 | PageCount: 477 - F:\Process\LogTag-10-30-15-104122.pdf - ToPage should be less than the page count
We want to only print 100 pages at a time, because the printer begins to slow down after 100 pages for some reason.
This printing problem is caused due to the issue with Syncfusion Control. You can contact the Syncfusion Software Support team to get the issue resolved. Please follow the below link to contact Syncfusion Support Team
Click Here

Why is the file being overwritten?

I'm designing a new server application, which includes a subroutine that parses the input into the console window, for example
LogAlways("--- CPU detection ---")
will be written as:
[net 21:8:38.939] --- CPU detection ---
This is the subroutine:
Public Sub LogAlways(ByVal input As String)
Dim dm As String = "[net " + Date.Now.Hour.ToString + ":" + Date.Now.Minute.ToString + ":" + Date.Now.Second.ToString + "." + Date.Now.Millisecond.ToString + "] "
Console.WriteLine(dm + input)
Dim fName As String = Application.StartupPath() + "\LogBackups\" + Date.Now.Day.ToString + Date.Now.Month.ToString + "" + Date.Now.Year.ToString + ".log"
Dim stWt As New Global.System.IO.StreamWriter(fName)
stWt.Write(dm + input)
stWt.Close()
End Sub
This works, but however only the last line of my desired input is written to the file.
Why is this happening, and how can I make it so that it does not overwrite the log file?
This is using the Wildfire Server API.
This is not a duplicate, as the destination question has a different answer which would otherwise not answer this question.
This occurs as the StreamWriter has not been told to append the output to the end of the file with the parameter set to True, Visual Studio actually gives it as a version of the StreamWriter:
To correctly declare it:
Dim stWt As New Global.System.IO.StreamWriter(fName, True)
or in the subroutine:
Public Sub LogAlways(ByVal input As String)
Dim dm As String = "[net " + Date.Now.Hour.ToString + ":" + Date.Now.Minute.ToString + ":" + Date.Now.Second.ToString + "." + Date.Now.Millisecond.ToString + "] "
Console.WriteLine(dm + input)
Dim fName As String = Application.StartupPath() + "\LogBackups\" + Date.Now.Day.ToString + Date.Now.Month.ToString + "" + Date.Now.Year.ToString + ".log"
Dim stWt As New Global.System.IO.StreamWriter(fName, True)
stWt.Write(dm + input)
stWt.Close()
End Sub
Requires the following to be Imports:
System.IO
System.Windows.Forms
It will now correctly write to the end of the file, but however it is noted that stWt.Close()'ing the file on every call may cause issues, therefore a queuing system may be better:
Desired log output is inserted into a single-dimensional array
A Timer dumps this array to the log file on every, say, five to ten seconds
When this is done, the array is cleared

Running Multiple Processes

The following code I'm about to posts works fine, however I need to be able to kick off multiple processes at the same time.
So to give some background, the listbox contains files that will be run through another process to create PDF files (essentially passing arguments to the other process which is the exe listed in the StartInfo.Filename). What's currently happening, say the listbox contains 10 files. Each file will be processed separately before the additional files are processed. I'd like to be able to kick off all 10 files at the same time instead of waiting. Some files may take longer than others, so I'm wasting time waiting for each file to finish.
Suggestions?
Dim UPSFiles = (From i In ListBoxUPSFiles.Items).ToArray()
For Each Item In UPSFiles
Dim UPSFiles2 = Item.ToString
Using psinfo As New Process
psinfo.StartInfo.FileName = "\\dgrvdp1\ClientServices\APPS\Printtrack\HeliosPNetExecuter\HeliosPNetExecuter.exe "
psinfo.StartInfo.Arguments = Arg2 + Arg3 + Arg4 + (Chr(34) + DATA_PATH + "\" + UPSFiles2 + Chr(34) + " ") + Arg6 + Arg7 + Arg8 + Arg9a + Arg10 + Arg11 + Arg13
psinfo.StartInfo.WindowStyle = ProcessWindowStyle.Hidden
psinfo.Start()
'psinfo.WaitForExit()
End Using
Next
EDIT
Here's my current code, based on the Parallel.ForEach suggestion. It appears to sort of worked but submitted 10x the number of files I need to run. In my case, I have two files to process however like I mention the code produced 10x the number of processes I truly need.
Dim SequentialFiles = (From i In ListBoxSequentialFiles.Items).ToString
For Each Item In SequentialFiles
Dim SequentialFiles2 = Item.ToString
Parallel.ForEach(SequentialFiles2, Sub(processFiles)
Using psinfo As New Process
psinfo.StartInfo.FileName = "\\dgrvdp1\ClientServices\APPS\Printtrack\HeliosPNetExecuter\HeliosPNetExecuter.exe "
psinfo.StartInfo.Arguments = Arg2 + Arg3 + Arg4 + (Chr(34) + DATA_PATH + "\" + SequentialFiles2 + Chr(34) + " ") + Arg6 + Arg7 + Arg8 + Arg9c + Arg10 + Arg11 + Arg12
psinfo.StartInfo.WindowStyle = ProcessWindowStyle.Normal
psinfo.Start()
psinfo.WaitForExit()
End Using
End Sub)
Next
Depends how is PDF processor work. You can avoid creting threads, but simply launch 10 processes, by feeding inside one file per process. No need of multi threading, at this stage at least.

SharpPcap Encoding.UTF8.GetBytes

Does someone know which is the right way to get the actual text in these bytes?
I do something wrong here.
And another question: is utf-8 the most generic encoding, that will show most of the chars correctly?
TY
private void device_OnPacketArrival(object sender, SharpPcap.CaptureEventArgs e)
{
string str = string.Empty;
var time = e.Packet.Timeval.Date;
var len = e.Packet.Data.Length;
str = "time.Hour: " + time.Hour + " time.Minute: " + time.Minute + " time.Second: " + time.Second + " time.Millisecond: " + time.Millisecond + "len: " + len;
str += Environment.NewLine + e.Packet.ToString();
str += Environment.NewLine + " Message: " + BitConverter.ToString(e.Packet.Data);
//str += e.Packet.Data + Environment.NewLine + Environment.NewLine;
Packet p = Packet.ParsePacket(e.Packet);
str += e.Packet.Data + Environment.NewLine + Environment.NewLine;
byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, e.Packet.Data);
str += Encoding.UTF8.GetBytes(utf8Bytes.ToString()).ToString();
//txtOutput.Text += "time.Hour: " + time.Hour + "time.Minute: " + time.Minute + "time.Second: " + time.Second + "time.Millisecond:" + time.Millisecond + "len:" + len;
//txtOutput.Text += e.Packet.ToString();
//txtOutput.Text += Environment.NewLine;
WriteToFile(str,null);
// WriteToFile("",c);
Packets contain binary data and not textual data.
There can be parts of the packets that contain text but you should only try and translate these parts to text (not the entire packet data) and you should know what is the text encoding.
There is no "generic" encoding. UTF8 is more generic than ASCII in the sense that all the text in ASCII will be converted using UTF8 but generally there is no "generic" encoding and you should know what is the encoding of your data.
What you're looking for is Encoding.UTF8.GetString() if the data is in fact UTF-8.
It all depends on what the Data field contains. If it's another protocol payload, you'll need to parse it either using a SharpPcap/Packet.Net parser or, if the specific parser doesn't exist yet then you'll need to look up the protocol spec and build your own parser from that.
All incoming packets are nothing but a big byte array of meaningless data until it can be parsed. Sometimes it's easy to write a parser, sometimes it can take many weeks (depending on the protocol's complexity or what tools already exist to parse the specific protocols). SharpPcap/Packet.Net is a pretty extensive protocol for parsing packet data but it's far from covering all of the commonly known/used protocols that exist.