VB.NET, .NET 4
Hello all,
I have a List(Of Byte) that is filled with bytes from the serial buffer on a SerialPort.DataRecieved Event. I then try to parse the data. Part of the parsing process involves deleting elements of the List(Of Byte). Should I be concerned about the List being modified by a DataRecieved Event that might be raised during the parsing process? I realize that probably depends on what I'm trying to do, but, assuming I should be concerned (e.g., the parsing process needs List.Count to not change until parsing is finished), how should I go about making sure any Add calls wait until the parser is done? I guess the answer is something like SyncLock, but I've never really understood how SyncLock works. Any basic help would be appreciated!
Thanks in advance,
Brian
Well, it's not the greatest use of CPU cycles, removing bytes from a List(Of Byte) is an O(n) operation. Making the overall processing step O(n^2). It is still quite difficult to put any kind of pressure on the cpu doing so, serial ports are glacially slow. You should only ever modify working code if you have measured it to be a perf problem.
If you're not there yet then consider creating a new array or List from the old one. That's O(n), the extra storage cannot hurt considering the slow data rates. The code should be cleaner too.
As far as threading goes, be sure to do this in the DataReceived handler. That's thread-safe and avoids putting undue pressure on the UI thread in case you invoke.
Related
First of all, thanks in advance for your help.
I've decided to ask for help in forums like this one because after several months of hard working, I couldn't find a solution for my problem.
This can be described as 'Why an object created in VB.net isn't released by the GC when it is disposed even when the GC was forced to be launched?"
Please consider the following piece of code. Obviously my project is much more complex, but I was able to isolate the problem:
Imports System.Data.Odbc
Imports System.Threading
Module Module1
Sub Main()
'Declarations-------------------------------------------------
Dim connex As OdbcConnection 'Connection to the DB
Dim db_Str As String 'ODBC connection String
'Sentences----------------------------------------------------
db_Str = "My ODBC connection String to my MySQL database"
While True
'Condition: Infinite loop.
connex = New OdbcConnection(db_Str)
connex.Open()
connex.Close()
'Release created objects
connex.Dispose()
'Force the GC to be launched
GC.Collect()
'Send the application to sleep half a second
System.Threading.Thread.Sleep(500)
End While
End Sub
End Module
This simulates a multithreaded application making connections to a MySQL database. As you can see, the connection is created as a new object, then released. Finally, the GC was forced to be launched. I've seen this algorithm in several forums but also in the MSDN online help, so as far as I am concerned, I am not doing anything wrong.
The problem begins when the application is launched. The object created is disposed within the code, but after a while, the availiable memory is exhausted and the application crashes.
Of course, this problem is hard to see in this little version, but on the real project, the application runs out of memory very quickly (due to the amount of connections made over the time) and as result, the uptime is only two days. Then I need to restart the application again.
I installed a memory profiler on my machine (Scitech .Net Memory profiler 4.5, downloadable trial version here). There is a section called 'Investigate memory leaks'. I was absolutely astonished when I saw this on the 'Real Time' tab. If I am correct, this graphic is telling me that none of the objects created on the code have been actually released:
The surprise was even bigger when I saw this other screen. According to this, all undisposed objects are System.Transactions type, which I assume are internally managed within the .Net libraries as I am not creating any object of this type on my code. Does it mean there is a bug on the VB.net Standard libraries???:
Please notice that in my code, I am not executing any query. If I do, the ODBCDataReader object won't be released either, even if I call the .Close() method (surprisingly enough, the number of unreleased objects of this type is exactly the same as the unreleased objects of type System.Transactions)
Another important thing is the statement GC.Collect(). This is used by the memory profiler to refresh the information to be displayed. If you remove it from the code, the profiler wont' update the real time diagram properly, giving you the false impression that everything is correct.
Finally, if you ommit the connex.Open() statement, the screenshot #1 will render a flat line (that means all the objects created have been successfully released), but unfortunatelly, we can't make any query against the database if the connection hasn't been opened.
Can someone find a logical explanation to this and also, a workaround for effectively releasing the objects?
Thank you all folks.
Nico
Dispose has nothing to do with garbage collection. Garbage collection is exclusively about managed resources (memory). Dispose has no bearing on memory at all, and is only relevant for unmanaged resources (database connections, file handles, gdi resource, sockets... anything not memory). The only relationship between the two has to do with how an object is finalized, because many objects are often implemented such that disposing them will suppress finalization and finalizing them will call .Dispose(). Explicitly Disposing() an object will never cause it to be collected1.
Explicitly calling the garbage collector is almost always a bad idea. .Net uses a generational garbage collector, and so the main effect of calling it yourself is that you'll hold onto memory longer, because by forcing the collection earlier you're likely to check the items before they are eligible for collection at all, which sends them into a higher-order generation that is collected less often. These items otherwise would have stayed in the lower generation and been eligible for collection when the GC next ran on it's own. You may need to use GC.Collect() now for the profiler, but you should try to remove it for your production code.
You mention your app runs for two days before crashing, and are not profiling (or showing results for) your actual production code, so I also think the profiler is in part misleading you here. You've pared down the code to something that produced a memory leak, but I'm not sure it's the memory leak you are seeing in production. This is partly because of the difference in time to reproduce the error, but it's also "instinct". I mention that because some of what I'm going to suggest might not make sense immediately in light of your profiler results. That out of the way, I don't know for sure what is going on with your lost memory, but I can make a few guesses.
The first guess is that your real code has try/catch block. An exception is thrown... perhaps not on every connection, but sometimes. When that happens, the catch block allows your program to keep running, but you skipped over the connex.Dispose() line, and therefore leave open connections hanging around. These connections will eventually create a denial of service situation for the database, which can manifest itself in a number of ways. The correction here is to make sure you always use a finally block for anything you .Dispose(). This is true whether or not you currently have a try/catch block, and it's important enough that I would say the code you've posted so far is fundamentally wrong: you need a try/finally. There is a shortcut for this, via a using block.
The next guess is that some of your real commands end up fairly large, possibly with large strings or image (byte[]) data involved. In this case, items end up on a special garbage collector generation called the Large Object Heap (LOH). The LOH is rarely collected, and almost never compacted. Think of compaction as analogous to what happens when you defrag a hard drive. If you have items going to the LOH, you can end up in a situation where the physical memory itself is freed (collected), but the address space within your process (you are normally limited to 2GB) is not freed (compacted). You have holes in your memory address space that will not be reclaimed. The physical RAM is available to your system for other processes, but over time this still results in the same kind of OutOfMemory exception you're seeing. Most of the time this doesn't matter: most .Net programs are short-lived user-facing apps, or ASP.Net apps where the entire thread can be torn down after a page is served. Since you're building something like a service that should run for days, you have to be more careful. The fix may involve significantly re-working some code, to avoid creating the large objects at all. That may mean re-using a single or small set of byte arrays over and over, or using streaming techniques instead of string concatenation or string builders for very large sql queries or sql query data. It may also mean you find this easier to do as a scheduled task that runs daily and shuts itself down at the end of the day, or a program that is invoked on demand.
A final guess is that something you are doing results in your connection objects still being in some way reachable by your program. Event handlers are a common source of mistakes of this sort, though I would find it strange to have event handlers on your connections, especially as this is not part of your example.
1 I suppose I could contrive a scenario that would make this happen. A simple way would be to build an object assumes a global collection for all objects of that type... the objects add themselves to the collection at construction and remove themselves at disposal. In this way, the object could not be collected before disposal, because before that point it would still be reachable... but that would be a very flawed program design.
Thank you all guys for your very helpful answers.
Joel, you're right. This code produces 'a leak' which is not necesarily the same as 'the leak' problem I have on my real project, though they reproduce the same symptoms, that is, the number of unreleased objects keep growing (and eventually will exhaust the memory) on the code mentioned above. So I wonder what's wrong with it as everything seems to be properly coded. I don't understand why they are not disposed/collected. But according to the profiler, they are still in memory and eventually will prevent to create new objects.
One of your guesses about my 'real' project hit the nail on the head. I've realized that my 'catch' blocks didn't call for object disposal, and this has been now fixed. Thanks for your valuable suggestion. However, I implemented the 'using' clause in the code in my example above and didn't actually fix the problem.
Hans, you are also right. After posting the question, I've changed the libraries on the code above to make connections to MySQL.
The old libraries (in the example):
System.Data.Odbc
The new libraries:
System.Data
Microsoft.Data.Odbc
Whith the new ones, the profiler rendered a flat line, whithout any further changes on the code, which it was what I've been looking after. So my conclussion is the same as yours, that is there may be some internal error in the old ones that makes that thing to happen, which makes them a real 'troublemaker'.
Now I remember that I originally used the new ones on my project (the System.Data and Microsoft.Data.Odbc) but I soon changed for the old ones (the System.Data.Odbc) because the new ones doesn't allow Multiple Active Recordsets (MARS) opened. My application makes a huge amount of queries against the MySQL database, but unfortunately, the number of connections are limited. So I initially implemented my real code in such a way that it made only a few connections, but they were shared accross the code (passing the connection between functions as parameter). This was great because (for example) I needed to retrieve a recordset (let's say clients), and make a lot of checks at the same time (example, the client has at least one invoice, the client has a duplicated email address, etc, which involves a lot of side queries). Whith the 'old' libraries, the same connection allowed to create multiple commands and execute different queries.
The 'new' libraries don't allow MARS. I can only create one command (that is, to execute a query) per session/connection. If I need to execute another one, I need to close the previous recordset (which isn't actually possible as I am iterating over it), and then to make the new query.
I had to find the balance between both problems. So I end up using the 'new libraries' because of the memory problems, and I recoded my application to not share the connections (so each procedure will create a new one when needed), as well as reducing the number of connections the application can do at the same time to not exhaust the connection pool.
The solution is far to ideal as it introduces spurious logic on the application (the ideal case scenario would be to migrate to SQL server), but it is giving me better results and the application is being more stable, at least in the early stages of the new version.
Thanks again for your suggestions, I hope you will find mines usefult too.
Cheers.
Nico
When you have a function that accepts an array as an argument and calls another function with that array and that calls another function with it and so forth the stack will contain many copies of the pointer to that array. I just thought of an interesting way to alleviate this problem but I'm wondering whether or not it is worth implementing.
Does anyone have any idea how often stacks contain duplicate pointers in practice?
EDIT
Just to clarify, I am not optimizing a given program but, rather, am considering writing a new kind of optimization pass for my VM. My benchmarks have indicated that my current solution causes up to 70% of the total running time to be spent in stack manipulations. The optimization pass I am thinking of would generate code at compile time that would perform the same actions but pointers would (potentially) be duplicated on the stack less often. I am interested in any prior studies that have measured the number of duplicates on the stack because this would help me to quantify my optimization's potential. For example, if it is known that real programs do not push pointers already on the stack in practice then my optimization is worthless.
Moreover, these stack manipulations are due to the code generated by my VM making sure locally-held pointers are visible to the garbage collector and not due only to function parameters as both answerers have currently assumed. And they are actually operations on a shadow stack rather than the main stack.
First of all, the answer will depend on your application.
Secondly, even with high duplication, I doubt there is much sense in implementing the mechanism you describe, or even that it is possible in a general case. If you call a method and you pass it parameters, you must do it either one way or another.
There may be advantages to doing it in some specific way - for example there are several function calling conventions and many C/C++ compilers (e.g. gcc) let you choose between passing parameters on the stack or via registers. In certain cases, the latter may be faster - you can try and benchmark if it helps your application.
But in a general case, the cost of detecting duplicated values on the stack and "reusing" them would probably much exceed any gains from having a smaller stack. The code for pushing and popping values is really simple (just a few CPU instructions in an optimized case), code for finding and reusing duplicates - hardly so. You would also have to somehow store the information about which values are already on the stack and how to find them - a nontrivial data structure. Except for some really weird cases, I don't think this would be smaller than the actual copied data itself.
What you could do, would be to rewrite your algorithm in such way that some function calls are eliminated. For example, if your function's result only depends on the input arguments, you could somehow cache or memoize the results, thus avoiding repeated calls with the same values. This may indeed bring some gains, though it's usually a memory vs CPU time tradeoff. Getting an advantage both in memory and in CPU time is rarely possible. Also, rewriting your algorithm is not really "avoiding duplication of data on the stack".
Any way, for the original question, I think the idea is not viable and you should look at optimizations elsewhere.
PS: You use case may somewhat resemble tail-call optimization, so perhaps that's a direction worth looking at - but if you implement it yourself, I would also consider this to fall into the "change your algorithm" category. Maybe changing from a recursive algorithm to an iterative one could help also.
Can I suggest getting some exposure to actual performance tuning?
(Here's my canonical example.)
Between the time a program starts and the time it ends, of the cycles it uses, it obviously uses 100% of those cycles.
If it goes in and out of functions, and passes pointers to an array, but does nothing else, then there's no surprise that a high percent of time goes into function entry and exit, and passing arguments.
If a program P is written to do task T, there are a multitude of other programs P' which could also do task T. Some of them take fewer cycles than all the others, and those are the optimal ones.
The way the optimal ones differ from the non-optimal ones is that the non-optimal ones are doing things that can be done without.
So, to optimize any program, find out what cycles are being spent that don't have to be, and get rid of those activities. That link shows in great detail how I do it.
Trying to pass fewer arguments to functions might or might not be necessary, depending on what your diagnostics tell you.
I have a reasonable number of records in an Azure Table that I'm attempting to do some one time data encryption on. I thought that I could speed things up by using a Parallel.ForEach. Also because there are more than 1K records and I don't want to mess around with continuation tokens myself I'm using a CloudTableQuery to get my enumerator.
My problem is that some of my records have been double encrypted and I realised that I'm not sure how thread safe the enumerator returned by CloudTableQuery.Execute() is. Has anyone else out there had any experience with this combination?
I would be willing to bet the answer to Execute returning a thread-safe IEnumerator implementation is highly unlikely. That said, this sounds like yet another case for the producer-consumer pattern.
In your specific scenario I would have the original thread that called Execute read the results off sequentially and stuff them into a BlockingCollection<T>. Before you start doing that though, you want to start a separate Task that will control the consumption of those items using Parallel::ForEach. Now, you will probably also want to look into using the GetConsumingPartitioner method of the ParallelExtensions library in order to be most efficient since the default partitioner will create more overhead than you want in this case. You can read more about this from this blog post.
An added bonus of using BlockingCollection<T> over a raw ConcurrentQueueu<T> is that it offers the ability to set bounds which can help block the producer from adding more items to the collection than the consumers can keep up with. You will of course need to do some performance testing to find the sweet spot for your application.
Despite my best efforts I've been unable to replicate my original problem. My conclusion is therefore that it is perfectly OK to use Parallel.ForEach loops with CloudTableQuery.Execute().
I'm using VB.Net, and I have a set of data which I have to able to filter through fairly quickly. Basically, the program is like google sugest, but instead of a drop-down menu, I'm using a listbox. When a user enters a word, I compare the word using LINQ and filter those that contain the user's input. The data are all strings of variable length (from 0 to 200 characters, most on 150 character mark), and I have 240,000+ of this strings and counting- all stored in an XML file.
A colleague of mine told me that loading all of that to memory (using VB.Net's XML serializer plus collections of string/objects) is not practical, and would slow the 'startup' time of the program. I haven't finished building the program yet and I'm having second thoughts about continuing this path.
So, my question is: Should I continue with my current approach on the problem (which is load everything to memory on startup), or is there a better way of solving my dilemma?
If you want to prevent startup time and keeping it in memory isn't an issue on performance, then load it asynchronously. Although loading 240.000+ strings from an XML and keeping it in memory doesn't sound like the greatest idea. Probably a database would be the better approach. Or at least some format like JSON that's faster to parse.
Depends on a number of things:
If
((you know the strings will not hugely increase in number) &&
(you know the spec of the machines that will run your app) &&
(you are able to test that the load time is *good enough* on the above spec))
{
**don't bother changing approach.**
}
else
{
**change approach.**
}
The alternative approach is obviously some kind of asynch lazy-load.
You're talking about loading roughly 36MB of strings. While this isn't a daunting amount by any means (though you could probably load it faster reading the XML yourself...I wouldn't go with the serialization engine if I was worried about performance), it's also a non-trivial amount. You're looking a adding a couple of seconds to your startup time, assuming you don't do it asynchronously as Mircea suggests.
If you do do it asynchronously, you'll have to ensure that any UI process that relies on the data doesn't occur until after it has loaded. That may be a difficult thing to ensure.
The question seems to imply an online application. A few suggestions if that is the case:
The data could / should be zipped. I suspect it would compress very nicely.
Maybe the data could be cached accross multiple sessions, possibly be delivered as html content with a expiry cache date as appropriate. This would save systematic loading, and may be feasible if the data isn't updated frequently.
The suggestion feature feature could be initially disabled (i.e. say showing a "loading..." message while the application initializes the cache, asynchronously). In this fashion the application would be quickly available upon startup, even though the suggest feature may lag by up to say 30 seconds or so.
Edit: Independently of how the data gets downloaded and cached, I second the opinion of Mircea Grelus that an xml file of this size is a poor substitute for a database.
It may not be a bad idea to load the XML into memory when the app starts up. But if you go this route I'd look into using the BackgroundWorker thread. The idea would be to load the XML into memory asynchronously so the UI is still responsive as this is going on. As far as the user is concerned the app shouldn't appear to start any slower, and yet once done the Google-suggest-like feature should be significantly faster.
I must say that even in memory this is an inherently inefficient operation since you have no advantage of using an index when querying an XML file in this way. This is something that would be 10X faster in SQL with full-text searching.
Of course XML has the advantage of being self-contained and requiring no additional components. And that makes it a decent choice for small desktop apps that query small amounts of data. Otherwise I would consider using a database for better performance.
You might be better served by using binary serialization rather than XML serialization to persist the data that your app reads on startup, particularly if you end up implementing a data structure that's faster to search than a `StringCollection. You'd still maintain the XML version of the data somewhere, of course.
And by all means, use a BackgroundWorker to load the data asynchronously if that'll make your application feel more responsive.
I need to optimize code to get room for some new code. I do not have the space for all the changes. I can not use code bank switching (80c31 with 64k).
You haven't really given a lot to go on here, but there are two main levels of optimizations you can consider:
Micro-Optimizations:
eg. XOR A instead of MOV A,0
Adam has covered some of these nicely earlier.
Macro-Optimizations:
Look at the structure of your program, the data structures and algorithms used, the tasks performed, and think VERY hard about how these could be rearranged or even removed. Are there whole chunks of code that actually aren't used? Is your code full of debug output statements that the user never sees? Are there functions specific to a single customer that you could leave out of a general release?
To get a good handle on that, you'll need to work out WHERE your memory is being used up. The Linker map is a good place to start with this. Macro-optimizations are where the BIG wins can be made.
As an aside, you could - seriously- try rewriting parts of your code with a good optimizing C compiler. You may be amazed at how tight the code can be. A true assembler hotshot may be able to improve on it, but it can easily be better than most coders. I used the IAR one about 20 years ago, and it blew my socks off.
With assembly language, you'll have to optimize by hand. Here are a few techniques:
Note: IANA8051P (I am not an 8501 programmer but I have done lots of assembly on other 8 bit chips).
Go through the code looking for any duplicated bits, no matter how small and make them functions.
Learn some of the more unusual instructions and see if you can use them to optimize, eg. A nice trick is to use XOR A to clear the accumulator instead of MOV A,0 - it saves a byte.
Another neat trick is if you call a function before returning, just jump to it eg, instead of:
CALL otherfunc
RET
Just do:
JMP otherfunc
Always make sure you are doing relative jumps and branches wherever possible, they use less memory than absolute jumps.
That's all I can think of off the top of my head for the moment.
Sorry I am coming to this late, but I once had exactly the same problem, and it became a repeated problem that kept coming back to me. In my case the project was a telephone, on an 8051 family processor, and I had totally maxed out the ROM (code) memory. It kept coming back to me because management kept requesting new features, so each new feature became a two step process. 1) Optimize old stuff to make room 2) Implement the new feature, using up the room I just made.
There are two approaches to optimization. Tactical and Strategical. Tactical optimizations save a few bytes at a time with a micro optimization idea. I think you need strategic optimizations which involve a more radical rethinking about how you are doing things.
Something I remember worked for me and could work for you;
Look at the essence of what your code has to do and try to distill out some really strong flexible primitive operations. Then rebuild your top level code so that it does nothing low level at all except call on the primitives. Ideally use a table based approach, your table contains stuff like; Input state, event, output state, primitives.... In other words when an event happens, look up a cell in the table for that event in the current state. That cell tells you what new state to change to (optionally) and what primitive(s) (if any) to execute. You might need multiple sets of states/events/tables/primitives for different layers/subsystems.
One of the many benefits of this approach is that you can think of it as building a custom language for your particular problem, in which you can very efficiently (i.e. with minimal extra code) create new functionality simply by modifying the table.
Sorry I am months late and you probably didn't have time to do something this radical anyway. For all I know you were already using a similar approach! But my answer might help someone else someday who knows.
In the whacked-out department, you could also consider compressing part of your code and only keeping some part that is actively used decompressed at any particular point in time. I have a hard time believing that the code required for the compress/decompress system would be small enough a portion of the tiny memory of the 8051 to make this worthwhile, but has worked wonders on slightly larger systems.
Yet another approach is to turn to a byte-code format or the kind of table-driven code that some state machine tools output -- having a machine understand what your app is doing and generating a completely incomprehensible implementation can be a great way to save room :)
Finally, if the code is indeed compiled in C, I would suggest compiling with a range of different options to see what happens. Also, I wrote a piece on compact C coding for the ESC back in 2001 that is still pretty current. See that text for other tricks for small machines.
1) Where possible save your variables in Idata not in xdata
2) Look at your Jmp statements – make use of SJmp and AJmp
I assume you know it won't fit because you wrote/complied and got the "out of memory" error. :) It appears the answers address your question pretty accurately; short of getting code examples.
I would, however, recommend a few additional thoughts;
Make sure all the code is really
being used -- code coverage test? An
unused sub is a big win -- this is a
tough step -- if you're the original
author, it may be easier -- (well, maybe) :)
Ensure the level of "verification"
and initialization -- sometimes we
have a tendency to be over zealous
in insuring we have initialized
variables/memory and sure enough
rightly so, how many times have we
been bitten by it. Not saying don't
initialize (duh), but if we're doing
a memory move, the destination
doesn't need to be zero'd first --
this dovetails with
1 --
Eval the new features -- can an
existing sub be be enhanced to cover
both functions or perhaps an
existing feature replaced?
Break up big code if a piece of the
big code can save creating a new
little code.
or perhaps there's an argument for hardware version 2.0 on the table now ... :)
regards
Besides the already mentioned (more or less) obvious optimizations, here is a really weird (and almost impossible to achieve) one: Code reuse. And with Code reuse I dont mean the normal reuse, but to a) reuse your code as data or b) to reuse your code as other code. Maybe you can create a lut (or whatever static data) that it can represented by the asm hex opcodes (here you have to look harvard vs von neumann architecture).
The other would reuse code by giving code a different meaning when you address it different. Here an example to make clear what I mean. If the bytes for your code look like this: AABCCCDDEEFFGGHH at address X where each letter stands for one opcode, imagine you would now jump to X+1. Maybe you get a complete different functionality where the now by space seperated bytes form the new opcodes: ABC CCD DE EF GH.
But beware: This is not only tricky to achieve (maybe its impossible), but its a horror to maintain. So if you are not a demo code (or something similiar exotic), I would recommend to use the already other mentioned ways to save mem.