I have an issue where my .NET 3.5 applications are causing the IIS worker process to continually eat up memory and never release it until the applications start throwing memory related errors and I have to recycle the IIS worker process. Another thing I've noticed is that the connection to the Oracle DB server also doesn't close and will remain open until I recycle the IIS worker process (as far as I can tell I'm closing the Oracle connections properly). From what I've read in other similar posts the GC is supposed to clean up unused memory and allow it to be reallocated but this is quite clearly not happening here (I'm observing the same problem on both the remote host and local host. I'm going to assume that this isn't an issue related to IIS settings but rather that I'm not doing proper housecleaning in my code; what things should I be look at? Thanks.
Here is my code related to querying the Oracle DB:
Using conn As New OracleConnection(oradb)
Try
cmd.Connection = conn
daData = New OracleDataAdapter(cmd)
cbData = New OracleCommandBuilder(daData)
dtData = New DataTable()
dtDADPLIs = New DataTable()
conn.Open()
cmd.CommandText = "SELECT * FROM TABLE" _
daData.Fill(dtData)
cmd.CommandText = "SELECT * FROM TABLE2"
daData.Fill(dtDADPLIs)
QueryName = "SD_TIER_REPORT"
WriteQueryLog(QueryName)
Catch ex As OracleException
'MessageBox.Show(ex.Message.ToString())
Finally
conn.Close()
conn.Dispose()
End Try
Once I ran into the same issue and I bumped into this article and this one.
I exchanged a few emails with the author (Paul Wilson) and he helped me to understand the problem with large objects which are allocated in memory in a "Large Object Heap" and it never gets compacted.
This is what he told me:
Larger objects are indeed allocated separately, where large is
something around 60-90 KB or larger (I don't remember exactly, and its
not officially documented anyhow). So if your byte arrays, and other
objects for that matter, are larger than that threshold then they will
be allocated separately. When does the large object heap get
collected? You may have ran into statements about there being several
generations of normal memory allocation (0, 1, and 2 in the current
frameworks) -- well the large object heap is basically considered to
be generation 2 automatically. That means that it will not be
collected until there isn't enough memory left after collecting gen 0
and gen 1 -- so basically it only happens on a full GC collection. So
to answer your question -- there is no way to make sure objects in the
large object heap get collected any sooner. The problem is that I'm
talking about garbage collection, which assumes that your objects
(large objects in this case) are no longer referenced anywhere and
thus available to be collected. If they are still referenced
somewhere, then it simply doesn't matter how much the GC runs -- your
memory usage is simply going to go up and up. So do you have all
references gone? It may seem you do, and you might be right -- all I
can tell you is that its very easy to be wrong, and its a terrible
amount of work with memory profilers and no shortcuts to prove it one
way or the other. I can tell you that if a manual GC.Collect reliably
does reduce your memory usage, then you've obviously got your objects
de-referenced -- else a GC.Collect wouldn't help. So the question may
simply be what makes you think you are having a memory problem? There
may be no reason for a GC to collect memory if you have plenty
available on a big server system!
Another article which is worth reading is this.
Solution?
Fetch only data you need
Avoid using datasets when possible and choose a datareader.
UPDATE:
If you're using a reporting tool like MS ReportViewer if you can bind your report to a "business object".
Related
How may I create and read a packet in VB.NET?
I desire to create an application that sends an object of some sort, and then have the client de-serialize that object, and perhaps establish a 2-way communication where the client sends a piece of info and the server replies with an apt object for it.
Check out ProtoBuf-Net. Fast, small, robust, somewhat easy (sparse docs) and free. Lots of info here on SO and at this link. It will serialize something to a file or mem stream, in less than 10 lines of code (plus some Class/Property attributes) and output something much, MUCH smaller than the NET binary serializer. The basic code is simple:
Try
Dim fs As New FileStream(mUserFile, FileMode.Create, FileAccess.Write)
Serializer.Serialize(fs, _Profiles)
fs.Close()
fs.Dispose()
Catch ex As Exception
MessageBox.Show("PBN Error", MsgTitle, MessageBoxButtons.OK, _
MessageBoxIcon.Exclamation)
End Try
In this case, a Collection of 5 or 6 ListOf items were serialized (ie nested), but it could have just as easily been a class. Loading/Deserializing is just as easy.
There might be a way around it which I never found, but when I tried something like what you describe, the NET binary serializer would only deserialize into the same assembly-class-culture type which created it. This is good for making the output proprietary to your project, very bad for data exchange. Output was also gigantic (Serialize an empty dictionary in NET results in 3000 bytes while PBN needed 300). The ONLY place that the NET serializer is a little better suited is when the assembly is obfuscated; MS knows how to get the data and is not sharing with the rest of the class. Even then, it only adds a few steps to the process.
PBN works with all the collection things like List Of, Dictionary etc but wont natively do things like Rectangles, Point and Size. It is not hard to write a converter to feed it something that will work (I wrote one for Bitmap yesterday).
The biggest downside to VB developers is that all the docs, examples and talk/help are from/for C#. That not only makes some VB people's eyes glaze over, but makes it look like it is a C#-specific solution. Likewise, the info (wire types, packets etc) makes it sound like a network data exchange solution. In reality, it will work just as well with VB for a variety of situations.
First of all, thanks in advance for your help.
I've decided to ask for help in forums like this one because after several months of hard working, I couldn't find a solution for my problem.
This can be described as 'Why an object created in VB.net isn't released by the GC when it is disposed even when the GC was forced to be launched?"
Please consider the following piece of code. Obviously my project is much more complex, but I was able to isolate the problem:
Imports System.Data.Odbc
Imports System.Threading
Module Module1
Sub Main()
'Declarations-------------------------------------------------
Dim connex As OdbcConnection 'Connection to the DB
Dim db_Str As String 'ODBC connection String
'Sentences----------------------------------------------------
db_Str = "My ODBC connection String to my MySQL database"
While True
'Condition: Infinite loop.
connex = New OdbcConnection(db_Str)
connex.Open()
connex.Close()
'Release created objects
connex.Dispose()
'Force the GC to be launched
GC.Collect()
'Send the application to sleep half a second
System.Threading.Thread.Sleep(500)
End While
End Sub
End Module
This simulates a multithreaded application making connections to a MySQL database. As you can see, the connection is created as a new object, then released. Finally, the GC was forced to be launched. I've seen this algorithm in several forums but also in the MSDN online help, so as far as I am concerned, I am not doing anything wrong.
The problem begins when the application is launched. The object created is disposed within the code, but after a while, the availiable memory is exhausted and the application crashes.
Of course, this problem is hard to see in this little version, but on the real project, the application runs out of memory very quickly (due to the amount of connections made over the time) and as result, the uptime is only two days. Then I need to restart the application again.
I installed a memory profiler on my machine (Scitech .Net Memory profiler 4.5, downloadable trial version here). There is a section called 'Investigate memory leaks'. I was absolutely astonished when I saw this on the 'Real Time' tab. If I am correct, this graphic is telling me that none of the objects created on the code have been actually released:
The surprise was even bigger when I saw this other screen. According to this, all undisposed objects are System.Transactions type, which I assume are internally managed within the .Net libraries as I am not creating any object of this type on my code. Does it mean there is a bug on the VB.net Standard libraries???:
Please notice that in my code, I am not executing any query. If I do, the ODBCDataReader object won't be released either, even if I call the .Close() method (surprisingly enough, the number of unreleased objects of this type is exactly the same as the unreleased objects of type System.Transactions)
Another important thing is the statement GC.Collect(). This is used by the memory profiler to refresh the information to be displayed. If you remove it from the code, the profiler wont' update the real time diagram properly, giving you the false impression that everything is correct.
Finally, if you ommit the connex.Open() statement, the screenshot #1 will render a flat line (that means all the objects created have been successfully released), but unfortunatelly, we can't make any query against the database if the connection hasn't been opened.
Can someone find a logical explanation to this and also, a workaround for effectively releasing the objects?
Thank you all folks.
Nico
Dispose has nothing to do with garbage collection. Garbage collection is exclusively about managed resources (memory). Dispose has no bearing on memory at all, and is only relevant for unmanaged resources (database connections, file handles, gdi resource, sockets... anything not memory). The only relationship between the two has to do with how an object is finalized, because many objects are often implemented such that disposing them will suppress finalization and finalizing them will call .Dispose(). Explicitly Disposing() an object will never cause it to be collected1.
Explicitly calling the garbage collector is almost always a bad idea. .Net uses a generational garbage collector, and so the main effect of calling it yourself is that you'll hold onto memory longer, because by forcing the collection earlier you're likely to check the items before they are eligible for collection at all, which sends them into a higher-order generation that is collected less often. These items otherwise would have stayed in the lower generation and been eligible for collection when the GC next ran on it's own. You may need to use GC.Collect() now for the profiler, but you should try to remove it for your production code.
You mention your app runs for two days before crashing, and are not profiling (or showing results for) your actual production code, so I also think the profiler is in part misleading you here. You've pared down the code to something that produced a memory leak, but I'm not sure it's the memory leak you are seeing in production. This is partly because of the difference in time to reproduce the error, but it's also "instinct". I mention that because some of what I'm going to suggest might not make sense immediately in light of your profiler results. That out of the way, I don't know for sure what is going on with your lost memory, but I can make a few guesses.
The first guess is that your real code has try/catch block. An exception is thrown... perhaps not on every connection, but sometimes. When that happens, the catch block allows your program to keep running, but you skipped over the connex.Dispose() line, and therefore leave open connections hanging around. These connections will eventually create a denial of service situation for the database, which can manifest itself in a number of ways. The correction here is to make sure you always use a finally block for anything you .Dispose(). This is true whether or not you currently have a try/catch block, and it's important enough that I would say the code you've posted so far is fundamentally wrong: you need a try/finally. There is a shortcut for this, via a using block.
The next guess is that some of your real commands end up fairly large, possibly with large strings or image (byte[]) data involved. In this case, items end up on a special garbage collector generation called the Large Object Heap (LOH). The LOH is rarely collected, and almost never compacted. Think of compaction as analogous to what happens when you defrag a hard drive. If you have items going to the LOH, you can end up in a situation where the physical memory itself is freed (collected), but the address space within your process (you are normally limited to 2GB) is not freed (compacted). You have holes in your memory address space that will not be reclaimed. The physical RAM is available to your system for other processes, but over time this still results in the same kind of OutOfMemory exception you're seeing. Most of the time this doesn't matter: most .Net programs are short-lived user-facing apps, or ASP.Net apps where the entire thread can be torn down after a page is served. Since you're building something like a service that should run for days, you have to be more careful. The fix may involve significantly re-working some code, to avoid creating the large objects at all. That may mean re-using a single or small set of byte arrays over and over, or using streaming techniques instead of string concatenation or string builders for very large sql queries or sql query data. It may also mean you find this easier to do as a scheduled task that runs daily and shuts itself down at the end of the day, or a program that is invoked on demand.
A final guess is that something you are doing results in your connection objects still being in some way reachable by your program. Event handlers are a common source of mistakes of this sort, though I would find it strange to have event handlers on your connections, especially as this is not part of your example.
1 I suppose I could contrive a scenario that would make this happen. A simple way would be to build an object assumes a global collection for all objects of that type... the objects add themselves to the collection at construction and remove themselves at disposal. In this way, the object could not be collected before disposal, because before that point it would still be reachable... but that would be a very flawed program design.
Thank you all guys for your very helpful answers.
Joel, you're right. This code produces 'a leak' which is not necesarily the same as 'the leak' problem I have on my real project, though they reproduce the same symptoms, that is, the number of unreleased objects keep growing (and eventually will exhaust the memory) on the code mentioned above. So I wonder what's wrong with it as everything seems to be properly coded. I don't understand why they are not disposed/collected. But according to the profiler, they are still in memory and eventually will prevent to create new objects.
One of your guesses about my 'real' project hit the nail on the head. I've realized that my 'catch' blocks didn't call for object disposal, and this has been now fixed. Thanks for your valuable suggestion. However, I implemented the 'using' clause in the code in my example above and didn't actually fix the problem.
Hans, you are also right. After posting the question, I've changed the libraries on the code above to make connections to MySQL.
The old libraries (in the example):
System.Data.Odbc
The new libraries:
System.Data
Microsoft.Data.Odbc
Whith the new ones, the profiler rendered a flat line, whithout any further changes on the code, which it was what I've been looking after. So my conclussion is the same as yours, that is there may be some internal error in the old ones that makes that thing to happen, which makes them a real 'troublemaker'.
Now I remember that I originally used the new ones on my project (the System.Data and Microsoft.Data.Odbc) but I soon changed for the old ones (the System.Data.Odbc) because the new ones doesn't allow Multiple Active Recordsets (MARS) opened. My application makes a huge amount of queries against the MySQL database, but unfortunately, the number of connections are limited. So I initially implemented my real code in such a way that it made only a few connections, but they were shared accross the code (passing the connection between functions as parameter). This was great because (for example) I needed to retrieve a recordset (let's say clients), and make a lot of checks at the same time (example, the client has at least one invoice, the client has a duplicated email address, etc, which involves a lot of side queries). Whith the 'old' libraries, the same connection allowed to create multiple commands and execute different queries.
The 'new' libraries don't allow MARS. I can only create one command (that is, to execute a query) per session/connection. If I need to execute another one, I need to close the previous recordset (which isn't actually possible as I am iterating over it), and then to make the new query.
I had to find the balance between both problems. So I end up using the 'new libraries' because of the memory problems, and I recoded my application to not share the connections (so each procedure will create a new one when needed), as well as reducing the number of connections the application can do at the same time to not exhaust the connection pool.
The solution is far to ideal as it introduces spurious logic on the application (the ideal case scenario would be to migrate to SQL server), but it is giving me better results and the application is being more stable, at least in the early stages of the new version.
Thanks again for your suggestions, I hope you will find mines usefult too.
Cheers.
Nico
I currently use a singleton to acces my database (see related question) but now when try to add some background processing everything fall apart. I read the sqlite docs and found that sqlite could work thread-safe, but each thread must have their own db connection. I try using egodatabase that promise a sqlite wrapper with thread safety but is very buggy, so I return to my old FMDB library I start to see how use it in multi-thread way.
Because I have all code with the idea of singleton, change everything will be expensive (and a lot of open/close connections could become slow), so I wonder if, as the sqlite docs hint, build a pooling for each connection will help. If is the case, how make it? How to know which connection to get from the pool (because 2 threads can't share the connection)?
I wonder if somebody already use sqlite in multi-threading with NSOperation or similar stuff, my searching only return "yeah, its possible" but let the details to my imagination...
You should look at using thread-local variables to hold the connection; if the variable is empty (i.e., holding something like a NULL) you know you can safely open a connection at that point to serve the thread and store the connection back in the variable. Don't know how to do this with Obj-C though.
Also be aware that SQLite is not tuned for concurrent writes. Writer locks are expensive, so keep any time in a writing transaction (i.e., one that includes an INSERT, UPDATE or DELETE) to a minimum in all threads. Transaction commits are also expensive too.
Does anyone know how DbDataReaders actually work. We can use SqlDataReader as an example.
When you do the following
cmd.CommandText = "SELECT * FROM Customers";
var rdr = cmd.ExecuteReader();
while(rdr.Read())
{
//Do something
}
Does the data reader have all of the rows in memory, or does it just grab one, and then when Read is called, does it go to the db and grab the next one? It seems just bringing one into memory would be bad performance, but bringing all of them would make it take a while on the call to ExecuteReader.
I know I'm the consumer of the object and it doesn't really matter how they implement it, but I'm just curious, and I think that I would probably spend a couple hours in Reflector to get an idea of what it's doing, so thought I'd ask someone that might know.
I'm just curious if anyone has an idea.
As stated here :
Using the DataReader can increase
application performance both by
retrieving data as soon as it is
available, and (by default) storing
only one row at a time in memory,
reducing system overhead.
And as far as I know that's the way every reader works in the .NET framework.
Rhapsody is correct.
Results are returned as the query
executes, and are stored in the
network buffer on the client until you
request them using the Read method of
the DataReader
I ran a test using DataReaders vs DataAdaptors on an equal 10,000 record data set, and I found that the DataAdaptor was consistently 3-4 milliseconds faster than the DataReader, but the DataAdaptor will end up holding onto more memory.
When I ran the same test on equal 50,000 record data sets I saw a performance gain on the DataReader side to the tune of 50 milliseconds.
With that said, if you had a long running query or a huge result set, I think you may be better off with a DataReader since you get your results sooner and don't have to hold onto all of that data in memory. It is also important to keep in mind that a DataReader is forward only, so if you need to move around in your results set, then it is not the best choice.
Hi i got a question about my server performance ... i got a classic asp cms hosting ~250 websites, for each website we build a Classic ASP dictionary using
set dict = CreateObject("Scripting.Dictionary")
dict.add "test1","Value Test1"
dict.add "test2","Value Test2"
dict.add "test3","Value Test3"
that dictionary is then loaded on every page for every user ...
lets say we got about ~150 000 users visiting those websites monthly loading those dictionary of about ~100k each every load ...
should i use application variable as dictionary instead of loading my dictionary every time?
and is it really gonna improve my server performance?
Certainly loading a dictionary for every ASP request is definitely a bad idea and will be hurting not only your performance but also fragmenting your Virtual Memory.
Using an array instead still has much the same problem, each request would need to allocate all the memory needed to hold it and it still needs populating on each request.
The simple answer would be yes use the application object as the dictionary. This will cost you much less in memory and CPU. The downside is does it collide with existing application object usage? You may need to prefix your keys in order to avoid this problem.
I'd absolutely suggest loading the dictionary only the one time, as the Dictionary object is heavy in terms of memory, slow in terms of lookup and the big one: isn't always destroyed in memory when you think it should be. Thus even after a user has left the page this object can still linger in memory waiting to be disposed of (even if you explicitly "destroy" it). Now multiply that times number of page hits per visit per user...
An alternative and more memory-light method would be to use an array -- one-dimensional if you can maintain track of the index somewhere (best), or two-dimensional with a lookup function if you need to (certainly if others are maintaining the code now or in the future).
I'm pretty sure that instantiating a single scripting.dictionary on each page shouldn't be a problem on any website. If performance is an issue I suggest profiling youre page first to see where the problem is. Big chance there is an unoptimised query somewhere taking 100+ ms to finish.
We run a classic ASP site that handles 200k pageviews a day and use scriping.dictionary extensively on every page (25+ instances). We use it as a base for all kinds of things. Do you have any example script to show that the dict's aren't always destroyed by the garbagecollector? Or that it's lookups are slow compared to any alternative? The only inconvenience we encountered is the lack of a 'clone' method.