I am using django to build an app that is slightly heavy on functionality. Consequently my templates are a bit heavy (nested loops, if conditions etc). I am noticing that 70% to 80% of time is spent in render_to_response step of my views.
I haven't found anything indicating performance issues around template engine in django on google. Anyone here faced similar issues / any suggestions to solve this problem?
One thing to keep in mind is that QuerySet objects are lazy. They won't actually hit the database until you a perform an operation that requires the query to be executed (looping, counting, etc).
If you pass a QuerySet object in your template context, often a loop (or some other operation) inside your template will be what triggers the database call. When this happens, the template rendering is "penalized" for the database I/O, but the overall response time should be unaffected by whether it happens in the view or in the template rendering.
Related
This seems to be a pretty common problem: I load an NHibernate object that has a lazily loaded collection.
At some later point, I access the collection to do something.
I still have the nhibernate session open (as it's managed per view or whatever) so it does actually work but the transaction is closed so in NHprof I get 'use of implicit transactions is discouraged'.
I understand this message and since I'm using a unit of work implementation, I can fix it simply by creating a new transaction and wrapping the call to the lazy loaded collection within it.
My problem is that this doesn't feel right...
I have this great NHibernate framework that gives me nice lazy loading but I can't use it without wrapping every property access in a transaction.
I've googled this a lot, read plenty of blog posts, questions on SO, etc, but can't seem to find a complete solution.
This is what I've considered:
Turn off lazy loading. I think this is silly, it's like getting a full on sports car and then only ever driving it in eco mode. Eager loading everything would hurt performance and if I just had ids instead of references then why bother with Nhibernate at all?
Keep the transaction open longer. Transactions should not be long lived and keeping one open as long as a view is open would just be asking for trouble.
Wrap every lazy load property access in a transaction. Works but is bloaty and error prone. (i.e. if I forget to wrap an accessor then it will still work fine. Only using NHProf will tell me the problem)
Always load all the data for the properties I might need when I load the initial object. Again, this is error prone, both with loading data that you don't need (because the later call to access it has been removed at some point) or with not loading data that you do
So is there a better way?
Any help/thoughts appreciated.
I has had the same feelings when I first encountered this warning in NHProf. In web applications I think the most popular way is to have opened transaction (and unit of work) for the whole duration of request. For desktop applications managing transactions (as well as sessions) may be painful. You can use automatic transaction management frameworks (e.g. Castle) and declare with attributes service methods that should be run within transaction. With this approach you can wrap multiple operations into single transaction denending on your requirements. Also, I was using session-per-view approach with one opened session per view and manual transaction management (in this case I just ignored profiler warnings about implicit transactions).
As for your considerations: I strongly don't recommend 2) and 3). 1) and 4) are points to consider. But the general advice is: think, then try different approaches and find a solution that suits better for your particular situation.
We've hooked up the ISaveOrUpdateEventListener event and hoped we could tie it to a progress bar update for each node being visited during the save traversal of a pretty big model, BUT the event only fires once when the save operations starts (only on the node on which the Save( ) was inititated and not on any subnodes).
Are there any other events that are more appropriate to listen to for this?
We've also tried breaking up the save operation (of a hierarchical model) by doing the traversal ourselves, but that seems to degrade the performance even further.
Perhaps we're trying to solve a problem for which FNH wasn't aimed to be used. We're new to it.
We've also set up an alternative solution using SqlBulkCopy, as recommended elsewhere.
We've seen the comments that FNH is primarily supposed for smaller transactions (OLTP) and not the type of exhaustive model we're bound to by our problem (signal processing of huge data volumes).
Background:
We're trying to use Fluent NHibernate on a larger database project with data gathered from fairly complex real time analysis (high frequency, multiple input signals, long experiment times etc). In a prototype we've built we see pretty scary wait times for the moment, and need to hook in some sort of reliable progress indicator.
Yes, now confirmed - as mentioned in my comment above. One (possible) solution to this is to simply turn of Cascades and do the model traversal manually and do explicit Save( ) calls.
This works, although it's not as neat as just handling an event. Still, given the genuin design of NHibernate, I bet there's certainly an event somewhere that could be intercepted - the question is just under what name. ... I bet someone on here knows more.
Also to improve performance we used a Stateless Session, experiemented with differnet batch size, and periodically/explicitly call Flush() and Clear(). See articles below for further details:
http://davybrion.com/blog/2008/10/bulk-data-operations-with-nhibernates-stateless-sessions/
http://ideas-net.blogspot.com/2009/03/nhibernate-update-performance-issue.html
Hope this helps.
Most of what I hear about NHibernate's lazy-loading, is that it's better to use it, than not to use it. It seems like it just makes sense to minimize database access, in an effort to reduce bottlenecks. But few things come without trade-offs, certainly it slightly limits design by forcing you to have virtual properties. But I've also noticed that some developers turn lazy-loading off on certain often-used objects.
This makes me wonder if there are some definite situations where data-access performance is hurt by using lazy-loading.
So I wonder, when and in what situations should I avoid lazy-loading one of my NHibernate-persisted objects?
Is the downside to lazy-loading merely in additional processing time, or can nhibernate lazy-loading also increase the data-access time (for instance, by making additional round-trips to the database)?
Thanks!
There are clear performance tradeoffs between eager and lazy loading objects from a database.
If you use eager loading, you suck a ton of data in a single query, which you can then cache. This is most common on application startup. You are trading memory consumption for database round trips.
If you use lazy loading, you suck a minimal amount of data in a single query, but any time you need more information related to that initial data it requires more queries to the database and database performance hits are very often the major performance bottleneck in most applications.
So, in general, you always want to retrieve exactly the data you will need for the entire "unit of work", no more, no less. In some cases, you may not know exactly what you need (because the user is working through a wizard or something similar) and in that case it probably makes sense to lazy load as you go.
If you are using an ORM and focused on adding features quickly and will come back and optimize performance later (which is extremely common and a good way to do things), having lazy loading being the default is the correct way to go. If you later find (through performance profiling/analysis) that you have one query to get an object and then N queries to get the N objects related to that original object, you can change that piece of code to use eager loading to only hit the database once instead of N+1 times (the N+1 problem is a well known downside of using lazy loading).
The usual tradeoff for lazy loading is that you make a smaller hit on the database up front, but you end up making more hits on it long-term.
Without lazy loading, you'll grab an entire object graph up front, sucking down a large chunk of data at once. This could, potentially, cause lag in your UI, and so it is often discouraged. However, if you have a common object graph (not just single object - otherwise it wouldn't matter!) that you know will be accessed frequently, and top to bottom, then it makes sense to pull it down at once.
As an example, if you're doing an order management system, you probably won't pull down all the lines of every order, or all the customer information, on a summary screen. Lazy loading prevents this from happening.
I can't think of a good example for not using it offhand, but I'm sure there are cases where you'd want to do a big load of an object graph, say, on application initialization, in order to avoid lags in processing further down the line.
The short version is this:
Development is simpler if you use lazy loading. You just traverse object relationships in a natural OO way, and you get what you need when you ask for it.
Performance is generally better if you figure out what you need before you ask for it, and ask for it in one trip to the database.
For the past few years we've been focusing on quick development times. Now that we have a solid app and userbase, we're optimizing our data access.
If you are using a webservice between the client and server handling the database access using nhibernate it might be problematic using lazy loading since the object will be serialized and sent over the webservice and subsequent usage of "objects" further down in the object relationship needs a new trip to the database server using additional webservices. In such an instance it might not be too good using lazy loading. A word of caution, be careful in what you fetch if you turn lazy loading of, its way to easy to not think this through and through and end up fetching almost the whole database...
I have seen many performance problems aring from wrong loading behaviour configuration in Hibernate. The situation is quite the same with NHibernate I think. My recommendation is to always use lazy relations and then use eager fetching statemetns in your query - like fetch joins - . This ensures you are not loading to much data and you can avoid to many SQL queries.
It is easy to make a lazy releation eager by a query. It is nearly impossible the other way round.
I'm using VB.Net, and I have a set of data which I have to able to filter through fairly quickly. Basically, the program is like google sugest, but instead of a drop-down menu, I'm using a listbox. When a user enters a word, I compare the word using LINQ and filter those that contain the user's input. The data are all strings of variable length (from 0 to 200 characters, most on 150 character mark), and I have 240,000+ of this strings and counting- all stored in an XML file.
A colleague of mine told me that loading all of that to memory (using VB.Net's XML serializer plus collections of string/objects) is not practical, and would slow the 'startup' time of the program. I haven't finished building the program yet and I'm having second thoughts about continuing this path.
So, my question is: Should I continue with my current approach on the problem (which is load everything to memory on startup), or is there a better way of solving my dilemma?
If you want to prevent startup time and keeping it in memory isn't an issue on performance, then load it asynchronously. Although loading 240.000+ strings from an XML and keeping it in memory doesn't sound like the greatest idea. Probably a database would be the better approach. Or at least some format like JSON that's faster to parse.
Depends on a number of things:
If
((you know the strings will not hugely increase in number) &&
(you know the spec of the machines that will run your app) &&
(you are able to test that the load time is *good enough* on the above spec))
{
**don't bother changing approach.**
}
else
{
**change approach.**
}
The alternative approach is obviously some kind of asynch lazy-load.
You're talking about loading roughly 36MB of strings. While this isn't a daunting amount by any means (though you could probably load it faster reading the XML yourself...I wouldn't go with the serialization engine if I was worried about performance), it's also a non-trivial amount. You're looking a adding a couple of seconds to your startup time, assuming you don't do it asynchronously as Mircea suggests.
If you do do it asynchronously, you'll have to ensure that any UI process that relies on the data doesn't occur until after it has loaded. That may be a difficult thing to ensure.
The question seems to imply an online application. A few suggestions if that is the case:
The data could / should be zipped. I suspect it would compress very nicely.
Maybe the data could be cached accross multiple sessions, possibly be delivered as html content with a expiry cache date as appropriate. This would save systematic loading, and may be feasible if the data isn't updated frequently.
The suggestion feature feature could be initially disabled (i.e. say showing a "loading..." message while the application initializes the cache, asynchronously). In this fashion the application would be quickly available upon startup, even though the suggest feature may lag by up to say 30 seconds or so.
Edit: Independently of how the data gets downloaded and cached, I second the opinion of Mircea Grelus that an xml file of this size is a poor substitute for a database.
It may not be a bad idea to load the XML into memory when the app starts up. But if you go this route I'd look into using the BackgroundWorker thread. The idea would be to load the XML into memory asynchronously so the UI is still responsive as this is going on. As far as the user is concerned the app shouldn't appear to start any slower, and yet once done the Google-suggest-like feature should be significantly faster.
I must say that even in memory this is an inherently inefficient operation since you have no advantage of using an index when querying an XML file in this way. This is something that would be 10X faster in SQL with full-text searching.
Of course XML has the advantage of being self-contained and requiring no additional components. And that makes it a decent choice for small desktop apps that query small amounts of data. Otherwise I would consider using a database for better performance.
You might be better served by using binary serialization rather than XML serialization to persist the data that your app reads on startup, particularly if you end up implementing a data structure that's faster to search than a `StringCollection. You'd still maintain the XML version of the data somewhere, of course.
And by all means, use a BackgroundWorker to load the data asynchronously if that'll make your application feel more responsive.
I am using NHibernate for ORM and have consolidated the loading of lots of entities into one big query.
I am actually loading a word dictionary, around 500K entries, and each word relates to others. Running the loading process in the background could be very tricky in our application, as we would have to manually load an entry that has not been loaded on time, as any word could be asked for at any time. Our only requirements are that all the data be loaded as fast as possible.
I also tried using a stateless session, but got an exception that stateless sessions can't fetch collections (for some reason, maybe it has to do with the fact there is no cache for stateless sessions?)
The problem is that although the query takes no more than 25 seconds in SQLServer, it takes well over 3 minutes for ICriteria.List().
I used NHProf to profile the loading process and found that the creation of the entities is a costly affair, which takes up most of the loading time in NHibernate.
Is there anything I could do to reduce this latency? Is the memory allocation expensive, or is it the "filling in" of the data?
Thanks!
Perhaps you should consider the fact that NHibernate (like most ORMs) is not particularly suited (or intended) for these types of bulk-loading scenarios. How many rows are you trying to load, give or take? What are you trying to do? Pre-populate a cache? Do batch-like processing?
My gut feeling is that you should seriously consider the purpose of your app and choose the underlying technologies accordingly. Perhaps you can shed some light on your intentions/requirements?
EDIT OK, from your comments I understand what it is you're trying to do here. The first thing I'd do is create a simple prototype using raw ADO.NET to load the same data, to get a feel for the best performance attainable using standard data access and in-memory collections. Next, fiddle around with different collection types to see what performs well when populating and searching. If loading data like this is still too slow, it's time to start looking at other methods of loading the data: file-based from a local data file, hydrating pre-serialized objects, some form of fast on-demand loading, etc.
Loading 500k entities into an NHibernate session is not a good idea. The session is made to be short lived and hold a relatively small number of entities.
If you want to do this kind of batch processing in NHibernate you should take a look at the StatelessSession instead of the ordinary session. Using a stateless session would most likely drastically improve performance in this scenario. However, when using a stateless session you lose the benefits of the NHibernate first level cache, such as change tracking.
More information about the StatelessSession can be found in this article and in the NH docs at nhibernate.info.
In this scenario I would also recommend that you consider using straight ADO.NET instead of NHibernate. I am not saying that you should switch you whole data access strategy to ADO.NET but you might want to consider using ADO.NET for the batch operations and using NHibernate for the other cases.
Profiling the creation process (for example with the VS performance analyser) should tell you exactly what is the costly operation. If you have played already with lazy loading tuning then I think the only good solution is to encapsulate the returned list to enable paging an return smaller chunks in a few iterations. I am not sure whether NHibernate support lazy result lists like JPA does (i.e. not loading entities from data reader until needed).