In Entity Framework i want to use roslyn analyzer and find the places where lazy loading might occur on some huge tables. Any idea on how to acheive? - lazy-loading

In Entity Framework i want to use roslyn analyzer and find the places where lazy loading might occur on some huge tables. Any idea on how to acheive?
I am trying to achieve and give warning while running the solution if a code is written in such a way that will trigger lazy loading on certain huge tables.

Related

What includes EclipseLink internal optimization via weaving

I am new in EclipseLink and just right now I am getting know it step by step. Right now I am working on performance optimizations via weaving inb order to use lazy loading for ***ToOne relationships, fetch groups for partial loading of entity instances, change tracking for commit performance optimizations and internal optimizations for ... And here the question is. Unfortunately I haven't found via googling a the right performances via this tactic.
Does somebody could explain what kind of internal optimizations EclipseLink performs via this weaving setting ?
Thanks in advance,
Simeon
I'd recommend you break up your question to make it more specific on what exactly you are looking for, but I'll try to add information.
Weaving allows EclipseLink to change the bytecodes of your entities to add provider specific methods etc so that you do not need to introduce a dependency within your model. Each of the terms listed in the doc you found - lazy loading, fetch goups etc - are all performance enhancements that you would need to look up individually. All can be used without weaving, but would require changes to your entity to implement EclipseLink interfaces and methods.
Lazy loading delays fetching a relationship until your application accesses it. getEmployee() in your entity for instance will just return the reference employee attribute - without weaving, the employee must have been fetched already or a null will be returned incorrectly. With weaving, code can be added to the Entity so that it goes to the database to fetch it on demand.
Fetch groups are similar concept that apply to basic mappings instead of relationships, while change tracking is more advanced and allows EclipseLink to be notified when you make a change to the entity rather than having to compare changes with a prebuilt backup on commit. Each will have independent references within the EclipseLink documentation.

When should one avoid using NHibernate's lazy-loading feature?

Most of what I hear about NHibernate's lazy-loading, is that it's better to use it, than not to use it. It seems like it just makes sense to minimize database access, in an effort to reduce bottlenecks. But few things come without trade-offs, certainly it slightly limits design by forcing you to have virtual properties. But I've also noticed that some developers turn lazy-loading off on certain often-used objects.
This makes me wonder if there are some definite situations where data-access performance is hurt by using lazy-loading.
So I wonder, when and in what situations should I avoid lazy-loading one of my NHibernate-persisted objects?
Is the downside to lazy-loading merely in additional processing time, or can nhibernate lazy-loading also increase the data-access time (for instance, by making additional round-trips to the database)?
Thanks!
There are clear performance tradeoffs between eager and lazy loading objects from a database.
If you use eager loading, you suck a ton of data in a single query, which you can then cache. This is most common on application startup. You are trading memory consumption for database round trips.
If you use lazy loading, you suck a minimal amount of data in a single query, but any time you need more information related to that initial data it requires more queries to the database and database performance hits are very often the major performance bottleneck in most applications.
So, in general, you always want to retrieve exactly the data you will need for the entire "unit of work", no more, no less. In some cases, you may not know exactly what you need (because the user is working through a wizard or something similar) and in that case it probably makes sense to lazy load as you go.
If you are using an ORM and focused on adding features quickly and will come back and optimize performance later (which is extremely common and a good way to do things), having lazy loading being the default is the correct way to go. If you later find (through performance profiling/analysis) that you have one query to get an object and then N queries to get the N objects related to that original object, you can change that piece of code to use eager loading to only hit the database once instead of N+1 times (the N+1 problem is a well known downside of using lazy loading).
The usual tradeoff for lazy loading is that you make a smaller hit on the database up front, but you end up making more hits on it long-term.
Without lazy loading, you'll grab an entire object graph up front, sucking down a large chunk of data at once. This could, potentially, cause lag in your UI, and so it is often discouraged. However, if you have a common object graph (not just single object - otherwise it wouldn't matter!) that you know will be accessed frequently, and top to bottom, then it makes sense to pull it down at once.
As an example, if you're doing an order management system, you probably won't pull down all the lines of every order, or all the customer information, on a summary screen. Lazy loading prevents this from happening.
I can't think of a good example for not using it offhand, but I'm sure there are cases where you'd want to do a big load of an object graph, say, on application initialization, in order to avoid lags in processing further down the line.
The short version is this:
Development is simpler if you use lazy loading. You just traverse object relationships in a natural OO way, and you get what you need when you ask for it.
Performance is generally better if you figure out what you need before you ask for it, and ask for it in one trip to the database.
For the past few years we've been focusing on quick development times. Now that we have a solid app and userbase, we're optimizing our data access.
If you are using a webservice between the client and server handling the database access using nhibernate it might be problematic using lazy loading since the object will be serialized and sent over the webservice and subsequent usage of "objects" further down in the object relationship needs a new trip to the database server using additional webservices. In such an instance it might not be too good using lazy loading. A word of caution, be careful in what you fetch if you turn lazy loading of, its way to easy to not think this through and through and end up fetching almost the whole database...
I have seen many performance problems aring from wrong loading behaviour configuration in Hibernate. The situation is quite the same with NHibernate I think. My recommendation is to always use lazy relations and then use eager fetching statemetns in your query - like fetch joins - . This ensures you are not loading to much data and you can avoid to many SQL queries.
It is easy to make a lazy releation eager by a query. It is nearly impossible the other way round.

Preserving data integrity in Drupal:

I've happened to develop a module in Drupal and due to some seeming View limitations had to use custom SQL. This ran me into some problems with node revisions and I came to conclusion
that in Drupal it's best to use its native methods for working with any data. Otherwise, data integrity problems may arise.
And even with desire to optimize SQL queries in Drupal apparently this should be done in rare cases for real bottlenecks.
What are you experiences related to this dilemma - direct sql queries vs. Drupal modules/functions ?
When updating data you should always use the Drupal default, even if you need to do other queries afterwards for custom tables etc. It is not obvious (without digging into the code) what Drupal does on various actions and if you copy the code for an action and put it in your function you will have to watch for changes in the core from then on.
One trick with views which may help you, is if views has got you almost what you want you can see the query generated by views copy that and put it in your own code. This removes the rest of the overhead of views and can be a big performance boost.

NHibernate latency is very high

I am using NHibernate for ORM and have consolidated the loading of lots of entities into one big query.
I am actually loading a word dictionary, around 500K entries, and each word relates to others. Running the loading process in the background could be very tricky in our application, as we would have to manually load an entry that has not been loaded on time, as any word could be asked for at any time. Our only requirements are that all the data be loaded as fast as possible.
I also tried using a stateless session, but got an exception that stateless sessions can't fetch collections (for some reason, maybe it has to do with the fact there is no cache for stateless sessions?)
The problem is that although the query takes no more than 25 seconds in SQLServer, it takes well over 3 minutes for ICriteria.List().
I used NHProf to profile the loading process and found that the creation of the entities is a costly affair, which takes up most of the loading time in NHibernate.
Is there anything I could do to reduce this latency? Is the memory allocation expensive, or is it the "filling in" of the data?
Thanks!
Perhaps you should consider the fact that NHibernate (like most ORMs) is not particularly suited (or intended) for these types of bulk-loading scenarios. How many rows are you trying to load, give or take? What are you trying to do? Pre-populate a cache? Do batch-like processing?
My gut feeling is that you should seriously consider the purpose of your app and choose the underlying technologies accordingly. Perhaps you can shed some light on your intentions/requirements?
EDIT OK, from your comments I understand what it is you're trying to do here. The first thing I'd do is create a simple prototype using raw ADO.NET to load the same data, to get a feel for the best performance attainable using standard data access and in-memory collections. Next, fiddle around with different collection types to see what performs well when populating and searching. If loading data like this is still too slow, it's time to start looking at other methods of loading the data: file-based from a local data file, hydrating pre-serialized objects, some form of fast on-demand loading, etc.
Loading 500k entities into an NHibernate session is not a good idea. The session is made to be short lived and hold a relatively small number of entities.
If you want to do this kind of batch processing in NHibernate you should take a look at the StatelessSession instead of the ordinary session. Using a stateless session would most likely drastically improve performance in this scenario. However, when using a stateless session you lose the benefits of the NHibernate first level cache, such as change tracking.
More information about the StatelessSession can be found in this article and in the NH docs at nhibernate.info.
In this scenario I would also recommend that you consider using straight ADO.NET instead of NHibernate. I am not saying that you should switch you whole data access strategy to ADO.NET but you might want to consider using ADO.NET for the batch operations and using NHibernate for the other cases.
Profiling the creation process (for example with the VS performance analyser) should tell you exactly what is the costly operation. If you have played already with lazy loading tuning then I think the only good solution is to encapsulate the returned list to enable paging an return smaller chunks in a few iterations. I am not sure whether NHibernate support lazy result lists like JPA does (i.e. not loading entities from data reader until needed).

NHibernate or FluentNHibernate or ActiveRecord?

I am in a stage of mapping my CSharp classes into database tables. I have decided to use NHibernate as my ORM tool after comparing with other tools. I have never done a real project with NHibernate before and now am considering alternatives for the mapping,
ActiveRecord: according to the project's web site, using ActiveRecord can boost productivity dramatically. However, I do not like the idea of adding attributes to my CSharp classes. After all, my class should NOT have any knowledge of database relationships. By using ActiveRecord will bind my nicely separated classes to ActiveRecord, and give me hard time if I ever want to switch underline DAO Layer implementation in the future.
FluentNHibernate: FluentNhibernate was my first attempt when starting mapping. But I also have a few issues with this approach. 1) I don't like my mapping strategies compiled as binary files. I would like to be able to change mapping by modifying xml files. 2) The maturity of FluentNHibernate. NHibernate has been around for a long time, and has LOTS of users, so I am quite comfortable with its maturity. On the contrast, FluentNhibernate is relatively young and not been tested by as many users. Even though I could dive into the source to fix whatever issue comes up, I am not comfortable with my skills to touch the low level implementation. 3) Availability of documentation for FluentNHibernate is much than that of NHibernate. I would like to have a place to go when I hit a hard wall.
NHibernate: Currently, I am using naked Nhibernate xml to do the mapping. To be honest, working with XML gives me massive headaches. Literally, I have to keep myself from the impulsion of just throwing away the .hbm.xml files and grab ActiveRecord or FluentNHibernate several times a day.
So, here is my dilemma: Should I go with my heart of "Just get this damn thing done!"; Or, should I follow the "Good practice guideline" to suffer the pain now and get relatively easy time later on?
Any comments?
Please note that any classes related to an ORM should not necessary be treated as "business object" classes or exposed to your UI. They should be considered part of your data layer. This pattern is not really unique to ActiveRecord. In general, you want your business layer to know as little as possible regarding the fact that there is an ORM beneath it, and you don't want your UI to know about your data layer. You also want to consider DTOs.
Fluent NHibernate solves the problem of having weakly typed XML which can be error prone to refactor.
While there can be downsides of adopting something like ActiveRecord, it seems like an appropriate solution in your case.
The best reason to use .hbm.xml files is if you are going to code generate them from your database (using something like CodeSmith). Hand coding the .hbm.xml files is rarely the best option.