I'm trying to insert around 16000 records into a single table. Because this is very slow, I'd like to batch insert them. However, I'm always getting a timeout error when I attempt to do this. So I have two questions:
What is the best way to speed up inserts?
How do I increase the timeout value of the connection?
First you have to use a stateless session. Instead of calling OpenSession(); (on the session factory), you call OpenStatelessSession(); It has much the same api as the normal session, but there is not caching and stuff (a lot quicker for bulking data operations). Then you need to set the batch size by calling .AdoNetBatchSize([[batch size]]); where you set the database in your configuration.
This might do the trick. But you should know that this isn't relay what nhibernate (or any other orm) is built for, so don't count on any kind of performance.
Related
What is the fastest way to send batch requests to Postgres database using Golang? Each request contains 500-200000 rows.
Methods I know about are-
1. Using database/sql package's transaction Begin, Prepare, Commit.
2. Sending all data in one statement.
3. Sending a list of statements using sql.Exec() method.
Is there some other way to send batch requests without making a connection at every statement? If not which is the best way of these?
This question is similar to question at- Golang how do I batch sql statements with package database.sql
There is a bit old depesz blog post on that. His programs are Perl scripts, but if you concentrate on SQL... Anyway - from DB perspective, you can use COPY, or INSERT with many rows in VALUES. It looks that around 20 is good choice, but it is worth to test that in your case. If performance is key factor, I would put around 2000-5000 rows per transaction. Also, from DB perspective transaction, and session are two separate things. So you can open session, and to many transactions in it.
For PostgreSQL starting new session per operation is really bad idea - DB spawns new process for each session. One of answers for the question you referred contains this. So you open connection, and then transaction, as it should be done.
I am currently implementing a server system which has both an SQL database and a Redis datastore. The data written to Redis heavily depends on the SQL data (cache, objects representing logic entities defined by a number of SQL models and their relationships).
While looking for an error handling methodology to wrap client requests, something similar to SQL's transaction commit & rollback (Redis doesn't support rollbacks), I thought of a mechanism which can serve this purpose and I'd appreciate input regarding it.
Basically, I intend to wrap (with before/after middleware) every client request with an SQL transaction and a Redis multi command (pipes commands until exec or discard command is invoked), and allow both transactions to occur only if the request was processed successfully.
The problem is that once you start a Redis multi command, you are not able to preform any reads/writes and actually use their values while processing a request. I reduced the problem just for reads since depending on just-now written values can be optimized out.
My (simple) solution: split the Redis connection into two - a writer and a reader. The writer connection will be the one to be initialized with the multi command and executed/discarded at the end. Of course, all writing will be preformed through it, while reading is done using the reader (executed instantly).
The down side: as opposed to SQL, you can't rely on values written in the same request (transaction). Then again, usually quite easy to overcome.
I have data conversion and caching service running as self-hosted WCF service.
Now it uses database polling in constant short intervals to update its data.
I think it's unnecessary. The data can be changed only if one of the tables is changed, and when the data is changed depends on system users actions.
There is no problem in setting a trigger for specific tables, however I would need an action outside SQL-Server to update my cache. My WCF service could perform update when receiving specific URI via HTTP. So all I need is a command in table trigger which would send a request. Is it even possible?
I think about a hack I used back in the days with HTTP requests. I halted HTTP request response at server until data packet from somewhere else arrived. There was no delay between polling requests. I achieved fully asynchronous, "real-time" updates.
Maybe this approach is possible to apply with SQL? I think about a query which blocks termination until receives a signal. Well, it eventually times out, but it's good enough to try. Then - how to signal and wait in SQL? By locking and unlocking shared resource, like cursor or dummy table?
Any other options?
I need the cache update done at lowest possible frequency (because it's pretty expensive, so once per minute is great), but I need immediate update when the data is changed.
To answer your question, have you looked at xp_cmdshell?
https://msdn.microsoft.com/en-us/library/ms175046.aspx
However, the security/performance implications of such a decision could be non-trivial depending on your use case.
We are trying to implement retry logic to recover from transient errors in Azure environment.
We are using long-running sessions to keep track and commit the whole bunch of changes at the end of application transaction (which may spread over several web-requests). Along the way we need to get additional data from database. Our main problem is that we can't easily recover from db error because we can't "replay" all user actions.
So far we used straightforward recovery algorithm:
Try to perform operation in long-running session
In case of error, close the session, open a new one and merge entities into it
Retry the operation
It's very expensive approach in terms of time (merge is really long for big entity hierarchies). So we'd like to optimize things a little.
We'd like to perform query operations in separate session (to keep long running one untouched and safe) and on success, merge results back to the long-running session. Retry is relatively simple here - we just need to open new session and run query once more. However, with this approach we have an issue with initializing lazy properties/collections:
If we do this in separate session, we need to merge results back (a lot of entities) but merge could fail and break the long-running session
We tried different ways of "moving" original entity to different session, loading details and returning it back, but without success (evict, replicate, etc.)
There is known statement that session should be discarded in case of exception. However, the example shows write operation. Is it still true for read ones? I mean if I guarantee that no data is written back to the database, can I reuse the same session to run query again?
Do you have any other suggestions about retry logic with long-running sessions?
IMO there's no way to solve your issue. It's gonna take a lot of time to commit everything or you will have to do a lot of work to break it up into smaller sessions and handle every error that can occur while merging.
To answer your question about using the session after an exception: you cannot trust ANYTHING anymore inside this session, not even loaded entities.
Read this paragraph from Ayende's article about building a simple todo app with a recoveryplan in case of an exception in the session:
Then there is the problem of error handling. If you get an exception
(such as StaleObjectStateException, because of concurrency conflict),
your session and its loaded entities are toast, because with
NHibernate, an exception thrown from a session moves that session into
an undefined state. You can no longer use that session or any loaded
entities. If you have only a single global session, it means that you
probably need to restart the application, which is probably not a good
idea.
I am using nhibernate, to create a collection of immutable domain objects from a legacy oracle DB. some simple lookup using Criteria api take over 60 seconds. If subsequent lookups of the same lookup is very fast usually less than 300ms (100ms in db and rest by nhibernate, i dont have 2-level cache or query cache enabled, all queries do go the DB I checked using nhibernate prof). If however i leave the app idle for a couple of minutes and run the lookup again it takes usualy 50-60 secs,
I have used nhibernate profiler and in every case its clearly showing only at the most 100ms is spend in database, i figure the rest of the time must be taken by nhibernate, I cant understand why ?
Some background info :
I am using dynamic-component to map a 20 columns into key value
pairs.
Using nhibernate 2.1
i am using dynamic-component in the mapping
Once retrieved the data is never modified, in mapping i am
using mutable=false flag.
its a legacy db so i am using a composite
key in the mapping.
I am only retriving around 50 objects in each lookup
When I open session I have set FlushMode=Never
I also tried stateless session (still have slow performance on initial lookup)
I dont not define or use any custom user types in the mapping
I am clearly doing something wrong or missed some thing, any ideas ?
I suggest downloading a C# performance profiler such as dotTrace. You will be able to quickly get a more accurate understanding of where your performance problem is. I'm pretty sure it is not an NHibernate mapping issue.
How is the lifetime of your SessionFactory being managed? Is it possible that your SessionFactory is being disposed of after some period of inactivity?
It is most likely not an Nhibernate issue.
Use the code below to figure the amount of time it takes to get your data back. (db+network_latency+nhibernate_execution)
Once you are positive that there is no APP related latency involved, check the database by looking at the query plan caching and query result caching. The first time the query runs, a cache miss, your db will invest in time-consuming and intensive operations to generate the resultset.
If 1 and 2 don't yield any useful information, check your network. Maybe some network pressure is causing heavy latency.
As mentioned by JeffreyABecker below, study how your session factories get disposed/created. Find usages of ISessionFactory.Dispose() or configuration.BuildSessionFactory(). Building ISessionFactory objects is an expensive operation and, typically, you should create them on application start and dispose them on application stop/shutdown. 60s> it is still a sound number for ISessionFactory instantiation.
//Codez
Stopwatch stopwatch = new Stopwatch();
// Begin timing
stopwatch.Start();
// Nhibernate specific stuff ONLY in here
// Depending on your setup, do a session.Flush(); if possible.
// End Timing
stopwatch.Stop();
// Write result - console/log4net/diagnostics.debug/etc
Console.WriteLine("Time elapsed: {0}",stopwatch.Elapsed);