Is there any way to get the query text - i.e. plain SQL - of an NHibernate query from a DetachedCriteria object (or any NHibernate object, I just want to be pointed in the right direction) BEFORE it is sent to my server? If so, can I prevent it from executing?
I don't know if there is an easy way of doing this. There might be a listener you can use to display the sql and then abort the execution. I've never used one for this purpose.
If you simply want to debug your queries and don't want to hit your database then write some tests using an in-memory database. In my opinion this is a much better strategy.
You can observe the queries that are being generated by tailing your log files or using NHProf.
Related
What is the fastest way to send batch requests to Postgres database using Golang? Each request contains 500-200000 rows.
Methods I know about are-
1. Using database/sql package's transaction Begin, Prepare, Commit.
2. Sending all data in one statement.
3. Sending a list of statements using sql.Exec() method.
Is there some other way to send batch requests without making a connection at every statement? If not which is the best way of these?
This question is similar to question at- Golang how do I batch sql statements with package database.sql
There is a bit old depesz blog post on that. His programs are Perl scripts, but if you concentrate on SQL... Anyway - from DB perspective, you can use COPY, or INSERT with many rows in VALUES. It looks that around 20 is good choice, but it is worth to test that in your case. If performance is key factor, I would put around 2000-5000 rows per transaction. Also, from DB perspective transaction, and session are two separate things. So you can open session, and to many transactions in it.
For PostgreSQL starting new session per operation is really bad idea - DB spawns new process for each session. One of answers for the question you referred contains this. So you open connection, and then transaction, as it should be done.
I have a postgresql database and I want to send E-Mail notifications if the results of specific queries have changed.
For SQL-Server there is a C# class called SqlDependency which allows me to do this in a very easy way. I'ts possible to say: "Hey notify me if SELECT * FROM a WHERE d changes".
But I couldn't find any solution for postgresql. I've often seen NOTIFY, but as far as I understand it, it's not as powerful as this SQL-Server mechanism, because I have to build lot of triggers.
My additional problem is, that the queries can potentially be very complex :/
So: Has postgresql any mechanism for this scenario?
I recently also went into that problem,
I'm currently evaluating the listen/notify functionality of psql.
It seems the suiting thing, but you have to implement your db-observer your self.
How to fire NOTIFICATION event in front end when table data gets changed
I am creating an application that allows users to construct complex SELECT statements. The SQL that is generated cannot be trusted, and is totally arbitrary.
I need a way to execute the untrusted SQL in relative safety. My plan is to create a database user who only has SELECT privileges on the relevant schemas and tables. The untrusted SQL would be executed as that user.
What could possibility go wrong with that? :)
If we assume postgres itself does not have critical vulnerabilities, the user could do a bunch of cross joins and overload the database. That could be mitigated with a session timeout.
I feel like there is a lot more than could go wrong, but I'm having trouble coming up with a list.
EDIT:
Based on the comments/answers so far, I should note that the number of people using this tool at any given time will be very near 0.
SELECT queries can not change anything in databse. Lack of dba privileges guarantee that any global settings can not be changed. So, overload is truely the only concern.
Onerload can be result of complex queryies or too much simple queries.
Too complex queryies can be ruled out by setting statement_timeout in postgresql.conf
Receiving plenties of simple queryies can be avoided too. Firstly, you can set parallel connection limit per user (alter user with CONNECTION LIMIT). And if you have some interface program between user and postgresql, you can additionally (1) add some extra wait after each query completion, (2) introduce CAPTCHA to avoid automated DOS-attack
ADDITION: PostgreSQL public system functions give many possible attack vectors. They can be called like select pg_advisory_lock(1) and every user have privilege to call them. So, you should restrict access to them. Good option is creating whitelist of all "callable words" or, more precisely, identifiers that can be used with ( after them. And rule out all queryies that include call-like construct identifier ( with an identifier not in white list.
Things that come to mind, in addition to having the user SELECT-only and revoking privileges on functions:
Read-only transaction. When a transaction is started by BEGIN READ ONLY, or SET TRANSACTION READ ONLY as its first instruction, it cannot write anything, independantly of the user permissions.
At the client side, if you want to restrict it to one SELECT, better use a SQL submission function that does not accept several queries bundled into one. For instance, the swiss-knife PQexec method of the libpq API does accept such queries and so does every driver function that is built on top of it, like PHP's pg_query.
http://sqlfiddle.com/ is a service dedicated to running arbitrary SQL statements which may be seen somehow as a proof-of-concept that it's doable without being hacked or DDos'ed all day long.
The problem with this, is i'm not sure if the sql itself will still continue to run in the background after a session timeout (can't really find much evidence either way via google and haven't had any real experience where I've attempted it myself either). If you're limiting to just select access, i think this is about the worst that could happen though. The real issue would be what happens if you got a hundred users trying to do complex cross joins? Session timeout dropping the query or not, it'll put a real heavy load on the database (could very easily be enough to pull the database down entirely)
The only way (from my point of view) to protect yourself against DoS on main server with crafted queries is to set up a read only replica of the Postgres DB and a special limited user on this replica DB. This way the main Postgres server wont be affected by queries on replica.
Also you will get hot standby / continuous replication DB for the case, when main DB fails for some reason.
When doing a criteria query with NHibernate, I want to get fresh results and not old ones from a cache.
The process is basically:
Query persistent objects into NHibernate application.
Change database entries externally (another program, manual edit in SSMS / MSSQL etc.).
Query persistence objects (with same query code), previously loaded objects shall be refreshed from database.
Here's the code (slightly changed object names):
public IOrder GetOrderByOrderId(int orderId)
{
...
IList result;
var query =
session.CreateCriteria(typeof(Order))
.SetFetchMode("Products", FetchMode.Eager)
.SetFetchMode("Customer", FetchMode.Eager)
.SetFetchMode("OrderItems", FetchMode.Eager)
.Add(Restrictions.Eq("OrderId", orderId));
query.SetCacheMode(CacheMode.Ignore);
query.SetCacheable(false);
result = query.List();
...
}
The SetCacheMode and SetCacheable have been added by me to disable the cache. Also, the NHibernate factory is set up with config parameter UseQueryCache=false:
Cfg.SetProperty(NHibernate.Cfg.Environment.UseQueryCache, "false");
No matter what I do, including Put/Refresh cache modes, for query or session: NHibernate keeps returning me outdated objects the second time the query is called, without the externally committed changes. Info btw.: the outdated value in this case is the value of a Version column (to test if a stale object state can be detected before saving). But I need fresh query results for multiple reasons!
NHibernate even generates an SQL query, but it is never used for the values returned.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only (also no stateless sessions for solution!); I don't want to add Clear(), Evict() or such everywhere in code, especially since the query is on a lower level and doesn't remember the objects previously loaded. Pessimistic locking would kill performance (multi-user environment!)
Is there any way to force NHibernate, by configuration, to send queries directly to the DB and get fresh results, not using unwanted caching functions?
First of all: this doesn't have anything to do with second-level caching (which is what SetCacheMode and SetCacheable control). Even if it did, those control caching of the query, not caching of the returned entities.
When an object has already been loaded into the current session (also called "first-level cache" by some people, although it's not a cache but an Identity Map), querying it again from the DB using any method will never override its value.
This is by design and there are good reasons for it behaving this way.
If you need to update potentially changed values in multiple records with a query, you will have to Evict them previously.
Alternatively, you might want to read about Stateless Sessions.
Is this code running in a transaction? Or is that external process running in a transaction? If one of those two is still in a transaction, you will not see any updates.
If that is not the case, you might be able to find the problem in the log messages that NHibernate is creating. These are very informative and will always tell you exactly what it is doing.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only
This is either the problem or it will become a problem in the future. NHibernate is doing all it can to make your life better, but you are trying to do as much as possible to prevent NHibernate to do it's job properly.
If you want NHibernate to update the dirty columns only, you could look at the dynamic-update-attribute in your class mapping file.
I am using Action Filter Attributes for loging user activity on certain action which has SQL database interaction. Similarly I can log the activity in the SQL tables using triggers on tables during each activity on the tables. I would like to know which of the above two methods is a best practice ( perfomance wise )
I think that the actionfilter is certainly the cleanest and best practice appraoch since it is in the application layer. Part of the benefit of being there is its managed code and if something breaks you can easily locate the problem. There is also the benefit that all your code is in one spot too.
Database triggers are a big no no in many companies since they have a habit of causing infinite loop well an unknowing programmer creates some logic that steps on the trigger over and over again causing the database to fail. Some companies do allow triggers but very well documented and very lightly used. Hope this helps.
Performance of logging depends greatly on the system architecture. If you have 3 load balanced web servers hitting one main database, triggers would have to handle all the load while Action Filters would split the load in three. In that scenario, Action Filters would be better.
In terms of best practices, I wouldn't use either of those approaches. I would set up Transactional Replication to another SQL server. This approach would run without impacting performance at all. The transaction log is already being generated and replication would just spin up a separate process that's reading that log.