LINQ to SQL updates - sql

Does any one have some idea on how to run the following statement with LINQ?
UPDATE FileEntity SET DateDeleted = GETDATE() WHERE ID IN (1,2,3)
I've come to both love and hate LINQ, but so far there's been little that hasn't worked out well. The obvious solution which I want to avoid is to enumerate all file entities and set them manually.
foreach (var file in db.FileEntities.Where(x => ids.Contains(x.ID)))
{
file.DateDeleted = DateTime.Now;
}
db.SubmitChanges();
There problem with the above code, except for the sizable overhead is that each entity has a Data field which can be rather large, so for a large update a lot of data runs cross the database connection for no particular reason. (The LINQ solution is to delay load the Data property, but this wouldn't be necessary if there was some way to just update the field with LINQ to SQL).
I'm thinking some query expression provider thing that would result in the above T-SQL...

LINQ cannot perform in store updates - it is language integrated query, not update. Most (maybe even all) OR mappers will generate a select statement to fetch the data, modify it in memory, and perform the update using a separate update statement. Smart OR mappers will only fetch the primary key until additional data is required, but then they will usually fetch the whole rest because it would be much to expensive to fetch only a single attribute at once.
If you really care about this optimization, use a stored procedure or hand-written SQL statement. If you want compacter code, you can use the following.
db.FileEntities.
Where(x => ids.Contains(x.ID)).
Select(x => x.DateDeleted = DateTime.Now; return x; );
db.SubmitChanges();
I don't like this because I find it less readable, but some prefer such an solution.

LINQ to SQL is an ORM, like any other, and as such, it was not designed to handle bulk updates/inserts/deleted. The general idea with L2S, EF, NHibernate, LLBLGen and the rest is to handle the mapping of relational data to your object graphs for you, eliminating the need to manage a large library of stored procs which, ultimately, limit your flexability and adaptability.
When it comes to bulk updates, those are best left to the thing that does them best...the database server. L2S and EF both provide the ability to map stored procedures to your model, which allows your stored procs to be somewhat entity oriented. Since your using L2S, just write a proc that takes the set of identities as input, and executes the SQL statement at the beginning of your question. Drag that stored proc onto your L2S model, and call it.
Its the best solution for the problem at hand, which is a bulk update. Like with reporting, object graphs and object-relational mapping are not the best solution for bulk processes.

Related

How to handle a big result set without load it into memory

using nhibernate and getNamedQuery, how can I handle a huge result set, without load all rows to a list?
Let's suppose I need to do some logic with each row, but I don't need it after use it.
I don't want to use pagination, because I can't change the source (a stored procedure).
When you say you don't want to use pagination, do you consider using setMaxResults() and setFirstResult() pagination?
Query query = session.getNamedQuery("storedProc");
query.setFirstResult(fromIndex);
query.setMaxResults(pageSize);
I'm not really sure you have any other option with Hibernate. There are all sorts of ways you can incorporate partitioning to split the work, but that's only after you've loaded the data that you want to partition.
Assuming you're doing some sort of logic against a row and storing in the DB, and seeing as you're already using stored procs, your best bet may be another stored proc. Alternatively, if you want to keep your logic outside of your DB, then you should be working off of tables rather than data from a stored proc.

webapi sql performance of query comparison

In terms of speed of execution, from webapi2 all things being equal, which method would be run faster?
OPTION 1
var result = db.Database.SqlQuery("dbo.myStoredProcedure").ToList();
OPTION 2
var query = db.table1.where(x =>x.email) == targetEmail.select(c => new PortfolioPerfomance)
Basically option1 I wrote the sql stored procedure myself, and option2 I let the models in EF do the appropriate relations. The queries are much more complicated than this but I am just trying to understand the concepts of which would be faster.
thanks!!
This is an EF thing, no matter if you are using web api or not.
But, for bulk updates where you're updating massive amounts of data, a stored procedure will always perform better than an ORM solution because you don't have to marshal the data over the wire to the ORM to perform updates.
Also, if queries are complex, stored procedures are faster than Entity Framework when it need to retrieve massive amounts of data.
Keep in mind that both options will run faster if you have a a proper DB design.
Check this article for some tips to improve WEB API performance: https://www.c-sharpcorner.com/article/Tips-And-Tricks-To-Improve-WEB-API-Performance/
Also check this one to improve stored procedures: https://www.c-sharpcorner.com/article/improve-stored-procedure-performance-in-sql-server-store-procedure-performance/

SQL or LINQ for this functionality?

I'm writing an SQL script to update the database of a deployed application (add a few tables and copy/update some data) to accommodate new functionality being added to the application (.NET using C#). I've written some code in the application to handle such updates (whenever a new update is available, run its .sql) which reusable for all DB updates (simply provide a new .sql).
The problem I'm running into now is that I am fairly new to SQL, and have a pretty short timeline to get this update ready. The more complex part of the update could easily be written in the application, but for consistency's sake I'd like to do it in SQL.
What I need to do, I've summarized as the following algorithm/pseudocode:
for each category in allCategories
Select * from unrankedEntities
where Category = category
order by (score)
take top Category.cutoff
insert each of these into rankedEntities with rank = 0
Select * from unrankedEntities
where Category = category
order by (score) DESC
take top 16 - Category.cutoff
insert each of these into rankedEntities with rank = null, belowcutoff=true
This is the general algorithm I am trying to implement (with a couple more tweaks to be made after this part is complete). Would it be more efficient to write this in SQL or just write the LINQ to do it (resulting in changing the application's database update code for each DB update)? If SQL is the best choice, where can I find a good reference/tutorial covering the topics I would need to do this (I've looked through few different ones online, and some seem pretty simple, so I would appreciate a recommendation of a good one).
Additionally, if you have any tips on improving the algorithm, I'd be happy to hear your advice.
EDIT:
I forgot to mention, but (score) is not a value stored in unrankedEntities. It needs to be calclulated for each entity by counting rows in another table that match a specific condition, and mulitplying by a constant and by numbers pulled from other tables. Additionally, it needs to be stored into rankedEntities to avoid all this calculation every time it is needed (often).
EDIT:
It has been pointed out to me that parts of the new additions to the application already contain the code necessary to do all of the data manipulation, so I will go with the simple solution of recycling that code, although the result is a less tidy update-via-sql class, as it now has an update-via-sql section and update-via-code section.
When using LINQ to SQL in debug mode you can see the SQL it's generating, which you could copy into a script, if you need to hurry and not learn all about SQL right at this moment. See http://weblogs.asp.net/scottgu/archive/2007/07/31/linq-to-sql-debug-visualizer.aspx
EDIT: I didn't realize you were using Entity Framework. See this response to a similar question: How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 "code only"?.
LINQ is perfectly fine for this, but inserts and updates each entity in a singular fashion, meaning in SQL you could update in bulk with one query, but you cannot in LINQ... it generates separate statements for each entity. So SQL has some efficiency gains. If you are more comfortable with LINQ, do it in LINQ. If you are talking thousands of objects, you may want to turn off change tracking (turned off in EF by changing the merge option) as the internal memory of changed objects will grow rather large.

Which is better, filtering results at db or application?

A simple question. There are cases when I fetch the data and then process it in my BLL. But I realized that the same processing/filtering can be done in my stored procedure and the filtered results returned to BLL.
Which is better, processing at DB or processing in BLL? And Why?
consider the scenario, I want to check whether a product exists in my db and if it exists add that to the order (Example taken from answer by Nour Sabony below)now i can do this checking at my BLL or I an do this at the stored procedure as well. If I combine things to one procedure, i reduce the whole operation to one db call. Is that better?
At the database level.
Databases should be optimized for querying, so why go through all the effort to return a large dataset, and then do filtering in your application afterwards?
As a general rule: Anything that reasonably belongs to the db's responsibilities and can be done at your db, do it at your db.
well, the fatest answer is at database,
but you may consider something like Linq2Sql,
I mean to write an expression at the presentation layer, and it will be parsed as Sql Statement at Data Access Layer.
of course there are some situation BLL should get some data from DAL ,process it ,and then return it to DAL.
take an example :
PutOrder(Order value) procedure , which should check for the availability of the ordered product.
public void PutOrder(Order _order)
{
foreach (OrderDetail _orderDetail in _order.Details)
{
int count = dalOrder.GetProductCount(_orderDetail.Product.ProductID);
if (count == 0)
throw new Exception (string.Format("Product {0} is not available",_orderDetail.Product.Name));
}
dalOrder.PutOrder(_order);
}
but if you are making a Browse view, it is not a good idea (from a performance viewpoint) to bring all the data from Dal and then choose what to display in the Browse view.
may be the following could help:
public List<Product> SearchProduts(Criteria _criteria)
{
string sql = Parser.Parse(_criteria);
///code to pass the sql statement to Database procedure and get the corresponding data.
}
at the database.
I have consistently been taught that data should be handled in the data layer whenever possible. Filtering data is what a DB is specifically there to do and is optimized to do. Therefore that is where it should occur.
This also prevents the duplication of effort. If the data filtering occurs in a data layer then other code/systems could benefit from the same stored procedure instead of duplicating the work in your BLL.
I have found that people like to put the primary logic in the section that they are most comfortable working in. Regardless of a developer's strengths, data processing should go in the data layer whenever possible.
To throw a differing view point out there it also really depends upon your data abstraction model if you have a large 2nd level cache that sits on top of your database and the majority (or entirety) of your data you will be filtering on is in the cache then yes stay inside the application and use the cached data.
The difference is not very important when you consider tables that will never contain more than a few dozen rows.
Filtering at the db becomes the obvious choice when you consider tables that will grow to thousands, hundreds of thousands or millions of rows. It would make zero sense to transfer that much data to filter for a subset of records.
I asked a similar question in a SSRS class earlier this year when I wanted to know "best practice" for having a stored procedure retrieve queried data instead of executing TSQL in the report's data set. The response I received was: Allow the database engine to do the "heavy lifting." That is, do all the selecting, grouping, sorting in the stored procedure and allow SSRS to concentrate on delivering the results.
The same is true for your scenario. By all means, my vote would be to do all the filtering in your stored procedure. Call the sproc from your application and let the database do the work.
If you filter at the database, then only the data that is required by the application is actually sent on the wire between the database and the application server. For this reason, I suggest that it is better to filter at the database.

transactions and delete using fluent nhibernate

I am starting to play with (Fluent) nHibernate and I am wondering if someone can help with the following. I'm sure it's a total noob question.
I want to do:
delete from TABX where name = 'abc'
where table TABX is defined as:
ID int
name varchar(32)
...
I build the code based on internet samples:
using (ITransaction transaction = session.BeginTransaction())
{
IQuery query = session.CreateQuery("FROM TABX WHERE name = :uid")
.SetString("uid", "abc");
session.Delete(query.List<Person>()[0]);
transaction.Commit();
}
but alas, it's generating two queries (one select and one delete). I want to do this in a single statement, as in my original SQL. What is the correct way of doing this?
Also, I noticed that in most samples on the internet, people tend to always wrap all queries in transactions. Why is that? If I'm only running a single statement, that seems an overkill. Do people tend to just mindlessly cut and paste, or is there a reason beyond that? For example, in my query above, if I do manage it to get it from two queries down to one, i should be able to remove the begin/commit transaction, no?
if it matters, I'm using PostgreSQL for experimenting.
You can do a delete in one step with the following code:
session.CreateQuery("DELETE TABX WHERE name = :uid")
.SetString("uid", "abc")
.ExecuteUpdate();
However, by doing it that way you avoid event listener calls (it's just mapped to a simple SQL call), cache updates, etc.
Your first query comes from query.List<Person>().
Your actual delete statement comes from session.Delete(...)
Usually, when you are dealing with only one object, you will use Load() or Get().
Session.Load(type, id) will create the object for you without looking it up in the database . However, as soon as you access one of the object's properties, it will hydrate the object.
Session.Get(type, id) will actually look up the data for you.
As far as transactions, this is a good article explaining why it is good to wrap all of your nHibernate queries with transactions.
http://nhprof.com/Learn/Alerts/DoNotUseImplicitTransactions
In NHibernate, I've noticed it is most common to do a delete with two queries like you see. I believe this is expected behavior. The only way around it off the top of my head is to use caching, then the first query could be loaded from the cache if it happened to be run earlier.
As far as wrapping everything in a transaction: in most databases, transactions are implicit for every query anyways. The explicit transactions are just a guarantee that the data won't be changed out from under you mid-operation.