I did not find something similar so I have to ask:
I use LINQ to SQL and all was fine until i started to use Stored Procedures to update the database entries. (My Stored Proc is something like update all entries which groupid x)
The Update runs fine and the values in the database change. But the DataContext ignores this changes.
I have to say that the data context is a Singleton which I know is no common way but I have different reasons why I have to do it like this.
so
db.Refresh(System.Data.Linq.RefreshMode.OverwriteCurrentValues);
doesn't help.
Why he doesn't know the changes of the db?
What you are trying to do goes very much against how LinqToSql works.
Using a long lived DataContext is very difficult to do correctly, especially if you need to call stored procedures, where LinqToSql can't easily track the data changes.
Changes you make through the DataContext are generally tracked automatically, so the DataContext can properly manage its cache and keep track of changes being made to the database from that DataContext. That isn't always the case however. The DataContext doesn't (and can't easily) understand what your stored procedure is doing, so it has no idea how to keep its cache correct. At that point, after calling the stored procedure, your very best option is to get rid of that DataContext and create a new one. That effectively blows away your cache, which may or may not be a significant performance hit, but data integrity should be your primary concern.
If your Singleton DataContext isn't the only thing modifying the database (for example, your database could be modified by things like: triggers, batch processing, other applications, etc.), your DataContext may also contain inaccurate data in its cache, which is yet another reason to have a short lived DataContext.
So, while you could possibly succeed with a long lived Singleton DataContext, you will be fighting the system the entire way and the system is likely to win in the end.
You have to decide: How important is data integrity?
Because the datacontext is caching the values. Here's an article on how to clear the cache. But now you have the problem of implementing a notification system of knowing when to clear it.
Microsoft recommend that a data context should be only used for a single unit of work. Hanging onto it as a singleton probably isn't a good idea.
Related
I have a medium-sized app written in Ruby, which makes pretty heavy use of a RDBMS. As our code grows, I found the ugly SQL statements are spreading to all modules and methods in my app and embedded in many application logic. I am not sure if this is bad, however, my gut tells me this is quite ugly...
So generally in any languages, how do you manage your SQL statements? Or do you think it is harmful for maintainibility to let many SQL statements embedded in the application logic? Why or why not?
Thanks.
SQL is a language for accessing databases. Often, it gets confused as being the API into the data store for a larger application. In fact, you should design a real API between the data store and the app.
The means several things.
For accessing data stored in tables, you want to go through views in the database, rather than directly access the tables.
For data modification steps, you want to wrap insert/update/delete in stored procedures. This has secondary benefits, where you can handle constraints and triggers in the stored procedure and better log what is happening.
For security, you want to include database security as part of your security architecture. Giving all users full access may not be the best approach.
Unfortunately, it is easy to write a simple app that uses a database directly, whether in java or ruby or VBA or whatever. This grows into a bigger app, and then the maintenance problems arise.
I would suggest an incremental approach to fixing this. Go through the code and create views where you have nasty select statements. You'll probably find you need many fewer views than selects (the views can be re-used -- a good thing).
Find places where code is being modified, and change these to stored procedures. I always return status from the stored procedure for error checking and put log information into a table called someting like splog or _spcalls.
If you want to limit permissions for different users of your app, then you might be interested in this.
Leaving the raw SQL statements in the code is a problem. Just wait until you want to rename a column and you have to find all the places where this breaks the code.
Yes, this is not optimal - maintenance becomes a nightmare; it's hard to forecast and determine which code must change when underlying DB changes occur. This is why it is good practice to create a data access layer (DAL) to encapsulate CRUD operations from the application logic. There is often an business logic layer (BLL) between the application logic and DAL to enforce business rules/logic.
Google "data access layer" "business logic layer" and even "n-tier architecture" to learn more.
If you are concerned about the SQL statements littered around your application logic, maybe consider implementing them as Stored Procedures?
That way you will only be including the procedure name and any parameters that need to be passed to it in your code.
It has other benefits too, a common one being easier to re-use in multiple files.
There is much debate about speed and security of Stored Procedure and you will never get a definitive answer about that so I won't even open that can of worms.
Here is how you do this with Java: Create a class that encapsulates all access to the database. Add a method to the class for each query you need to run.
The answer for ruby will be similar to this.
It depends on the architecture of your application but a simple solution is to keep each sql in a file, qry.sql. For each Ruby module (or whatever is used in Ruby to aggregate related code) you can keep a folder SQL with these files. So, the collection of SQL folder/files form the data access layer of your application. The Ruby code provides the business layer. If your data model changes (field names, etc), you can do greps to identify the sql files that need changes. Anyway, definitely separate SQL from your logic code.
I will not go into the details why I am exploring the use of Micro ORMs at this stage - except to say that I feel powerless when I use a full blown ORM. There are too many things going on in the background that happens automatically, and not all of them are the best possible choices. I was quite ready to go back to raw database access, but I found out about the three new guys on the block: Dapper, PetaPoco and Massive. So I decided to give the low-level approach a go with a pet project. It is not relevant, but so far, I am using PetaPoco.
In any case, I am having trouble deciding how to go about maintaining the SQL strings that I will use from the higher levels. There are three main solutions that I can think of:
Sprinkle the SQL queries wherever I need them. This is the least infrastructure heavy method. However, it suffers in both maintainability and testability areas.
Limit the query usage to some service classes. This helps maintainability, is still low on infrastructure I need to implement. It may also be possible to build these service classes such that it would be easy to mock for testing purposes.
Prepare some classes to make the system somewhat flexible. I have started on this path. I implemented a Repository interface, and a database dependent Repository class. I have also build some tiny interfaces to capture SQL queries that can be passed to my Repository's GetMany() method. All the queries are implemented as individual classes right now, and I will probably need a little more interface around this to add some level of database independence - and maybe for some flexibility in decorating queries into paged and sorted queries (again, this would also make them a little bit more flexible in handling different databases).
What I am mainly worried about right now is that I have entered the slippery slope of writing all the functions needed for a full blown ORM, but badly. For example, it feels sensible right now that I write or find a library to convert linq calls into SQL statements so that I can massage my queries easily or write extenders that can decorate any query I pass to it, etc. But that is a large task, and is already done by the big guys, so I am resisting the urge to go there. I also want to retain control over what queries I send to the database - by explicitly writing them.
So what is the suggestion? Should I go #2 option, or try to stumble along on option #3? I am certain I cannot show any code written in the first option to anyone without blushing. Is there any other approach you can recommend?
EDIT: After I've asked the question, I realized there is another option, somewhat orthogonal to these three options: stored procedures. There seems to be a few advantages to putting all your queries inside the database as stored procedures. They are kept in a central location, and not spread through the code (though maintenance is an issue - the parameters may get out of sync). The reliance on database dialect is solved automatically: if you move databases, you port all your stored procedures, and you are done. And there is also the security benefits.
With the stored procedure option, the alternatives 1 and 2 seem a little bit more suitable. There seems to be not enough entities to warrant option 3 - but it is still possible to separate the procedure call commands from database accessing code.
I've implemented option 3 without stored procedures, and option 2 with stored procedures, and it seems like the latter is more suitable for me (in case anyone is interested with the outcome of the question).
I would say put the sql where you would have put the equivalent LINQ query, or the sql for DataContext.ExecuteQuery. As for where that is... well, that is up to you and depends on how much separation you want. - Marc Gravell, creator on Dapper
See Marc's opinion on the matter
I think the key point is, you shouldn't really be re-using the SQL. If your logic is re-used then it should be wrapped in a method called that can then be called from multiple places.
I know you've accepted your answer already but I still wanted to show you a nice alternative that may be helpful in your case as well. Now or in the future.
When using stored procedures it's wise to use T4
I tend to use stored procedures on my project even though it's not using PetaPoco, Dapper or Massive (project started before these were here). It uses BLToolkit instead. Anyway. Instead of writing my methods to run stored procedures and write code to provide stored procedure parameters, I've written a T4 template that generates the code for me.
Whenever stored procedures change (some may be added/removed, parameters added/removed/renamed/retyped), my code will break on compilation because method calls will not match their signature any more.
I keep my stored procedures in a file (so they get version controlled). If you work in a multi-developer team it may be sensible to have stored procedures each in its own file. It makes updates much less painful. I've experienced that on some project and it worked ok as long as number of SPs is not huge. You can restructure them into folders based on the entity they're related to.
Anyway. Maintenance is related to stored procedures, code change is just a simple click of a button in Visual Studio that converts all T4s at once. You don't have to search your methods that use those procedures. You'll be reported errors while compiling. One thing less to worry about.
So instead of writing
using (var db = new DbManager())
{
return db
.SetSpCommand(
"Person_SaveWithRelations",
db.Parameter("#Name", name),
db.Parameter("#Email", email),
db.Parameter("#Birth", birth),
db.Parameter("#ExternalID", exId))
.ExecuteObject<Person>();
}
and having a bunch of magic strings I can just simply write:
using (var db = new DataManager())
{
return db
.Person
.SaveWithRelations(name, email, birth, exId)
.ExecuteObject<Person>();
}
This is nicer, cleaner breaks on compile and provides intellisense so it's also faster to while developing.
The good thing is that stored procedures may become very complex and may do many things. In my upper example I check some data, insert person record and some related one as well and in the end return the newly inserted Person record. Inserts and updated should usually return data that was added/changed to reflect actual state.
I have been working with NHibernate, LINQ to SQL, and Entity Framework for quite some time. And while I see the benefits to using an ORM to keep the development effort moving quickly, the code simple, and the object relational impedance mismatch to a minimum, I still find it very difficult to convince a die hard SQL dba of an ORM's strengths. From my point of view an ORM can be used for at least 90-95% of all of your data access leaving those really hairy things to be done in procedures or functions where appropriate. I am by no means the guy that says we must do everything in the ORM!
Question: What are some of the better arguments for convincing an old school dba that the use of an ORM is not the absolute worst idea ever conceived by a programmer!
If you want to convince him, first you need to understand what his problem is with use of an ORM. Giving you a list of generic benefits is unlikely to help if it does not address the issues he has.
However, my first guess as to his issue would be that it prevents him from doing any optimisation because you're accessing tables directly so he has no layer of abstraction behind which to work, so if a table needs altering or (de)normalizing then he can't do it without breaking your application.
If you're wondering why a DBA would feel like this, and how to respond to it, then it's roughly the same as him coming up to you and saying he wants you to make all the private fields in your classes public, and that you can't change any of them without asking him first. Imagine what it would take for him to convince you that's a good idea, and then use the same argument on him.
Explain to them that creating a stored procedure for every action taken by an application is unmaintainable on several levels.
If the schema changes it's difficult
to track down all the stored
procedures that are affected.
It's impossible ensure that multiple
stored procedures aren't created to
do the same thing, or if slightly
altering an existing stored
procedure is going to have serious
ramifications.
It's difficult to make sure that the
application and database are in
sync after a deploy.
Dynamic SQL has all these issues and more.
I guess, my first question to "Convincing a die hard DBA to use an ORM" would be: Is the DBA also a programmer that also works outside the DB so that he/she would "use an ORM"? If not then why would the DBA give up a major part of their job to someone else and thereby significantly reduce their overall usefulness to the company? They wouldn't.
In any case, the best way to convince any engineer of anything is with empirical data. Setup a prototype with a few parts of the real application ported to ORM for the purpose of your demonstration and actually prove your points.
On another point I think you don't get the object relational impedance dilemma if you're trying to use that as an argument to use an Object-Relation-Mapper. The DBA could quote from that link you posted where where it says "Mapping such private object representation to database tables makes such databases fragile according to OOP philosophy" and that the issue is further pronounced "particularly when objects or class definitions are mapped (ORM) in a straightforward way to database tables or relational schemata" So according to your own link, by promoting ORM you are promoting the problem.
By using sprocs the DBA is free to make changes to the underlying schema, so long as the sproc still returns the same columns with the same types. Thusly with this abstraction that sprocs add, the direct schema mapping issues become nought. This does not mean however that you need to give up your beloved EF since EF can now be used quite happily with sprocs.
Procedures used to be more efficient because of predictable caching mechanisms. However, many DBA's overkill the procedures, introducing lots of branching logic with IF commands, resulting in an scenarios where they become uncacheable.
Next, procedures are only useful if you plan to span data logic across multiple platforms; a website and separate client application, for example. If you're only making a web application, the procedures introduce an unnecessary level of abstraction and more things to juggle. Having to adjust a table, then a procedure, then a data model is a lot of work when adjusting a single model via the ORM would suffice.
Lastly, procedures couple your code to your database very tightly. If you want to migrate to a different database you have to migrate all the procedures, some of which may need to be heavily rewritten. This sort of migration is significantly easier with an ORM since you can yank out the backend and install a new one without the frontend application knowing the difference.
I work in a software and hardware development farm. Today one of my colleagues told me that NHibernate is only useful for small projects, and for complex or large scale projects it must be avoided. Also, it makes code harder to change.
Are those statements true?
Ebay uses Hibernate (the Java version that NHibernate is ported from). I don't consider that a small project.
As far as changing code goes, consider this: Let's assume we need to add a new property to an object.
Here is what has to be done with a hand-rolled data access layer:
Add the column to the db table.
Change every stored procedure that
deals with that object / table.
This is usually several stored
procedures in my experience.
Change the code in the mapping layer
Add the property to the Object
Here is what has to be done with NHibernate:
Add the column to the db table.
Add the property to the HBM file
Add the property to the object.
Have to agree with Daniel Augur on the first point.
On the second, "does it make code harder to change?", I'll provide a general view. Any time you use something ready-rolled you're going to run into restrictions that might not be easier to overcome. Even when the source is available, you may not wish to modify it for fear of deviating to the point of a breaking change.
Part of a software developer's job is determining whether the merits outweigh the drawbacks with 3rd party code.
Should queries live inside the classes that need the data? Should queries live in stored procedures in the database so that they are reusable?
In the first situation, changing your queries won't affect other people (either other code or people generating reports, etc). In the second, the queries are reusable by many and only exist in one place, but if someone breaks them, they're broken for everyone.
I used to be big on the stored proc solution but have changed my tune in the last year as databases have gotten faster and ORMs have entered the main stream. There is certainly a place for stored procs. But when it comes to simple CRUD sql: one liner insert/update/select/delete, using an ORM to handle this is the most elegant solution in my opinion, but I'm sure many will argue the other way. An ORM will take the SQL and DB connection building plumbing out of your code and make the persistence logic much more fluidly integrated with your app.
I suggest placing them as stored procedures in the database. Not only will you have the re-use advantage but also you'll make sure the same query (being a select, update, insert, etc...) it is the same because you are using the same stored procedure. If you need to change it, you'll only change it in one place. Also, you'll be taking advantage of the database server's processing power instead of the server/computer where your application resides. That is my suggestion.
Good luck!
It depends / it's situational. There are very strong arguments for and against either option. Personally, I really don't like seeing business logic get split between the database (in stored procedures) and in code, so I usually want all the queries in code to the maximum extent possible.
In the Microsoft world, there are things like Reporting Services that can retrieve data from classes/objects instead of (and in addition to) a database / stored procedures. Also, there are things like Linq that give you more strongly typed queries in code. There are also strong, mature ORMs like NHibernate that allow for writing pretty much all types of queries from code.
That said, there are still times when you are doing "rowset" types of things with your queries that work much better in a stored procedure than they work from code.
In the majority of situations, either/both options work just fine.
From my perspective, I think that stored procs are the the way to go. First, they are easier to maintain as a quick change to one means just running the script not recompiling the application. Second they are far better for security. You can set permissions at the sp level and not directly on the tables and views. This helps prevent fraud because the users cannot do anything directly to the datbase that isn't specified in a stored proc. They are easier to performance tune. When you use stored procs, you can use the database dependency metadata to help determine the affect of database changes on the code base. In many systems, not all data access or or even CRUD operations will take place through the application, having the code there seems to me to be counterproductive. If all the data access is in one place (an idea I support), it should be in the database where it is accessible to all applications or processes that might need to use it.
I've found that application programmers often don't consider the best way for a database to process information as they are focused on the application not the backend. By putting the code for the database in the database where it belongs, it is more likely to be seen and reviewed by database specialists who do consider the database and it's perfomance. We support hundreds of databases and applications here. I can look in any database and find the code that I need to find when something is slow. I don't have to upload the application code for each of hundreds of different applications just to see the part I need to do my job.