Code legibility/maintenance: Where to put SQL statements? - sql

In the language file itself?
In a language file with all SQL statements?
In different .sql files?
Another way?
Share your code style. :)

Even if you don't use a framework, MVC helps keep you sane by separating your data access, logic and presentation into separate language files.

I guess you will get as many answers as answers. Anyway let's ask why one could
be preferrable over the other:
In the language file itself (I don't know what yoyu mean with the language file) but let me assume you meant the programming langauge itself. Well this approach was taken by Microsoft with Linq . It was taken e.g in Gemstone where the "query" langauge is Smalltalk (but not SQL)
If you put it in some .sql file then there must be a way to adress the code. I think this is what is done with stored procedures. Examples for that can e.g be found in the Postgres Database software.
If you put it in one of many files is probably open. E.g it could be that you have one query one file. Is that better or worse than having a hash table with diverse SQL statements identified by some key.
I see the following approaches every day in Access software
1) embedded in VBA as "just strings"
2) put into the queries section of access
3) I even read about putting this SQL statements in an extra SQL Statement Table.
Regards
Friedrich

It all depends - e.g:
in small DB maintenance scripts with a simple sequential control flow it's nice to see the statements where they are used
programs with loops/callbacks should prepare the statements early; then a list of all stements near a init/prepare function makes sense
a special case: I use a set of tool scripts written in different languages that do 'the same thing'; they all get their statements from .txt files containig SQL statements tagged by name
'big' applications (should) use stored procedures - then the problem vanishes

In Cobol, I put the SQL in the language file, but in separate procedures. That way, I separated the business logic from the data base logic.
In Python, I put the SQL in its own .py module. That way, I separated the business logic from the data base logic.
In Java, I put the SQL in a separate package. That way, I separated the business logic from the data base logic.
I've not used other languages, but I'd probably separate the business logic from the data base logic.

Related

How to include SQL code from one script into another

I'm looking for something that lets me better organize my SQL scripts.
I want to be able to include SQL code from one script into another, similar to how in C++ you can do include foo.c to import the contents of foo.c into your program.
Is that possible with SQL?
(FYI, I'm using SQL Server)
SQL is not designed to work like structured or object oriented programming languages.
In case you want to re-use scripts you built, I suggest you create functions and/or stored procedures which you will then be able to call, so you would avoid having to rewrite the code (this would be like "importing").
Functions are the basis of returning data in a custom format. You can read more about them here. You will find tips on when/where and how to use functions.
If you think functions are not enough, try reading about stored procedures here.

Using extract method on stored procedure

Extract method is a common refactoring pattern when writing programming languages.
When I try to do some refactorings on my stored procedures, I am wondering if it is also a good practice to use extract method when writing stored procedures (SP)/User-defined functions (UDF) since we can call other SPs/UDFs on a SP/UDF?
Does it affect performance?
Thanks in advance.
Just my opinion (working for several years with databases now):
Stored procedures should be used for database tasks only. For example migrating data (currently I'm working on a process to transform a database structure for example), or some dynamic queries (where a sql statement is built on the fly), or maybe a procedure to build a table (for example a table that holds dates for a specific date range).
Not for anything else! For everything that gets more complicated than above examples consider to code it on application layer.
Also, you maybe heard that it's wise to put as much business logic into the database as possible. That's true for the database design, but it does not mean, that you should code almost everything in it. Databases are not good at that (talking for example about data transformation or something like that). A programming language like PHP or whatever is faster!
So, for everything that I used stored procedures for, I never felt the need to put anything in extra procedures. Apart from for example the restructuring of a database (in my case it's a ETL process (it denormalizes data into a star schema for better performance)), there I wrote a procedure for every table and these procedures are called from a procedure that manages the whole process. But again, it's nothing like a programming language.
Also, when I take this example for extract method pattern http://www.refactoring.com/catalog/extractMethod.html
having something like this in your database will become a debugging nightmare and you will spend way too much time coding. And again, the cases where a stored procedure should be used are not cases where it makes sense to apply the extract method pattern.

How to avoid SQL statements spreading everywhere in your app?

I have a medium-sized app written in Ruby, which makes pretty heavy use of a RDBMS. As our code grows, I found the ugly SQL statements are spreading to all modules and methods in my app and embedded in many application logic. I am not sure if this is bad, however, my gut tells me this is quite ugly...
So generally in any languages, how do you manage your SQL statements? Or do you think it is harmful for maintainibility to let many SQL statements embedded in the application logic? Why or why not?
Thanks.
SQL is a language for accessing databases. Often, it gets confused as being the API into the data store for a larger application. In fact, you should design a real API between the data store and the app.
The means several things.
For accessing data stored in tables, you want to go through views in the database, rather than directly access the tables.
For data modification steps, you want to wrap insert/update/delete in stored procedures. This has secondary benefits, where you can handle constraints and triggers in the stored procedure and better log what is happening.
For security, you want to include database security as part of your security architecture. Giving all users full access may not be the best approach.
Unfortunately, it is easy to write a simple app that uses a database directly, whether in java or ruby or VBA or whatever. This grows into a bigger app, and then the maintenance problems arise.
I would suggest an incremental approach to fixing this. Go through the code and create views where you have nasty select statements. You'll probably find you need many fewer views than selects (the views can be re-used -- a good thing).
Find places where code is being modified, and change these to stored procedures. I always return status from the stored procedure for error checking and put log information into a table called someting like splog or _spcalls.
If you want to limit permissions for different users of your app, then you might be interested in this.
Leaving the raw SQL statements in the code is a problem. Just wait until you want to rename a column and you have to find all the places where this breaks the code.
Yes, this is not optimal - maintenance becomes a nightmare; it's hard to forecast and determine which code must change when underlying DB changes occur. This is why it is good practice to create a data access layer (DAL) to encapsulate CRUD operations from the application logic. There is often an business logic layer (BLL) between the application logic and DAL to enforce business rules/logic.
Google "data access layer" "business logic layer" and even "n-tier architecture" to learn more.
If you are concerned about the SQL statements littered around your application logic, maybe consider implementing them as Stored Procedures?
That way you will only be including the procedure name and any parameters that need to be passed to it in your code.
It has other benefits too, a common one being easier to re-use in multiple files.
There is much debate about speed and security of Stored Procedure and you will never get a definitive answer about that so I won't even open that can of worms.
Here is how you do this with Java: Create a class that encapsulates all access to the database. Add a method to the class for each query you need to run.
The answer for ruby will be similar to this.
It depends on the architecture of your application but a simple solution is to keep each sql in a file, qry.sql. For each Ruby module (or whatever is used in Ruby to aggregate related code) you can keep a folder SQL with these files. So, the collection of SQL folder/files form the data access layer of your application. The Ruby code provides the business layer. If your data model changes (field names, etc), you can do greps to identify the sql files that need changes. Anyway, definitely separate SQL from your logic code.

Confused about the role of a query language

So, I haven't had any luck finding any articles or forum posts that have explained to me how exactly a query language works in conjunction with a general use programming language like c++ or vb. So I guess it wont hurt to ask >.<
Basically, I've been having a hard time understanding what the roles of the query language are ( we'll use SQL as an example for query language and VB6 for norm language) if i'm creating a simple database query that fills a table with normal information (first name, last name, address etc). I somewhat know the steps in setting up a program like this using ado objects for the connection and whatnot, but how do we decide which language of the 2 gets used for certain things ? Does vb6 specifically handle the basics like loops, if else's, declarations of your vars, and SQL specifically handles things like connecting to the database and doing the searching, filtering and sorting ? Is it possible to do certain general use vb6 actions (loops or conditionals) in SQL syntax instead ? Any help would be GREATLY appreciated.
SQL is a language to query a database. SQL is an ISO standard and relational database vendors implement to the ISO standard and then add on their own customizations. For example in SQL Server it is called T-SQL and in Oracle it is called PL-SQL. They both implement ISO standards and so each will have identical queries for a simple select like
select columname from tablename where columnname=1
However, each have different syntax for string functions, date functions, etc....
The ISO SQL standard by design is not a full procedural language with looping, subroutines, ect as in a full procedural language like VB.
However, each vendor has added capabilities to their version to add some of this functionality in.
For example both T-SQL and PL-SQL can "loop" through records using various constructs in their language.
There is also a difference when working with data that many developers are not well in tuned with. That is set based operations vs. procedural based.
Databases can work with procedural constructs but are often more performant with set based. A developer who is not versed in this concept may end up creating a very innefficient query. Here's an example of this discussion.
With any situation you have to weight out the pro's/con's of where it is best to do this work.
I tend to favor using procedural constructs such as loops in the language I am using over SQL. I find it easier to maintain and the language I am using offers more powerful syntax for me to get the job done.
However, I keep both options as a tool in the toolbox. For example, I have written data conversion scripts in SQL and in this case I have used the looping constructs in SQL.
Usually programming language are executed in the client side (app server too), and query languages are executed in the db server, so in the end it depends where you want to put all the work. Sometimes you can put lot of work in the client side by doing all the calculations with the programming language and other times you want to use more the db server and you end up using the query language or even better tsql/psql or whatever.
Relational databases are designed to manage data. In particular, they provide an efficient mechanism for managing memory, disk, and processors for large quantities of data. In addition, relational databases can handle multiple clients, guarantee transactional integrity, security, backups, persistence, and numerous other functions.
In general, if you are using an RDBMS with another language, you want to design the data structure first and then think about the API (applications programming interface) between the two. This is particularly true when you have an app/server relationship.
For a "simple" type of application, which uses a lot of data but with minimal or batch changes to it, you want to move as much of the processing into the database as is reasonable. Here are things you do not want to do:
Use queries to load things into arrays, and then do array manipulations at the language level. SQL provides joins for this.
Load data into an array and do manipulations and summaries on the array. SQL provides aggregations for this.
Save data into a file to have a backup. Databases provide backup mechanisms.
If you data fits into an array or on an Excel spreadsheet, it is often sufficient to get started with the data stored there. Only when you start to expand the needs (multiple clients, security, integration with other data) do the advantages of a database become more apparent.
These are just for guidance and to give you some ideas.
In terms of doing what where, do as much as is sensible in SQL (given it runs on a server) as you can.
So for instance don't do stuff like this (psuedo code)
foreach(row in "Select * from Orders")
if (row[CustomerID] = 876)
Display(row)
Do
foreach(row in "Select * from Orders where CustomerId = 876")
Display(row)
First it's likely Orders is indexed by CustomerID so it will find all 876s order way quicker.
Second to do the first one you just sucked every record in that table into the client's memory space probably across your network.
What language is used is essentially irrelevant, you could invent your own DBMS with it's own language.
It's where you do what processing that matters. It's Rule with exceptions, but the essential idea is let your backend do as much as it can.

Migrating from MySQL to arbitrary standards-compliant SQL2003 server

Is there an incantation of mysqldump or a similar tool that will produce a piece of SQL2003 code to create and fill the same databases in an arbitrary SQL2003 compliant RDBMS?
(The one I'm trying right now is MonetDB)
DDL statements are inherently database-vendor specific. Although they have the same basic structure, each vendor has their own take on how to define types, indexes, constraints, etc.
DML statements on the other hand are fairly portable. Therefore I suggest:
Dump the database without any data (mysqldump --no-data) to get the schema
Make necessary changes to get the schema loaded on the other DB - these need to be done by hand (but some search/replace may be possible)
Dump the data with extended inserts off and no create table (--extended-insert=0 --no-create-info)
Run the resulting script against the other DB.
This should do what you want.
However, when porting an application to a different database vendor, many other things will be required; moving the schema and data is the easy bit. Checking for bugs introduced, different behaviour and performance testing is the hard bit.
At the very least test every single query in your application for validity on the new database. Ideally do a lot more.
This one is kind of tough. Unless you've got a very simple DB structure with vanilla types (varchar, integer, etc), you're probably going to get the best results writing a migration tool. In a language like Perl (via the DBI), this is pretty straight-forward. The program is basically an echo loop that reads from one database and inserts into the other. There are examples of this sort of code that Google knows about.
Aside from the obvious problem of moving the data is the more subtle problem of how some datatypes are represented. For instance, MS SQL's datetime field is not in the same format as MySQL's. Other datatypes like BLOBs may have a different capacity in one RDBMs than in another. You should make sure that you understand the datatype definitions of the target DB system very well before porting.
The last problem, of course, is getting application-level SQL statements to work against the new system. In my work, that's by far the hardest part. Date math seems especially DB-specific, while annoying things like quoting rules are a constant source of irritation.
Good luck with your project.
From SQL Server 2000 or 2005 you can have it generate scripts for your objects, but I am not sure how well they will transfer to other RDBMS.
The generate script option is probably the easiest way to go. You'll undoubtedly have to do some search/replace on a few data types though.