Everyone has been a super help so far. My next question is what is the best way for me to approach this... If I have 7 fields that a user can search what is the best way to conduct this search, They can have any combination of the 7 fields so that is 7! or 5040 Combinations which is impossible to code that many. So how do I account for when the User selects field 1 and field 3 or they select field 1, field 2, and field 7? Is there any easy to do this with SQL? I dont know if I should approach this using an IF statement or go towards a CASE in the select statement. Or should I go a complete different direction? Well if anyone has any helpful pointers I would greatly appreciate it.
Thank You
You'll probably want to look into using dynamic SQL for this. See: Dynamic Search Conditions in T-SQL and Catch-all queries for good articles on this topic.
Select f1,f2 from table where f1 like '%val%' or f2 like '%val%'
You could write a stored procedure that accepts each parameter as null and then write your WHERE clause like:
WHERE (field1 = #param1 or #param1 is null)
AND (field2 = #param2 or #param2 is null) etc...
But I wouldn't recommend it. It can definitely affect performance doing it this way depending on the number of parameters you have. I second Joe Stefanelli's answer with looking into dynamic SQL in this case.
Depends on:
how your data looks like,
how big they are,
how exact result is expected (all matching records or top 100 is enough),
how much resources has you database.
you can try something like:
CREATE PROC dbo.Search(
#param1 INT = NULL,
#param2 VARCHAR(3) = NULL
)
AS
BEGIN
SET NOCOUNT ON
-- create temporary table to keep keys (primary) of matching records from searched table
CREATE TABLE #results (k INT)
INSERT INTO
#results(k)
SELECT -- you can use TOP here to norrow results
key
FROM
table
-- you can use WHERE if there are some default conditions
PRINT ##ROWCOUNT
-- if #param1 is set filter #result
IF #param1 IS NOT NULL BEGIN
PRINT '#param1'
;WITH d AS (
SELECT
key
FROM
table
WHERE
param1 <> #param1
)
DELETE FROM
#results
WHERE
k = key
PRINT ##ROWCOUNT
END
-- if #param2 is set filter #result
IF #param2 IS NOT NULL BEGIN
PRINT '#param2'
;WITH d AS (
SELECT
key
FROM
table
WHERE
param2 <> #param2
)
DELETE FROM
#results
WHERE
k = key
PRINT ##ROWCOUNT
END
-- returns what left in #results table
SELECT
table.* -- or better only columns you need
FROM
#results r
JOIN
table
ON
table.key = r.k
END
I use this technique on large database (millions of records, but running on large server) to filter data from some predefined data. And it works pretty well.
However I don't need all matching records -- depends on query 10-3000 matching records is enough.
If you are using a stored procedure you can use this method:
CREATE PROCEDURE dbo.foo
#param1 VARCHAR(32) = NULL,
#param2 INT = NULL
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM MyTable as t
WHERE (#param1 IS NULL OR t.Column1 = #param1)
AND (#param2 IS NULL OR t.COlumn2 = #param2)
END
GO
These are usually called optional parameters. The idea is that if you don't pass one in it gets the default value (null) and that section of your where clause always returns true.
Related
I have a Table Tab1. I want to make a stored procedure, in which I will take up to 3 parameters from the user and select data from the table using the AND operator. Like:
Select * from Tab1
Where Para1=1 AND Para2=1 AND Para3=4
But I have a condition that the user can pass one, or two, or all three parameters. I want to write a single query such that if he passes all three parameters, then select data according these 3 parameters using the AND operator; if he passes any two parameters, select data according to those two parameters using the AND operator. Lastly, he may pass a single parameter, so then select data according this single parameter.
Is any way to write a single query for above requirement?
SELECT *
FROM Tab1
WHERE (#Para1 IS NULL OR (Para1 = #Para1))
AND (#Para2 IS NULL OR (Para2 = #Para2))
AND (#Para3 IS NULL OR (Para3 = #Para3))
OPTION (RECOMPILE);
So how is this possible, its because in OR short-circuits, i.e. when #Para1 is null (assuming default is null when there is no value) it doesn't go to second condition i.e. Para1 = #Para1, might be due to performance reason coz first is already satisfied which is what OR actually means i.e. to check if any clause is satisfied and similarly with rest of logic Or you can do dynamic query too
Adding to comment below by KM.
It better using OPTION (RECOMPILE), then the execution plan won't be reused coz the select depends hugely on parameters here which are dynamic so adding OPTION (RECOMPILE) would re-generate execution plan.
Try something like:
CREATE PROCEDURE usp_Test
#param1 int = NULL
, #param2 int = NULL
, #param3 int = NULL
AS
BEGIN
SELECT * FROM Tab1
WHERE (Para1 = #param1 OR #param1 IS NULL)
AND (Para2 = #param2 OR #param2 IS NULL)
AND (Para3 = #param3 OR #param3 IS NULL)
END
I have a base stored procedure simply returning a select from the database, like this:
CREATE PROCEDURE MyProcedure
AS
BEGIN
SELECT * FROM MyTable
END
GO
But now I need to execute some logic for every row of my select. According to the result I need to return or not this row. I would have my select statement running with a cursor, checking the rule and return or not the row. Something like this:
CREATE PROCEDURE MyProcedure
AS
BEGIN
DECLARE CURSOR_MYCURSOR FOR SELECT Id, Name FROM MyTable
OPEN CURSOR_MYCURSOR
FETCH NEXT FROM CURSOR_MYCURSOR INTO #OUTPUT1, #OUTPUT2
WHILE (##FETCH_STATUS=0)
BEGIN
IF (SOME_CHECK)
SELECT #OUTPUT1, #OUTPUT2
ELSE
--WILL RETURN SOMETHING ELSE
END
END
GO
The first problem is that everytime I do SELECT #OUTPUT1, #OUTPUT2 the rows are sent back as different result sets and not in a single table as I would need.
Sure, applying some logic to a row sounds like a "FUNCTION" job. But I can't use the result of the function to filter the results being selected. That is because when my check returns false I need to select something else to replace the faulty row. So, I need to return the faulty rows so I can be aware of them and replace by some other row.
The other problem with this method is that I would need to declare quite a few variables so that I can output them through the cursor iteration. And those variables would need to follow the data types for the original table attributes and somehow not getting out of sync if something changes on the original tables.
So, what is the best approach to return a single result set based on a criteria?
Thanks in advance.
I recommend use of cursors but easy solution to your question would be to use table variable or temp table
DECLARE #MyTable TABLE
(
ColumnOne VARCHAR(20)
,ColumnTwo VARCHAR(20)
)
CREATE TABLE #MyTable
(
ColumnOne VARCHAR(20)
,ColumnTwo VARCHAR(20)
)
than inside your cursors you can insert records that match your logic
INSERT INTO #MyTable VALUES (#Output1, #Output2)
INSERT INTO #MyTable VALUES (#Output1, #Output2)
after you done with cursor just select everything from table
SELECT * FROM #MyTable
SELECT * FROM #MyTable
I have a two column table with a primary key (int) and a unique value (nvarchar(255))
When I insert a value to this table, I can use Scope_identity() to return the primary key for the value I just inserted. However, if the value already exists, I have to perform an additional select to return the primary key for a follow up operation (inserting that primary key into a second table)
I'm thinking there must be a better way to do this - I considered using covered indexes but the table only has two columns, most of what I've read on covered indexes suggests they only help where the table is significantly larger than the index.
Is there any faster way to do this? Would a covered index be faster even if its the same size as the table?
Building an index won't gain you anything since you have already created your value column as unique (which builds a index in the background). Effectively a full table scan is no different from an index scan in your scenario.
I assume you want to have a sort of insert-if-not-already-existsts behaviour. There is no way getting around a second select
if not exists (select ID from where name = #...)
insert into ...
select SCOPE_IDENTITY()
else
(select ID from where name = #...)
If the value happens to exist, the query will usually have been cached, so there should be no performance hit for the second ID select.
[Update statment here]
IF (##ROWCOUNT = 0)
BEGIN
[Insert statment here]
SELECT Scope_Identity()
END
ELSE
BEGIN
[SELECT id statment here]
END
I don't know about performance but it has no big overhead
As has already been mentioned this really shouldn't be a slow operation, especially if you index both columns. However if you are determined to reduce the expense of this operation then I see no reason why you couldn't remove the table entirely and just use the unique value directly rather than looking it up in this table. A 1-1 mapping like this is (theoretically) redundant. I say theoretically because there may be performance implications to using an nvarchar instead of an int.
I'll post this answer since everyone else seems to say you have to query the table twice in the event that the record exists... that's not true.
Step 1) Create a unique-index on the other column:
I recommend this as the index:
-- We're including the "ID" column so that SQL will not have to look far once the "WHERE" clause is finished.
CREATE INDEX MyLilIndex ON dbo.MyTable (Column2) INCLUDE (ID)
Step 2)
DECLARE #TheID INT
SELECT #TheID = ID from MyTable WHERE Column2 = 'blah blah'
IF (#TheID IS NOT NULL)
BEGIN
-- See, you don't have to query the table twice!
SELECT #TheID AS TheIDYouWanted
END
ELSE
INSERT...
SELECT SCOPE_IDENTITY() AS TheIDYouWanted
Create a unique index for the second entry, then:
if not exists (select null from ...)
insert into ...
else
select x from ...
You can't get away from the index, and it isn't really much overhead -- SQL server supports index columns upto 900-bytes, and does not discriminate.
The needs of your model are more important than any perceived performance issues, symbolising a string (which is what you are doing) is a common method to reduce database size, and this indirectly (and generally) means better performance.
-- edit --
To appease timothy :
declare #x int = select x from ...
if (#x is not null)
return x
else
...
You could use OUTPUT clause to return the value in the same statement. Here is an example.
DDL:
CREATE TABLE ##t (
id int PRIMARY KEY IDENTITY(1,1),
val varchar(255) NOT NULL
)
GO
-- no need for INCLUDE as PK column is always included in the index
CREATE UNIQUE INDEX AK_t_val ON ##t (val)
DML:
DECLARE #id int, #val varchar(255)
SET #val = 'test' -- or whatever you need here
SELECT #id = id FROM ##t WHERE val = #val
IF (#id IS NULL)
BEGIN
DECLARE #new TABLE (id int)
INSERT INTO ##t (val)
OUTPUT inserted.id INTO #new -- put new ID into table variable immediately
VALUES (#val)
SELECT #id = id FROM #new
END
PRINT #id
One of the "best practice" is accessing data via stored procedures. I understand why is this scenario good.
My motivation is split database and application logic ( the tables can me changed, if the behaviour of stored procedures are same ), defence for SQL injection ( users can not execute "select * from some_tables", they can only call stored procedures ), and security ( in stored procedure can be "anything" which secure, that user can not select/insert/update/delete data, which is not for them ).
What I don't know is how to access data with dynamic filters.
I'm using MSSQL 2005.
If I have table:
CREATE TABLE tblProduct (
ProductID uniqueidentifier -- PK
, IDProductType uniqueidentifier -- FK to another table
, ProductName nvarchar(255) -- name of product
, ProductCode nvarchar(50) -- code of product for quick search
, Weight decimal(18,4)
, Volume decimal(18,4)
)
then I should create 4 stored procedures ( create / read / update / delete ).
The stored procedure for "create" is easy.
CREATE PROC Insert_Product ( #ProductID uniqueidentifier, #IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
INSERT INTO tblProduct ( ProductID, IDProductType, ... etc .. ) VALUES ( #ProductID, #IDProductType, ... etc ... )
END
The stored procedure for "delete" is easy too.
CREATE PROC Delete_Product ( #ProductID uniqueidentifier, #IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
DELETE tblProduct WHERE ProductID = #ProductID AND IDProductType = #IDProductType AND ... etc ...
END
The stored procedure for "update" is similar as for "delete", but I'm not sure this is the right way, how to do it. I think that updating all columns is not efficient.
CREATE PROC Update_Product( #ProductID uniqueidentifier, #Original_ProductID uniqueidentifier, #IDProductType uniqueidentifier, #Original_IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
UPDATE tblProduct SET ProductID = #ProductID, IDProductType = #IDProductType, ... etc ...
WHERE ProductID = #Original_ProductID AND IDProductType = #Original_IDProductType AND ... etc ...
END
And the last - stored procedure for "read" is littlebit mystery for me. How pass filter values for complex conditions? I have a few suggestion:
Using XML parameter for passing where condition:
CREATE PROC Read_Product ( #WhereCondition XML ) AS BEGIN
DECLARE #SELECT nvarchar(4000)
SET #SELECT = 'SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume FROM tblProduct'
DECLARE #WHERE nvarchar(4000)
SET #WHERE = dbo.CreateSqlWherecondition( #WhereCondition ) --dbo.CreateSqlWherecondition is some function which returns text with WHERE condition from passed XML
DECLARE #LEN_SELECT int
SET #LEN_SELECT = LEN( #SELECT )
DECLARE #LEN_WHERE int
SET #LEN_WHERE = LEN( #WHERE )
DECLARE #LEN_TOTAL int
SET #LEN_TOTAL = #LEN_SELECT + #LEN_WHERE
IF #LEN_TOTAL > 4000 BEGIN
-- RAISE SOME CONCRETE ERROR, BECAUSE DYNAMIC SQL ACCEPTS MAX 4000 chars
END
DECLARE #SQL nvarchar(4000)
SET #SQL = #SELECT + #WHERE
EXEC sp_execsql #SQL
END
But, I think the limitation of "4000" characters for one query is ugly.
The next suggestion is using filter tables for every column. Insert filter values into the filter table and then call stored procedure with ID of filters:
CREATE TABLE tblFilter (
PKID uniqueidentifier -- PK
, IDFilter uniqueidentifier -- identification of filter
, FilterType tinyint -- 0 = ignore, 1 = equals, 2 = not equals, 3 = greater than, etc ...
, BitValue bit , TinyIntValue tinyint , SmallIntValue smallint, IntValue int
, BigIntValue bigint, DecimalValue decimal(19,4), NVarCharValue nvarchar(4000)
, GuidValue uniqueidentifier, etc ... )
CREATE TABLE Read_Product ( #Filter_ProductID uniqueidentifier, #Filter_IDProductType uniqueidentifier, #Filter_ProductName uniqueidentifier, ... etc ... ) AS BEGIN
SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume
FROM tblProduct
WHERE ( #Filter_ProductID IS NULL
OR ( ( ProductID IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_ProductID AND FilterType = 1 ) AND NOT ( ProductID IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_ProductID AND FilterType = 2 ) )
AND ( #Filter_IDProductType IS NULL
OR ( ( IDProductType IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_IDProductType AND FilterType = 1 ) AND NOT ( IDProductType IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_IDProductType AND FilterType = 2 ) )
AND ( #Filter_ProductName IS NULL OR ( ... etc ... ) )
END
But this suggestion is littlebit complicated I think.
Is there some "best practice" to do this type of stored procedures?
For reading data, you do not need a stored procedure for security or to separate out logic, you can use views.
Just grant only select on the view.
You can limit the records shown, change field names, join many tables into one logical "table", etc.
First: for your delete routine, your where clause should only include the primary key.
Second: for your update routine, do not try to optimize before you have working code. In fact, do not try to optimize until you can profile your application and see where the bottlenecks are. I can tell you for sure that updating one column of one row and updating all columns of one row are nearly identical in speed. What takes time in a DBMS is (1) finding the disk block where you will write the data and (2) locking out other writers so that your write will be consistent. Finally, writing the code necessary to update only the columns that need to change will generally be harder to do and harder to maintain. If you really wanted to get picky, you'd have to compare the speed of figuring out which columns changed compared with just updating every column. If you update them all, you don't have to read any of them.
Third: I tend to write one stored procedure for each retrieval path. In your example, I'd make one by primary key, one by each foreign key and then I'd add one for each new access path as I needed them in the application. Be agile; don't write code you don't need. I also agree with using views instead of stored procedures, however, you can use a stored procedure to return multiple result sets (in some version of MSSQL) or to change rows into columns, which can be useful.
If you need to get, for example, 7 rows by primary key, you have some options. You can call the stored procedure that gets one row by primary key seven times. This may be fast enough if you keep the connection opened between all the calls. If you know you never need more than a certain number (say 10) of IDs at a time, you can write a stored procedure that includes a where clause like "and ID in (arg1, arg2, arg3...)" and make sure that unused arguments are set to NULL. If you decide you need to generate dynamic SQL, I wouldn't bother with a stored procedure because TSQL is just as easy to make a mistake as any other language. Also, you gain no benefit from using the database to do string manipulation -- it's almost always your bottleneck, so there is no point in giving the DB any more work than necessary.
I disagree that create Insert/Update/Select stored procedures are a "best practice". Unless your entire application is written in SPs, use a database layer in your application to handle these CRUD activities. Better yet, use an ORM technology to handle them for you.
My suggestion is that you don't try to create a stored procedure that does everything that you might now or ever need to do. If you need to retrieve a row based on the table's primary key then write a stored procedure to do that. If you need to search for all rows meeting a set of criteria then find out what that criteria might be and write a stored procedure to do that.
If you try to write software that solves every possible problem rather than a specific set of problems you will usually fail at providing anything useful.
your select stored procedure can be done as follows to require only one stored proc but any number of different items in the where clause. Pass in any one or combination of the parameters and you will get ALL items which match - so you only need one stored proc.
Create sp_ProductSelect
(
#ProductID int = null,
#IDProductType int = null,
#ProductName varchar(50) = null,
#ProductCode varchar(10) = null,
...
#Volume int = null
)
AS
SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume FROM tblProduct'
Where
((#ProductID is null) or (ProductID = #ProductID)) AND
((#ProductName is null) or (ProductName = #ProductName)) AND
...
((#Volume is null) or (Volume= #Volume))
In SQL 2005, it supports nvarchar(max), which has a limit of 2G, but virtually accepting all string operations upon normal nvarchar. You may want to test if this can fit into what you need in the first approach.
I'm not sure if this is something I should do in T-SQL or not, and I'm pretty sure using the word 'iterate' was wrong in this context, since you should never iterate anything in sql. It should be a set based operation, correct? Anyway, here's the scenario:
I have a stored proc that returns many uniqueidentifiers (single column results). These ids are the primary keys of records in a another table. I need to set a flag on all the corresponding records in that table.
How do I do this without the use of cursors? Should be an easy one for you sql gurus!
This may not be the most efficient, but I would create a temp table to hold the results of the stored proc and then use that in a join against the target table. For example:
CREATE TABLE #t (uniqueid int)
INSERT INTO #t EXEC p_YourStoredProc
UPDATE TargetTable
SET a.FlagColumn = 1
FROM TargetTable a JOIN #t b
ON a.uniqueid = b.uniqueid
DROP TABLE #t
You could also change your stored proc to a user-defined function that returns a table with your uniqueidentifiers. You can joing directly to the UDF and treat it like a table which avoids having to create the extra temp table explicitly. Also, you can pass parameters into the function as you're calling it, making this a very flexible solution.
CREATE FUNCTION dbo.udfGetUniqueIDs
()
RETURNS TABLE
AS
RETURN
(
SELECT uniqueid FROM dbo.SomeWhere
)
GO
UPDATE dbo.TargetTable
SET a.FlagColumn = 1
FROM dbo.TargetTable a INNER JOIN dbo.udfGetUniqueIDs() b
ON a.uniqueid = b.uniqueid
Edit:
This will work on SQL Server 2000 and up...
Insert the results of the stored proc into a temporary table and join this to the table you want to update:
INSERT INTO #WorkTable
EXEC usp_WorkResults
UPDATE DataTable
SET Flag = Whatever
FROM DataTable
INNER JOIN #WorkTable
ON DataTable.Ket = #WorkTable.Key
If you upgrade to SQL 2008 then you can pass table parameters I believe. Otherwise, you're stuck with a global temporary table or creating a permanent table that includes a column for some sort of process ID to identify which call to the stored procedure is relevant.
How much room do you have in changing the stored procedure that generates the IDs? You could add code in there to handle it or have a parameter that lets you optionally flag the rows when it is called.
Use temporary tables or a table variable (you are using SS2005).
Although, that's not nest-able - if a stored proc uses that method then you can't dumpt that output into a temp table.
An ugly solution would be to have your procedure return the "next" id each time it is called by using the other table (or some flag on the existing table) to filter out the rows that it has already returned
You can use a temp table or table variable with an additional column:
DECLARE #MyTable TABLE (
Column1 uniqueidentifer,
...,
Checked bit
)
INSERT INTO #MyTable
SELECT [...], 0 FROM MyTable WHERE [...]
DECLARE #Continue bit
SET #Continue = 1
WHILE (#Continue)
BEGIN
SELECT #var1 = Column1,
#var2 = Column2,
...
FROM #MyTable
WHERE Checked = 1
IF #var1 IS NULL
SET #Continue = 0
ELSE
BEGIN
...
UPDATE #MyTable SET Checked = 1 WHERE Column1 = #var1
END
END
Edit: Actually, in your situation a join will be better; the code above is a cursorless iteration, which is overkill for your situation.