Use inner join only if temporary table have values - sql

I have dynamic query where first I create a temporary table then I fill it:
CREATE TABLE [#SearchKeys] ([DesignKey] INT);
INSERT INTO [#SearchKeys]
SELECT
[pd].[DesignKey]
FROM
[Project] AS [p]
....
Once it have data I just use into INNER JOIN section of my dynamic query like:
INNER JOIN
(SELECT DesignKey FROM #SearchKeys) AS [S] ON [S].[DesignKey] = [PD].[DesignKey]
Problem is I only want to add this INNER JOIN if temporary table have values, if not just don't execute it. How can I achieve that? Regards

I dont see dynamic query from your question. But I just give you pesudocode here
--Check for If Data Exist in #SearchKeys
IF Exists (SELECT 1 FROM #SearchKeys)--Condition to check value available in temp table
BEGIN
--without INNER JOIN Query
END
ELSE
BEGIN
-- with INNER JOIN Query
INNER JOIN (SELECT DesignKey FROM #SearchKeys) AS [S] ON [S].[DesignKey] = [PD].[DesignKey]
END

My suggestion, which is fully inlined, was this:
DECLARE #mockupTable TABLE(ID INT IDENTITY,Content INT);
INSERT INTO #mockupTable VALUES(10),(20),(30);
DECLARE #SearchKeys TABLE(DesignKey INT);
--Keep it empty in the first run, then decomment the insert to see the difference
--INSERT INTO #SearchKeys VALUES(20)
SELECT *
FROM #mockupTable t
LEFT JOIN #SearchKeys k ON t.Content=k.DesignKey
WHERE ((SELECT COUNT(*) FROM #SearchKeys)=0 OR DesignKey IS NOT NULL);
The LEFT JOIN will return all rows in any case. The WHERE will decide if there are filters in the SearchKey-table. In this case only rows with a corresponding key are returned.
Hint: If needed, you can easily turn your keys to an anti-pattern by using IS NULL instead of IS NOT NULL. In this case you'd introduce a variable and use something like OR ((#antipattern=0 AND ...) OR (#antipattern=1 AND ...))
The other answer by Developer_29 will be better optimized, thus faster. But in many cases we don't want multi-statement approaches

Related

Where am I going wrong with this SQL query?

I am attempting to do the following:
Check to see if the table does not exist and if so, create the TABLE 'tmpTriangleTransfer'.
Check to see if the table exists and if so, DROP the TABLE 'tmpTriangleTransfer'.
Insert the data being pulled from the other tables into the 2nd -
5th columns of the TABLE 'tmpTriangleTransfer'.
Loop and for each row that exists in the TABLE 'tmpTriangleTransfer' update the 1st column with the declared information.
Return all of the information from that table (to be formatted into a report).
Can someone please help me figure out what I am doing wrong? I'm getting no results even though I know for a fact there are records (when I run just the SELECT statement on the last line, it shows records and when I run the SELECT DISTINCT statement in the middle, it shows the same records).
IF OBJECT_ID('tmpTriangleTransfer') IS NOT NULL
DROP TABLE tmpTriangleTransfer;
IF OBJECT_ID('tmpTriangleTransfer') IS NULL
CREATE TABLE tmpTriangleTransfer
(
CompanyName varchar(max),
OrderID decimal(19,2) NULL,
DriverID int NULL,
VehicleID int NULL,
Phone varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
BOL varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
);
INSERT INTO tmpTriangleTransfer (OrderID, BOL, DriverID, VehicleID, Phone)
SELECT DISTINCT tblOrder.OrderID AS OrderID, tblOrder.BOL AS BOL, tblOrderDrivers.DriverID AS DriverID, tblDrivers.VehicleID AS VehicleID, tblWorker.Phone AS Phone
FROM tblOrder WITH (NOLOCK)
INNER JOIN tblActiveOrders
ON tblOrder.OrderID = tblActiveOrders.OrderID
INNER JOIN tblOrderDrivers
ON tblOrder.OrderID = tblOrderDrivers.OrderID
INNER JOIN tblDrivers
ON tblOrderDrivers.DriverID = tblDrivers.DriverID
INNER JOIN tblWorker
ON tblDrivers.WorkerID = tblWorker.WorkerID
WHERE tblOrder.CustID = 7317
ORDER BY tblOrder.OrderID`
DECLARE #MaxRownum INT
SET #MaxRownum = (SELECT MAX(OrderID) FROM tmpTriangleTransfer)
DECLARE #Iter INT
SET #Iter = (SELECT MIN(OrderID) FROM tmpTriangleTransfer)
WHILE #Iter <= #MaxRownum
BEGIN
UPDATE tmpTriangleTransfer
SET tmpTriangleTransfer.CompanyName = 'Triangle'
WHERE tmpTriangleTransfer.CompanyName IS NULL;
SET #Iter = #Iter + 1
END
SELECT * from tmpTriangleTransfer WITH (NOLOCK)
Your existing query is far too complicated. In fact, you don't need a temporary table, the WHILE loop, or anything - just a single SELECT is all you need:
SELECT
'Triangle' AS CompanyName,
tblOrder.OrderId,
tblOrder.BOL,
tblOrderOrders.DriverID,
tblDrivers.VehicleID,
tblWorker.Phone
FROM
tblOrder
OUTER JOIN tblActiveOrders ON tblOrder.OrderID = tblActiveOrders.OrderID
OUTER JOIN tblOrderDrivers ON tblOrder.OrderID = tblOrderDrivers.OrderID
OUTER JOIN tblDrivers ON tblOrderDrivers.DriverID = tblDrivers.DriverID
OUTER JOIN tblWorker ON tblDrivers.WorkerID = tblWorker.WorkerID
WHERE
tblOrder.CustID = 7317
ORDER BY
tblOrder.OrderID
I've changed your query to use OUTER JOIN instead of INNER JOIN because I suspect this is the main reason for no data being returned. INNER JOIN requires rows to exist in both tables (relations) and I suspect that you have Orders without Drivers or that not every Order is in ActiveOrders. Change the joins to INNER JOIN if you know that related rows will always be present.
You can return literals in queries directly, like I'm doing in the SELECT 'Triangle' AS CompanyName part, whereas you were seemingly manually adding it to the output temporary-table.
Your code didn't seem to be doing anything that would require the WITH (NOLOCK) modifier - the fact it was repeated everywhere makes it look like a case of Cargo-Cult Programming.
Tip: In SQL, a SELECT statement, as written, is not representative of its logical execution order. It should instead be read in this order: FROM > WHERE > [GROUP BY >] SELECT > ORDER BY.
This is why in .NET Linq the .Select() call is often at the end, not the beginning, because previous Linq expressions define the data sources.
This query can be parameterised by converting it to a Table-defined Function that accepts CustID as a parameter, I also assume you have the company name "Triangle" stored in a table somewhere - embedding it as a literal value for a single query is a code-smell - what's so special about 7317 / "Triangle"?
Related note: Generally speaking, queries that only SELECT data (and don't perform any INSERT/UPDATE/DELETE/ALTER/CREATE statements) should be Table-valued UDFs or Views and not Stored Procedures - so that they can benefit from function-composition, query-composition and runtime execution plan optimizations that you cannot get with Stored Procedures.
If you're able to, see if you can remove the tbl prefix from the table names (Using "tbl" as a prefix has its defenders, but my own personal opinion is that it's an obsolete developer aid as today's database tooling shows type information, and it makes database refactoring harder (e.g. converting a table to a view).
Taken from a combination of the suggestion from Dai and the requirements of my employer:
`SELECT 'Triangle' AS CompanyName, tblOrder.OrderId AS OrderID, tblOrder.BOL AS BOL, tblOrderDrivers.DriverID AS DriverID, tblDrivers.VehicleID AS VehicleID, tblWorker.Phone AS Phone
FROM tblOrder WITH (NOLOCK)
INNER JOIN tblActiveOrders WITH (NOLOCK)
ON tblOrder.OrderID = tblActiveOrders.OrderID
INNER JOIN tblOrderDrivers WITH (NOLOCK)
ON tblOrder.OrderID = tblOrderDrivers.OrderID
INNER JOIN tblDrivers WITH (NOLOCK)
ON tblOrderDrivers.DriverID = tblDrivers.DriverID
INNER JOIN tblWorker WITH (NOLOCK)
ON tblDrivers.WorkerID = tblWorker.WorkerID
WHERE
tblOrder.CustID = 7317
ORDER BY
tblOrder.OrderID desc`

SQL IN Clause replacement Temp table

My query is :
Select *
from Person
where KV_CODE IN('','',''......1000 values here)
How to write this list of values in a temp table so that I can replace IN clause with a join to increase performance.
This list of values is coming on the basis of random selection by user and is stored in a Java collection.
Select * from Person P INNER JOIN #Temp T ON T.KV_CODE = P.KV_CODE
You can apply INNER JOIN on Temp Table similar to Normal Tables.
select *
from Person
where KV_CODE in (
select KV_CODE
from TempTable
)
This will compare the KV_CODE of Person to the KV_CODE of TempTable. If there are matches, it will intersect them, meaning your first select will print only those rows.
select *
from Person p
join TempTable tmp
on p.KV_CODE = tmp.KV_CODE
This will join the two tables and display the matching rows from both tables.
Unless your question is "how to populate a temporary table from a java collection"
Newer database solutions optimize this already. There is not much difference between IN clause and Join as optimizer figures out this aspect and execute query by appropriate path.
If you still wish, here is what you have to do.
if you are using Oracle,
you can create a session scope or transaction scope temporary table.
Insert data into this temporary table
Fire query like
SELECT *
FROM PERSONS T1,
TEMP_TABLE_NAME_HERE T2
WHERE T1.KV_CODE = T2.KV_CODE
However, a better solution is to avoid temp table creation during run time. You can actually create a permanent table in your application database with some unique key to identify your session (to isolate cross session impacts, i.e 2 users using same functionality).
CREATE TABLE KV_TEMP_TABLE
(SESSION_ID VARCHAR2(100),
KV_CODE VARCHAR2(100)
);
--Adjust datatypes suitably
Then your query should look like
SELECT * FROM PERSONS T1, KV_TEMP_TABLE T2
WHERE T2.SESSION_ID = ?
AND T1.KV_CODE = T2.KV_CODE
--Bind your session id
Before issuing this query, you have to populate data into KV_TEMP_TABLE using normal insert statements (along with a session id).
Use this example to populates a temp table :
DECLARE #t TABLE
(
EmployeeID INT,
Certs VARCHAR(8000)
)
INSERT #t VALUES (1,'B.E.,MCA, MCDBA, PGDCA'), (2,'M.Com.,B.Sc.'), (3,'M.Sc.,M.Tech.')
SELECT EmployeeID,
LTRIM(RTRIM(m.n.value('.[1]','varchar(8000)'))) AS Certs
FROM
(
SELECT EmployeeID,CAST('<XMLRoot><RowData>' + REPLACE(Certs,',','</RowData><RowData>') + '</RowData></XMLRoot>' AS XML) AS x
FROM #t
)t
CROSS APPLY x.nodes('/XMLRoot/RowData')m(n)
Once the temp table #t is populated, it can be used with JOINs to get the desired output.

SQL Server Table-Value Function and Except Combination performance

I have a table (Resources with about 18000 records) and a Table-Value Function with this body :
ALTER FUNCTION [dbo].[tfn_GetPackageResources]
(
#packageId int=null,
#resourceTypeId int=null,
#resourceCategoryId int=null,
#resourceGroupId int=null,
#resourceSubGroupId int=null
)
RETURNS TABLE
AS
RETURN
(
SELECT Resources.*
FROM Resources
INNER JOIN ResourceSubGroups ON Resources.ResourceSubGroupId=ResourceSubGroups.Id
INNER JOIN ResourceGroups ON ResourceSubGroups.ResourceGroupId=ResourceGroups.Id
INNER JOIN ResourceCategories ON ResourceGroups.ResourceCategoryId=ResourceCategories.Id
INNER JOIN ResourceTypes ON ResourceCategories.ResourceTypeId=ResourceTypes.Id
WHERE
(#resourceSubGroupId IS NULL OR ResourceSubGroupId=#resourceSubGroupId) AND
(#resourceGroupId IS NULL OR ResourceGroupId=#resourceGroupId) AND
(#resourceCategoryId IS NULL OR ResourceCategoryId=#resourceCategoryId) AND
(#resourceTypeId IS NULL OR ResourceTypeId=#resourceTypeId) AND
(#packageId IS NULL OR PackageId=#packageId)
)
now I make a query like this :
SELECT id
FROM dbo.tfn_GetPackageResources(#sourcePackageId,null,null,null,null)
WHERE id not in(
SELECT a.Id
FROM dbo.tfn_GetPackageResources(#sourcePackageId,null,null,null,null) a INNER JOIN
dbo.tfn_GetPackageResources(#comparePackageId,null,null,null,null) b
ON a.No = b.No AND
a.UnitCode=b.UnitCode AND
a.IsCompound=b.IsCompound AND
a.Title=b.Title
)
This query takes about 10 seconds!(Although each part query runs extremely fast but the whole one take time) I check it with LEFT JOIN and NOT EXISTS but the result was same.
but if I run the query on the Resources table directly it only takes one second or less! the fast query is :
select * from resources where id not in (select id from resources)
how can I solve it?
Your UDF is expanded like a macro.
So your complete query has
9 INNER JOINs in the IN clause
4 INNER JOINs in the main SELECT.
You apply (... IS NULL OR ...) 15 times in total for each of your WHERE clauses.
Your idea of clever code reuse fails because of this expansionSQL does not usually lend itself to this reuse.
Keep it simple:
SELECT
R.id
FROM
Resources R
WHERE
R.PackageId = #sourcePackageId
AND
R.id not in (
SELECT a.Id
FROM Resources a
INNER JOIN
Resources b
ON a.No = b.No AND
a.UnitCode=b.UnitCode AND
a.IsCompound=b.IsCompound AND
a.Title=b.Title
WHERE
a.PackageId = #sourcePackageId
AND
b.PackageId = #comparePackageId
)
For more, see my other answers here:
Why is a UDF so much slower than a subquery?
Profiling statements inside a User-Defined Function
Does query plan optimizer works well with joined/filtered table-valued functions?
Table Valued Function where did my query plan go?
In your function, declare the type of the table it returns, and include a primary key. This way, the ID filter will be able to look up the IDs more efficiently.
See http://msdn.microsoft.com/en-us/library/ms191165(v=sql.105).aspx for the syntax.
Thing you should try is to break one complicated query into multiple simple ones that store their results in temporary tables, this way one complicated execution plan will be replaced by several simple plans whose total execution time might be shorter then the execution time of a complicated execution plan:
SELECT *
INTO #temp1
FROM dbo.tfn_GetPackageResources(#sourcePackageId,null,null,null,null)
SELECT *
INTO #temp2
FROM dbo.tfn_GetPackageResources(#comparePackageId,null,null,null,null)
SELECT a.Id
INTO #ids
FROM #temp1 a
INNER JOIN
#temp2 b ON
a.No = b.No
AND a.UnitCode=b.UnitCode
AND a.IsCompound=b.IsCompound
AND a.Title=b.Title
SELECT id
FROM #temp1
WHERE id not in(
SELECT Id
FROM #ids
)
-- you can also try replacing the above query with this one if it performs faster
SELECT id
FROM #temp1 t
WHERE NOT EXISTS
(
SELECT Id FROM #ids i WHERE i.Id = t.id
)

Converting a nested sql where-in pattern to joins

I have a query that is returning the correct data to me, but being a developer rather than a DBA I'm wondering if there is any reason to convert it to joins rather than nested selects and if so, what it would look like.
My code currently is
select * from adjustments where store_id in (
select id from stores where original_id = (
select original_id from stores where name ='abcd'))
Any references to the better use of joins would be appreciated too.
Besides any likely performance improvements, I find following much easier to read.
SELECT *
FROM adjustments a
INNER JOIN stores s ON s.id = a.store_id
INNER JOIN stores s2 ON s2.original_id = s.original_id
WHERE s.name = 'abcd'
Test script showing my original fault in ommitting original_id
DECLARE #Adjustments TABLE (store_id INTEGER)
DECLARE #Stores TABLE (id INTEGER, name VARCHAR(32), original_id INTEGER)
INSERT INTO #Adjustments VALUES (1), (2), (3)
INSERT INTO #Stores VALUES (1, 'abcd', 1), (2, '2', 1), (3, '3', 1)
/*
OP's Original statement returns store_id's 1, 2 & 3
due to original_id being all the same
*/
SELECT * FROM #Adjustments WHERE store_id IN (
SELECT id FROM #Stores WHERE original_id = (
SELECT original_id FROM #Stores WHERE name ='abcd'))
/*
Faulty first attempt with removing original_id from the equation
only returns store_id 1
*/
SELECT a.store_id
FROM #Adjustments a
INNER JOIN #Stores s ON s.id = a.store_id
WHERE s.name = 'abcd'
If you would use joins, it would look like this:
select *
from adjustments
inner join stores on stores.id = adjustments.store_id
inner join stores as stores2 on stores2.original_id = stores.original_id
where stores2.name = 'abcd'
(Apparently you can omit the second SELECT on the stores table (I left it out of my query) because if I'm interpreting your table structure correctly,
select id from stores where original_id = (select original_id from stores where name ='abcd')
is the same as
select * from stores where name ='abcd'.)
--> edited my query back to the original form, thanks to Lieven for pointing out my mistake in his answer!
I prefer using joins, but for simple queries like that, there is normally no performance difference. SQL Server treats both queries the same internally.
If you want to be sure, you can look at the execution plan.
If you run both queries together, SQL Server will also tell you which query took more resources than the other (in percent).
A slightly different approach:
select * from adjustments a where exists
(select null from stores s1, stores s2
where a.store_id = s1.id and s1.original_id = s2.original_id and s2.name ='abcd')
As say Microsoft here:
Many Transact-SQL statements that include subqueries can be
alternatively formulated as joins. Other questions can be posed only
with subqueries. In Transact-SQL, there is usually no performance
difference between a statement that includes a subquery and a
semantically equivalent version that does not. However, in some cases
where existence must be checked, a join yields better performance.
Otherwise, the nested query must be processed for each result of the
outer query to ensure elimination of duplicates. In such cases, a join
approach would yield better results.
Your case is exactly when Join and subquery gives the same performance.
Example when subquery can not be converted to "simple" JOIN:
select Country,TR_Country.Name as Country_Translated_Name,TR_Country.Language_Code
from Country
JOIN TR_Country ON Country.Country=Tr_Country.Country
where country =
(select top 1 country
from Northwind.dbo.Customers C
join
Northwind.dbo.Orders O
on C.CustomerId = O.CustomerID
group by country
order by count(*))
As you can see, every country can have different name translations so we can not just join and count records (in that case, countries with larger quantities of translations will have more record counts)
Of cource, you can can transform this example to:
JOIN with derived table
CTE
but it is an other tale-)

SQL Server Update with Inner Join

I have 3 tables (simplified):
tblOrder(OrderId INT)
tblVariety(VarietyId INT,Stock INT)
tblOrderItem(OrderId,VarietyId,Quantity INT)
If I place an order, I drop the stock level using this:
UPDATE tblVariety
SET tblVariety.Stock = tblVariety.Stock - tblOrderItem.Quantity
FROM tblVariety
INNER JOIN tblOrderItem ON tblVariety.VarietyId = tblOrderItem.VarietyId
INNER JOIN tblOrder ON tblOrderItem.OrderId = tblOrder.OrderId
WHERE tblOrder.OrderId = 1
All fine, until there are two rows in tblOrderItem with the same VarietyId for the same OrderId. In this case, only one of the rows is used for the stock update. It seems to be doing a GROUP BY VarietyId in there somehow.
Can anyone shed some light? Many thanks.
My guess is that because you have shown us simplified schema, some info is missing that would determine why have the repeated VarietyID values for a given OrderID.
When you have multiple rows, SQL Server will arbritrarily pick one of them for the update.
If this is the case, you need to group first
UPDATE V
SET
Stock = Stock - foo.SumQuantity
FROM
tblVariety V
JOIN
(SELECT SUM(Quantity) AS SumQuantity, VarietyID
FROM tblOrderItem
JOIN tblOrder ON tblOrderItem.OrderId = tblOrder.OrderId
WHERE tblOrder.OrderId = 1
GROUP BY VarietyID
) foo ON V.VarietyId = foo.VarietyId
If not, then the OrderItems table PK is wrong because if allows duplicate OrderID/VarietyID combinations (The PK should be OrderID/VarietyID, or these should be constrained unique)
From the documentation UPDATE
The results of an UPDATE statement are
undefined if the statement includes a
FROM clause that is not specified in
such a way that only one value is
available for each column occurrence
that is updated (in other words, if
the UPDATE statement is not
deterministic). For example, given the
UPDATE statement in the following
script, both rows in table s meet the
qualifications of the FROM clause in
the UPDATE statement, but it is
undefined which row from s is used to
update the row in table t.
CREATE TABLE s (ColA INT, ColB DECIMAL(10,3))
GO
CREATE TABLE t (ColA INT PRIMARY KEY, ColB DECIMAL(10,3))
GO
INSERT INTO s VALUES(1, 10.0)
INSERT INTO s VALUES(1, 20.0)
INSERT INTO t VALUES(1, 0.0)
GO
UPDATE t
SET t.ColB = t.ColB + s.ColB
FROM t INNER JOIN s ON (t.ColA = s.ColA)
GO
You are doing an update. It will update once.
Edit: to solve, you can add in a subquery that will group your orderitems by orderid and varietyid, with a sum on the amount.