Need some help with this query, I just want to know if what I am doing is fine or do I need JOIN to get it better. Sorry if this a silly question but I am little worried as I query the same table thrice. Thanks in advance
Select *
from TableA
where (A_id in (1, 2, 3, 4)
and flag = 'Y') or
(A_id in
(select A_id from TableB
where A_id in
(Select A_id from TableA
where (A_id in (1, 2, 3, 4)
and flag = 'N')
group by A_id
having sum(qty) > 0)
)
Relation between TableA and TableB is one-to-many
Condition or Logic:
if the flag is true, the data can be selected without further checks
if the flag is false, we have to refer TableB to see if sum of the qty column is greater than 0
Your approach is indeed way too complicated. Select from A where flag = Y or the sum of related B > 0. Do the latter in a subquery.
select *
from a
where a_id in (1,2,3,4)
and
(
flag = 'Y'
or
(select sum(qty) from b where b.a_id = a.a_id) > 0
)
There's nothing badly wrong with the query you've presented, but there are improvements that can be made. If you move the test for Flag='N' into your first select from TableA and correlate your select from TableB with your first select from TableA, then you can dispense with the second select from TableA:
Select *
from TableA A
where A_id in (1, 2, 3, 4)
and (flag = 'Y'
or (flag = 'N'
and A_id in (select A_id
from TableB B
where b.A_id = a.A_id
group by A_id
having sum(qty) > 0))
);
This will eliminate an extra lookup on TableA for information that should already be known. Second since TableA.A_Id is now correlated with TableB.A_Id, the A_Id in (...) can be changed to an exists clause:
Select *
from TableA A
where A_id in (1, 2, 3, 4)
and (flag = 'Y'
or (flag = 'N'
and exists (select A_id
from TableB B
where b.A_id = a.A_id
group by A_id
having sum(qty) > 0))
);
This may (depending on the database type) inform the databases query optimizer that it can stop retrieving rows from TableB after the first row is found.
In an Oracle database on a small unindexed sample dataset these two changes shaved 25% off of the cost of the query, so the performance increases could be significant.
Would it be possible for you to split this query into a store procedure?
In Example:
DELIMITER $$
CREATE FUNCTION flaggedSelection ( my_flag varchar(1) )
RETURNS varchar(255) -- TODO: change to appropriate output
BEGIN
DECLARE return_value varchar(255); -- TODO: change to appropriate output
IF flag = 'Y'
THEN
-- Performe select without further checks
-- return_value = QUERY;
ELSE
-- Refer TableB to see if sum of the qty column is greater than 0
-- return_value = QUERY;
END IF;
RETURN return_value;
END; $$
DELIMITER;
Related
Below I have sql select to retrieve values from a table. I want to retrieve the values from tableA regardless of whether or not there are matching rows in tableB. The below gives me both non-null values and null values. How do I filter out the null values if non-null rows exist, but otherwise keep the null values?
SELECT a.* FROM
(
SELECT
id,
col1,
coll2
FROM tableA a LEFT OUTER JOIN tableB b ON b.col1=a.col1 and b.col2='value'
WHERE a.id= #id
AND a.col2= #arg
) AS a
ORDER BY col1 ASC
You can do this by counting the number of matches using a window function. Then, either return all rows in A if there are no matching B rows, or only return the rows that do match:
select id, col1, col2
from (SELECT a.id, a.col1, a.coll2,
count(b.id) over () as numbs
FROM tableA a LEFT OUTER JOIN tableB b ON b.col1=a.col1 and b.col2='value'
WHERE a.id = #id AND a.col2= #arg
) ab
where numbs = 0 or b.id is not null;
Filter them out in WHERE clause
SELECT
id,
col1,
coll2
FROM tableA a LEFT OUTER JOIN tableB b ON b.col1=a.col1 and b.col2='value'
WHERE a.id= #id
AND a.col2= #arg
AND A.Col1 IS NOT NULL -- HERE
) AS a
ORDER BY col1 ASC
For some reason, people write code like Marko that puts a filter (b.col2 = 'value') in the JOIN clause. While this works, it is not good practice.
Also, you should get in the habit of having the ON clause in the right sequence. We are joining table A to table B, why write it as B.col1 = A.col1 which is backwards.
While the above statement works, it could definitely be improved.
I created the following test tables.
-- Just playing
use tempdb;
go
-- Table A
if object_id('A') > 0 drop table A
go
create table A
(
id1 int,
col1 int,
col2 varchar(16)
);
go
-- Add data
insert into A
values
(1, 1, 'Good data'),
(2, 2, 'Good data'),
(3, 3, 'Good data');
-- Table B
if object_id('B') > 0 drop table B
go
create table B
(
id1 int,
col1 int,
col2 varchar(16)
);
-- Add data
insert into B
values
(1, 1, 'Good data'),
(2, 2, 'Good data'),
(3, NULL, 'Null data');
Here is the improved statement. I choose literals instead of variables. However, you can change for your example.
-- Filter non matching records
SELECT
A.*
FROM A LEFT OUTER JOIN B ON
A.col1 = B.col1
WHERE
B.col1 IS NOT NULL AND
A.id1 in (1, 2) AND
A.col2 = 'Good data'
ORDER BY
A.id1 DESC
Here is an image of the output.
Let's say I have the following statement and the inner join results in 3 rows where a.Id = b.Id, but each of the 3 rows have different b.Value's. Since only one row from tableA is being updated, which of the 3 values is used in the update?
UPDATE a
SET a.Value = b.Value
FROM tableA AS a
INNER JOIN tableB as b
ON a.Id = b.Id
I don't think there are rules for this case and you cannot depend on a particular outcome.
If you're after a specific row, say the latest one, you can use apply, like:
UPDATE a
SET a.Value = b.Value
FROM tableA AS a
CROSS APPLY
(
select top 1 *
from tableB as b
where b.id = a.id
order by
DateColumn desc
) as b
Usually what you end up with in this scenario is the first row that appears in the order of the physical index on the table. In actual practice, you should treat this as non-deterministic and include something that narrows your result to one row.
Here is what I came up with using SQL Server 2008
--drop table #b
--drop table #a
select 1 as id, 2 as value
into #a
select 1 as id, 5 as value
into #b
insert into #b
select 1, 3
insert into #b
select 1, 6
select * from #a
select * from #b
UPDATE #a
SET #a.Value = #b.Value
FROM #a
INNER JOIN #b
ON #a.Id = #b.Id
It appears that it uses the top value of a basic select each time (row 1 of select * from #b). So, it possibly depends on indexing. However, I would not rely on the implementation set by SQL, as that has the possibility of changing. Instead, I would suggest using the solution presented by Andomar to make sure you know what value you are going to choose.
In short, do not trust the default implementation, create your own. But, this was an interesting academic question :)
Best option in my case for updating multiple records is to use merge Query(Supported from SQL Server 2008), in this query you have complete control of what you are updating.
Also you can use output query to do further processing.
Example: Without Output clause(only update)
;WITH cteB AS
( SELECT Id, Col1, Col2, Col3
FROM B WHERE Id > 10 ---- Select Multiple records
)
MERGE A
USING cteB
ON(A.Id = cteB.Id) -- Update condition
WHEN MATCHED THEN UPDATE
SET
A.Col1 = cteB.Col1, --Note: Update condition i.e; A.Id = cteB.Id cant appear here again.
A.Col2 = cteB.Col2,
A.Col3 = cteB.Col3;
Example: With OputPut clause
CREATE TABLE #TempOutPutTable
{
PkId INT NOT NULL,
Col1 VARCHAR(50),
Col2 VARCHAR(50)
}
;WITH cteB AS
( SELECT Id, Col1, Col2, Col3
FROM B WHERE Id > 10
)
MERGE A
USING cteB
ON(A.Id = cteB.Id)
WHEN MATCHED THEN UPDATE
SET
A.Col1 = cteB.Col1,
A.Col2 = cteB.Col2,
A.Col3 = cteB.Col3
OUTPUT
INSERTED.Id, cteB.Col1, A.Col2 INTO #TempOutPutTable;
--Do what ever you want with the data in temporary table
SELECT * FROM #TempOutPutTable; -- you can check here which records are updated.
Yes, I came up with a similar experiment to Justin Pihony:
IF OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test ;
SELECT
1 AS Name, 0 AS value
INTO #test
IF OBJECT_ID('tempdb..#compare') IS NOT NULL DROP TABLE #compare ;
SELECT 1 AS name, 1 AS value
INTO #compare
INSERT INTO #compare
SELECT 1 AS name, 0 AS value;
SELECT * FROM #test
SELECT * FROM #compare
UPDATE t
SET t.value = c.value
FROM #test t
INNER JOIN #compare c
ON t.Name = c.name
Takes the topmost row in the comparison, right-side table. You can reverse the #compare.value values to 0 and 1 and you'll get the reverse. I agree with the posters above...its very strange that this operation does not throw an error message as it is completely hidden that this operation IGNORES secondary values
I have a tableA:
ID value
1 100
2 101
2 444
3 501
Also TableB
ID Code
1
2
Now I want to populate col = code of table B if there exists ID = 2 in tableA. for multiple values , get max value.
else populate it with '123'. Now here is what I used:
if exists (select MAX(value) from #A where id = 2)
BEGIN
update #B
set code = (select MAX(value) from #A where id = 2)
from #A
END
ELSE
update #B
set code = 123
from #B
I am sure there is some problem in BEGIN;END or in IF EXIST;ELSE.
Basically I want to by-pass the else part if select statement in IF-part exist and vice- versa. For example if select statement of IF=part is:
(select MAX(value) from #A where id = 4)
It should just populate 123, coz ID = 4 do not exist !
EDIT
I want to add the reason that your IF statement seems to not work. When you do an EXISTS on an aggregate, it's always going to be true. It returns a value even if the ID doesn't exist. Sure, it's NULL, but its returning it. Instead, do this:
if exists(select 1 from table where id = 4)
and you'll get to the ELSE portion of your IF statement.
Now, here's a better, set-based solution:
update b
set code = isnull(a.value, 123)
from #b b
left join (select id, max(value) from #a group by id) a
on b.id = a.id
where
b.id = yourid
This has the benefit of being able to run on the entire table rather than individual ids.
Try this:
Update TableB Set
Code = Coalesce(
(Select Max(Value)
From TableA
Where Id = b.Id), 123)
From TableB b
I know its been a while since the original post but I like using CTE's and this worked for me:
WITH cte_table_a
AS
(
SELECT [id] [id]
, MAX([value]) [value]
FROM table_a
GROUP BY [id]
)
UPDATE table_b
SET table_b.code = CASE WHEN cte_table_a.[value] IS NOT NULL THEN cte_table_a.[value] ELSE 124 END
FROM table_b
LEFT OUTER JOIN cte_table_a
ON table_b.id = cte_table_a.id
Lets say I have a database that looks like this:
tblA:
ID, Name, Sequence, tblBID
1 a 5 14
2 b 3 15
3 c 3 16
4 d 3 17
tblB:
ID, Group
14 1
15 1
16 2
17 3
I would like to sequence A so that the sequences go 1...n for each group of B.
So in this case, the sequences going down should be 1,2,1,1.
The ordering needs to be consistent with the current ordering, but there are no guarantees as to the current ordering.
I am not exactly a sql master and I am sure there is a fairly easy way to do this, but I really don't know the right route to take. Any hints?
If you are using SQL Server 2005+ or higher, you can use a ranking function:
Select tblA.Id, tblA.Name
, Row_Number() Over ( Partition By tblB.[Group] Order By tblA.Id ) As Sequence
, tblA.tblBID
From tblA
Join tblB
On tblB.tblBID = tblB.ID
Row_Number ranking function.
Here's another solution that would work in SQL Server 2000 and prior.
Select A.Id, A.Name
, (Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID) + 1 As Sequence
, A.tblBID
From tblA As A
Join tblB As B
On B.Id = A.tblBID
EDIT
Also want to make it clear that I want to actually update tblA to reflect the proper sequences.
In SQL Server, you can use their proprietary From clause in an Update statement like so:
Update tblA
Set Sequence = (
Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID
) + 1
From tblA As A
Join tblB As B
On B.Id = A.tblBID
The Hoyle ANSI solution might be something like:
Update tblA
Set Sequence = (
Select (Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID) + 1
From tblA As A
Join tblB As B
On B.Id = A.tblBID
Where A.Id = tblA.Id
)
EDIT
Can we do that [the inner group] comparison based on A.Sequence instead of B.ID?
Select A1.*
, (Select Count(*)
From tblB As B2
Join tblA As A2
On A2.tblBID = B2.Id
Where B2.[Group] = B1.[Group]
And A2.Sequence < A1.Sequence) + 1
From tblA As A1
Join tblB As B1
On B1.Id = A1.tblBID
Because it's SQL 2000, we can't use a windowing function. That's okay.
Thomas's queries are good and will work. However, they will get worse and worse as the number of rows increases—with different characteristics depending on how wide (the number of groups) and how deep (the number of items per group). This is because those queries use a partial cross-join, perhaps we could call it a "pyramidal cross-join" where the crossing part is limited to right side values less than left side values rather than left crossing to all right values.
What to do?
I think you will be surprised to find that the following long and painful-looking script will outperform the pyramidal join at a certain size of data (which may not be all that big) and eventually, with really large data sets must be considered a screaming performer:
CREATE TABLE #tblA (
ID int identity(1,1) NOT NULL,
Name varchar(1) NOT NULL,
Sequence int NOT NULL,
tblBID int NOT NULL,
PRIMARY KEY CLUSTERED (ID)
)
INSERT #tblA VALUES ('a', 5, 14)
INSERT #tblA VALUES ('b', 3, 15)
INSERT #tblA VALUES ('c', 3, 16)
INSERT #tblA VALUES ('d', 3, 17)
CREATE TABLE #tblB (
ID int NOT NULL PRIMARY KEY CLUSTERED,
GroupID int NOT NULL
)
INSERT #tblB VALUES (14, 1)
INSERT #tblB VALUES (15, 1)
INSERT #tblB VALUES (16, 2)
INSERT #tblB VALUES (17, 3)
CREATE TABLE #seq (
seq int identity(1,1) NOT NULL,
ID int NOT NULL,
GroupID int NOT NULL,
PRIMARY KEY CLUSTERED (ID)
)
INSERT #seq
SELECT
A.ID,
B.GroupID
FROM
#tblA A
INNER JOIN #tblB B ON A.tblBID = b.ID
ORDER BY B.GroupID, A.Sequence
UPDATE A
SET A.Sequence = S.seq - X.MinSeq + 1
FROM
#tblA A
INNER JOIN #seq S ON A.ID = S.ID
INNER JOIN (
SELECT GroupID, MinSeq = Min(seq)
FROM #seq
GROUP BY GroupID
) X ON S.GroupID = X.GroupID
SELECT * FROM #tblA
DROP TABLE #seq
DROP TABLE #tblB
DROP TABLE #tblA
If I understood you correctly, then ORDER BY B.GroupID, A.Sequence is correct. If not, you can switch A.Sequence to B.ID.
Also, my index on the temp table should be experimented with. For a certain quantity of rows, and also the width and depth characteristics of those rows, clustering on one of the other two columns in the #seq table could be helpful.
Last, there is a possible different data organization possible: leaving GroupID out of the #seq table and joining again. I suspect it would be worse, but am not 100% sure.
Something like:
SELECT a.id, a.name, row_number() over (partition by b.group order by a.id)
FROM tblA a
JOIN tblB on a.tblBID = b.ID;
I'm having scenario to which a sql query need to be built. I tried to come up with a efficient query, but could not find a clear way of doing this. My scenario is as follows:
I'm having TABLE_A and TABLE_B ,where FIELD_AB will definitely be a filed of TABLE_A, however, there can be exist FIELD_AB in TABLE_B.
I need to retrieve value for FIELD_AB, from TABLE_B if such field exist, if it is not, then retrieve value for FIELD_AB from TABLE_A.
I'm looking for a single query to retrieve the value of FIELD_AB, and according to my knowledge CASE statement can be used to accomplish this, but not clear a better way of using it.
EDIT:
Please do not misunderstood question. What I mean by "FIELD_AB can be exist" is that there is a possibility of FIELD_AB itself does not exist in the TABLE_B, not a value for FIELD_AB
Any help appreciated
Thank You
You probably need to use an outer join to link the two tables:
select a.id
, case when b.col_ab is null then a.col_ab
else b.col_ab end as ab
from table_b b
left outer join table_a a
on ( b.id = a.id )
/
Oracle has some alternative ways of testing for NULL. A simpler, if non-standard, way of testing for AB would be:
nvl2(b.col_ab, b.col_ab, a.col_ab) as ab
This is logically identical to the more verbose CASE() statement.
create table table_b ( field_ab int not null, value varchar(20) not null )
create table table_a ( field_ab int not null, value varchar(20) not null )
insert into table_a values( 1, '1 from a')
insert into table_a values( 2, '2 from a')
insert into table_a values( 3, '3 from a')
insert into table_b values( 2, '2 from b')
-- result is '2 from b'
select
case when b.field_ab is null then a.value
else b.value
end
from table_a a left outer join table_b b on a.field_ab = b.field_ab
where a.field_ab = 2
-- result is '1 from a'
select
case when b.field_ab is null then a.value
else b.value
end
from table_a a left outer join table_b b on a.field_ab = b.field_ab
where a.field_ab = 1