Need an efficient query in the following case - sql

Have 3 tables
Table A
id | value
-----------
|
Table B
id|value|A_id(fk to A)
--------------
| |
Table C
id|value|B_id(FK to B)|timestamp
--------------------------------
| | |
I have written a query to find out all latest distinct C values using the following query
select A.id, B.id, C.timestamp, C.value
from A,B,C
where A.id = B.A_id
and B.id = C.B_id
where C.value in (select distinct value from C c2 where c2.value = c.value and c2.value is not null)
and c.timestamp = (select max(timestamp) from C c3 where c3.value = c.value);
except IDs none of the other columns are having indexes. Right now this query takes about 2 hrs or more to run, because the number of distinct C values are 221000 records. Is there an efficient way to do this?

SELECT distinct A.id, B.id, c.timestamp, c.value FROM
(
SELECT c.value, MAX(c.timestamp) AS max_timestamp FROM c
WHERE NOT c.value IS NULL
GROUP BY c.value) c1 INNER JOIN c ON c1.value = c.value AND c1.max_timestamp = c.timestamp
inner join b ON B.id = C.B_id
inner join a ON A.id = B.A_id

A sub-query inside a query will be run for each row inside the main query.
When having large data inside the main query, that will be a performance anti-pattern (you have 2 sub-queries).
You need a group maximum, that could be achieved with a self left join.
SELECT A.id a_id, B.id b_id, C1.timestamp, C1.value
From C C1
INNER JOIN B on B.id = C1.b_id
INNER JOIN A on A.id = B.A_id
LEFT JOIN C C2 on C1.value = C2.value
and
C1.timstamp < C2.timestamp
WHERE C1.value IS NOT NULL
and C2.id IS NULL

Try one of these. i think i understood your connections. hope it helps. 2nd query might be faster since its trying to gather max timestamp by itself first then joining to other tables ids
select A.id, B.id.C.timestamp,c.value
from A
inner join B on B.B_id = A.id
inner join C on C.C_id = B._B_id
where C.value in (select distinct value from C c2 where c2.value = c.value and c2.value is not null)
and c.timestamp = (select max(timestamp) from C c3 where c3.value = c.value);
WITH
A_ID AS (
SELECT A.id, B.id
from A
inner join B on B.A_id = A.id
)
, C_ID AS (
SELECT C_ID, value, max(timestamp)
from C
where value in (select distinct value from C where value is not null)
)
SELECT a.id, a.b_id, c.B_id, c.timestamp, c.value
FROM A_ID a
inner join C_ID c on c.B_ID = a.B_ID
order by a.id

Related

How to replace Union in Sql

I need to concatinate results from 2 tables without using UNION.
Ex : I have 4 tables a, b, c, d. see below snips:
Table a:
Table b:
Table c:
Table d:
I am concatinating a and d results using UNION ALL like below:
select a.id,a.seq,a.item,b.des,c.qty from a
left join b on a.item = b.item
left join c on a.id = c.id and a.seq = c.seq
UNION ALL
select d.id,d.seq,d.item,d.des,c.qty from d
join c on d.id = c.id and d.seq = c.seq
My output:
But I need same result without using UNION ALL.
Is it possible if so HOW?
You can apply Full Outer join instead of Union which has its similarities to Union
SELECT a.id,a.seq,a.item,b.des,c.qty
FROM a left join b on a.item = b.item
left join c on a.id = c.id and a.seq = c.seq
FULL OUTER JOIN
(
SELECT d.id,d.seq,d.item,d.des,c.qty
FROM d join c on d.id = c.id and d.seq = c.seq
)x ON a.id = x.id
You can replace union all with full join with a "false" condition and lots of COALESCE()s.
The logic looks like this:
SELECT COALESCE(abc.id, cd.id) as id,
COALESCE(abc.seq, cd.seq) as seq,
COALESCE(abc.item, cd.item) as item,
COALESCE(abc.des, cd.des) as eesc,
COALESCE(abc.qty, cd.qty) as qty
FROM (SELECT a.id, a.seq, a.item, b.des, c.qty
FROM a LEFT JOIN
b
ON a.item = b.item LEFT JOIN
c
ON a.id = c.id AND a.seq = c.seq
) abc FULL JOIN
(SELECT d.id, d.seq, d.item, d.des, c.qty
FROM d JOIN
c
ON d.id = c.id AND d.seq = c.seq
) dc
ON 1 = 0; -- never evaluates to true
There is a LEFT JOIN from table 'a' to table 'c' (on 'id' and 'seq' columns) in the upper query and INNER JOIN from table 'd' to table 'c' in the lower query. Therefore I think you could LEFT JOIN from table 'c' to table 'd' and it would produce the correct output.
select a.id,a.seq,a.item,b.des,c.qty
from a
left join b on a.item = b.item
left join c on a.id = c.id and a.seq = c.seq
left join d on c.id = d.id and c.seq = d.seq;
If you want you can use INSERT INTO a (tmp) table and get the same result in a table.

how to apply contions in inner joins in sql query

I have there different tables lets say A ,B ,C.
From table A the query I am using is
select a.status, a.resolution, a.ID
from A a ;
From table B I have to fetch B.destID when A.ID matches
For that I am using
select b.destID
from B b
where b.ID = A.ID ;
if b.destID exists then
select newstatus
from C
where C.id = B.destID
else
select newstatus
from C
where C.id = A.ID
Can somebody help me in combining all these three queries into one?
Any help is appreciated.. Thanks in advance
Sure:
select a.status,a.resolution,
Coalesce(c2.newstatus, c1.newstatus) newStatus
from A Left Join B On B.ID = A.ID
Left Join C c1 On c1.id = A.Id
Left Join C c2 On c2.id = B.destID

SQL summations with multiple outer joins

I have tables a, b, c, and d whereby:
There are 0 or more b rows for each a row
There are 0 or more c rows for each a row
There are 0 or more d rows for each a row
If I try a query like the following:
SELECT a.id, SUM(b.debit), SUM(c.credit), SUM(d.other)
FROM a
LEFT JOIN b on a.id = b.a_id
LEFT JOIN c on a.id = c.a_id
LEFT JOIN d on a.id = d.a_id
GROUP BY a.id
I notice that I have created a cartesian product and therefore my sums are incorrect (much too large).
I see that there are other SO questions and answers, however I'm still not grasping how I can accomplish what I want to do in a single query. Is it possible in SQL to write a query which aggregates all of the following data:
SELECT a.id, SUM(b.debit)
FROM a
LEFT JOIN b on a.id = b.a_id
GROUP BY a.id
SELECT a.id, SUM(c.credit)
FROM a
LEFT JOIN c on a.id = c.a_id
GROUP BY a.id
SELECT a.id, SUM(d.other)
FROM a
LEFT JOIN d on a.id = d.a_id
GROUP BY a.id
in a single query?
Your analysis is correct. Unrelated JOIN create cartesian products.
You have to do the sums separately and then do a final addition. This is doable in one query and you have several options for that:
Sub-requests in your SELECT: SELECT a.id, (SELECT SUM(b.debit) FROM b WHERE b.a_id = a.id) + ...
CROSS APPLY with a similar query as the first bullet then SELECT a.id, b_sum + c_sum + d_sum
UNION ALL as you suggested with an outer SUM and GROUP BY on top of that.
LEFT JOIN to similar subqueries as above.
And probably more... The performance of the various solutions might be slightly different depending on how many rows in A you want to select.
SELECT a.ID, debit, credit, other
FROM a
LEFT JOIN (SELECT a_id, SUM(b.debit) as debit
FROM b
GROUP BY a_id) b ON a.ID = b.a_id
LEFT JOIN (SELECT a_id, SUM(b.credit) as credit
FROM c
GROUP BY a_id) c ON a.ID = c.a_id
LEFT JOIN (SELECT a_id, SUM(b.other) as other
FROM d
GROUP BY a_id) d ON a.ID = d.a_id
Can also be done with correlated subqueries:
SELECT a.id
, (SELECT SUM(debit) FROM b WHERE a.id = b.a_id)
, (SELECT SUM(credit) FROM c WHERE a.id = c.a_id)
, (SELECT SUM(other) FROM d WHERE a.id = d.a_id)
FROM a

Left Join Multiple Tables and Avoid Duplicates

I have two tables with a 1:n relationship to my base table, both of which I want to LEFT JOIN.
-------------------------------
Table A Table B Table C
-------------------------------
|ID|DATA| |ID|DATA| |ID|DATA|
-------------------------------
1 A1 1 B1 1 C1
- - 1 C2
I'm using:
SELECT * FROM TableA a
LEFT JOIN TableB b
ON a.Id = b.Id
LEFT JOIN TableC c
ON a.Id = c.Id
But this is showing duplicates for TableB:
1 A1 B1 C1
1 A1 B1 C2
How can I write this join to ignore the duplicates? Such as:
1 A1 B1 C1
1 A1 null C2
I think you need to do logic to get what you want. You want for any multiple b.ids to eliminate them. You can identify them using row_number() and then use case logic to make subsequent values NULL:
select a.id, a.val,
(case when row_number() over (partition by b.id, b.seqnum order by b.id) = 1 then val
end) as bval
c.val as cval
from TableA a left join
(select b.*, row_number() over (partition by b.id order by b.id) as seqnum
from tableB b
) b
on a.id = b.id left join
tableC c
on a.id = c.id
I don't think you want a full join between B and C, because you will get multiple rows. If B has 2 rows for an id and C has 3, then you will get 6. I suspect that you just want 3. To achieve this, you want to do something like:
select *
from (select b.*, row_number() over (partition by b.id order by b.id) as seqnum
from TableB b
) b
on a.id = b.id full outer join
(select c.*, row_number() over (partition by c.id order by c.id) as seqnum
from TableC c
) c
on b.id = c.id and
b.seqnum = c.seqnum join
TableA a
on a.id = b.id and a.id = c.id
This is enumerating the "B" and "C" lists, and then joining them by position on the list. It uses a full outer join to get the full length of the longer list.
The last join references both tables so TableA can be used as a filter. Extra ids in B and C won't appear in the results.
Do you want to use distinct
SELECT distinct * FROM TableA a
LEFT JOIN TableB b
ON a.Id = b.Id
LEFT JOIN TableC c
ON a.Id = c.Id
Do it as a UNION, i.e.
SELECT TableA.ID, TableB.ID, TableC.Id
FROM TableA a
INNER JOIN TableB b ON a.Id = b.Id
LEFT JOIN TableC c ON a.Id = c.Id
UNION
SELECT TableA.ID, Null, TableC.Id
FROM TableA a
LEFT JOIN TableC c ON a.Id = c.Id
i.e. one SELECT to being back the first row and another to bring back the second row. It's a bit rough because I don't know anything about the data you are trying to read but the principle is sound. You may need to rework it a bit.

How many B and C has A?

I have this tables:
A:
id
1
2
B:
id a_id
1 1
2 1
3 1
C:
id a_id
1 1
2 1
3 2
I need this result:
A, CountB, CountC
1, 3, 2
2, 0, 1
This try doesnt work fine:
SELECT
A.id, COUNT(B.id), COUNT(C.id)
FROM
A
LEFT JOIN
B ON A.id = B.a_id
LEFT JOIN
C ON A.id = C.a_id
GROUP BY A.id
How must be the sql sentence without using correlative queries?
The following variation on yours should work:
SELECT A.id, COUNT(distinct B.id), COUNT(distinct C.id)
FROM A LEFT JOIN
B
ON A.id = B.a_id LEFT JOIN
C
ON A.id = C.a_id
GROUP BY A.id
However, there are those (such as myself) who feel that using count distinct is a cop-out. The problem is that the rows from B and from C are interfering with each other, multiplying in the join. So, you can also do each join independently, and then put the results together:
select ab.id, cntB, cntC
from (select a.id, count(*) as cntB
from A left outer join
B
on A.id = B.a_id
group by a.id
) ab join
(select a.id, count(*) as cntC
from A left outer join
C
on A.id = C.a_id
group by a.id
) ac
on ab.id = ac.id
For just counting, the first form is fine. If you need to do other summarizations (say, summing a value), then you generally need to split into the component queries.