SQL Query Refactoring Possible? - sql

Assume this is part of the result set
AND
Assume Dob,Name,Adress,Postcode,Telephone,EmailAddress are the same for each ID - and these columns are used in the group by clause
Sample data:
ID date Amount
---------------------------
12345 1/1/2017 100
12345 1/2/2017 200
12345 1/3/2017 300
With the outer query included I get the following which is what I want to achieve
ID date Amount
--------------------------
12345 1/1/2017 600
I want to confirm if there's a better way in terms of performance for this code. I feel like I could do a join, or a shorter version of the query but I can't get the logic right.
When I remove the outer query and do the MIN and SUM aggregate functions inside, the results doesn't group by correctly. It'll show more than one result for each id.
Also is it possible for a shorter group by?
Here's the partial version of the final code
SELECT
a.id, a.dob, a.claim_id,
a.name, a.Address, a.postcode,
a.Telephone, a.EmailAdress,
MIN(a.date), SUM(a.amount) as Amount
FROM
(SELECT DISTINCT
i.date, i.id, cl.name, cl.address,
cl.postcode, cl.telephone, cl.dob,
cl.EmailAdress, i.amount, cm.claim_id
FROM
testdb.dbo.invoice i
JOIN
testdb.dbo.claim cm with (nolock) ON i.id = cm.id
JOIN
testdb.dbo.clients cl with (nolock) ON cm.clientid = cl.id
JOIN
(....) c ON i.id = c.id
WHERE
.....) AS a
GROUP BY
a.id, a.dob, a.claim, a.name, a.Address,
a.postcode, a.Telephone, a.EmailAdress
ORDER BY
1

SELECT DISTINCT
i.date ,i.id ,cl.name ,cl.address
,cl.postcode ,cl.telephone,cl.dob
,cl.EmailAdress ,i.amount ,cm.claim_id
FROM
testdb.dbo.invoice i
JOIN
testdb.dbo.claim cm with (nolock) on i.id = cm.id
JOIN
testdb.dbo.clients cl with (nolock) on cm.clientid = cl.id
JOIN
( .... ) c on i.id = c.id
WHERE
.....
GROUP BY
i.id,i.dob,cm.claim_id,cl.name,cl.Address,cl.postcode,
cl.Telephone,cl.EmailAdress
ORDER BY 1
Is pretty much the previous code. With the outer query removed. I'm not sure what happened previously and as to why it still gave me multiple records(I'm not sure what differed now and then). But it isn't doing that anymore with this code.

Why not do the calculation inline and then join the detail tables afterwards,
something like:
SELECT
a.id, a.dob, claimDetails.claim_id,
a.name, a.Address, a.postcode,
a.Telephone, a.EmailAdress,
claimDetails.FirstDate, claimDetails.Amount
FROM a
LEFT JOIN
(
SELECT i.id, cm.claim_id, MIN(i.date) as FirstDate, SUM(i.amount) as Amount
FROM testdb.dbo.invoice i
JOIN testdb.dbo.claim cm ON i.id = cm.id
GROUP BY i.id, cm.claim_id
) claimDetails
ON claimDetails.id = a.id
LEFT JOIN Clients....

Related

Multiple joins with group by (Sum)

When I using multiple JOIN, I hope to get the sum of some column in joined tables.
SELECT
A.*,
SUM(C.purchase_price) AS purcchase_total,
SUM(D.sales_price) AS sales_total,
B.user_name
FROM
PROJECT AS A
LEFT JOIN
USER AS B ON A.user_idx = B.user_idx
LEFT JOIN
PURCHASE AS C ON A.project_idx = C.project_idx
LEFT JOIN
SALES AS D ON A.project_idx = D.project_idx
GROUP BY
????
You need to use subquery as follows:
SELECT A.project_idx,
a.project_name,
A.project_category,
sum(C.purchase_price) AS purcchase_total,
sum(D.sales_price) as sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_price
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_price
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
I am not sure but you can use inner join of project with user instead of left join.
SELECT A.project_idx,
a.project_name,
A.project_category,
purcchase_total,
sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_total
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_total
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
This is working correctly on MS-SQL Server.
Thanks to Popeye
You are attempting to aggregate over two unrelated dimensions, and that throws off all the calculations.
Correlated subqueries are an alternative:
SELECT p.*,
(SELECT SUM(pu.purchase_price)
FROM PURCHASE pu
WHERE p.project_idx = pu.project_idx
) as purchase_total,
(SELECT SUM(s.sales_price)
FROM SALES s
WHERE p.project_idx = s.project_idx
) as sales_total,
u.user_name
FROM PROJECT p LEFT JOIN
USER u
ON p.user_idx = u.user_idx ;
Note that this uses meaningful table aliases so the query is easier to read. Arbitrary letters are really no better (and perhaps worse) than using the entire table name.
Correlated subqueries avoid the outer aggregation as well -- and let you select all the columns from the first table, which is what you want. They also often have better performance with the right indexes.

SQL Server Circular Query

I have 4 tables, in that I want to fetch records from all 4 and aggregate the values
I have these tables
I am expecting this output
but getting this output as a Cartesian product
It is multiplying the expenses and allocation
Here is my query
select
a.NAME, b.P_NAME,
sum(a.DURATION) DURATION,
sum(b.[EXP]) EXPEN
from
(select
e.ID, a.P_ID, e.NAME, a.DURATION DURATION
from
EMPLOYEE e
inner join
ALLOCATION a ON e.ID = a.E_ID) a
inner join
(select
p.P_ID, e.E_ID, p.P_NAME, e.amt [EXP]
from
PROJECT p
inner join
EXPENSES e ON p.P_ID = e.P_ID) b ON a.ID = b.E_ID
and a.P_ID = b.P_ID
group by
a.NAME, b.P_NAME
Can anyone suggest something about this.
The following should work:
SELECT e.Name,p.Name,COALESCE(d.Duration,0),COALESCE(exp.Expen,0)
FROM
Employee e
CROSS JOIN
Project p
LEFT JOIN
(SELECT E_ID,P_ID,SUM(Duration) as Duration FROM Allocation
GROUP BY E_ID,P_ID) d
ON
e.E_ID = d.E_ID and
p.P_ID = d.P_ID
LEFT JOIN
(SELECT E_ID,P_ID,SUM(AMT) as Expen FROM Expenses
GROUP BY E_ID,P_ID) exp
ON
e.E_ID = exp.E_ID and
p.P_ID = exp.P_ID
WHERE
d.E_ID is not null or
exp.E_ID is not null
I've tried to write a query that will produce results where e.g. there are rows in Expenses but no rows in Allocations (or vice versa) for some particular E_ID,P_ID combination.
Use left join in select query by passing common id for all table
Hi I got the answer what I want from some modification in the query
The above query is also working like a charm and have done some modification to the original query and got the answer
Just have to group by the inner queries and then join the queries it will then not showing Cartesian product
Here is the updated one
select a.NAME,b.P_NAME,sum(a.DURATION) DURATION,sum(b.[EXP]) EXPEN from
(select e.ID,a.P_ID, e.NAME,sum(a.DURATION) DURATION from EMPLOYEE e inner join ALLOCATION a
ON e.ID=a.E_ID group by e.ID,e.NAME,a.P_ID) a
inner join
(select p.P_ID,e.E_ID, p.P_NAME,sum(e.amt) [EXP] from PROJECT p inner join EXPENSES e
ON p.P_ID=e.P_ID group by p.P_ID,p.P_NAME,e.E_ID) b
ON a.ID=b.e_ID and a.P_ID=b.P_ID group by a.NAME,b.P_NAME
Showing the correct output

Combining two join queries into one

I have two queries that I want to combine into one.
First query is
select
a.desc as desc
,sum(bdd.amount) as amount
from
t_main c
left outer join
t_direct bds on (bds.mainId = c.id)
left outer join
tm_defination a on (a.id = bds.defId)
where
c.descId = 1000000134
group by
a.desc;
It returns following result
desc amount
NW 12.00
SW 10
Second query I have
select
a.desc as desc
,sum(bdd.newAmt) as amount1
from
t_main c
left outer join
t_newBox b on (b.mainId = c.id)
left outer join
t_transition c on (c.id = b.tranId)
left outer join
tm_defination def a on (a.id = c.defId)
where
c.descId = 1000000134
group by
a.desc;
This query returns this result:
desc amount
NW 4.00
I want to combine these two queries so that I get out put like this..
desc amount amount1
NW l2.00 4.00
SW 10.00
I tried UNION between query 1 and query 2 but the result came out as
desc amountamount1
NW 16.00
SW 10.00
Which is not what I wanted.
Please let me know how I can create a query or expression to achieve this.
Thanks
Instead of union you can use the join.
In this case your code will be like the following:
select coalesce(q1.desc, q2.desc) as desc,
q1.amount as amount, q2.amount1 as amount1
from
(
select
a.desc as desc
,sum(bdd.amount) as amount
from t_main c
left outer join t_direct bds on (bds.mainId=c.id)
left outer join tm_defination a on (a.id =bds.defId)
where c.descId=1000000134
group by a.desc
) q1
full join
(
select
a.desc as desc
,sum(bdd.newAmt) as amount1
from t_main c
left outer join t_newBox b on (b.mainId=c.id)
left outer join t_transition c (c.id=b.tranId)
left outer join tm_defination def a on (a.id =c.defId)
where c.descId=1000000134
group by a.desc
) q2
on q1.desc = q2.desc
order by 1
Because of the same table usage for the desc column source, coalesce function can be used. The result query will be ordered by the result desc column.

How to have SQL INNER JOIN accept null results

I have the following query:
SELECT TOP 25 CLIENT_ID_MD5, COUNT(CLIENT_ID_MD5) TOTAL
FROM dbo.amazonlogs
GROUP BY CLIENT_ID_MD5
ORDER BY COUNT(*) DESC;
Which returns:
283fe255cbc25c804eb0c05f84ee5d52 864458
879100cf8aa8b993a8c53f0137a3a176 126122
06c181de7f35ee039fec84579e82883d 88719
69ffb6c6fd5f52de0d5535ce56286671 68863
703441aa63c0ac1f39fe9e4a4cc8239a 47434
3fd023e7b2047e78c6742e2fc5b66fce 45350
a8b72ca65ba2440e8e4028a832ec2160 39524
...
I want to retrieve the corresponding client name (FIRM) using the returned MD5 from this query, so a row might look like:
879100cf8aa8b993a8c53f0137a3a176 126122 Burger King
So I made this query:
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL, c.FIRM
FROM dbo.amazonlogs a
INNER JOIN dbo.customers c
ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY a.CLIENT_ID_MD5, c.FIRM
ORDER BY COUNT(*) DESC;
This returns something like:
879100cf8aa8b993a8c53f0137a3a176 126122 Burger King
06c181de7f35ee039fec84579e82883d 88719 McDonalds
703441aa63c0ac1f39fe9e4a4cc8239a 47434 Wendy's
3fd023e7b2047e78c6742e2fc5b66fce 45350 Tim Horton's
Which works, except I need to return an empty value for c.FIRM if there is no corresponding FIRM for a given MD5. For example:
879100cf8aa8b993a8c53f0137a3a176 126122 Burger King
06c181de7f35ee039fec84579e82883d 88719 McDonalds
69ffb6c6fd5f52de0d5535ce56286671 68863
703441aa63c0ac1f39fe9e4a4cc8239a 47434 Wendy's
3fd023e7b2047e78c6742e2fc5b66fce 45350 Tim Horton's
How should I modify the query to still return a row even if there is no corresponding c.FIRM?
Replace INNER JOIN with LEFT JOIN
use LEFT JOIN instead of INNER JOIN
Instead of doing an INNER join, you should do a LEFT OUTER join:
SELECT
a.CLIENT_ID_MD5,
COUNT(a.CLIENT_ID_MD5) TOTAL,
ISNULL(c.FIRM,'')
FROM
dbo.amazonlogs a LEFT OUTER JOIN
dbo.customers c ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY
a.CLIENT_ID_MD5,
c.FIRM
ORDER BY COUNT(0) DESC
http://www.w3schools.com/sql/sql_join.asp
An inner join excludes NULLs; you want a LEFT OUTER join.
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL, IsNull(c.FIRM, 'Unknown') as Firm
FROM dbo.amazonlogs a
LEFT JOIN dbo.customers c ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY a.CLIENT_ID_MD5, c.FIRM ORDER BY COUNT(*) DESC;
This will give you a value of "Unknown" when records in the customers table don't exist. You could obviously drop that part and just return c.FIRM if you want to have actual nulls instead.
Change your INNER JOIN to an OUTER JOIN...
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL, c.FIRM
FROM dbo.amazonlogs a
LEFT OUTER JOIN dbo.customers c
ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
GROUP BY a.CLIENT_ID_MD5, c.FIRM
ORDER BY COUNT(*) DESC;
WITH amazonlogs_Tallies
AS
(
SELECT a.CLIENT_ID_MD5, COUNT(a.CLIENT_ID_MD5) TOTAL
FROM dbo.amazonlogs a
GROUP
BY a.CLIENT_ID_MD5
),
amazonlogs_Tallies_Firms
AS
(
SELECT a.CLIENT_ID_MD5, a.TOTAL, c.FIRM
FROM amazonlogs_Tallies a
INNER JOIN dbo.customers c
ON c.CLIENT_ID_MD5 = a.CLIENT_ID_MD5
)
SELECT CLIENT_ID_MD5, TOTAL, FIRM
FROM amazonlogs_Tallies_Firms
UNION
SELECT CLIENT_ID_MD5, TOTAL, '{{NOT_KNOWN}}'
FROM amazonlogs_Tallies
EXCEPT
SELECT CLIENT_ID_MD5, TOTAL, '{{NOT_KNOWN}}'
FROM amazonlogs_Tallies_Firms;

SQL Inner join division

I have issue with my inner join division below. From my oracle, it keep prompt me missing right parenthesis when I have already close it. I'll need to get the names of the patient who have collected all items.
Select P.name
From ((((Select Patientid From Patient) As P
Inner Join (Select Accountno, Patientid From Account) As A1
on P.PatientID = A1.PatientID)
Inner Join (Select Accountno, Itemno From AccountType) As Al
On A1.Accountno = Al.Accountno)
Inner Join (Select Itemno From Item) As I
On Al.Itemno = I.Itemno)
Group By Al.Itemno
Having Count(*) >= (Select Count(*) FROM AccountType);
Here's a simpler approach that I believe is essentially equivalent:
select a.name
from Patient a
inner join Account b on a.PatientID = b.PatientID
inner join AccountType c on b.Accountno = c.Accountno
inner join Item d on c.Itemno = d.Itemno
group by c.Accountno, a.name
having Count(*) >= (Select Count(*) FROM AccountType);
This approach is a bit simpler. It has the added benefit of being much more likely to use indexes on the tables -- if you do joins between what are essentially 'join tables' in memory, you don't get the benefit of the indexes that exist for the physical tables in memory.
I also usually alias table names using sequential letters -- 'a', 'b', 'c', 'd' as you can see. I find that when I'm writing complicated queries it makes it easier for me to follow. 'a' is the first table in the join, 'b' is the second, etc.
It sounds like you just want
SELECT p.name
FROM patient p
INNER JOIN account a ON (a.patientID = p.patientID)
INNER JOIN accountType accTyp ON (accTyp.accountNo = a.accountNo)
INNER JOIN item i ON (i.itemNo = accTyp.itemNo)
GROUP BY accTyp.itemNo
HAVING COUNT(*) = (SELECT COUNT(*)
FROM accountType);
Note that having an alias of A1 and an alias of Al is quite confusing. You want to pick more meaningful and more distinguishing aliases.