Oracle: How to use left outer join to get all entries from left table and satisfying the condition in Where clause - sql

I have the tables below.
Client:
ID | clientName
--------------
1 A1
2 A2
3 A3
Order:
OrdID clientID status_cd
------------------------
100 1 DONE
101 1 SENT
102 3 SENT
Status:
status_cd status_category
DONE COMPL
SENT INPROG
I have to write a query to get all the clients and count of order against all of them, whether the client_id exists in Order table or not and has the orders with "COMPL" as status category.
In this case, I am using the query below but it's filtering out the clients which has no orders. I want to get all clients such that the expected result is as below.
Query:
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID=o.client_id(+)
and o.status_cd=s.status_cd where s.status_category='COMPL'
group by c.ID
Expected result:
C.ID count(distinct o.OrdID)
----------------------------
1 1
2 0
3 0
Can someone please help me with this? I know, in this case, left outer join is behaving like inner join when I am using where clause, but is there any other way to achieve the results above?

This can be dealt with a lot easier when using an explicit join operator:
select c.ID, count(distinct s.status_cd)
from client c
left join orders o on o.clientid = c.id
left join status s on s.status_cd = o.status_cd and s.status_category='COMPL'
group by c.ID;
The above assumes that orders.status_cd is defined as not null
Another option is to move the join between orders and status in a derived table:
select c.ID, count(distinct o.ordid)
from client c
left join (
select o.ordid
from orders o
join status s on s.status_cd = o.status_cd
where s.status_category='COMPL'
) o on o.clientid = c.id
group by c.ID;
The above "states" more clearly (at least in my eyes) that only orders within that status category are of interest compared to the first solution

As usual, there are lots of ways to express this requirement.
Try ANSI join people will hate me an vote down this answer ;) :
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID = o.client_id(+)
and o.status_cd = s.status_cd
and s.status_category='COMPL'
group by c.ID
;
or
select c.ID
, nvl((select count(distinct o.OrdID)
from order o, status s
where c.ID = o.client_id
and o.status_cd = s.status_cd
and s.status_category='COMPL'
), 0) as order_count
from client c
group by c.ID
;
or
with ord as
(select client_id, count(distinct o.OrdID) cnt
from order o, status s
where 1=1
and o.status_cd = s.status_cd
and s.status_category='COMPL'
group by client_id
)
select c.ID
, nvl((select cnt from ord o where c.ID = o.client_id ), 0) as order_count
from client c
group by c.ID
;
or
...

The second WHERE should be an AND.
Other than that, you need the plus sign, (+), marking left outer join, in the second join condition as well. It is not enough to left-outer-join the first two tables.
Something like
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID=o.client_id(+)
and o.status_cd=s.status_cd(+) AND s.status_category='COMPL'
-- ^^^ ^^^ (not WHERE)
group by c.ID
Of course, it would be much better if you used proper (SQL Standard) join syntax.

Related

Finding count in multiple tables

I need to find the amount of times a customer appears in the Orders and Requests tables respectively. However this script is producing the same count value for both places where COUNT is used. The value cannot possibly be the same so what am I doing wrong?
SELECT o.CustomerID,
COUNT(o.CustomerID) as OrdersPerCustomer,
COUNT(r.CustomerID) as RequestsPerCustomer
FROM Orders o
INNER JOIN [Customers] c on c.ID = o.CustomerID
INNER JOIN [Request] r on r.CustomerID = c.ID
GROUP BY o.CustomerID
You are multiplying the number of order and request records. I.e. by joining the tables, you get for, say, 3 orders and 4 requests for a customer 12 result rows. As the IDs will never be null in a record, COUNT(o.CustomerID) and COUNT(r.CustomerID) are just COUNT(*) (12 in my example, and not 3 and 4 as you expected).
The easiest approach:
select
customer_id,
(select count(*) from orders o where o.customerid = c.id) as o_count,
(select count(*) from request r where r.customerid = c.id) as r_count
from customers c;
The same with subqueries (derived tables) in the from clause:
select
customer_id,
coalesce(o.total, 0) as o_count,
coalesce(r.total, 0) as r_count
from customers c
left join (select customerid, count(*) as total from orders group by customerid) o
on o.customerid = c.id
left join (select customerid, count(*) as total from request group by customerid) r
on r.customerid = c.id;
When aggregating from multiple tables, always aggregate first and join then.

SQL Query Refactoring Possible?

Assume this is part of the result set
AND
Assume Dob,Name,Adress,Postcode,Telephone,EmailAddress are the same for each ID - and these columns are used in the group by clause
Sample data:
ID date Amount
---------------------------
12345 1/1/2017 100
12345 1/2/2017 200
12345 1/3/2017 300
With the outer query included I get the following which is what I want to achieve
ID date Amount
--------------------------
12345 1/1/2017 600
I want to confirm if there's a better way in terms of performance for this code. I feel like I could do a join, or a shorter version of the query but I can't get the logic right.
When I remove the outer query and do the MIN and SUM aggregate functions inside, the results doesn't group by correctly. It'll show more than one result for each id.
Also is it possible for a shorter group by?
Here's the partial version of the final code
SELECT
a.id, a.dob, a.claim_id,
a.name, a.Address, a.postcode,
a.Telephone, a.EmailAdress,
MIN(a.date), SUM(a.amount) as Amount
FROM
(SELECT DISTINCT
i.date, i.id, cl.name, cl.address,
cl.postcode, cl.telephone, cl.dob,
cl.EmailAdress, i.amount, cm.claim_id
FROM
testdb.dbo.invoice i
JOIN
testdb.dbo.claim cm with (nolock) ON i.id = cm.id
JOIN
testdb.dbo.clients cl with (nolock) ON cm.clientid = cl.id
JOIN
(....) c ON i.id = c.id
WHERE
.....) AS a
GROUP BY
a.id, a.dob, a.claim, a.name, a.Address,
a.postcode, a.Telephone, a.EmailAdress
ORDER BY
1
SELECT DISTINCT
i.date ,i.id ,cl.name ,cl.address
,cl.postcode ,cl.telephone,cl.dob
,cl.EmailAdress ,i.amount ,cm.claim_id
FROM
testdb.dbo.invoice i
JOIN
testdb.dbo.claim cm with (nolock) on i.id = cm.id
JOIN
testdb.dbo.clients cl with (nolock) on cm.clientid = cl.id
JOIN
( .... ) c on i.id = c.id
WHERE
.....
GROUP BY
i.id,i.dob,cm.claim_id,cl.name,cl.Address,cl.postcode,
cl.Telephone,cl.EmailAdress
ORDER BY 1
Is pretty much the previous code. With the outer query removed. I'm not sure what happened previously and as to why it still gave me multiple records(I'm not sure what differed now and then). But it isn't doing that anymore with this code.
Why not do the calculation inline and then join the detail tables afterwards,
something like:
SELECT
a.id, a.dob, claimDetails.claim_id,
a.name, a.Address, a.postcode,
a.Telephone, a.EmailAdress,
claimDetails.FirstDate, claimDetails.Amount
FROM a
LEFT JOIN
(
SELECT i.id, cm.claim_id, MIN(i.date) as FirstDate, SUM(i.amount) as Amount
FROM testdb.dbo.invoice i
JOIN testdb.dbo.claim cm ON i.id = cm.id
GROUP BY i.id, cm.claim_id
) claimDetails
ON claimDetails.id = a.id
LEFT JOIN Clients....

SQL strategy to fetch maximum

Suppose I have these three tables:
I want to get, for all products, it's product_id and the client that bougth it most times (the biggest client of the product).
I solved it like this:
SELECT
product_id AS product,
(SELECT TOP 1 client_id FROM Bill_Item, Bill
WHERE Bill_Item.product_id = p.product_id
and Bill_Item.bill_id = Bill.bill_id
GROUP BY
client_id
ORDER BY
COUNT(*) DESC
) AS client
FROM Product p
Do you know a better way?
the inner query will give you the ranking. The outer query will give you the client that puchase the most for a product
SELECT *
(
SELECT i.product_id, b.client_id,
r = row_number() over (partition by i.product_id
order by count(*) desc)
FROM Bill b
INNER JOIN Bill_Item i ON b.bill_id = i.bill_id
GROUP BY i.product_id, b.client_id
) d
WHERE r = 1
I was going to submit pretty much the same thing as #Squirrell only with a Common Table Expression [CTE] rather than a derived table. So I wont duplicate that but there are some learning points concerning your query. First is IMPLICIT JOINS such as FROM Bill_Item, Bill are really easy to have uintended consequences (one of many questions: Queries that implicit SQL joins can't do?) Next for the Calculated column you can actually do this in a OUTER APPLY or CROSS APPLY which is a very useful technique.
So you could re-write your method as follows:
SELECT *
FROM
Product p
OUTER APPLY (SELECT TOP 1 b.client_id
FROM
Bill_Item bi
INNER JOIN Bill b
ON bi.bill_id = b.bill_id
WHERE
bi.product_id = p.product_id
GROUP BY
b.client_id
ORDER BY
COUNT(*) DESC) c
And to show you how squirell's answer can still include products that have never been sold all you need to do is join Products and LEFT JOIN to other tables:
;WITH cte AS (
SELECT
p.product_id
,b.client_id
,ROW_NUMBER() OVER (PARTITION BY p.product_id ORDER BY COUNT(*) DESC) as RowNumber
FROM
Product p
LEFT JOIN Bill_Item bi
ON p.product_id = bi.product_id
LEFT JOIN Bill b
ON bi.bill_id = b.bill_id
GROUP BY
p.product_id
,b.client_id
)
SELECT *
FROM
cte
WHERE
RowNumber = 1
Techniques used in some of these that are useful.
CTE
APPLY (Outer & Cross)
Window Functions
Squirrel's answer doesn't return products that have never been sold. If you want to include those, then your approach is ok, although I would write the query as:
SELECT product_id as product,
(SELECT TOP 1 b.client_id
FROM Bill_Item bi JOIN
Bill b
ON bi.bill_id = b.bill_id
WHERE Bill_Item.product_id = p.product_id
GROUP BY client_id
ORDER BY COUNT(*) DESC
) as client
FROM Product p;
You can also express this using APPLY, but a correlated subquery is also fine.
Note the correct use of the explicit JOIN syntax.

Counting associations from multiple tables

I want to see how many association each of my records in a given table have. Some of these association have some conditions attached to them
So far I have
-- Count app associations
SELECT
distinct a.name,
COALESCE(v.count, 0) as visitors,
COALESCE(am.count, 0) AS auto_messages,
COALESCE(c.count, 0) AS conversations
FROM apps a
LEFT JOIN (SELECT app_id, count(*) AS count FROM visitors GROUP BY 1) v ON a.id = v.app_id
LEFT JOIN (SELECT app_id, count(*) AS count FROM auto_messages GROUP BY 1) am ON a.id = am.app_id
LEFT JOIN (
SELECT DISTINCT c.id, app_id, count(c) AS count
FROM conversations c LEFT JOIN messages m ON m.conversation_id = c.id
WHERE m.visitor_id IS NOT NULL
GROUP BY c.id) c ON a.id = c.app_id
WHERE a.test = false
ORDER BY visitors DESC;
I run into problem with the last join statement for conversations. I want to count the number of conversations that have at least 1 message where the visitor_id is not null. For some reason, I get multiple records for each app, ie. the conversations are not being grouped properly.
Any ideas?
My gut feeling, based on limited understanding of the big picture: in the nested query selecting from conversations,
remove DISTINCT
remove c.id from SELECT list
GROUP BY c.app_id instead of c.id
EDIT: try this
...
LEFT JOIN (
SELECT app_id, count(*) AS count
FROM conversations c1
WHERE
EXISTS (
SELECT *
FROM messages m
WHERE m.conversation_id = c1.id and
M.visitor_id IS NOT NULL
)
GROUP BY c1.app_id) c
ON a.id = c.app_id

SQL - Select only last order for every clients who had ordered before

There are 2 tables, and I want to select only last order for each client that has ordered before...
(cid,cname) Clients table
1 , David
2 , Tom
3 , Alex
(oid,cid,title,ordertime) Orders table
1,1,"Tshirt",2013-10-1
2,3,"Ball",2013-10-1
3,3,"Food",2013-11-20
*Acording to tables Tom never ordered before. So he will not be listed. Alex ordered 2 times before and I want only show last order.
Output must be like this :*
1,1,"Tshirt",2013-10-1, David
3,3,"Food",2013-11-20, Alex
I tried something like this code but Alex was listed 2 times. I dont understand how I can figure out.
select *
from Clients t2
left join Orders t1
on t1.cid=t2.cid
where t1.ordertime<getutcdate()
order by t1.ordertime desc**
Probably I must use Distinct or Group by but I dont understand how.
Select
x.oid,
x.cid,
x.title,
x.ordertime,
x.cname
From (
Select
o.oid,
o.cid,
o.title,
o.ordertime,
c.cname,
row_number() over (partition by o.cid order by o.ordertime desc) rn
From
Clients c
inner join
Orders o
on c.cid = o.cid
) x
Where
x.rn = 1
SELECT o.*, c.name
FROM orders o INNER JOIN clients c on o.cid = c.cid
INNER JOIN (SELECT MAX(oid) as latestOrderid, cid FROM orders GROUP BY cid) as latest
on latest.latestorderid= orders.oid
Try this:
select o.*, c.cname
from clients c
join orders o on
c.cid = o.cid
and not exists (select *
from orders o2
where o2.cid = o.cid
and o2.ordertime > o.ordertime)