Problem with duplicates in distinct when joining

Problem with duplicates in distinct when joining - sql

I'm writing this query:
SELECT DISTINCT(o.id), o.status, p.excelID
FROM orders as o
LEFT JOIN (SELECT DISTINCT(orderId) id, excelID FROM parcels) as p on o.id = p.id
WHERE o.id is not null and p.id is not null
This is example of query records:
id
status
excelID
145
good
4444
145
good
3215
94
bad
9875
81
bad
5784
81
bad
5631
Like you can see i have duplicates in id column even when im using DISTINCT function and how can i write question to query to get records like:
id
status
excelID
145
good
4444
94
bad
9875
81
bad
5784

Incase you are interested in maximum value of excelID in case multiple excelID is available. Try this:
SELECT DISTINCT(o.id), o.status, max(p.excelID) as excelID
FROM orders as o
LEFT JOIN (SELECT DISTINCT(orderId) id, excelID FROM parcels) as p on o.id = p.id
WHERE o.id is not null and p.id is not null
group by o.id, o.status

Related

Join on two columns, if null then only join on one

I have the following two tables:
customers:
customer_id
department_id
aaaa
1234
bbbb
3456
status:
department_id
customer_id
status
1234
NULL
silver
3456
bbbb
gold
1234
bbbb
gold
I want to join status on customers, but if if it returns NULL I want to give the customer the department default. My ideal Output for this would be the following:
customer_id
department_id
status
aaaa
1234
silver
bbbb
3456
gold
I have tried to do two left joins, but am getting a memory usage error. Is there another way to do this that is efficient?

You can do:
select c.*, coalesce(s.status, d.status) as status
from customers c
left join status d on d.department_id = c.department_id
and d.customer_id is null
left join status s on s.department_id = c.department_id
and s.customer_id = c.customer_id

This might work:
SELECT *,
(
SELECT TOP 1 status
FROM status s
WHERE s.customer_id = c.customer_id
OR (c.customer_id IS NULL AND s.department_id = c.department_id)
ORDER BY CASE WHEN s.customer_id is NOT NULL THEN 0 ELSE 1 END
) as status
FROM customers c
The kicker is what kind of database you're using. If it's MySql you might want LIMIT 1 instead of TOP 1. For Oracle you'd look at the ROWNUM field.

Assuming that there is always a match at least on the department_id, you need an INNER join and FIRST_VALUE() window function will pick the proper status:
SELECT DISTINCT
c.customer_id,
c.department_id,
FIRST_VALUE(s.status) OVER (
PARTITION BY c.customer_id, c.department_id
ORDER BY CASE
WHEN s.customer_id = c.customer_id THEN 1
WHEN s.customer_id IS NULL THEN 2
ELSE 3
END
) status
FROM customers c INNER JOIN status s
ON s.department_id = c.department_id;
Depending on the database that you use the code may be simplified.
See the demo.

SQL MAX aggregate function not bringing the latest date

Purpose: I am trying to find the max date of when the teachers made a purchase and type.
Orders table
ID
Ordertype
Status
TeacherID
PurchaseDate
SchoolID
TeacherassistantID
1
Pencils
Completed
1
1/1/2021
1
1
2
Paper
Completed
1
3/5/2021
1
1
3
Notebooks
Completed
1
4/1/2021
1
1
4
Erasers
Completed
2
2/1/2021
2
2
Teachers table
TeacherID
Teachername
1
Mary Smith
2
Jason Crane
School table
ID
schoolname
1
ABC school
2
PS1
3
PS2
Here is my attempted code:
SELECT o.ordertype, o.status, t.Teachername, s.schoolname
,MAX(o.Purchasedate) OVER (PARTITION by t.ID) last_purchase
FROM orders o
INNER JOIN teachers t ON t.ID=o.TeacherID
INNER JOIN schools s ON s.ID=o.schoolID
WHERE o.status in ('Completed','In-progress')
AND o.ordertype not like 'notebook'
It should look like this:
Ordertype
Status
teachername
last_purchase
schoolname
Paper
Completed
Mary Smith
3/5/2021
ABC School
Erasers
Completed
PS1
2/1/2021
ABC school
It is bringing multiple rows instead of just the latest purchase date and its associated rows. I think i need a subquery.

Aggregation functions are not appropriate for what you are trying to do. Their purpose is to summarize values in multiple rows, not to choose a particular row.
Just a window function does not filter any rows.
You want to use window functions with filtering:
SELECT ordertype, status, Teachername, schoolname, Purchasedate
FROM (SELECT o.ordertype, o.status, t.Teachername, s.schoolname,
o.Purchasedate,
ROW_NUMBER() OVER (PARTITION by t.ID ORDER BY o.PurchaseDate DESC) as seqnum
FROM orders o JOIN
teachers t
ON t.ID = o.TeacherID
schools s
ON s.ID = o.schoolID
WHERE o.status in ('Completed', 'In-progress') AND
o.ordertype not like 'notebook'
) o
WHERE seqnum = 1;

You can use it in different way. it's better to use Group By for grouping the other columns and after that use Order by for reorder all records just like bellow.
SELECT top 1 o.ordertype, o.status, t.Teachername, s.schoolname
,o.Purchasedate
FROM orders o
INNER JOIN teachers t ON t.ID=o.TeacherID
INNER JOIN schools s ON s.ID=o.schoolID
having o.status in ('Completed','In-progress')
AND o.ordertype not like 'notebook'
group by o.ordertype, o.status, t.Teachername, s.schoolname
order by o.Purchasedate Desc

Counting the amount of relations to one table to another

Very hard to create a good title for this.
Given the table products
productID
---------
892
583
388
And the table purchases
customerID productID
---------- ---------
56 892
97 388
56 583
56 388
97 583
How would I go about getting a table of all the costumers that have bought all products?

You can use group by and having:
select customerId
from purchases
group by customerId
having count(distinct productID) = (select count(*) from products);

Use GROUP BY clause with HAVING :
SELECT pr.customerID
FROM products p INNER JOIN
purchases pr
on pr.productID = p.productID
GROUP BY pr.customerID
HAVING COUNT(DISTINCT pr.productID) = (SELECT COUNT(*) FROM products);

Access Cross tab Query

I have two tables.
I need to subtract the the number of items ordered from the quantity currently
recorded.
I can get the count() of the sales of each individual item like so:
SELECT SALES.PRODUCT_ID AS ORDERED_ID
,COUNT(SALES.PRODUCT_ID) AS ORDERED
FROM SALES
GROUP BY SALES.PRODUCT_ID
Which gives me:
ORDERED_ID ORDERED
1201 2
1202 2
1204 2
1205 3
1206 1
1207 2
1208 1
1209 1
1210 3
Getting the quantity is just a matter of
SELECT PRODUCT.PRODUCT_ID AS INVEN_ID
,PRODUCT.QUANTITY AS INVEN
FROM PRODUCT
Which gives me:
INVEN_ID INVEN
1199 5
1200 2
1201 33
1202 44
1203 55
1204 66
1205 77
1206 88
1207 99
1208 110
1209 121
1210 132
I've spent hours on this problem and gave up at what I thought should be the solution:
SELECT SUB1.INVEN - SUB2.ORDERED
FROM
(SELECT PRODUCT.PRODUCT_ID AS INVEN_ID
,PRODUCT.QUANTITY AS INVEN
FROM PRODUCT
)AS SUB1
,(SELECT SALES.PRODUCT_ID AS ORDERED_ID
,COUNT(SALES.PRODUCT_ID) AS ORDERED
FROM SALES
GROUP BY SALES.PRODUCT_ID
)AS SUB2
INNER JOIN SUB1 ON SUB1.INVEN_ID = SUB2.ORDERED_ID
However, access does not recognize that last join as a valid join and without it I just get a Cartesian product. If I try to retrieve quantity without a sub query and just try to SELECT product.quantity - SUB2.ORDERED access demands that I put product.quantity - SUB2.ORDERED in an aggregate function. When I do what it says it then tells me that product.quantity - SUB2.ORDERED can't be in an aggregate function. I'm at a loss.
EDIT:
Final Solution:
SELECT SUB1.INVEN_ID AS PRODUCT_ID
,SUB1.PRODUCT_NAME AS PRODUCT_NAME
,SUB1.PRICE AS PRICE
,SUB1.INVEN - NZ(SUB2.ORDERED,0) AS AVAILABLE
FROM
(SELECT PRODUCT.PRODUCT_ID AS INVEN_ID
,PRODUCT.PRODUCT_NAME AS PRODUCT_NAME
,PRODUCT.PRICE AS PRICE
,PRODUCT.QUANTITY AS INVEN
FROM PRODUCT
)AS SUB1
LEFT JOIN
(SELECT SALES.PRODUCT_ID AS ORDERED_ID
,COUNT(SALES.PRODUCT_ID) AS ORDERED
FROM SALES
GROUP BY SALES.PRODUCT_ID
)AS SUB2 ON SUB1.INVEN_ID = SUB2.ORDERED_ID

Your INNER JOIN should put after first subquery.
I think you are looking for LEFT JOIN,because PRODUCT table should be the master table.
if you use LEFT JOIN SUB2.ORDERED column might be NULL so use NZ function
Or IIF(ISNULL(SUB2.ORDERED),0,SUB2.ORDERED) to check.
You can try this.
SELECT SUB1.INVEN - NZ(SUB2.ORDERED,0)
FROM
(SELECT PRODUCT.PRODUCT_ID AS INVEN_ID
,PRODUCT.QUANTITY AS INVEN
FROM PRODUCT
)AS SUB1
LEFT JOIN
(SELECT SALES.PRODUCT_ID AS ORDERED_ID
,COUNT(SALES.PRODUCT_ID) AS ORDERED
FROM SALES
GROUP BY SALES.PRODUCT_ID
)AS SUB2 ON SUB1.INVEN_ID = SUB2.ORDERED_ID

How to join 2 tables with select and count in single query

I need to join 2 tables (Person and PersonLine). The result should contain id and name column from Person table and count of personlineid column from PersonLine Table for each id. But sql query returns count of all personlineid. Can anyone help to form the sql.
Person:
ID NAME AGE
100 John 25
101 James 30
102 Janet 35
PersonLine:
ID NAME PERSONLINEID
100 John 1
100 John 2
100 John 3
101 James 1
101 James 2
102 Janet 1
SQL:
SELECT P.ID, CNT.COUNT_PERSONLINE, P.NAME
FROM PERSON P
LEFT JOIN PERSONLINE PL
ON P.ID = PL.ID,
(SELECT count(PL.PERSONLINEID) cnt FROM PERSON P LEFT JOIN PERSONLINE PL ON P.ID = PL.ID WHERE
P.ID = PL.ID) cnt
JOIN Table (Expected):
ID COUNT_PERSONLINE NAME
100 3 John
101 2 James
102 1 Janet
JOIN Table (Actual):
ID COUNT_PERSONLINE NAME
100 6 John
101 6 James
102 6 Janet

With your sample data, you don't even need the Person table -- because you seem to have redundant table in the two tables. You should probably fix this, but:
select pl.id, pl.name, count(*)
from personline pl
group by pl.id, pl.name;
Your count is just counting all the rows from the join of the tables -- that would be all the rows. A simple aggregation should suffice, even if you decide that the join is still necessary.
EDIT:
You have several choices with lots of columns in persons. One method is to put them in the group by:
select pl.id, pl.name, p.col1, p.col2, count(*)
from persons p join
personline pl
on p.id = pl.id
group by pl.id, pl.name, p.col1, p.col2
Another method is to do the aggregation before the join:
select p.*, pl.cnt
from person p join
(select pl.id, pl.name, count(*) as cnt
from personline pl
group by pl.id, pl.name
) pl
on p.id = pl.id;
Or, a correlated subquery:
select p.*, (select count(*) from personline pl where p.id = pl.id)
from person;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Problem with duplicates in distinct when joining - sql

Related

Join on two columns, if null then only join on one

SQL MAX aggregate function not bringing the latest date

Counting the amount of relations to one table to another

Access Cross tab Query

How to join 2 tables with select and count in single query

Categories

Resources