How to join records from a group by - sql

I don't know if the title is as descriptive as I wanted but I'll try to explain with real examples of what I want.
In my table 'Details' I have
Date | ProductId | Total
-------------------------------------
17/05/20 | 16788 | 62
--------------------------------------
19/05/20 | 3789 | 15
So I want the result be something like that:
17/05/20 - 16788 - 62
17/05/20 - 3789 - NULL (or 0)
19/05/20 - 16788 - NULL (or 0)
19/05/20 - 3789 - 15
I started doing RIGHT JOIN with a GROUP BY of the Dates, but didn't work.
I run out of ideas, can someone help me?
Thanks in advance

You can generate the rows with a cross join and then bring in the values using left join:
SELECT d.date,
p.productid,
t.total
FROM (
SELECT DISTINCT DATE
FROM details
) d
CROSS JOIN (
SELECT DISTINCT productid
FROM details
) p
LEFT JOIN details t
ON t.date = d.date
AND t.productid = p.productid
ORDER BY
d.date,
p.productid DESC;

Related

Count current job with a partition

The next code makes a join between users and payment, to get the last payment.
The query should work if the payment table did not contain duplicated rows with the same max_date as the following one.
Something to notice, is that the row is not completely duplicated, sometimes contains little changes. But we do not care if we select the 'right' one, we only need it to be one, no matter which one of those is.
user_ID | Payment | date | product | credit_card
1 300 1/1/2020 A No
1 300 1/1/2020 Null | No
1 300 1/1/2021 A Yes
1 300 1/1/2021 Null | Yes
This causes the second inner join to duplicate rows because it makes a match twice with the maxDate which is 1/1/2021
SELECT a.*, c.*
FROM users a
INNER JOIN payments c
ON a.id = c.user_ID
INNER JOIN
(
SELECT user_ID, MAX(date) maxDate
FROM payments
GROUP BY user_ID
) b ON c.user_ID = b.user_ID AND
c.date = b.maxDate
I'm looking for a way to select only the first match of the maxDate. Any clue is welcome, thank in advance for any help.
You should be using window functions for this. That would be:
SELECT u.*, p.*
FROM users u JOIN
(SELECT p.*,
ROW_NUMBER() OVER (PARTITION BY p.user_id ORDER BY p.date DESC) as seqnum
FROM payments p
) p
ON p.user_ID = u.id AND p.seqnum = 1;
This returns one row, but which row is arbitrary.
Note the use of meaningful table aliases in the query -- u for users and p for `payments. Don't use meaningless letters. They just make the query hard to read -- and to maintain.

Select MAX and RIGHT OUTER JOIN

I have platform to extract data from sql tables and so far all queries were generated by simple drag and drop tool. Now I am trying to change query manually, but it's not working as expected...
Can you take a look?
Query delivered by generator:
SELECT
repo.MAT.MAT_A_COD,
inventory.INV.MRP_RQMT_DT,
SUM(inventory.INV.MRP_AVL_QTY)
FROM
repo.MAT RIGHT OUTER JOIN inventory.INV ON (inventory.INV.MRP_MAT_A_FK=repo.MAT.MAT_A_PK)
WHERE
( inventory.INV.MRP_COMPANY_COD IN ('01','02') )
GROUP BY
1,
2
Results:
Material A | 2020.01.01 | 100
Material A | 2020.01.02 | 200
Material A | 2020.01.03 | 300
Material B | 2020.01.01 | 10
Material B | 2020.01.02 | 0
What I am looking for: only values for the latest date for each material.
Material A | 2020.01.03 | 300
Material B | 2020.01.02 | 0
I tried with MAX(inventory.INV.MRP_RQMT_DT), but no success. Any help is appreciated!
You can try the below -
SELECT
repo.MAT.MAT_A_COD,
inventory.INV.MRP_RQMT_DT,
SUM(inventory.INV.MRP_AVL_QTY)
FROM
repo.MAT RIGHT OUTER JOIN inventory.INV ON inventory.INV.MRP_MAT_A_FK=repo.MAT.MAT_A_PK
WHERE
inventory.INV.MRP_COMPANY_COD IN ('01','02') and inventory.INV.MRP_RQMT_DT=(select max(inventory.INV.MRP_RQMT_DT) from inventory.INV inv1 where inventory.INV.MRP_MAT_A_FK=inv1.MRP_MAT_A_FK)
GROUP BY 1, 2
You did not specify the database engine, but the RANK windows function works in many major ones (I will use T-SQL syntax).
SELECT * FROM (
SELECT
repo.MAT.MAT_A_COD,
inventory.INV.MRP_RQMT_DT,
SUM(inventory.INV.MRP_AVL_QTY),
RANK () OVER (PARTITION BY repo.MAT.MAT_A_COD ORDER BY inventory.INV.MRP_RQMT_DT) rn
FROM repo.MAT RIGHT OUTER JOIN inventory.INV ON (inventory.INV.MRP_MAT_A_FK=repo.MAT.MAT_A_PK)
WHERE inventory.INV.MRP_COMPANY_COD IN ('01','02')
GROUP BY 1, 2
)
WHERE rn = 1
You can use window functions:
SELECT m.MAT_A_COD, i.MRP_RQMT_DT,
SUM(i.MRP_AVL_QTY)
FROM repo.MAT LEFT JOIN
(SELECT i.*,
MAX(MRP_RQMT_DT) OVER (PARTITION BY MRP_MAT_A_FK ORDER BY DESC) as max_MRP_RQMT_DT
FROM inventory.INV i
) i
ON i.MRP_MAT_A_FK = r.MAT_A_PK AND
i.MRP_RQMT_DT = i.max_MRP_RQMT_DT
WHERE i.MRP_COMPANY_COD IN ('01', '02')
GROUP BY 1, 2;
Note other changes to the query:
Table aliases make the query easier to write and to read.
An outer join does not seem necessary at all. But if you do use one, it is probably on the MAT table, not the inventory table.
If you use an outer join, you should try to take the columns from the table where you are keeping all the rows -- the first table in a LEFT JOIN. I don't recommend RIGHT JOINs in general.

SQL - Tricky Merging

I'm facing a tricky query to do. I hope your expertise will help me to sort it out.
There are 2 tables:
Table1 : Orders
Index ProductName OrderDate
0 a 03/03/1903
1 a 10/03/2014
2 b 01/01/2017
3 c 01/01/2019
Table2 : Product Specs
--> This table shows every change made in the Color of our products
Index ProductName Color ColorUpdatedOn
0 a Blue 01/01/1900
1 a Red 01/01/2014
2 a Yellow 01/01/2017
3 b Pink 01/01/2017
4 c Black 01/01/2018
5 c Black 31/12/2018
I would like to be able to retrieve all the data from Table1 with the Column Color et UpdatedOn
Index ProductName OrderDate Color ColorUpdatedOn
0 a 03/03/1903 Blue 01/01/1900
1 a 10/03/2014 Red 01/01/2014
2 a 01/01/2019 Yellow 01/01/2017
3 c 01/01/2019 Black 31/12/2018
Do you have any idea how I could do this ?
Thank you in advance for your help
Largo
Get the max() date of Product Specs table based on color,
then join it using year() function, applicable on mysql and mssql, not sure with other db.
select o.Index, o.ProdcutName, o.Date, t1.color, t1.ColorUpdatedOn
from Orders o
inner join
(select color, max(colorupdatedon) as ColorUpdatedOn
from productspecs
group by color) t1 on year(o.OrderDate) = year(t1.createdon)
but I would prefer using right() function since your year dates are at the end.
select o.Index, o.ProdcutName, o.Date, t1.color, t1.ColorUpdatedOn
from Orders o
inner join
(select color, max(colorupdatedon) as ColorUpdatedOn
from productspecs
group by color) t1 on right(o.OrderDate, 4) = right(t1.createdon, 4)
In a database that supports lateral joins (which is quite a few of them now), this is pretty easy:
select o.*, s.* -- select the columns you want
from orders o left join lateral
(select s.*
from specs s
where s.ProductName = o.ProductName and
s.ColorUpdatedOn <= o.OrderDate
order by s.ColorUpdatedOn desc
fetch first 1 row only
) s
on 1=1;
In SQL Server, this would use outer apply rather than left join lateral.
In other databases, I would use lead():
select o.*, s.* -- select the columns you want
from orders o left join
(select s.*,
lead(ColorUpdatedOn) over (partition by ProductName order by ColorUpdatedOn) as next_ColorUpdatedOn
from specs s
) s
on s.ProductName = o.ProductName and
o.OrderDate >= s.ColorUpdatedOn and
(o.OrderDate < s.next_ColorUpdatedOn or s.next_ColorUpdatedOn is null)
Assuming, the datatype for OrderDate and ColorUpdatedOn are both date, we can find the colors which was at the time of order.
For this I have used the anlytical/windowing function. The Hive query would look like this:
SELECT
y.ProductName, y.OrderDate, y.Color, y.ColorUpdatedOn
FROM (
SELECT
x.*,
DENSE_RANK() OVER(PARTITION BY x.ProductName, x.OrderDate ORDER BY x.recency ASC) AS relevance
FROM (
SELECT
a.*, b.color, b.ColorUpdatedOn, DATEDIFF(a.OrderDate, b.ColorUpdatedOn) AS recency
FROM
Order a
INNER JOIN
Product b
ON (
a.ProductName = b.ProductName
AND a.OrderDate >= b.ColorUpdatedOn
)
) x
) y
WHERE
y.relevance = 1;
The query could be made specific if you let me know the database you are using.
Let me know if it helps.

Select all from max date

Good morning,
I am writing a SQL query for the latest metal prices with the latest date they were put into the database. Example table below:
ID Date Created
1 01/01/01 01:01
2 01/01/01 01:02
3 01/01/01 01:03
4 01/01/01 01:04
1 02/01/01 01:01
2 02/01/01 01:02
So from this I want the following result:
ID Date Created
1 02/01/01 01:01
2 02/01/01 01:02
When I run the below query it is just giving me the last one entered into the date base so from the above example it would be ID 2 DateCreated 02/01/01 01:02. The query I am using is below:
SELECT mp.MetalSourceID, ROUND(mp.PriceInPounds,2),
mp.UnitPrice, mp.HighUnitPrice, mp.PreviousUnitPrice,
mp.PreviousHighUnitPrice, ms.MetalSourceName,
ms.UnitBasis, cu.Currency
FROM tblMetalPrice AS mp
INNER JOIN tblMetalSource AS ms
ON tblMetalPrice.MetalSourceID = tblMetalSource.MetalSourceID
INNER JOIN tblCurrency AS cu
ON tblMetalSource.CurrencyID = tblCurrency.CurrencyID
WHERE DateCreated = (SELECT MAX (DateCreated) FROM tblMetalPrice)
GROUP BY mp.MetalSourceID;
Could anyone please help its driving me crazy not knowing and my brain is dead this friday morning.
Thanks
Use a correlated subquery for the where clause:
WHERE DateCreated = (SELECT MAX(DateCreated) FROM tblMetalPrice mp2 WHERE mp2.id = mp.id)
You can join on a subquery, and I don't think you'll need the group by, or indeed the where clause (because that's handled by the join).
SELECT mp.MetalSourceID,
ROUND(mp.PriceInPounds,2),
mp.UnitPrice,
mp.HighUnitPrice,
mp.PreviousUnitPrice,
mp.PreviousHighUnitPrice,
ms.MetalSourceName,
ms.UnitBasis,
cu.Currency
FROM tblMetalPrice AS mp
INNER JOIN tblMetalSource AS ms
ON tblMetalPrice.MetalSourceID = tblMetalSource.MetalSourceID
INNER JOIN tblCurrency AS cu
ON tblMetalSource.CurrencyID = tblCurrency.CurrencyID
INNER JOIN (SELECT ID,MAX(DateCreated) AS maxdate FROM tblMetalPrice GROUP BY ID) AS md
ON md.ID = mp.ID
AND md.maxdate = mp.DateCreated
with maxDates as
(select max(datecreated) maxd, ids grp , count(1) members from s_tableA group by ids having count(1) > 1)
select ids, datecreated from s_tableA,maxDates
where maxd = datecreated and ids = grp;
this query will give your desired result. Correlated sub queries tend to consume lot of processing time, because for each row of the outer query it has to process all the rows in the inner query.

Cross Join Query

Rewording my original post for further clarification.
I current have the below tables:
Product_Ref
product_id
product_name
Products
product_id
so_date (date)
total_sales
Calendar
dt (date field, each row representing a single day for the past/next 10 years)
I am looking to produce a report that will tell me the number of products that were sold in the past 6 months (based on SYSDATE) on a daily basis, the report should be every combination of day in the last 6 months against every possible product_id in the format:
Product id | date | total sales
If I assume that there were 0 entries in the products table (i.e no sales) I would still expect a complete report output but instead it would show 6 months of zero'd data i.e.
1 | 2012-01-01 | 0
2 | 2012-01-01 | 0
3 | 2012-01-01 | 0
1 | 2012-01-02 | 0
2 | 2012-01-02 | 0
3 | 2012-01-02 | 0
…
This would assume there were 3 products in the product_reference table - my original query (noted below) was my starter for 10, but not sure where to go from here.
SELECT products.product_id, calendar.dt, products.total_sales
FROM products RIGHT JOIN calendar ON (products.so_date = calendar.dt)
WHERE calendar.dt < SYSDATE AND calendar.dt >= ADD_MONTHS(SYSDATE, -7)+1
ORDER BY calendar.dt ASC, products.product_id DESC;
The clue is in the question - you are looking for a CROSS JOIN.
SELECT products.product_id, calendar.dt, products.total_sales
FROM Product_Ref
CROSS JOIN calendar
LEFT JOIN products ON products.so_date = calendar.dt
AND products.product_id = Product_Ref.product_id
WHERE calendar.dt < SYSDATE AND calendar.dt >= ADD_MONTHS(SYSDATE, -7)+1
ORDER BY calendar.dt ASC, products.product_id DESC;
I was confused at first by your table names where "Product" in fact means "sale" and "Product_Ref" is a product!
This is very similar to an example of the use of CROSS JOIN I once posted here.
As far as I understood, what do you want is to have no result if there were no sales, write?
So, I think you just need to change the RIGHT JOIN to INNER JOIN.
By RIGHT joining, if there were register in the JOIN table and there weren't in the FROM table it will return the data from the JOIN table, with NULL values in the columns refering to the FROM table.
By INNER joining you will have results just if you there were data that match in both tables.
Hope I understood well and it helps.
Assuming your desired output is to match only the products date with those in the calendar table, you should use an INNER JOIN:
SELECT c.dt, p.product_id, p.total_sales
FROM calendar c
INNER JOIN products p on c.dt = p.so_date
WHERE c.dt < SYSDATE and c.dt >= ADD_MONTHS(SYSDATE,-7)+1
ORDER BY c.dt ASC, p.product_id DESC;
A CROSS JOIN would produce results with every combination from your products table and your calendar table and thus not require the use of ON.
--EDIT
See edits below (UNTESTED):
SELECT PR.Product_ID, C.dt, P.TotalSales
FROM Calendar C
CROSS JOIN Product_Ref PR
LEFT JOIN Product P ON P.Product_Id = PR.Product_Id and p.so_date = c.dt
WHERE c.dt < SYSDATE and c.dt >= ADD_MONTHS(SYSDATE,-7)+1
ORDER BY c.dt ASC, p.product_id DESC;
Good luck.