Oracle SQL Selecting Most Recent Data - sql

Good morning,
This is a follow-up to SELECT most recent in Oracle SQL Query
I am hoping to take my Oracle skills to the next level after learning a lot from this site.
I work for a small construction company and thus, we buy a lot of smaller parts/materials from our vendors. Sometimes, in the same calendar year, we may switch who we buy the SAME part from. I want to only grab the most recent VENDOR for each individual PART NUMBER. Here is an example of what I mean:
The code for my starting query:
WITH
PartNums AS -- Grabs me all of the stuff we "bought", and its vendor, in the construction division since Jan 1 2018
(
SELECT
PO_ITEM AS "PART_NUM",
VEND_NUM,
VEND_NM,
PODiv AS "DIVISION_CD"
FROM
INNER JOIN
(
SELECT MAX(PODate) OVER(PARTITION BY PO_Number, VEND_NUM))
FROM tblPurchases
WHERE PODate > '01-Jan-2017'
) tblTemp INNER JOIN tblPurchases ON tblPurchases.VEND_NUM = tblTemp.VEND_NUM
INNER JOIN tblVendors ON tblPurchases.VEND_NUM = tblVendors.VEND_NUM
WHERE
PODate > '01-Jan-2017'
AND
PODiv = 'C'
),
Defects AS -- Grabs me the listed defects against their stuff
(
SELECT
PartNums.*,
DEFECT_NUM,
DEFECT_CAT
FROM
PartNums
INNER JOIN tblDefects ON PartNums.PART_NUM = tblDefects.DEFECTIVE_PART_NUM
WHERE
DEFECT_DATE > '01-Jan-2017'
),
Names AS
(
SELECT
Defects.*,
PART_NM
FROM
Defects
INNER JOIN tblParts ON Defects.PART_NUM = tblParts.PART_NUM
)
SELECT
VEND_NUM,
VEND_NM,
PART_NUM,
PART_NM,
DEFECT_NUM,
DEFECT_CAT,
DIVISION_CD
FROM Names
This produces the following results:
| Vendor Number | Vendor Name | Part Number | Part Name | Defect Number | Defect Category | Division | Purchase Order Date |
|---------------|------------------------------|-------------|----------------|---------------|-----------------|----------|---------------------|
| 200123 | Push-Button LLC | 54211EW | Faceplate | PROB333211 | WRPT | C | 11-Jan-2017 |
| 200587 | Entirely Concrete | 69474TR | 2in Screw | PROB587412 | WRPT | C | 03-Mar-2017 |
| 200444 | Maaco | 77489GF | Hammer NR | PROB369854 | WRPT | C | 08-Aug-2017 |
| 200100 | Fleischman Contractors | 21110LW | Service | PROB215007 | OPYM | C | 01-Jun-2017 |
| 200664 | Advanced Tool Repair LLC | 47219UZ | Service | PROB9874579 | UPYM | C | 14-Jan-2018 |
| 200999 | AllTech Electronic Equipment | 36654DD | Plastic Casing | PROB326598 | NA | C | 16-Jan-2018 |
| 200321 | ZyotoCard Electronics | 74200ZN | Service | PROB012547 | MISCT | C | 19-Apr-2017 |
| 200331 | Black&Decker | 41122UT | .11mm Drillbit | PROB147741 | BRKN | C | 03-Aug-2017 |
| 200333 | Sears | 41122UT | .11mm Drillbit | PROB147741 | BRKN | C | 11-Mar-2017 |
As you can see, there are 2 vendors for Part Number 41122UT. For this part number, I only want Black & Decker (whose PO Date is 5 months newer than Sears).
I would like for the data to look like this:
| Vendor Number | Vendor Name | Part Number | Part Name | Defect Number | Defect Category | Division | Purchase Order Date |
|---------------|------------------------------|-------------|----------------|---------------|-----------------|----------|---------------------|
| 200123 | Push-Button LLC | 54211EW | Faceplate | PROB333211 | WRPT | C | 11-Jan-2017 |
| 200587 | Entirely Concrete | 69474TR | 2in Screw | PROB587412 | WRPT | C | 03-Mar-2017 |
| 200444 | Maaco | 77489GF | Hammer NR | PROB369854 | WRPT | C | 08-Aug-2017 |
| 200100 | Fleischman Contractors | 21110LW | Service | PROB215007 | OPYM | C | 01-Jun-2017 |
| 200664 | Advanced Tool Repair LLC | 47219UZ | Service | PROB9874579 | UPYM | C | 14-Jan-2018 |
| 200999 | AllTech Electronic Equipment | 36654DD | Plastic Casing | PROB326598 | NA | C | 16-Jan-2018 |
| 200321 | ZyotoCard Electronics | 74200ZN | Service | PROB012547 | MISCT | C | 19-Apr-2017 |
| 200331 | Black&Decker | 41122UT | .11mm Drillbit | PROB147741 | BRKN | C | 03-Aug-2017 |
I have found that using MAX() OVER (PARTITION BY) can be used to return the most recent, so I tried this query and it now runs, but it gives me the most recent date, for each vendor, for each part. Not just for each part. I need the MOST RECENT VENDOR INFORMATION (found on the Purchase Order, so ultimately need the most recent Purchase Order) for every PART. Could anyone advise?
WITH
PartNums AS -- Grabs me all of the stuff we "bought", and its vendor, in the construction division since Jan 1 2018
(
SELECT
PO_ITEM AS "PART_NUM",
VEND_NUM,
VEND_NM,
PODiv AS "DIVISION_CD"
FROM
INNER JOIN
(
SELECT PO_NUMBER, VEND_NUM, MAX(PODate) OVER(PARTITION BY PO_NUMBER, VEND_NUM))
FROM tblPurchases
WHERE PODate > '01-Jan-2017'
) tblTemp INNER JOIN tblPurchases ON tblPurchases.VEND_NUM = tblTemp.VEND_NUM
INNER JOIN tblVendors ON tblPurchases.VEND_NUM = tblVendors.VEND_NUM
WHERE
PODate > '01-Jan-2017'
AND
PODiv = 'C'
),
Defects AS -- Grabs me the listed defects against their stuff
(
SELECT
PartNums.*,
DEFECT_NUM,
DEFECT_CAT
FROM
PartNums
INNER JOIN tblDefects ON PartNums.PART_NUM = tblDefects.DEFECTIVE_PART_NUM
WHERE
DEFECT_DATE > '01-Jan-2017'
),
Names AS
(
SELECT
Defects.*,
PART_NM
FROM
Defects
INNER JOIN tblParts ON Defects.PART_NUM = tblParts.PART_NUM
)
SELECT
VEND_NUM,
VEND_NM,
PART_NUM,
PART_NM,
DEFECT_NUM,
DEFECT_CAT,
DIVISION_CD
FROM Names
Thank you very much for your time and help. Sorry if this creates any ambiguity.

Instead of using MAX, use DENSE_RANK, RANK or ROW_NUMBER and partition it by PO_NUMBER, VEND_NUM, order it by PO_DATE DESC, and filter out the records that returns value greater than 1,
Your query could be similar like below, as you can see I used DENSE_RANK,
SELECT *
FROM (SELECT A.*, DENSE_RANK() OVER(PARTITION BY PO_NUMBER, VEND_NUM ORDER BY podate DESC) rank_value
FROM your_table)
WHERE rank_value = 1;

Supposedly you are looking for all parts reported defective since a particular date and want to find the according order so as to be able to contact the supplier.
In Oracle 12c you can use CROSS APPLY to join only the latest order (which you get with ORDER BY date DESC FETCH FIRST ROW ONLY).
select
o.vend_num as vendor_number,
o.vend_nm as vendor_name,
d.defective_part_num as part_number,
p.part_nm as part_name,
d.defect_num as defect_number,
d.defect_cat as defect_category,
o.podiv as division,
o.podate as purchase_order_date
from tbldefects d
cross apply
(
select *
from tblpurchases pu
where pu.po_number = d.defective_part_num
and pu.podate <= d.defect_date
and pu.podiv = 'C'
order by pu.podate desc
fetch first row only
) o
join tblparts p on p.part_num = d.defective_part_num
where d.defect_date >= date '2017-01-01';

Related

Joining table on two columns only joins it on a single

How do I correctly join a table on two columns. My issue is that the result is not correct as it only joins on a single column.
This question started of in this other question: SQL query returns product of results instead of sum . I am creating a new question as there is an other issue I am trying to solve.
I join a table of materials on a table which contains multiple supply and disposal movements. Each movement references a material id. I would like to join the material on each movement.
My query:
SELECT supply_material_refer, disposal_material_refer, material_id, material_name
FROM "construction_sites"
JOIN projects ON construction_sites.project_refer = projects.project_id
JOIN addresses ON construction_sites.address_refer = addresses.address_id
cross join lateral ( select *
from (select row_number() over () as rn, *
from supplies
where supplies.supply_project_refer = projects.project_id) as supplies
full join (select row_number() over () as rn, *
from disposals
where disposals.disposal_project_refer = projects.project_id
) as disposals
on (supplies.rn = disposals.rn)
) as combined
LEFT JOIN materials material ON combined.disposal_material_refer = material.material_id
OR combined.supply_material_refer = material.material_id
WHERE (projects.project_name = 'Project 15')
ORDER BY construction_site_id asc;
The result of the query:
+-----------------------+-------------------------+-------------+---------------+
| supply_material_refer | disposal_material_refer | material_id | material_name |
+-----------------------+-------------------------+-------------+---------------+
| 1 | 1 | 1 | Materialtest |
| 2 | 1 | 1 | Materialtest |
| 2 | 1 | 2 | Dirt |
| 1 | 1 | 1 | Materialtest |
| 2 | 1 | 1 | Materialtest |
| 2 | 1 | 2 | Dirt |
| 1 | (null) | 1 | Materialtest |
| 4 | (null) | 4 | Stones |
+-----------------------+-------------------------+-------------+---------------+
An example line I have issues with:
+------------------------+-------------------------+-------------+---------------+
| supply_material_refer | disposal_material_refer | material_id | material_name |
+------------------------+-------------------------+-------------+---------------+
| 2 | 1 | 1 | Materialtest |
+------------------------+-------------------------+-------------+---------------+
A prefered output would be like:
+------------------------+----------------------+-------------------------+------------------------+
| supply_material_refer | supply_material_name | disposal_material_refer | disposal_material_name |
+------------------------+----------------------+-------------------------+------------------------+
| 2 | Dirt | 1 | Materialtest |
+------------------------+----------------------+-------------------------+------------------------+
I have created a sqlfiddle with dummy data: http://www.sqlfiddle.com/#!17/863d78/2
To my understanding the solution would be to have a disposal_material column and and supply_material column for the material names. I do not know how I can achieve this goal though...
Thanks for any help!

Postgresql left join

I have two tables cars and usage. I create a record in usage once a month for some of cars.
Now I want to get distinct list of cars with their latest usage that I saved.
first of all look at the tables please
cars:
| id | model | reseller_id |
|----|-------------|-------------|
| 1 | Samand Sall | 324228 |
| 2 | Saba 141 | 92933 |
usages:
| id | car_id | year | month | gas |
|----|--------|------|-------|-----|
| 1 | 2 | 2020 | 2 | 68 |
| 2 | 2 | 2020 | 3 | 94 |
| 3 | 2 | 2020 | 4 | 33 |
| 4 | 2 | 2020 | 5 | 12 |
The problem is here
I need only the latest usage of year and month
I tried a lot of ways but none of them is good enough. because sometimes this query gets me one ofnot latest records of usages.
SELECT * FROM cars AS c
LEFT JOIN
(select *
from usages
) u on (c.id = u.car_id)
order by u.gas desc
You can do this with a DISTINCT ON in the derived table:
SELECT *
FROM cars AS c
LEFT JOIN (
select distinct on (u.car_id) *
from usages u
order by u.car_id, u.year desc, u.month desc
) lu on c.id = lu.car_id
order by u.gas desc;
I think you need window function row_number. Here is the demo.
select
id,
model,
reseller_id
from
(
select
c.id,
model,
reseller_id,
row_number() over (partition by u.car_id order by u.id desc) as rn
from cars c
left join usages u
on c.id = u.car_id
) subq
where rn = 1

Retrieve the minimal create date with multiple rows

I have an issue with an SQL query that I am trying to write. I am trying to retrieve the row that has the minimal create_dt for each inst (see table) and amount (which isn't unique).
Unfortunately I can't use group by as the amount column isn't unique.
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company A | 400 | 4545 | 01/11/2018 |
| Company A | 200 | 4545 | 31/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
| Company B | 212 | 4893 | 04/10/2016 |
| Company B | 100 | 4893 | 10/10/2017 |
| Company B | 20 | 4893 | 04/10/2018 |
+--------------+--------+------+-------------+
In the above example I expect to see:
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
+--------------+--------+------+-------------+
Code:
SELECT
bill_company, bill_name, account_no
FROM
dbo.customer_information;
SELECT
balance_id, balance_id2, minus_balance,new_balance,
create_date, account_no
FROM
dbo.btr
SELECT
balance_id, balance_id2, expired_Date, amount, balance_type, account_no
FROM
dbo.btr_balance
SELECT
balance_ist, expired_date, account_no, balance_type
FROM
dbo.BALANCE_inst
Retrieve the minimal create data for a balance instance with the lowest balance for a balance inst.
(SELECT
bill_company,
bill_name,
account_no,
balance_ist,
amount,
MIN(create_date)
FROM
dbo.mtr btr
LEFT JOIN
btr_balance btrb ON btr.balance_id = btrb.balance_id
AND btr.balance_id2 = btrb.balance_id2
LEFT JOIN
balance_inst bali ON btr.account_no = bali.account_no
AND btrb.expired_date = bali.expired_date
GROUP BY
bill_company, bill_name, account_no,amount, balance_ist)
I have seen some solutions about using correlated query but can't see to get my head around it.
Common Table Expression (CTE) will help you.
;with cte as (
select *, row_number() over(partition by company_name order by create_date) rn
from dbo.myTable
)
select * from cte
where rn = 1;
use row_number() i assumed bill_company is your company name
select * from
( SELECT bill_company,
bill_name,
account_no,
balance_ist,
amount,
create_date,
row_number() over(partition by bill_company order by create_date) rn
FROM dbo.mtr btr left join btr_balance btrb
on btr.balance_id = btrb.balance_id and btr.balance_id2 = btrb.balance_id2
left join balance_inst bali
on btr.account_no = bali.account_no and btrb.expired_date = bali.expired_date
) t where t.rn=1

Oracle SQL - Select duplicates based on two columns

I need to select duplicate rows based on two columns in a join, and i can't seem to figure out how that is done.
Currently i got this:
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id
And the output is something along the lines of below:
| Name | adm_id | external_code |identifier_value |
|:-----------|------------:|:------------: |:----------------:|
| Warlob | 66323 | ext531 | id444 |
| Ozzy | 53123 | ext632 | id333 |
| Motorhead | 521 | ext733 | id222 |
| Perez | 123 | ext833 | id111 |
| Starlight | 521 | ext934 | id222 |
| Aligned | 123 | ext235 | id111 |
What i am looking for, is how to simply select these 4 rows, as they are duplicates based on column: adm_id and Identifier_value
| Name | adm_id | external_code |identifier_value |
|:-----------|------------:|:------------: |:----------------:|
| Motorhead | 521 | ext733 | id222 |
| Perez | 123 | ext833 | id111 |
| Starlight | 521 | ext934 | id222 |
| Aligned | 123 | ext235 | id111 |
First group by ADM_ID, IDENTIFIER_VALUE and find groups that has more than one row in it.
Then select all rows that has these couples
SELECT S.NAME
,ADMINISTRATIVE_SITE_ID AS ADM_ID
,S.EXTERNAL_CODE
,SI.IDENTIFIER_VALUE
FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
WHERE (ADMINISTRATIVE_SITE_ID, SI.IDENTIFIER_VALUE) IN (SELECT ADMINISTRATIVE_SITE_ID AS ADM_ID, SI.IDENTIFIER_VALUE
FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
GROUP BY ADM_ID, IDENTIFIER_VALUE
HAVING COUNT(*) > 1)
Or an alternate way that may perform better on big datasets:
with t as (
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value
COUNT(*) OVER (PARTITION BY administrative_site_id ,identifier_value ) AS cnt
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id)
select name, adm_id, external_code, identifier_value
from t
where cnt > 1

How to return the SUM value using a distinct value from referenced table?

I have the following structure in a schema:
------------------------------- ----------------------------
| m_user | | person |
------------------------------- ----------------------------
| UUID | ID | PLATFORM | | ID | NAME | EMAIL |
| 456789 | 22222 | TG | | 22222 | JOSEPH | J#CM.CO |
| 987654 | 22222 | MS | | 85858 | MARKUS | M#GM.CO |
| 948576 | 85858 | TG | | 36363 | ANDREA | A#GM.CO |
------------------------------- ----------------------------
-------------------------------------------------
| plan |
-------------------------------------------------
| ID | HOURS | DATE | CLIENT |
| 22222 | 72 | 2017-12-05 | CLIENT11 |
| 22222 | 88 | 2017-12-25 | CLIENT11 |
| 85858 | 48 | 2017-12-05 | CLIENT12 |
-------------------------------------------------
I need to return the SUM of HOURS planned for each user that exists in m_user table. m_user allow only one ID per platform: the same user can be on two platform, but own an unique ID that apply for both platforms.
The problem occur when the results show duplicate SUM values because the ID appear two times in m_user table. This is the query:
SELECT ps.id
ps.name,
ps.email,
SUM(pl.hours) AS hours
FROM schema.person AS ps
JOIN schema.m_user AS usr ON ps.id = usr.id
JOIN schema.plan AS pl ON usr.id = pl.id -- Here is the problem, I think
WHERE pl.client = 'CLIENT11' AND
pl.date BETWEEN '2017-12-01' AND '2017-12-31'
GROUP BY id, name, email;
I've tried using DISTINCT and DISTINCT ON (usr.id) but the result given is the same.
Here is the result I get:
--------------------------------------
| ID | NAME | EMAIL | HOURS |
--------------------------------------
| 22222 | JOSEPH | J#CM.CO | 320 | -- <- 320 instead of 160
| ... | .... | .... | ... |
--------------------------------------
I am new to SQL, so I think this is simple error that I am not able to figure right now, I also have tried to use OVER (PARTITION BY usr.id) AND LIMIT 1 but again I get 320 for every row where 22222 appear. Do I need to use a CTE to perform this query? I hope you can help me, thank you (I am currently using PostgreSQL, but I think this problem apply for SQL in general so I set SQL tag).
Removed the join on m_user. Use subquery on m_user table to find the list of users.
SELECT ps.id,
ps.name,
ps.email,
SUM(pl.hours) AS hours
FROM schema.person AS ps
JOIN schema.plan AS pl ON ps.id = pl.id
WHERE pl.client = 'CLIENT11' AND
pl.date BETWEEN '2017-12-01' AND '2017-12-31'
AND ps.id IN ( SELECT usr.id FROM schema.m_user AS usr )
GROUP BY ps.id, ps.name, ps.email;
sqlfiddle: sqlfiddle.com/#!17/5996e/1
You can always phrase this as:
SELECT ps.id, ps.name, ps.email, SUM(pl.hours) AS hours
FROM schema.person ps JOIN
(SELECT usr.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) as seqnum
FROM schema.m_user usr
) usr
ON ps.id = usr.id JOIN
schema.plan pl
ON usr.id = pl.id AND seqnum = 1
WHERE pl.client = 'CLIENT11' AND
pl.date BETWEEN '2017-12-01' AND '2017-12-31'
GROUP BY id, name, email;
This selects one row for the join.