Get Only One Row For Each ID With the Highest Value - sql

I have a query
SELECT
*
FROM
mgr.MF_AGREEMENT_LGR TABLE1
INNER JOIN
(SELECT
MAX(VALUE_DATE) AS VALUE_DATE,
REGISTRATION_NO AS REGISTRATION_NO
FROM
mgr.MF_AGREEMENT_LGR
GROUP BY
REGISTRATION_NO) AS TABLE2 ON TABLE1.REGISTRATION_NO = TABLE2.REGISTRATION_NO
WHERE
TABLE1.VALUE_DATE = TABLE2.VALUE_DATE
AND TABLE1.TRX_CODE = 'LCLR'
ORDER BY
TABLE1.REGISTRATION_NO
This returns the rows with the latest date for each REGISTRATION_CODE. Some have like three or more results for each REGISTRATION_CODE because it has more than one transaction on the same date.
Also, each row has its DOC_NO field.
My question is, how am I going to get only one row from each REGISTRATION_CODE with the highest DOC_NO.
By the way, DOC_NO is a varchar.
Example value for this field is: Amort 1, Amort 12, Amort 5
If those examples are in one REGISTRATION_CODE, I only need the row with the highest amort which is Amort 12.
I am using a SQL Server 2000.

SQL Server 2000 has not been supported in years. You really should upgrade to supported software.
You can get what you want with not exists:
SELECT al.*
FROM mgr.MF_AGREEMENT_LGR al
WHERE NOT EXISTS (SELECT 1
FROM mgr.MF_AGREEMENT_LGR al2
WHERE al2.registration_no = al.registration_no and
(al2.date > al2.date or
al2.date = al.date and al2.DOC_NO > al.DOC_NO
)
) AND
al.TRX_CODE = 'LCLR';
You probably want the condition on 'LCLR' in the subquery as well. However, that is not in your original query, so I'm leaving it out.

Related

Complex INSERT INTO SELECT statement in SQL

I have two tables in SQL. I need to add rows from one table to another. The table to which I add rows looks like:
timestamp, deviceID, value
2020-10-04, 1, 0
2020-10-04, 2, 0
2020-10-07, 1, 1
2020-10-08, 2, 1
But I have to add a row to this table if a state for a particular deviceID was changed in comparison to the last timestamp.
For example this record "2020-10-09, 2, 1" won't be added because the value wasn't changed for deviceID = 2 and last timestamp = "2020-10-08". In the same time record "2020-10-09, 1, 0" will be added, because the value for deviceID = 1 was changed to 0.
I have a problem with writing a query for this logic. I have written something like this:
insert into output
select *
from values
where value != (
select value
from output
where timestamp = (select max(timestamp) from output) and output.deviceID = values.deviceID)
Of course it doesn't work because of the last part of the query "and output.deviceID = values.deviceID".
Actually the problem is that I don't know how to take the value from "output" table where deviceID is the same as in the row that I try to insert.
I would use order by and something to limit to one row:
insert into output
select *
from values
where value <> (select o2.value
from output o2
where o2.deviceId = v.deviceId
order by o2.timestamp desc
fetch first 1 row only
);
The above is standard SQL. Specific databases may have other ways to express this, such as limit or top (1).

SQL Filtering duplicate rows due to bad ETL

The database is Postgres but any SQL logic should help.
I am retrieving the set of sales quotations that contain a given product within the bill of materials. I'm doing that in two steps: step 1, retrieve all DISTINCT quote numbers which contain a given product (by product number).
The second step, retrieve the full quote, with all products listed for each unique quote number.
So far, so good. Now the tough bit. Some rows are duplicates, some are not. Those that are duplicates (quote number & quote version & line number) might or might not have maintenance on them. I want to pick the row that has maintenance greater than 0. The duplicate rows I want to exclude are those that have a 0 maintenance. The problem is that some rows, which have no duplicates, have 0 maintenance, so I can't just filter on maintenance.
To make this exciting, the database holds quotes over 20+ years. And the data scientists guys have just admitted that maybe the ETL process has some bugs...
--- step 0
--- cleanup the workspace
SET CLIENT_ENCODING TO 'UTF8';
DROP TABLE IF EXISTS product_quotes;
--- step 1
--- get list of Product Quotes
CREATE TEMPORARY TABLE product_quotes AS (
SELECT DISTINCT master_quote_number
FROM w_quote_line_d
WHERE item_number IN ( << model numbers >> )
);
--- step 2
--- Now join on that list
SELECT
d.quote_line_number,
d.item_number,
d.item_description,
d.item_quantity,
d.unit_of_measure,
f.ref_list_price_amount,
f.quote_amount_entered,
f.negtd_discount,
--- need to calculate discount rate based on list price and negtd discount (%)
CASE
WHEN ref_list_price_amount > 0
THEN 100 - (ref_list_price_amount + negtd_discount) / ref_list_price_amount *100
ELSE 0
END AS discount_percent,
f.warranty_months,
f.master_quote_number,
f.quote_version_number,
f.maintenance_months,
f.territory_wid,
f.district_wid,
f.sales_rep_wid,
f.sales_organization_wid,
f.install_at_customer_wid,
f.ship_to_customer_wid,
f.bill_to_customer_wid,
f.sold_to_customer_wid,
d.net_value,
d.deal_score,
f.transaction_date,
f.reporting_date
FROM w_quote_line_d d
INNER JOIN product_quotes pq ON (pq.master_quote_number = d.master_quote_number)
INNER JOIN w_quote_f f ON
(f.quote_line_number = d.quote_line_number
AND f.master_quote_number = d.master_quote_number
AND f.quote_version_number = d.quote_version_number)
WHERE d.net_value >= 0 AND item_quantity > 0
ORDER BY f.master_quote_number, f.quote_version_number, d.quote_line_number
The logic to filter the duplicate rows is like this:
For each master_quote_number / version_number pair, check to see if there are duplicate line numbers. If so, pick the one with maintenance > 0.
Even in a CASE statement, I'm not sure how to write that.
Thoughts? The database is Postgres but any SQL logic should help.
I think you will want to use Window Functions. They are, in a word, awesome.
Here is a query that would "dedupe" based on your criteria:
select *
from (
select
* -- simplifying here to show the important parts
,row_number() over (
partition by master_quote_number, version_number
order by maintenance desc) as seqnum
from w_quote_line_d d
inner join product_quotes pq
on (pq.master_quote_number = d.master_quote_number)
inner join w_quote_f f
on (f.quote_line_number = d.quote_line_number
and f.master_quote_number = d.master_quote_number
and f.quote_version_number = d.quote_version_number)
) x
where seqnum = 1
The use of row_number() and the chosen partition by and order by criteria guarantee that only ONE row for each combination of quote_number/version_number will get the value of 1, and it will be the one with the highest value in maintenance (if your colleagues are right, there would only be one with a value > 0 anyway).
Can you do something like...
select
*
from
w_quote_line_d d
inner join
(
select
...
,max(maintenance)
from
w_quote_line_d
group by
...
) d1
on
d1.id = d.id
and d1.maintenance = d.maintenance;
Am I understanding your problem correctly?
Edit: Forgot the group by!
I'm not sure, but maybe you could Group By all other columns and use MAX(Maintenance) to get only the greatest.
What do you think?

SQL Statement to select row where previous row status = 'C' AS400

This is being run on sql for IBMI Series 7
I have a table which stores info about orders. Each row has an order number (ON), part number(PN), and sequence number(SEQ). Each ON will have multiple PN's linked to them and each part number has multiple SEQ Number. Each sequence number represents the order in which to do work on the part. Somewhere else in the system once the part is at a location and ready to be worked on it shows a flag. What I want to do is get a list of orders for a location that have not yet arrived but have been closed out on the previous location( Which means the part is on it's way).
I have a query listed below that I believe should work but I get the following error: "The column qualifier or table t undefined". Where is my issue at?
Select * From (SELECT M2ON as Order__Number , M2SEQ as Sequence__Number,
M2PN as Product__Number,ML2OQ as Order__Quantity
FROM M2P
WHERE M2pN in (select R1PN FROM R1P WHERE (RTWC = '7411') AND (R1SEQ = M2SEQ)
)
AND M2ON IN (SELECT M1ON FROM M1P WHERE ML1RCF = '')
ORDER BY ML2OSM ASC) as T
WHERE
T.Order__Number in (Select t3.m2on from (SELECT *
FROM(Select * from m2p
where m2on = t.Order__Number and m2pn = t.Product__Number
order by m2seq asc fetch first 2 rows only
)as t1 order by m2seq asc fetch first row only
) as t3 where t3.m2stat = 'C')
EDIT- Answer for anyone else with this issue
Clutton's Answer worked with slight modification so thank you to him for the fast response! I had to name my outer table and specify that in the subquery otherwise the as400 would kick back and tell me it couldn't find the columns. I also had to order by the sequence number descending so that I grabbed the highest record that was below the parameter(otherwise for example if my sequence number was 20 it could grab 5 even though 10 was available and should be shown first. Here is the subquery I now use. Please note the actual query names m2p as T1.
IFNULL((
SELECT
M2STAT
FROM
M2P as M2P_1
WHERE
M2ON = T1.M2ON
AND M2SEQ < T1.M2SEQ
AND M2PN IN (select R1PN FROM R1P WHERE (RTWC = #WC) AND (R1SEQ = T1.M2SEQ))
ORDER BY M2SEQ DESC
FETCH FIRST ROW ONLY
), 'NULL') as PRIOR_M2STAT
Just reading your question, it looks like something I do frequently to emulate RPG READPE op codes. Is the key to M2P Order/Seq? If so, here is a basic piece that may help you build out the rest of the query.
I am assuming that you are trying to get the prior record by key using SQL. In RPG this would be like doing a READPE on the key for a file with Order/Seq key.
Here is an example using a subquery to get the status field of the prior record.
SELECT
M2ON, M2PN, M2OQ, M2STAT,
IFNULL((
SELECT
M2STAT
FROM
M2P as M2P_1
WHERE
M2P_1.M2ON = M2ON
AND M2P_1.M2SEQ < M2SEQ
FETCH FIRST ROW ONLY
), '') as PRIOR_M2STAT
FROM
M2P
Note that this wraps the subquery in an IFNULL to handle the case where it is the first sequence number and no prior sequence exists.

SQL select max(date) and corresponding value [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to get the record of a table who contains the maximum value?
I've got an aggregate query like the following:
SELECT TrainingID, Max(CompletedDate) as CompletedDate, Max(Notes) as Notes --This will only return the longest notes entry
FROM HR_EmployeeTrainings ET
WHERE (ET.AvantiRecID IS NULL OR ET.AvantiRecID = #avantiRecID)
GROUP BY AvantiRecID, TrainingID
Which is working, and returns correct data most of the time, but I noticed a problem. The Notes field which gets returned will not necessarily match the record that the max(completedDate) is from. Instead it will be the one with the longest string? Or the one with the highest ASCII value? What does SQL Server do in the event of a tie between two records? I'm not even sure. What I want to get is the notes field from the max(completedDate) record. How should I got about doing this?
You can use a subquery. The subquery will get the Max(CompletedDate). You then take this value and join on your table again to retrieve the note associate with that date:
select ET1.TrainingID,
ET1.CompletedDate,
ET1.Notes
from HR_EmployeeTrainings ET1
inner join
(
select Max(CompletedDate) CompletedDate, TrainingID
from HR_EmployeeTrainings
--where AvantiRecID IS NULL OR AvantiRecID = #avantiRecID
group by TrainingID
) ET2
on ET1.TrainingID = ET2.TrainingID
and ET1.CompletedDate = ET2.CompletedDate
where ET1.AvantiRecID IS NULL OR ET1.AvantiRecID = #avantiRecID
Ah yes, that is how it is intended in SQL. You get the Max of every column seperately. It seems like you want to return values from the row with the max date, so you have to select the row with the max date. I prefer to do this with a subselect, as the queries keep compact easy to read.
SELECT TrainingID, CompletedDate, Notes
FROM HR_EmployeeTrainings ET
WHERE (ET.AvantiRecID IS NULL OR ET.AvantiRecID = #avantiRecID)
AND CompletedDate in
(Select Max(CompletedDate) from HR_EmployeeTrainings B
where B.TrainingID = ET.TrainingID)
If you also want to match by AntiRecID you should include that in the subselect as well.
Each MAX function is evaluated individually. So MAX(CompletedDate) will return the value of the latest CompletedDate column and MAX(Notes) will return the maximum (i.e. alphabeticaly highest) value.
You need to structure your query differently to get what you want. This question had actually already been asked and answered several times, so I won't repeat it:
How to find the record in a table that contains the maximum value?
Finding the record with maximum value in SQL
There's no easy way to do this, but something like this will work:
SELECT ET.TrainingID,
ET.CompletedDate,
ET.Notes
FROM
HR_EmployeeTrainings ET
inner join
(
select TrainingID, Max(CompletedDate) as CompletedDate
FROM HR_EmployeeTrainings
WHERE (ET.AvantiRecID IS NULL OR ET.AvantiRecID = #avantiRecID)
GROUP BY AvantiRecID, TrainingID
) ET2
on ET.TrainingID = ET2.TrainingID
and ET.CompletedDate = ET2.CompletedDate

SQL, Search by Date and not exists

I have two tables and need to search for all entries that exist in one table in another table by idProduct, only if the date (dateStamp) is less than or older than 7 days.
Because the api I'm using is restricted to only processing 3000 results at a time, the application will close and the next time I run the application I only want the idProducts that are say 3000 or greater for that idProduct, this will be run numerous times for the Suppliercode wll most likely already exist in the table.
So I've been looking at the not exists and getdate functions in sql but not been able to get the desired results.
SELECT
*
FROM
products
WHERE
(active = - 1)
AND suppliercode = 'TIT'
and (NOT EXISTS
(SELECT
idProduct
FROM compare
WHERE
(products.idProduct = idProduct)
OR (compare.dateStamp < DATEADD(DAY,-7,GETDATE()))))
Any pointers would be great, I've changed the OR to AND but it doesn't seem to bring back the correct results.
I am guessing you want to match the rows in the two tables by idProduct as right now your inner query (NOT EXISTS (SELECT idProduct FROM compare WHERE (products.idProduct = idProduct) OR (compare.dateStamp < DATEADD(DAY,-7,GETDATE())))) looks like it is finding all rows that don't match. As your subquery finds all rows that match or where the date is older than 7 days and makes sure that they don't exist.
Is this what your want?
SELECT *
FROM products as p
LEFT JOIN compare as c
ON p.idProduct = c.idProduct
WHERE p.active = -1 and p.suppliercode = 'TIT' and c.dateStamp < DATEADD(DAY,-7,GETDATE())
Have you tried this one yet?
SELECT * FROM products
WHERE (active = - 1) AND
suppliercode = 'TIT'
and ipProduct NOT IN
(
SELECT idProduct FROM compare
WHERE
(products.idProduct = idProduct) OR
(compare.dateStamp < DATEADD(DAY,-7,GETDATE()))
)
Try NOT IN instead:
...
and ProductId NOT IN
(SELECT
idProduct
FROM compare
WHERE
(products.idProduct = idProduct)
OR (compare.dateStamp < DATEADD(DAY,-7,GETDATE()))))
....