Show last update date - sql

I am new in this forum and also new in SQL my question is
I have an Excel sheet link to database with "From Microsoft query" I have 3 tables link together pd_ln,pdcflbrt,pdlbr
By using the following query I am getting this data
SELECT pdcflbrt.lbrcod, pdcflbrt.lbrrat, pd_ln.prdnum, pdcflbrt.begeffdat
FROM velocity.dbo.pd_ln pd_ln, velocity.dbo.pdcflbrt pdcflbrt, velocity.dbo.pdlbr pdlbr
WHERE pdlbr.lbrrattky = pdcflbrt.lbrrattky AND pd_ln.pd_ln_tky = pdlbr.pd_ln_tky
+--------------+--------------+-----------+------------------+
| lbrcod | lbrrat | prdnum | begeffdat |
+--------------+--------------+-----------+------------------+
| FC Braselton | 0.11 | 00236 | 7/15/2012 0:00 |
| FC Braselton | 0.11 | 00236 | 7/15/2012 0:00 |
| FC Braselton | 0.1 | 00236 | 12/10/2012 0:00 |
| Sizing | 0.21 | 03103 | 8/28/2015 0:00 |
| Sizing | 0.2 | 03103 | 10/13/2011 0:00 |
+--------------+--------------+-----------+------------------+
How do I query to get the last begeffdat of each prdnum.

Magood's answer may work in this situation. However, if there was a unique identifier for each edit that you were selecting, it wouldn't work. As far as I know, you would have to get involved with row_number() like so:
SELECT s2.lbrcod, s2.lbrrat, s2.prdnum, s2.begeffdat from
(SELECT pdcflbrt.lbrcod
, pdcflbrt.lbrrat
, pd_ln.prdnum
, pdcflbrt.begeffdat
, row_number() over (partition by pd_ln.prdnum order by pdcflbrt.begeffdat desc) as RN
FROM velocity.dbo.pd_ln pd_ln, velocity.dbo.pdcflbrt pdcflbrt, velocity.dbo.pdlbr pdlbr
WHERE pdlbr.lbrrattky = pdcflbrt.lbrrattky AND pd_ln.pd_ln_tky = pdlbr.pd_ln_tky) s2
where s2.rn = 1
This will return only the top date (it is the same query on the inner portion, but with the row_number() function added, with each different prdnum starting the numbers over, and ordering the rows by date, with the newest date first. The outer portion selects only row 1 (that's the last where) which is the newest date.
EDIT: Alternatively, if you only want the OLDEST update, you could change the desc in the main query's select statement to say asc.

-- Only for name and latest date
select lbrcod, max(begeffdate) begeffdat from #table
group by lbrcod
-- For all columns
select * from (
select *, row_number() over (partition by prdnum order by begeffdate desc) rowNum from #table
) data
where rowNum = 1

Related

Self join to create a new column with updated records

I am trying to write a SQL query to get the start date for employees in a store. As seen in the first screenshot, employee number 5041 had the number A0EH but as the number got updated, it updated the start date for the employee as well. This effects the metric of total duration in the store.
I am trying to get to the output below but haven't been able to figure out how to get this view.
This is the code I was trying but I am not getting the correct output.
select
esd.employee_number,
(case when esd.old_employee_number is null then es.employee_number else es.old_employee_number end) as old_employee_number,
esd.entity_id,
esd.original_start_date
from earliest_start_date as esd
left join earliest_start_date as es
on (es.employee_number = esd.old_employee_number)
How do I solve this on SQL?
Redshift reportedly supports recursion via WITH clause. Here's an example:
MariaDB 10.5 has similar support. Test case is here:
Fully working test case (via MariaDB 10.5) (Updated)
Link to Amazon Redshift detail for WITH clause and window functions:
Amazon Redshift - WITH clause
Amazon redshift - Window functions
WITH RECURSIVE cte (employee_number, original_no, entity_id, original_start_date, n) AS (
SELECT employee_number, employee_number, entity_id, original_start_date, 1 FROM earliest_start_date WHERE old_employee_number IS NULL UNION ALL
SELECT new_tbl.employee_number, cte.original_no, cte.entity_id, cte.original_start_date, n+1
FROM earliest_start_date new_tbl
JOIN cte
ON cte.employee_number = new_tbl.old_employee_number
)
, xrows AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY entity_id ORDER BY n DESC) AS rn
FROM cte
)
SELECT * FROM xrows WHERE rn = 1
;
Result:
+-----------------+-------------+-----------+---------------------+------+----+
| employee_number | original_no | entity_id | original_start_date | n | rn |
+-----------------+-------------+-----------+---------------------+------+----+
| XXXX | XXXX | 88 | 2021-09-02 | 1 | 1 |
| 5041 | A0EH | 96 | 2021-09-05 | 2 | 1 |
+-----------------+-------------+-----------+---------------------+------+----+
2 rows in set
Raw test data:
SELECT * FROM earliest_start_date;
+-----------------+---------------------+-----------+---------------------+
| employee_number | old_employee_number | entity_id | original_start_date |
+-----------------+---------------------+-----------+---------------------+
| 5041 | A0EH | 96 | 2021-09-10 |
| A0EH | NULL | 96 | 2021-09-05 |
| XXXX | NULL | 88 | 2021-09-02 |
+-----------------+---------------------+-----------+---------------------+
Note that the logic makes assumption about uniqueness of the employee_number and, in the current form, can't handle cases where the employee_number is reused by the same employee or used again with a different employee without adjusting prior data. There may not be enough detail in the current structure to handle those cases.

Is there a difference between Oracle SQL 'KEEP' for multiple columns and 'KEEP' for one and GROUP BY for the rest?

I'm just now learning about KEEP in Oracle SQL, but I cannot seem to find documentation that explains why their examples use KEEP in all columns that are not indexed.
I have a table with 5 columns
PERSON_ID | BRANCH | YEAR | STATUS | TIMESTAMP
123456 | 0001 | 2017 | 1 | 1-1-2017 (ROW 1)
123456 | 0001 | 2017 | 2 | 2-1-2017 (ROW 2)
123456 | 0002 | 2017 | 3 | 3-1-2017 (ROW 3)
123456 | 0001 | 2017 | 2 | 4-1-2017 (ROW 4)
123456 | 0001 | 2018 | 2 | 1-1-2018 (ROW 5)
123456 | 0001 | 2018 | 3 | 2-1-2018 (ROW 6)
I want to return the row of the most recent timestamp by person, branch, and year, so rows 3, 4, and 6.
RESULTS
PERSON_ID | BRANCH | YEAR | STATUS | TIME_STAMP
123456 | 0002 | 2017 | 3 | 3-1-2017 (ROW 3)
123456 | 0001 | 2017 | 2 | 4-1-2017 (ROW 4)
123456 | 0001 | 2018 | 3 | 2-1-2018 (ROW 6)
To get the entire row, I would normally I would write something like this:
SELECT *
FROM STATUS_TABLE a
WHERE a.TIME_STAMP =
(
SELECT MAX(sub.TIME_STAMP)
FROM STATUS_TABLE sub
WHERE a.PERSON_ID = sub.PERSON_ID
AND a.YEAR = sub.YEAR
AND a.BRANCH = sub.BRANCH
)
But I'm learning I can write this:
SELECT
a.PERSON_ID,
a.YEAR,
a.BRANCH,
MAX(a.STATUS) KEEP (DENSE_RANK FIRST ORDER BY TIME_STAMP DESC)
FROM STATUS_TABLE a
GROUP BY a.PERSON_ID, a.YEAR, a.BRANCH;
My concern is that a lot of the documentation and example I'm finding doesn't put all the group-by columns in GROUP BY, but rather they write a KEEP statement for many columns.
Like this:
SELECT
a.PERSON_ID,
MAX(a.YEAR) KEEP (DENSE_RANK FIRST ORDER BY TIME_STAMP DESC),
MAX(a.BRANCH) KEEP (DENSE_RANK FIRST ORDER BY TIME_STAMP DESC),
MAX(a.STATUS) KEEP (DENSE_RANK FIRST ORDER BY TIME_STAMP DESC)
FROM STATUS_TABLE a
GROUP BY a.PERSON_ID;
QUESTION
If I know that there will never be duplicates on TIME_STAMP for an ID, YEAR, and BRANCH, can I write it the first way or do I still need to write it the 2nd way. Using the first way, I get the results I'm expecting, but I can't seem to find any explanation of this method and what the differences may be.
Are there any?
Your aggregation queries are different. When you have:
GROUP BY a.PERSON_ID, a.YEAR, a.BRANCH
Your result set will have one row in the result set for each combination of the three columns.
If you specify:
GROUP BY a.PERSON_ID
Then there is one row only for each PERSON_ID. Under some circumstances, this is the same as the above version. But only when there is one YEAR and BRANCH per PERSON_ID. That is not true in your data.
These versions are functionally equivalent for most practical purposes to your version with the correlated subquery. One difference is what happens if any of the grouping/correlation columns are NULL. The GROUP BY keeps these groupings. The correlated subquery filters them out.

Display max value of a sum using group by

Cheers everybody,
I've been trying endlessly to display only the max value of COBRANCATOTAL for each year and display the name and nif of the client,
for instance the query result is :
Current Result
Therefore the result should be
227518698 | Rui | G | 2015 | 100
227518699 | Sara | G | 2016 | 100
227518693 | Paulo Pereira | G | 2014 | 43
227518691 | Diogo Batista | G | 2017 | 2
I can't seem to remove the other values, just to appear the maximum for each year.
You can do this using window functions. Here is one method:
with t as (
<your query here>
)
select t.*
from (select t.*, row_number() over (partition by year order by cobrancatotal desc) as seqnum
from t
) t
where seqnum = 1;
You should also learn to use proper, explicit JOIN syntax. Commas in the FROM clause are from archaic versions of SQL.

How to search max value from group in sql

I am just learning some SQL, so I have a question.
-I have a table with name TABL
-a variable :ccname which has a value "Bottle"
The table is as follows:
+----------+---------+-------+--------+
| Name | Price | QTY | CODE |
+----------+---------+-------+--------+
| Rope | 3.6 | 35 | 236 |
| Chain | 2.8 | 15 | 237 |
| Paper | 1.6 | 45 | 124 |
| Bottle | 4.5 | 41 | 478 |
| Bottle | 1.8 | 12 | 123 |
| Computer | 1450.75 | 71 | 784 |
| Spoon | 0.7 | 10 | 412 |
| Bottle | 1.3 | 15 | 781 |
| Rope | 0.9 | 14 | 965 |
+----------+---------+-------+--------+
Now I want to find the CODE from the variable :ccname with the higher quantity! So I translated like this:
SELECT CODE
FROM TABL
GROUP BY :ccname
WHERE QTY=MAX(QTY)
In a perfect world that would turn as a result 478.
In the SQL world what should I write in order to get 478?
You probably want something like that:
SELECT code
FROM TABL
WHERE Name=:ccname
ORDER BY QTY DESC
LIMIT 1
The idea is we find all rows of the table whose Name column is the same as the contents of the variable :ccname, then order them by the quantity in descending order, and filally we select first one, which has to be the one with the largest quantity because they are sorted in descending order.
Try this
SELECT CODE
FROM TABLENAme
WHERE QTY = (SELECT MAX(QTY) FROM TablName WHERE Name = :ccname)
Use ORDER BY, a proper WHERE, and the something to limit the result set to one row:
SELECT CODE
FROM TABL
WHERE name = :ccname
ORDER BY QTY DESC
FETCH FIRST 1 ROW ONLY;
Note: Some databases spell the ANSI standard FETCH FIRST 1 ROW ONLY as LIMIT or as SELECT TOP 1.
Depending on your specific database, you can use one of the following options to restrict your result set to a single value after ordering your existing columns through an ORDER BY clause:
SELECT TOP 1
LIMIT 1
FETCH FIRST 1 ROW ONLY
Syntax Examples
SELECT TOP 1 Code
FROM TABL
WHERE Name = :ccname
ORDER BY QTY DESC
or
SELECT Code
FROM TABL
WHERE Name = :ccname
ORDER BY QTY DESC
LIMIT 1
or
SELECT CODE
FROM TABL
WHERE Name = :ccname
ORDER BY QTY DESC
FETCH FIRST 1 ROW ONLY;
Using join can also effectively solve the question:
Select t1.Code
From TABL As t1 Join (
Select Name, Max(table.QTY) as MaxQTY
From TABL
Where Name = :ccname
Group by Name
) As t2
Where t1.QTY = t2.MaxQTY And t1.Name = t2.Name
Explanation:
You first calculate the maximum value for "Bottle" using the subquery and then join the two tables to select corresponding row with MaxQTY and same name.

PostgreSQL return multiple rows with DISTINCT though only latest date per second column

Lets says I have the following database table (date truncated for example only, two 'id_' preix columns join with other tables)...
+-----------+---------+------+--------------------+-------+
| id_table1 | id_tab2 | date | description | price |
+-----------+---------+------+--------------------+-------+
| 1 | 11 | 2014 | man-eating-waffles | 1.46 |
+-----------+---------+------+--------------------+-------+
| 2 | 22 | 2014 | Flying Shoes | 8.99 |
+-----------+---------+------+--------------------+-------+
| 3 | 44 | 2015 | Flying Shoes | 12.99 |
+-----------+---------+------+--------------------+-------+
...and I have a query like the following...
SELECT id, date, description FROM inventory ORDER BY date ASC;
How do I SELECT all the descriptions, but only once each while simultaneously only the latest year for that description? So I need the database query to return the first and last row from the sample data above; the second it not returned because the last row has a later date.
Postgres has something called distinct on. This is usually more efficient than using window functions. So, an alternative method would be:
SELECT distinct on (description) id, date, description
FROM inventory
ORDER BY description, date desc;
The row_number window function should do the trick:
SELECT id, date, description
FROM (SELECT id, date, description,
ROW_NUMBER() OVER (PARTITION BY description
ORDER BY date DESC) AS rn
FROM inventory) t
WHERE rn = 1
ORDER BY date ASC;