Get Previous Data Row in Oracle - sql

I have a Oracle table where data can be ordered on the basis of date. Now I have a request to get data to specific condition and previous row to that data. for example :
if I have
Date
Dept
Employee
18-Aug
2
John
19-Aug
1
Meredith
20-Aug
9
Steve
21-Aug
0
Bella
so i give condition Dept = '0' , it should retrun below 2 rows :
Date
Dept
Employee
08/20
9
Steve
08/21
0
Bella

This would give you all with dept 0 and it predecessors, but would not have duplicates
SELECT
"Date", "Dept", "Employee" FROM tab1 WHERE "Dept" = 0
UNION
SELECT "Date", "Dept", "Employee" FROM tab1
WHERE "Date" IN (SELECT "DAT"
FROM
(SELECT "Date","Dept",
LAG("Date") OVER (ORDER BY
"Date") dat FROM tab1 ) t1
WHERE "Dept" = 0
)
Date | Dept | Employee
:---- | ---: | :-------
08/20 | 9 | Steve
08/21 | 0 | Bella
db<>fiddle here

You may use a subquery to get the date of the required department and select the first two rows where date column is less than or equal to that department date. Try the following (Supposing that the date is unique among departments)
Select D.Date_, D.Dept, D.Employee
From tbl_name D
Where D.Date_ <= (Select Date_ From tbl_name Where Dept = 0)
Order By Date_ DESC
FETCH NEXT 2 ROWS ONLY;
See a demo from db<>fiddle.
If the date is not unique, you may choose an extra column to order by, i.e. employee column. In this case you may try the following:
With CTE As
(
Select Date_, Dept, Employee,
ROW_NUMBER() Over (Order By Date_ DESC, Employee) rn
From tbl_name
)
Select Date_, Dept, Employee
From CTE
Where rn >= (Select rn From CTE Where Dept = 0)
FETCH NEXT 2 ROWS ONLY;
The second query is valid for both cases, see a demo.

Related

MS SQL Server - How to generate end date for each line item

In the table, I have emp_id, rate_effective_date and hourly_rate columns, however, there is no rate_effective_end_date column.
Please help me with a query that generate rate_effective_end_date for each line item.
The rate_effective_end_date will be calculated from next rate_effective_date -1 for any given employee.
Emp id | rate_effective_date | hourly_rate | rate_effective_end_date
--------+---------------------+-------------+------------------------
1 | 01/01/17 | 50 | 06/29/17
1 | 06/30/17 | 55 | 09/30/17
1 | 10/01/17 | 45 | 12/31/17
1 | 01/01/18 | 60 | {null}
Use lead() function with dateadd() function
select *,
dateadd(day, -1, lead(rate_effective_date) over(partition by Emp_id order by rate_effective_date)) as rate_effective_end_date
from table t;
In other way you could also use subquery
select *,
dateadd(day, -1, (select top 1 rate_effective_date from table
where emp_id = t.emp_id and
rate_effective_date > t.rate_effective_date
order by rate_effective_date)
) as rate_effective_end_date
from table t;
This will return all fields from myTable, and where a given employee has another record with a later rate_effective_date will return rate_effective_end_date calculated as 1 day before the following record's rate_effective_date.
If there is no later record, the returned value will be null.
; with cte as
(
select emp_id
, rate_effective_date
, hourly_rate
, row_number() over (partition by emp_id order by rate_effective_date ) r
from myTable
)
select a.emp_id
, a.rate_effective_date
, a.hourly_rate
, dateadd(day, -1, b.rate_effective_date) rate_effective_end_date
from cte a
left outer join cte b
on b.emp_id = a.emp_id
and b.r = a.r + 1
SQL Fiddle Example

Limit MAX() result to one row based on highest value in a particular field

Of course my data set is more complex, but this is essentially what I have:
+--------+--------+-------+
| SEQ_NO | FILTER | VALUE |
+--------+--------+-------+
| 1 | 'A' | 5 |
| 2 | 'A' | 10 |
| 3 | 'A' | 15 |
+--------+--------+-------+
Here is my query:
SELECT MAX(SEQ_NO)
, FILTER
, VALUE
FROM TABLE
GROUP BY FILTER
, VALUE
This returns my entire data set. How can I alter my query so that it only returns the record with the highest SEQ_NO ?
SELECT t1.*
FROM Table AS t1
INNER JOIN
(
SELECT MAX(SEQ_NO) MAXSeq
, FILTER
, VALUE
FROM TABLE
GROUP BY FILTER
, VALUE
) t2 ON t1.SEQ_NO = t2.MAXSeq
AND t1.FILTER = t2.FILTER
AND t1.VALUE = t2.VALUE
Or using row_number:
SELECT *
FROM
(
SELECT *,
row_number() over(partition by FILTER, VALUE
order by SEQ_NO desc) as rn
FROM table
) t
WHERE rn = 1
From Oracle 12C:
SELECT SEQ_NO
, FILTER
, VALUE
FROM TABLE
ORDER BY SEQ_NO DESC
FETCH FIRST 1 ROWS ONLY;
You can use ROWNUM in oracle:
select *
from
( select *
from yourTable
order by SEQ_NO desc ) as t
where ROWNUM = 1;
This should work
SELECT TOP 1 *
FROM TABLE
ORDER BY SEQ_NO DESC
If I understand correctly, you want the top SEQ_NO per filter?
i've created this in SQL Server and converted to Oracle
SELECT a.SEQ_NO,
a.FILTER,
a.VALUE
FROM (
SELECT SEQ_NO,
FILTER,
VALUE,
MAX(SEQ_NO) OVER (PARTITION BY FILTER) m
FROM TABLE
) a
WHERE SEQ_NO = m
Using mysql
SELECT SEQ_NO
, VALUE
, FILTER
FROM TABLE
Order by SEQ_NO DESC LIMIT 1

Procedure to copy data from a table to another table in SQL Server

I have a table A, with 4 columns:
first_name, invoice, value, date.
And a table B (first_name, max_invoice_name, max_invoice_value, last_date)
I want to create a procedure in order to move data from A, to B, but:
first_name should be one time in B,
max_invoice_name is the name of the max invoice value
max_invoice_value is the max value
last_date is the latest date from invoices from the same first_name.
For example:
TABLE A:
Smith | Invoice1 | 100 | 23.06.2016
John | Invoice13 | 23 | 18.07.2016
Smith | Invoice3 | 200 | 01.01.2015
Table B should be:
Smith |Invoice3 | 200 | 23.06.2016
John |Invoice13| 23 | 18.07.2016
Something like this should work:
select *, (select max(date) from #Table1 T1 where T1.first_name = X.first_name)
from (
select
*,
row_number() over (partition by first_name order by invoice_Value desc) as RN
from
#Table1
) X
where RN = 1
Row number takes care of selecting the row with biggest value, and the max get's the date. You'll need to list the columns in correct place instead of *
You will need to create 2 scalar functions getMaxNameForMaxValue AND getLastDateByFirstName to get the values you want.
INSERT INTO TableB (first_name, max_invoice_name, max_invoice_value, last_date) (SELECT DISTINCT first_name, getMaxNameForMaxValue(MAX(max_value)) AS 'max_invoice_name', MAX(max_invoice_value) AS 'max_invoice_value', getLastDateByFirstName(first_name) AS 'lastDate' FROM Table A)
You can use something like this:
--INSERT INTO TableB
SELECT first_name,
invoice_name,
invoice_value,
last_date
FROM (
SELECT a.first_name,
a.invoice_name,
a.invoice_value,
COALESCE(p.last_date,a.last_date) as last_date,
ROW_NUMBER() OVER (PARTITION BY a.first_name ORDER BY a.last_date) as rn
FROM TableA a
OUTER APPLY (SELECT TOP 1 * FROM TableA WHERE first_name = a.first_name and last_date > a.last_date) as p
) as res
WHERE rn = 1
As output:
first_name invoice_name invoice_value last_date
John Invoice13 23 2016-07-18
Smith Invoice3 200 2016-06-23
Try this
Insert into TableB(first_name, max_invoice_name, max_invoice_value, last_date)
select t1.first_name,t1.invoice,t1,value,t2.date from TableA as t1 inner join
(
select first_name, max(replace(invoice,'invoice','')) as invoice, max(date) as date
from TableA group by first_name
) as t2 on t1.first_name=t2.first_name and t1.invoice=t2.invoice

Retrieve a report on any duplicate rows of data in the emp table along with the count of -- the number of times that row of data is duplicated

I have EMP table as follows:
CREATE TABLE EMP
(
[ID] INT NOT NULL PRIMARY KEY,
[MGR_ID] INT,
[DEPT_ID] INT,
[NAME] VARCHAR(30),
[SAL] INT,
[DOJ] DATE
);
I need to retrieve a report on any duplicate rows of data in the emp table along with the count of -- the number of times that row of data is duplicated.
I partially solved this:
This query returns a singe instance of each of the duplicated rows
SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
from EMP
group by [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
having count(*) > 1
the output will be:
MGR_ID DEPT_ID NAME SAL DOJ
NULL 2 Hash 100 2012-01-01
1 2 Robo 100 2012-01-01
2 1 Privy 50 2012-05-01
I still need to group this output by the number of times each of these rows are duplicated in the EMP table.
I tried this:
WITH CTE
AS
(
SELECT * from EMP A
join ( SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
from EMP
group by [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
having count(*) > 1 ) B
on a.[MGR_ID] = b.[MGR_ID]
OR a.[MGR_ID] != b.[MGR_ID]
AND a.[DEPT_ID] = b.[DEPT_ID]
AND a.[NAME] = b.[NAME]
AND a.[SAL] = b.[SAL]
AND a.[DOJ] = b.[DOJ]
)
SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ], DENSE_RANK() OVER
(PARTITION BY [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ] ORDER BY DUPICATES) AS [DUPLICATES]
FROM CTE
But I got this error:
Msg 8156, Level 16, State 1, Line 1
The column 'MGR_ID' was specified multiple times for 'CTE'.
Please help.
The solution was partially found, except from I still need to do return MRG_ID column in the output for 3 records where it is = NULL
with cte as
(
SELECT A.[DEPT_ID],A.[NAME],A.[SAL],A.[DOJ] from EMP A
join ( SELECT [DEPT_ID],[NAME],[SAL],[DOJ]
from EMP
group by [DEPT_ID],[NAME],[SAL],[DOJ]
having count(*) > 1 ) B
ON a.[DEPT_ID] = b.[DEPT_ID]
AND a.[NAME] = b.[NAME]
AND a.[SAL] = b.[SAL]
AND a.[DOJ] = b.[DOJ]
)
SELECT [DEPT_ID],[NAME],[SAL],[DOJ], DENSE_RANK() OVER
(PARTITION BY [NAME] ORDER BY [NAME] DESC) AS [DUPLICATES], RANK() OVER
(PARTITION BY [NAME] ORDER BY [NAME] DESC) AS [SimpleRank]
FROM CTE
DEPT_ID NAME SAL DOJ DUPLICATES SimpleRank
2 Hash 100 2012-01-01 1 1
2 Hash 100 2012-01-01 1 1
2 Hash 100 2012-01-01 1 1
1 Privy 50 2012-05-01 1 1
1 Privy 50 2012-05-01 1 1
1 Privy 50 2012-05-01 1 1
2 Robo 100 2012-01-01 1 1
2 Robo 100 2012-01-01 1 1
2 Robo 100 2012-01-01 1 1
much
The final solution appears to be much easier:
Select [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ], count(name) From EMP group by [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ] having Count(Name) >1
It produces this result set
MGR_ID DEPT_ID NAME SAL DOJ Count_Of_ Duplicated_Rows
NULL 2 Hash 100 2012-01-01 3
1 2 Robo 100 2012-01-01 3
2 1 Privy 50 2012-05-01 3
Note: This will work only if you group by column that is duplicated.
The example below is based on previous more complex query, but it validates all the fields in the row, in comparison to the simple query above that checks condition of a one particular column that you are grouping the query by.
WITH CTE
AS
(
SELECT A.[MGR_ID], A.[DEPT_ID], A.[NAME], A.[SAL], A.[DOJ]
FROM EMP A
JOIN (SELECT [MGR_ID], [DEPT_ID], [NAME], [SAL], [DOJ]
FROM EMP
GROUP BY [MGR_ID], [DEPT_ID], [NAME], [SAL], [DOJ]
HAVING count(*) > 1) B
ON a.[MGR_ID] = b.[MGR_ID]
AND a.[DEPT_ID] = b.[DEPT_ID]
AND a.[NAME] = b.[NAME]
AND a.[SAL] = b.[SAL]
AND a.[DOJ] = b.[DOJ]
)
SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ],
count(*) As Count_Of_Duplicated_Rows
FROM EMP
GROUP BY [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
--HAVING Count(*) >1
Your problem is that you do not explicitly name the selected columns inside your CTE. Since both EMP and the subquery have a column called MGR_ID, doing select * on the join returns the column MGR_ID twice. According to MSDN, this is not allowed:
The list of column names is optional only if distinct names for all resulting columns are supplied in the query definition.
Note that you will encounter the same error for each pair of columns that exists on both sides of the join. To resolve this, you can either explicitly name the columns returned by the CTE in a column list with an alias for the repeated columns, like so:
WITH CTE (mgr_id,dept_id,name,sal,doj,mgr_id2,...) //mgr_id2 is an alias for b.mgr_id
AS
...
You can refer to this SQLFiddle for a demo. Remove the column list and you will see the same error you see now.
Alternatively, you can specify the columns to be selected in the CTE itself, I would recommend this since you don't actually need any repeated columns in your query:
;with cte as
(
SELECT A.[MGR_ID],A.[DEPT_ID],A.[NAME],A.[SAL],A.[DOJ] from EMP A
join ( SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
from EMP
group by [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
having count(*) > 1 ) B
...
try this
WITH CTE
AS
(
SELECT a.* from EMP A
join ( SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
from EMP
group by [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]
having count(*) > 1 ) B
on a.[MGR_ID] = b.[MGR_ID]
--OR a.[MGR_ID] != b.[MGR_ID]
AND a.[DEPT_ID] = b.[DEPT_ID]
AND a.[NAME] = b.[NAME]
AND a.[SAL] = b.[SAL]
AND a.[DOJ] = b.[DOJ]
),cte2 as(
SELECT [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ], DENSE_RANK() OVER
(PARTITION BY [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ] ORDER BY [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ]) AS [DUPLICATES]
FROM CTE )
select [MGR_ID],[DEPT_ID],[NAME],[SAL],[DOJ] from cte2 where DUPLICATES=1

Deleting Duplicate Records in Oracle based on Maximum Date/Time

i have the following sample data with duplicate information:
ID Date Emp_ID Name Keep
---------------------------------------------------------
1 17/11/2010 13:45:22 101 AB *
2 17/11/2010 13:44:10 101 AB
3 17/11/2010 12:45:22 102 SF *
4 17/11/2010 12:44:10 102 SF
5 17/11/2010 11:45:22 103 RD *
6 17/11/2010 11:44:10 103 RD
Based on the above data set, how can I remove the duplicate Emp IDs and only keep the Emp IDs that have the maximum date/time specified?
So based on the above, I would only see IDs: 1, 3 and 5.
Thanks.
Something like:
DELETE FROM the_table_with_no_name
WHERE date_column != (SELECT MAX(t2.date_column)
FROM the_table_with_no_name t2
WHERE t2.id = the_table_with_no_name.id);
You can generate the ROWIDs of all rows other than the one with the maximum date (for a given EMPIds) and delete them. I have found this to be performant as it is a set-based approach and uses analytics, rowIDs.
--get list of all the rows to be deleted.
select row_id from (
select rowid row_id,
row_number() over (partition by emp_id order by date desc) rn
from <table_name>
) where rn <> 1
And then delete the rows.
delete from table_name where rowid in (
select row_id from (
select rowid row_id,
row_number() over (partition by emp_id order by date desc) rn
from <table_name>
) where rn <> 1
);