How to show row repetition in oracle sql? [duplicate] - sql

This question already has answers here:
Oracle "Partition By" Keyword
(6 answers)
Closed last month.
I am trying to work out a query for a transaction table with data as shown below:
Dept
Employee
TransactionDate
Event
dept1
emp1
2022-05-20
abgd
dept1
emp1
2022-05-20
ggg
dept1
emp1
2022-05-20
hdfh
dept2
emp2
2022-01-26
3fdfds
dept2
emp2
2022-01-26
dsfsd
dept2
emp2
2022-01-26
554fsds
dept2
emp2
2022-01-26
gg32
dept2
emp2
2022-01-26
fd4gfg
I would like to list the count the no. of times the Dept+Employee+TransactionDate is repeated for each event as shown below:
Dept
Employee
TransactionDate
Event
count
dept1
emp1
2022-05-20
abgd
3
dept1
emp1
2022-05-20
ggg
3
dept1
emp1
2022-05-20
hdfh
3
dept2
emp2
2022-01-26
3fdfds
5
dept2
emp2
2022-01-26
dsfsd
5
dept2
emp2
2022-01-26
554fsds
5
dept2
emp2
2022-01-26
gg32
5
dept2
emp2
2022-01-26
fd4gfg
5
I am looking a way to get the expected view. If it's possible with a single sql query?
Any pointers will be appreciated.

Use the COUNT analytic function:
SELECT t.*,
COUNT(*) OVER (PARTITION BY Dept, Employee, TransactionDate) AS cnt
FROM table_name t
Which, for the sample data:
CREATE TABLE table_name (Dept, Employee, TransactionDate, Event) AS
SELECT 'dept1', 'emp1', DATE '2022-05-20', 'abgd' FROM DUAL UNION ALL
SELECT 'dept1', 'emp1', DATE '2022-05-20', 'ggg' FROM DUAL UNION ALL
SELECT 'dept1', 'emp1', DATE '2022-05-20', 'hdfh' FROM DUAL UNION ALL
SELECT 'dept2', 'emp2', DATE '2022-01-26', '3fdfds' FROM DUAL UNION ALL
SELECT 'dept2', 'emp2', DATE '2022-01-26', 'dsfsd' FROM DUAL UNION ALL
SELECT 'dept2', 'emp2', DATE '2022-01-26', '554fsds' FROM DUAL UNION ALL
SELECT 'dept2', 'emp2', DATE '2022-01-26', 'gg32' FROM DUAL UNION ALL
SELECT 'dept2', 'emp2', DATE '2022-01-26', 'fd4gfg' FROM DUAL;
Outputs:
DEPT
EMPLOYEE
TRANSACTIONDATE
EVENT
CNT
dept1
emp1
2022-05-20 00:00:00
abgd
3
dept1
emp1
2022-05-20 00:00:00
hdfh
3
dept1
emp1
2022-05-20 00:00:00
ggg
3
dept2
emp2
2022-01-26 00:00:00
gg32
5
dept2
emp2
2022-01-26 00:00:00
554fsds
5
dept2
emp2
2022-01-26 00:00:00
dsfsd
5
dept2
emp2
2022-01-26 00:00:00
fd4gfg
5
dept2
emp2
2022-01-26 00:00:00
3fdfds
5
fiddle

Related

SQL Joining transactions on Date Range

In SQL Server 2014, I'm working with two tables, an EMPLOYEE and a SALES table:
EMPID EMPNAME HIRE_DATE
---------------------------
1234 JOHN SMITH 2021-05-01
1235 JANE DOE 2021-08-05
1236 JANE SMITH 2021-07-31
EMPID SALE_DATE PRODUCT
-------------------------------------
1234 2021-05-05 VPN
1234 2021-05-10 VPN Basic
1234 2021-07-15 Cloud Storage Bronze
1234 2021-07-05 Cloud Storage Gold
1235 2021-10-01 Antivirus
I need to write a query that will produce all rows/columns from the EMPLOYEE table, with a column showing their (aggregated) sales, but ONLY sales that were triggered within 30 days of the hire date.
This query works, but will pull in ALL sales completed until present:
SELECT EMP.*, SALES_30_DAYS
FROM EMP
LEFT JOIN
(SELECT EMPID, COUNT(*)
FROM SALES_30_DAYS
GROUP BY EMPID) ON EMP.EMPID = SALES.EMPID
In this other attempt, HIRE_DATE is not recognized in the sub-query.
SELECT EMP.*, SALES_30_DAYS
FROM EMP
LEFT JOIN
(SELECT EMPID, COUNT(*) SALES_30_DAYS
FROM SALES
WHERE DATEDIFF(DD, HIRE_DATE, SALE_DATE) < 30
GROUP BY EMPID) ON EMP.EMPID= SALES.EMPID
How can I re-write this query, so that the second table will provide the aggregated sales ONLY if the sale took place up to 30 days after the hire date?
Desired outcome:
EMPID EMPNAME HIRE_DATE SALES_30_DAYS
-----------------------------------------
1234 JOHN SMITH 2021-05-01 2
1235 JANE DOE 2021-08-05 1
1236 JANE SMITH 2021-07-31 NULL
WITH EMPLOYEES(EMPID, EMPNAME, HIRE_DATE)AS
(
SELECT 1234, 'JOHN SMITH', '2021-05-01' UNION ALL
SELECT 1235, 'JANE DOE' , '2021-08-05' UNION ALL
SELECT 1236, 'JANE SMITH' ,'2021-07-31'
),
SALES(EMPID, SALE_DATE, PRODUCT) AS
(
SELECT 1234, '2021-05-05' ,'VPN' UNION ALL
SELECT 1234 , '2021-05-10' ,'VPN Basic' UNION ALL
SELECT 1234 , '2021-07-15' ,'Cloud Storage Bronze' UNION ALL
SELECT 1234 , '2021-07-05' ,'Cloud Storage Gold' UNION ALL
SELECT 1235 , '2021-10-01', 'Antivirus'
)
SELECT E.EMPID,E.EMPNAME,E.HIRE_DATE,SALE_QUERY.CNTT
FROM EMPLOYEES E
OUTER APPLY
(
SELECT COUNT(*)CNTT
FROM SALES AS S WHERE E.EMPID=S.EMPID AND
S.SALE_DATE BETWEEN E.HIRE_DATE AND DATEADD(DD,30,E.HIRE_DATE)
)SALE_QUERY
Could you please try if the above is suitable for you

Oracle SQL Anniversary Dates Query

I need some assistance in creating a query which shows the anniversary of employees in the business. I need the report to run and show results within two dates I select. I would like the report to show historical anniversary data too.
I would like columns NAME, ANNIVERSARY DATE, YEARS WITH BUSINESS
Columns I have in the dataset is name and date of employment.
Any help is appreciated!
TIA
Use MONTHS_BETWEEN and ADD_MONTHS:
SELECT name,
date_of_employment,
ADD_MONTHS(
date_of_employment,
(EXTRACT(YEAR FROM SYSDATE) - EXTRACT(YEAR FROM date_of_employment))*12
) AS anniversary,
MONTHS_BETWEEN(SYSDATE, date_of_employment)/12
AS years_with_business,
TRUNC(MONTHS_BETWEEN(SYSDATE, date_of_employment)/12)
AS full_years_with_business
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( name, date_of_employment ) AS
SELECT 'Alice', ADD_MONTHS(TRUNC(SYSDATE - 1), -120) FROM DUAL UNION ALL
SELECT 'Beryl', ADD_MONTHS(TRUNC(SYSDATE + 0), -120) FROM DUAL UNION ALL
SELECT 'Carol', ADD_MONTHS(TRUNC(SYSDATE + 1), -120) FROM DUAL UNION ALL
SELECT 'Debra', ADD_MONTHS(TRUNC(SYSDATE - 1), +12) FROM DUAL UNION ALL
SELECT 'Emma', ADD_MONTHS(TRUNC(SYSDATE + 0), +12) FROM DUAL UNION ALL
SELECT 'Frances', ADD_MONTHS(TRUNC(SYSDATE + 1), +12) FROM DUAL;
Outputs:
NAME
DATE_OF_EMPLOYMENT
ANNIVERSARY
YEARS_WITH_BUSINESS
FULL_YEARS_WITH_BUSINESS
Alice
2011-08-18 00:00:00
2021-08-18 00:00:00
10.00435082511947431302270011947431302267
10
Beryl
2011-08-19 00:00:00
2021-08-19 00:00:00
10
10
Carol
2011-08-20 00:00:00
2021-08-20 00:00:00
9.99897448103345280764635603345280764633
9
Debra
2022-08-18 00:00:00
2021-08-18 00:00:00
-.9956491748805256869772998805256869773
0
Emma
2022-08-19 00:00:00
2021-08-19 00:00:00
-1
-1
Frances
2022-08-20 00:00:00
2021-08-20 00:00:00
-1.00102551896654719235364396654719235364
-1
db<>fiddle here

Oracle - Join most recent status at the time of activity

I have two tables - one with employee activity and one with employee_status. The issue is the employee status changes over time, so I need to join the status as it was at the time of the session.
>>> employee_activity
id session_start
emp1 1/1/2019
emp1 2/22/2019
emp1 3/1/2019
emp2 1/4/2019
emp2 2/23/2019
>>> employee_status
id status effective date
emp1 a 1/1/2018
emp1 b 2/1/2019
emp1 c 3/5/2019
emp2 a 6/1/2018
emp2 b 1/1/2019
So I started writing something that will make sure it's ignoring statuses after the activity, but I'm struggling a bit with figuring out how to only select the most recent status. The query needs join only the status with the max
effective date that is less than the session start
SELECT * FROM employee_activity a
LEFT join employee_status s on a.id = s.id WHERE s.effective_date <= a.session_start
-- how do I join only the most recent status?
The desired output from the two tables above would be
>>> my_output
id session_start status
emp1 1/1/2019 a
emp1 2/22/2019 b
emp1 3/1/2019 b
emp2 1/4/2019 b
emp2 2/23/2019 b
Thanks!!
Calculate first the validity interval from the STATUS, i.e. instead of EFFECTIVE_DATE you have starting and ending timestamp.
Note, that I use a default open end date and I subtract one second from the end date to get closed interval which can be queried using BETWEEN.
Than simple join on the key and add the between constraint for the time:
with emp as (
select ID, STATUS, EFFECTIVE_DATE status_valid_from,
lead(EFFECTIVE_DATE - INTERVAL '1' SECOND,1,DATE'2500-01-01')
over (partition by id order by EFFECTIVE_DATE) as status_valid_to
from employee_status)
SELECT a.id, a.SESSION_START, s.STATUS, s.STATUS_VALID_FROM
FROM employee_activity a
LEFT join emp s
on a.id = s.id and session_start between s.status_valid_from and s.status_valid_to
order by 1,2;
ID SESSION_START S STATUS_VALID_FROM
---- ------------------- - -------------------
emp1 01.01.2019 00:00:00 a 01.01.2018 00:00:00
emp1 22.02.2019 00:00:00 b 01.02.2019 00:00:00
emp1 01.03.2019 00:00:00 b 01.02.2019 00:00:00
emp2 04.01.2019 00:00:00 b 01.01.2019 00:00:00
emp2 23.02.2019 00:00:00 b 01.01.2019 00:00:00
Sample Data
create table employee_activity as
select 'emp1' id, to_date('1/1/2019','mm/dd/yyyy') session_start from dual union all
select 'emp1' id, to_date('2/22/2019','mm/dd/yyyy') session_start from dual union all
select 'emp1' id, to_date('3/1/2019','mm/dd/yyyy') session_start from dual union all
select 'emp2' id, to_date('1/4/2019','mm/dd/yyyy') session_start from dual union all
select 'emp2' id, to_date('2/23/2019','mm/dd/yyyy') session_start from dual;
create table employee_status as
select 'emp1' id, 'a'status, to_date('1/1/2018','mm/dd/yyyy') effective_date from dual union all
select 'emp1' id, 'b'status, to_date('2/1/2019','mm/dd/yyyy') effective_date from dual union all
select 'emp1' id, 'c'status, to_date('3/5/2019','mm/dd/yyyy') effective_date from dual union all
select 'emp2' id, 'a'status, to_date('6/1/2018','mm/dd/yyyy') effective_date from dual union all
select 'emp2' id, 'b'status, to_date('1/1/2019','mm/dd/yyyy') effective_date from dual;
You can do this using a correlated subquery:
select ea.*,
(select max(es.status) keep (dense_rank first order by es.effective_date desc)
from employee_status es
where es.id = ea.id and es.effective_date <= ea.session_start
) as status
from employee_activity ea;
In Oracle 12C+, there is the more intuitive:
select ea.*,
(select es.status
from employee_status es
where es.id = ea.id and es.effective_date <= ea.session_start
order by es.effective_date desc
fetch first 1 row only
) as status
from employee_activity ea;

Remove specific data from a table which are common based on certain fields

I have a table EMPLOYEE as under:
Enroll Date STS EMP_ID EMP_Name DEPT Rank OST BLOCK
12-Jan-17 Q 123 ABC ABC123 12 Y 1000
14-Jan-17 Q 123 ABC DEF123 12 Y 1000
15-Jan-17 R 123 ABC DEF123 12 Y 100
15-Jan-17 R 123 ABC DEF123 12 Y 200
15-Jan-17 R 123 ABC DEF123 12 Y 300
20-Jan-17 R 123 ABC DEF123 10 Y 300
26-Jan-17 R 456 RST DEF456 8 N 200
26-Jan-17 R 456 RST DEF456 8 N 300
2-Feb-17 Q 123 ABC ABC123 12 Y 300
Now i need to remove the duplicate rows for each emp_id (duplicate if EMP_Name, DEPT, OST and rank is same). If 2 rows have these 4 value same and enroll_date is different then i need not delete that row. And if 2 rows have same enroll date and the 4 fields (OST, EMP_Name, DEPT and rank) are same then i need to keep the row with highest block (1000 followed by 300 followed by 200 and so on)
So after deleting such data my table should have these rows:
Enroll Date STS EMP_ID EMP_Name DEPT Rank OST BLOCK
12-Jan-17 Q 123 ABC ABC123 12 Y 1000
14-Jan-17 Q 123 ABC DEF123 12 Y 1000
15-Jan-17 R 123 ABC DEF123 12 Y 100
2-Feb-17 Q 123 ABC ABC123 12 Y 300
20-Jan-17 R 123 ABC DEF123 10 Y 300
26-Jan-17 R 456 RST DEF456 8 N 200
26-Jan-17 R 456 RST DEF456 8 N 300
I tried using below query and will delete rows which have rn >1
SELECT enroll_date, STS, BLOCK, EMP_ID, EMP_NAME, DEPT,RANK, OST, row_number() over ( partition BY emp_id, enroll_date,emp_name, dept, ost, rank ORDER BY enroll_date ASC, block DESC)rn
FROM employee
But i am getting rn as 1 only everytime.
can someone check the issue here or suggest some other way to do so?
I am creating a temporary table which will have all non duplicate values:
create table employee_temp as
with duplicates as (
SELECT enroll_date, STS, BLOCK, EMP_ID, EMP_NAME, DEPT,RANK, OST, row_number() over ( partition BY emp_id, trunc(enroll_date),emp_name, dept, ost, rank ORDER BY enroll_date ASC, block DESC)rn FROM employee )
SELECT enroll_date, STS, BLOCK, EMP_ID, EMP_NAME, DEPT,RANK, OST from duplicates where rn =1;
It looks like your enroll_date values have non-midnight times, so partitioning by those also made those combinations unique (even though they don't look it when you only show the date part).
My initial thought was that your analytic row_number() was partitoned by too many columns, and that you shouldn't be including the date value you want to order by - it doesn't really make sense to partition by and order by the same thing, as it will be unique. Reducing the columns you actually want to check against, perhaps to:
row_number() over (partition BY emp_id, emp_name, dept, ost, rank
ORDER BY enroll_date ASC, block DESC)
would produce different ranks rather than all being 1. But I don't think that's right; that would probably make your secondary block ordering somewhat redundant, as you'll maybe be unlikely to have two rows with exactly the same time for one ID. Unlikely but not impossible, perhaps.
Re-reading your wording again I don't think you want to be ordering by the enroll_date at all, and you do want to be partitioning by the date instead; but, given that it contains non-midnight times that you apparently want to ignore for this exercise, the partitioning would have to be on the truncated date (which strips the time back to midnight, by default:
row_number() over (partition BY trunc(enroll_date), emp_id, emp_name, dept, ost, rank
ORDER BY block DESC)
With your sample data as a CTE, including slightly different times within each day, and one extra row to get everything the same but the date, this shows your original rn and my two calculated values:
with employee (enroll_date, sts, emp_id, emp_name, dept, rank, ost, block) as (
select to_date('12-Jan-17 00:00:00', 'DD-Mon-RR HH24:MI:SS'), 'Q', 123, 'ABC', 'ABC123', 12, 'Y', 1000 from dual
union all select to_date('14-Jan-17 00:00:00', 'DD-Mon-RR HH24:MI:SS'), 'Q', 123, 'ABC', 'DEF123', 12, 'Y', 1000 from dual
union all select to_date('15-Jan-17 00:00:01', 'DD-Mon-RR HH24:MI:SS'), 'R', 123, 'ABC', 'DEF123', 12, 'Y', 100 from dual
union all select to_date('15-Jan-17 00:00:02', 'DD-Mon-RR HH24:MI:SS'), 'R', 123, 'ABC', 'DEF123', 12, 'Y', 200 from dual
union all select to_date('15-Jan-17 00:00:03', 'DD-Mon-RR HH24:MI:SS'), 'R', 123, 'ABC', 'DEF123', 12, 'Y', 300 from dual
union all select to_date('20-Jan-17 00:00:00', 'DD-Mon-RR HH24:MI:SS'), 'R', 123, 'ABC', 'DEF123', 10, 'Y', 300 from dual
union all select to_date('26-Jan-17 00:00:00', 'DD-Mon-RR HH24:MI:SS'), 'R', 456, 'RST', 'DEF456', 8, 'N', 200 from dual
union all select to_date('26-Jan-17 00:00:01', 'DD-Mon-RR HH24:MI:SS'), 'R', 456, 'RST', 'DEF456', 8, 'N', 300 from dual
union all select to_date('2-Feb-17 00:00:00', 'DD-Mon-RR HH24:MI:SS'), 'Q', 123, 'ABC', 'ABC123', 12, 'Y', 300 from dual
union all select to_date('3-Feb-17 00:00:00', 'DD-Mon-RR HH24:MI:SS'), 'Q', 123, 'ABC', 'ABC123', 12, 'Y', 300 from dual
)
SELECT to_char(enroll_date, 'DD-Mon-RR') as date_only,
enroll_date, sts, block, emp_id, emp_name, dept, rank, ost,
row_number() over ( partition BY emp_id, enroll_date, emp_name, dept, ost, rank
ORDER BY enroll_date ASC, block DESC) your_rn,
row_number() over (partition BY emp_id, emp_name, dept, ost, rank
ORDER BY enroll_date ASC, block DESC) my_rn_1,
row_number() over (partition BY trunc(enroll_date), emp_id, emp_name, dept, ost, rank
ORDER BY block DESC) as my_rn_2
FROM employee
ORDER BY enroll_date;
DATE_ONLY ENROLL_DATE S BLOCK EMP_ID EMP DEPT RANK O YOUR_RN MY_RN_1 MY_RN_2
--------- ------------------- - ----- ------ --- ------ ---- - ------- ------- -------
12-Jan-17 2017-01-12 00:00:00 Q 1000 123 ABC ABC123 12 Y 1 1 1
14-Jan-17 2017-01-14 00:00:00 Q 1000 123 ABC DEF123 12 Y 1 1 1
15-Jan-17 2017-01-15 00:00:01 R 100 123 ABC DEF123 12 Y 1 2 3
15-Jan-17 2017-01-15 00:00:02 R 200 123 ABC DEF123 12 Y 1 3 2
15-Jan-17 2017-01-15 00:00:03 R 300 123 ABC DEF123 12 Y 1 4 1
20-Jan-17 2017-01-20 00:00:00 R 300 123 ABC DEF123 10 Y 1 1 1
26-Jan-17 2017-01-26 00:00:00 R 200 456 RST DEF456 8 N 1 1 2
26-Jan-17 2017-01-26 00:00:01 R 300 456 RST DEF456 8 N 1 2 1
02-Feb-17 2017-02-02 00:00:00 Q 300 123 ABC ABC123 12 Y 1 2 1
03-Feb-17 2017-02-03 00:00:00 Q 300 123 ABC ABC123 12 Y 1 3 1
To identify the rows to delete you can use a subquery:
SELECT enroll_date, sts, block, emp_id, emp_name, dept, rank, ost
FROM (
SELECT enroll_date, sts, block, emp_id, emp_name, dept, rank, ost,
row_number() over (partition BY trunc(enroll_date), emp_id, emp_name, dept, ost, rank
ORDER BY block DESC) as my_rn_2
FROM employee
)
WHERE my_rn_2 > 1
ORDER BY enroll_date;
ENROLL_DATE S BLOCK EMP_ID EMP DEPT RANK O
------------------- - ----- ------ --- ------ ---- -
2017-01-15 00:00:01 R 100 123 ABC DEF123 12 Y
2017-01-15 00:00:02 R 200 123 ABC DEF123 12 Y
2017-01-26 00:00:00 R 200 456 RST DEF456 8 N
You'll need to decide what actually makes sense for your data and requirements though.

I have an emp_salary table with emp_id, salary and salary_date. I want to write a query to find out which employee was paid highest in every month

Following is the table:
emp_id salary salary_date
Emp1 1000 Feb 01
Emp1 2000 Feb 15
Emp1 3000 Feb 28
Emp1 4000 Mar 01
Emp2 5000 Jan 01
Emp2 6000 Jan 15
Emp2 2000 Mar 01
Emp2 5000 Apr 01
Emp3 1000 Jan 01
Emp4 3000 Dec 31
Emp4 5000 Dec 01
And I want the following result:
Emp1 Feb 3000
Emp2 Jan 6000
Emp4 Dec 5000
Emp2 Apr 5000
Emp1 Mar 4000
SELECT e.Emp_ID,MaxSalary,MonthName
FROM employeeTable e
INNER JOIN
(
SELECT MAX(salary) as MaxSalary,
LEFT(salary_date,3) as MonthName
FROM employeeTable
GROUP BY LEFT(salary_date,3)
)t
ON e.Salary=t.MaxSalary
Here's my take on it using the nice RANK() function:
SELECT emp_id, to_char(salary_date, 'MM, YYYY'), salary FROM (
SELECT emp_id, salary_date, salary,
rank() over (partition BY to_char(salary_date, 'MM-YYYY') ORDER BY salary DESC) rnk
FROM emp_salary) WHERE rnk = 1
Output:
EMP_ID MONTHYEAR SALARY
Emp2 01, 2014 6000
Emp1 02, 2014 3000
Emp1 03, 2014 4000
Emp2 04, 2014 5000
Emp4 12, 2014 5000
Here's the SQL Fiddle to play with the data and query: http://sqlfiddle.com/#!4/e58eaa/32
I have found my answer!!!
select emp_id, substr(sal_date,1,3) monthly, salary from
(select emp_id, sal_date, salary,
max(salary)over (partition by substr(sal_date,1,3)) max_sal from emp_salary order by emp_id)
where salary=max_sal;
Result-set:
EMP_ID MONTHLY SALARY
Emp1 Mar 4000
Emp1 Feb 3000
Emp2 Jan 6000
Emp2 Apr 5000
Emp4 Dec 5000