SQL Joining transactions on Date Range - sql

In SQL Server 2014, I'm working with two tables, an EMPLOYEE and a SALES table:
EMPID EMPNAME HIRE_DATE
---------------------------
1234 JOHN SMITH 2021-05-01
1235 JANE DOE 2021-08-05
1236 JANE SMITH 2021-07-31
EMPID SALE_DATE PRODUCT
-------------------------------------
1234 2021-05-05 VPN
1234 2021-05-10 VPN Basic
1234 2021-07-15 Cloud Storage Bronze
1234 2021-07-05 Cloud Storage Gold
1235 2021-10-01 Antivirus
I need to write a query that will produce all rows/columns from the EMPLOYEE table, with a column showing their (aggregated) sales, but ONLY sales that were triggered within 30 days of the hire date.
This query works, but will pull in ALL sales completed until present:
SELECT EMP.*, SALES_30_DAYS
FROM EMP
LEFT JOIN
(SELECT EMPID, COUNT(*)
FROM SALES_30_DAYS
GROUP BY EMPID) ON EMP.EMPID = SALES.EMPID
In this other attempt, HIRE_DATE is not recognized in the sub-query.
SELECT EMP.*, SALES_30_DAYS
FROM EMP
LEFT JOIN
(SELECT EMPID, COUNT(*) SALES_30_DAYS
FROM SALES
WHERE DATEDIFF(DD, HIRE_DATE, SALE_DATE) < 30
GROUP BY EMPID) ON EMP.EMPID= SALES.EMPID
How can I re-write this query, so that the second table will provide the aggregated sales ONLY if the sale took place up to 30 days after the hire date?
Desired outcome:
EMPID EMPNAME HIRE_DATE SALES_30_DAYS
-----------------------------------------
1234 JOHN SMITH 2021-05-01 2
1235 JANE DOE 2021-08-05 1
1236 JANE SMITH 2021-07-31 NULL

WITH EMPLOYEES(EMPID, EMPNAME, HIRE_DATE)AS
(
SELECT 1234, 'JOHN SMITH', '2021-05-01' UNION ALL
SELECT 1235, 'JANE DOE' , '2021-08-05' UNION ALL
SELECT 1236, 'JANE SMITH' ,'2021-07-31'
),
SALES(EMPID, SALE_DATE, PRODUCT) AS
(
SELECT 1234, '2021-05-05' ,'VPN' UNION ALL
SELECT 1234 , '2021-05-10' ,'VPN Basic' UNION ALL
SELECT 1234 , '2021-07-15' ,'Cloud Storage Bronze' UNION ALL
SELECT 1234 , '2021-07-05' ,'Cloud Storage Gold' UNION ALL
SELECT 1235 , '2021-10-01', 'Antivirus'
)
SELECT E.EMPID,E.EMPNAME,E.HIRE_DATE,SALE_QUERY.CNTT
FROM EMPLOYEES E
OUTER APPLY
(
SELECT COUNT(*)CNTT
FROM SALES AS S WHERE E.EMPID=S.EMPID AND
S.SALE_DATE BETWEEN E.HIRE_DATE AND DATEADD(DD,30,E.HIRE_DATE)
)SALE_QUERY
Could you please try if the above is suitable for you

Related

List the branch that monthly pays the most in salaries

I have this table, the expected output should be B003 since it's pays 54,000
STAFF
SALARY
BRAN
SL21
30000
B005
SG37
12000
B003
SG14
18000
B003
SA9
9000
B007
SG5
24000
B003
SL41
9000
B005
So far I only have this subquery, which isn't working how I expected.
SELECT BRANCHNO
FROM STAFF
WHERE (SALARY) IN (SELECT MAX(SUM(SALARY))
FROM STAFF
GROUP BY BRANCHNO);
This works but I want a subquery that returns the branchno
SELECT MAX(SUM(SALARY))
FROM STAFF
GROUP BY BRANCHNO;
select BRANCHNO max(sum_sal)
from (SELECT BRANCHNO, SUM(SALARY) sum_sal
FROM STAFF
GROUP BY BRANCHNO) q1
group by BRANCHNO ;
The column used to group the rows can be displayed. So, add BRANCHNO to your select clause.
One option is to use rank analytic function which ranks branches by sum of their salaries in descending order; you'd then return the one(s) that rank as the highest (rnk = 1).
Sample data:
SQL> with staff (staff, salary, bran) as
2 (select 'SL21', 30000, 'B005' from dual union all
3 select 'SG37', 12000, 'B003' from dual union all
4 select 'SG14', 18000, 'B003' from dual union all
5 select 'SA9' , 9000, 'B007' from dual union all
6 select 'SG5' , 24000, 'B003' from dual union all
7 select 'SL41', 9000, 'B005' from dual
8 )
Query:
9 select bran
10 from (select bran, rank() over (order by sum(salary) desc) rnk
11 from staff
12 group by bran
13 )
14 where rnk = 1;
BRAN
----
B003
SQL>

Need sql query to get expected output for the below,

Tried the below query, but it works only for the first and second records.
Select
name,
dept,
sal,
(
coalesce(sal, 0) + coalesce(saltodrop)
) as running total (
SELECT
name,
dept,
sal,
LAG(Sal, 1, 0) OVER(
PARTITION BY [dept]
ORDER BY
[name],
[dept] ASC
) AS [saltodrop]
FROM
dataset
) as data_set_extract
Input
Name dept sal
John sales 10000
Tom sales 8000
Tim sales 5000
George finance 6000
Dane finance 4000
Mike hr 5000
Meme hr 6000
Ark it 5000
Output
Name dept sal
John sales 1000
Tom sales 18000
Tim sales 23000
George finance 29000
Dane finance 33000
Mike hr 38000
Meme hr 44000
Ark it 49000
Using the Oracle database, I need to add a consecutive row of the
first two records, later the sum of the first record and second record
with that of the third record and so on.
Assuming you have already ordered the results then use the SUM analytic function and order by ROWNUM to keep the current ordering:
SELECT t.*,
SUM(sal) OVER (ORDER BY ROWNUM) AS cumulative_sal
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (Name, dept, sal) AS
SELECT 'John', 'sales', 10000 FROM DUAL UNION ALL
SELECT 'Tom', 'sales', 8000 FROM DUAL UNION ALL
SELECT 'Tim', 'sales', 5000 FROM DUAL UNION ALL
SELECT 'George', 'finance', 6000 FROM DUAL UNION ALL
SELECT 'Dane', 'finance', 4000 FROM DUAL UNION ALL
SELECT 'Mike', 'hr', 5000 FROM DUAL UNION ALL
SELECT 'Meme', 'hr', 6000 FROM DUAL UNION ALL
SELECT 'Ark', 'it', 5000 FROM DUAL;
Outputs:
NAME
DEPT
SAL
CUMULATIVE_SAL
John
sales
10000
10000
Tom
sales
8000
18000
Tim
sales
5000
23000
George
finance
6000
29000
Dane
finance
4000
33000
Mike
hr
5000
38000
Meme
hr
6000
44000
Ark
it
5000
49000
db<>fiddle here

Oracle SQL - Scanning attribute Changes

I have the following employee table
EMPID RECORD_DATE DEPARTMENT
123456 2020-01-01 HR
123456 2020-02-01 HR
123456 2020-03-01 FINANCE
123456 2020-04-01 FINANCE
987654 2020-01-01 HR
987654 2020-02-01 HR
987654 2020-03-01 HR
987654 2020-04-01 LEGAL
Using Oracle PL/SQL, I need to build an expression to ascertain a list of employee movement, specifically that have moved from HR to any other (non-HR) department.
Expected result:
EMPID MOVEMENT_DATE DEPT_BEFORE DEPT_AFTER
123456 2020-03-01 HR FINANCE
987654 2020-04-01 HR LEGAL
I know you can use the Lead or Lag function, but it's a little off for me:
SELECT
,EMP
,RECORD_DATE
,LAG(DEPARTMENT, 1, 0) OVER (PARTITION BY EMP ORDER BY RECORD_DATE) PREV
FROM EMP
Here are some values to work with:
CREATE TABLE #EMP
(
EMP VARCHAR(30) NOT NULL ,
RECORD_DATE DATE NOT NULL ,
DEPARTMENT VARCHAR(30) NOT NULL
);
INSERT INTO #EMP (EMP, DATE_WORKED, CITY)
VALUES
('123456','2020-01-01','HR'),
('123456','2020-02-01','HR'),
('123456','2020-03-01','FINANCE'),
('123456','2020-04-01','FINANCE'),
('987654','2020-01-01','HR'),
('987654','2020-02-01','HR'),
('987654','2020-03-01','HR'),
('987654','2020-04-01','LEGAL')
You could do it using LAG function:
WITH data AS(
SELECT 123456 EMPID, DATE '2020-01-01' RECORD_DATE, 'HR' DEPARTMENT FROM dual UNION ALL
SELECT 123456 EMPID, DATE '2020-02-01' RECORD_DATE, 'HR' DEPARTMENT FROM dual UNION ALL
SELECT 123456 EMPID, DATE '2020-03-01' RECORD_DATE, 'FINANCE' DEPARTMENT FROM dual UNION ALL
SELECT 123456 EMPID, DATE '2020-04-01' RECORD_DATE, 'FINANCE' DEPARTMENT FROM dual UNION ALL
SELECT 987654 EMPID, DATE '2020-01-01' RECORD_DATE, 'HR' DEPARTMENT FROM dual UNION ALL
SELECT 987654 EMPID, DATE '2020-02-01' RECORD_DATE, 'HR' DEPARTMENT FROM dual UNION ALL
SELECT 987654 EMPID, DATE '2020-03-01' RECORD_DATE, 'HR' DEPARTMENT FROM dual UNION ALL
SELECT 987654 EMPID, DATE '2020-04-01' RECORD_DATE, 'LEGAL' DEPARTMENT FROM dual
)
SELECT * FROM(
SELECT
EMPID,
RECORD_DATE MOVEMENT_DATE,
LAG(DEPARTMENT) OVER (PARTITION BY EMPID ORDER BY RECORD_DATE) DEPARTMENT_BEFORE,
DEPARTMENT DEPARTMENT_AFTER
FROM data
)
WHERE DEPARTMENT_BEFORE <> DEPARTMENT_AFTER;
EMPID MOVEMENT_DATE DEPARTMENT_BEFORE DEPARTMENT_AFTER
---------- --------------- ----------------- -----------------
123456 2020-03-01 HR FINANCE
987654 2020-04-01 HR LEGAL

How to do a partitioned outer join in BigQuery

I would like to implement the partitioned outer join in BigQuery. To give a concrete example, I'd like to achieve the partitioned outer join as the accepted answer here: https://dba.stackexchange.com/questions/227069/what-is-a-partitioned-outer-join
I understand there are a lot of discussions about this topic, but I can't make it work under BigQuery. I added partition by date after the left table following the same syntax in the answer as follows:
select * from (
select '2019-01-17' as date, 'London' as location, 11 as qty
union all
select '2019-01-15' as date, 'London' as location, 10 as qty
union all
select '2019-01-16' as date, 'Paris' as location, 20 as qty
union all
select '2019-01-17' as date, 'Boston' as location, 31 as qty
union all
select '2019-01-16' as date, 'Boston' as location, 30 as qty
) as sales partition by (date)
right join
(
select 'London' as location
union all
select 'Paris' as location
union all
select 'Boston' as location
)
as loc
using (location)
The target result I'm looking for is:
date qty location
15-JAN-19 NULL Boston
15-JAN-19 10 London
15-JAN-19 NULL Paris
16-JAN-19 30 Boston
16-JAN-19 NULL London
16-JAN-19 20 Paris
17-JAN-19 31 Boston
17-JAN-19 11 London
17-JAN-19 NULL Paris
But I got the following error: Syntax error: Unexpected keyword PARTITION at [11:12]
How can I implement it in BigQuery?
Below is for BigQuery Standard SQL
#standardSQL
SELECT `date`, qty, location
FROM (SELECT DISTINCT `date` FROM sales)
CROSS JOIN loc
LEFT JOIN sales
USING (`date`, location)
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH sales AS (
SELECT '2019-01-17' AS `date`, 'London' AS location, 11 AS qty UNION ALL
SELECT '2019-01-15', 'London', 10 UNION ALL
SELECT '2019-01-16', 'Paris', 20 UNION ALL
SELECT '2019-01-17', 'Boston', 31 UNION ALL
SELECT '2019-01-16', 'Boston', 30
), loc AS (
SELECT 'London' AS location UNION ALL
SELECT 'Paris' UNION ALL
SELECT 'Boston'
)
SELECT `date`, qty, location
FROM (SELECT DISTINCT `date` FROM sales)
CROSS JOIN loc
LEFT JOIN sales
USING (`date`, location)
-- ORDER BY `date`, location
with below result
Row date qty location
1 2019-01-15 null Boston
2 2019-01-15 10 London
3 2019-01-15 null Paris
4 2019-01-16 30 Boston
5 2019-01-16 null London
6 2019-01-16 20 Paris
7 2019-01-17 31 Boston
8 2019-01-17 11 London
9 2019-01-17 null Paris
In case if you need dates to be in 15-JAN-19 format - you below
#standardSQL
SELECT FORMAT_DATE('%d-%b-%y', CAST(`date` AS DATE)) AS `date`, qty, location
FROM (SELECT DISTINCT `date` FROM sales)
CROSS JOIN loc
LEFT JOIN sales
USING (`date`, location)
so result will be
Row date qty location
1 15-Jan-19 null Boston
2 15-Jan-19 10 London
3 15-Jan-19 null Paris
4 16-Jan-19 30 Boston
5 16-Jan-19 null London
6 16-Jan-19 20 Paris
7 17-Jan-19 31 Boston
8 17-Jan-19 11 London
9 17-Jan-19 null Paris

how to retrieve highest and lowest salary from following table?

I have employee table
EMP_ID | F_NAME | L_NAME | SALARY | JOINING_DATE | DEPARTMENT
-----------------------------------------------------------------------------------
101 | John | Abraham | 100000 | 01-JAN-14 09.15.00.000000 AM | Banking
102 | Michel | Clarke | 800000 | | Insaurance
102 | Roy | Thomas | 70000 | 01-FEB-13 12.30.00.000000 PM | Banking
103 | Tom | Jose | 600000 | 03-FEB-14 01.30.00.000000 AM | Insaurance
105 | Jerry | Pinto | 650000 | 01-FEB-13 12.00.00.000000 PM | Services
106 | Philip | Mathew | 750000 | 01-JAN-13 02.00.00.000000 AM | Services
107 | TestName1 | 123 | 650000 | 01-JAN-13 12.05.00.000000 PM | Services
108 | TestName2 | Lname% | 600000 | 01-JAN-13 12.00.00.000000 PM | Insaurance
i want to find highest and lowest salary from above table in oracle sql.
if i do
select max(salary) from (select * from (select salary from employee) where rownum <2);
it returns MAX(SALARY) = 100000 where it should return 800000
If I do
select max(salary)
from (select * from (select salary from employee)
where rownum <3);
it returns MAX(SALARY) = 800000
If I do
select min(salary)
from (select * from(select salary from employee)
where rownum < 2);
it will return MIN(SALARY) = 100000 where it should return 70000.
What is wrong in this query?
what should be the correct query?
You don't need all these subqueries:
SELECT MAX(salary), MIN(salary)
FROM employee
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE employee ( EMP_ID, F_NAME, L_NAME, SALARY, JOINING_DATE, DEPARTMENT ) AS
SELECT 101, 'John', 'Abraham', 100000, TIMESTAMP '2014-01-01 09:15:00', 'Banking' FROM DUAL
UNION ALL SELECT 102, 'Michel', 'Clarke', 800000, NULL, 'Insurance' FROM DUAL
UNION ALL SELECT 102, 'Roy', 'Thomas', 70000, TIMESTAMP '2013-02-01 12:30:00', 'Banking' FROM DUAL
UNION ALL SELECT 103, 'Tom', 'Jose', 600000, TIMESTAMP '2014-02-03 01:30:00', 'Insurance' FROM DUAL
UNION ALL SELECT 105, 'Jerry', 'Pinto', 650000, TIMESTAMP '2013-02-01 12:00:00', 'Services' FROM DUAL
UNION ALL SELECT 106, 'Philip', 'Mathew', 750000, TIMESTAMP '2013-01-01 02:00:00', 'Services' FROM DUAL
UNION ALL SELECT 107, 'TestName1', '123', 650000, TIMESTAMP '2013-01-01 12:05:00', 'Services' FROM DUAL
UNION ALL SELECT 108, 'TestName2', 'Lname%', 600000, TIMESTAMP '2013-01-01 12:00:00', 'Insurance' FROM DUAL;
Query 1 - To find the highest-n salaries:
SELECT *
FROM (
SELECT salary
FROM employee
ORDER BY salary DESC
)
WHERE rownum <= 3 -- replace with the number of salaries you want to retrieve.
Results:
| SALARY |
|--------|
| 800000 |
| 750000 |
| 650000 |
Query 2 - To find the lowest-n salaries:
SELECT *
FROM (
SELECT salary
FROM employee
ORDER BY salary ASC
)
WHERE rownum <= 3 -- replace with the number of salaries you want to retrieve.
Results:
| SALARY |
|--------|
| 70000 |
| 100000 |
| 600000 |
For each row returned by a query, the ROWNUM pseudocolumn returns a number indicating the order in which Oracle selects the row from a table or set of joined rows. The first row selected has a ROWNUM of 1, the second has 2, and so on.
So in your case :
select max(salary) from (select * from (select salary from employee) where rownum <2);
This query will return
101 John Abraham 100000 01-JAN-14 09.15.00.000000 AM Banking
only this row as output... and hence the max value will be 100000 only.
select max(salary) from (select * from (select salary from employee) where rownum <3);
This query will tak first 2 rows from your table, i.e.,
101 John Abraham 100000 01-JAN-14 09.15.00.000000 AM Banking
102 Michel Clarke 800000 Insaurance
and hence the max salary will be 800000.
Similarly,
select min(salary)from (select * from(select salary from employee)where rownum<2);
will only select 1st row
select min(salary)from (select * from(select salary from employee)where rownum<2);
so min salary will be 100000.
P.S. : You could simply write your queries like this :
select max(salary) from employee where rownum<[n];
where n will be ROWNUM to which you want to limit the number of rows returned by your query
Try it:
select *
from (
select T.*, rownum RRN
from (
select salary
from employee
order by salary desc) T)
where RRN < 3
so you want the 2nd highest and 2nd lowest salary? Check this out
select max(salary), min(salary) from employee
where salary < (select max(salary) from employee)
and salary > (select min(salary) from employee)
;
1) For lowest salary.
select * from (
select empno,job,ename,sal
from emp order by sal)
where rownum=1;
2) For Highest salary.
select * from (
select empno,job,ename,sal
from emp order by sal desc)
where rownum=1;
i don't know why you make complicated queries you can simply write this and get the same result:
select salary
from employees
where rownum <=3
order by salary desc;
you can solve your problem with following queries:
Highest salary:
Select * from Employee(Select salary from Employee ORDER BY salary DISC) where rownum=1;
Lowest salary:
Select * from Employee(Select salary from Employee ORDER BY salary) where rownum=1;
Second highest salary:
Select MAX(Salary) from Employee
Where Salary < (Select MAX(Salary) from employee);
Second Lowest salary :
Select MIN(Salary) from Employee
Where Salary > (Select MIN(Salary) from employee);