How to simplify this oracle sql query?

How to simplify this oracle sql query? - sql

I have written following select to get the previous different grade value from jobs table.
This works well but is it possible to simplify the code that it won't have 3 levels?
select value_1
from ( select distinct
value_1,
date_from,
date_to,
emp_id,
(select o.value_1
from jobs o
where o.emp_id=w.emp_id
and (
(o.date_to >= sysdate and o.date_from <= sysdate) or
(o.data_from <= sysdate and o.data_to is null)
)
) current_grade
from jobs w
where w.emp_id = t.emp_id
order by data_from desc
)
where value_1 != current_grade
and data_from <= sysdate
and rownum=1
and t.emp_id=123
order by data_from desc,
value_1,
emp_id
What it suppose to do? I want to select previous different grade value from jobs table. This table is used to store positions for each employee, they have date_from, date_to, additionally in value_1 we store the grade symbol. What is important for me is to select previous different value for grade which could have changed 3 positions before.

I don't think you can get away from a three-level query in this instance, but it can be simplified. As I noted in my comment, the ORDER BY in the outer query is superfluous, and you would actually get incorrect results if the ORDER BY in the second query was not there. Oracle's rownum does not work like other databases' Top-N queries -- rownum is calculated before order by, so using rownum= with an ORDER BY will not necessarily return the highest row.
This should produce the desired result, and is slightly more compact:
SELECT
value_1
FROM
(
SELECT
value_1
FROM
jobs w
WHERE
date_from <= sysdate
and emp_id=123
and value_1 != (SELECT value_1
FROM jobs o
WHERE o.emp_id = w.emp_id
AND (o.date_to >= sysdate and o.date_from <= sysdate
OR o.date_from <= sysdate and o.date_to is null))
ORDER BY date_from desc
)
WHERE
rownum = 1
SQLFiddle here

You can do it with a single table hit by getting value_1 of latest date_to value in the past.
select value_1 from jobs where date_to < sysdate and emp_id = 123
If you need the latest job role do a order by desc and get first row.

Related

Grouping results by a set of dates in Redshift with two tables

Hope you are fine, I am trying to account the amount of observations that I have in an employee database. Tables look more or less like this:
Date_Table
date_dt
2020-09-07
2020-09-14
2020-09-21
Employee_table
login_id
effective_date
is_active
a
2020-09-07
1
a
2020-09-14
1
b
2020-09-07
1
b
2020-09-14
0
c
2020-09-21
1
keep in mind the effective_date represents (the higher the date the most recent the change) some change (attrition, position change, what ever, those are easily filtered) being the latest the one the current status.
In the above example the date 2020-09-14 for empl_login b would be the day it stopped to be active within the table.
I want to reflect something like this:
the_date
amount_of_employees
2020-09-07
2
2020-09-14
1
2020-09-21
2
This query works perfectly fine, and provides me the correct number:
SELECT '2020-09-07',COUNT(DISTINCT login_id) amount_of_employees
FROM (SELECT date_dt FROM Date_Table) AS dd,(SELECT *,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS chk
FROM Employee_table WHERE effective_date <= '2020-09-07' ) AS dp
WHERE
dp.is_active =1
AND
dp.chk=1
GROUP BY 1
ORDER BY 1 ASC;
Great! This one works and gives me the right value:
the_date
amount_of_employees
2020-09-07
2
However, when I try this to build my dataset with this query:
SELECT dd.date_dt ,COUNT(DISTINCT login_id) amount_of_employees
FROM (SELECT date_dt FROM Date_Table) AS dd,(SELECT *,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS chk
FROM Employee_table WHERE effective_date <= dd.date_dt ) AS dp
WHERE
dp.is_active =1
AND
dp.chk=1
GROUP BY 1
ORDER BY 1 ASC;
I get this error message:
Invalid operation: subquery in FROM may not refer to other relations of same query level
I tried to investigate something like this:
https://w3coded.com/questions/672056/error-subquery-in-from-cannot-refer-to-other-relations-of-same-query-level
but didn't work or doesn't apply necessarily. May be I am not getting it
Any idea? I wouldn't like to make A lot of unions, but is a workaround.
Thanks in advance

I'm not familiar with Amazon Redshift,but as long as your query syntax is supported, you can use a subquery to get the count, and there you'll be able to refer to the columns of the outer query like this
SELECT
dt.date_dt,
(
SELECT COUNT(DISTINCT login_id)
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS rn
FROM employee_table et
WHERE et.effective_date <= dt.date_dt
ORDER BY effective_date DESC
) t
WHERE rn = 1 AND is_active = 1
) amount
FROM date_table dt

this is a solution for this:
SELECT dt.date_dt, COUNT(DISTINCT login_id) other_account
FROM Date_Table dt
LEFT JOIN employee_table et ON dd.date_dt BETWEEN et.effective_date AND et.effective_date + (some additional interval)
WHERE et.is_active = 1 (And other where clauses)
GROUP BY 1
Thanks for all your support

How can i group rows on sql base on condition

I am using redshift sql and would like to group users who has overlapping voucher period into a single row instead (showing the minimum start date and max end date)
For E.g if i have these records,
I would like to achieve this result using redshift
Explanation is tat since row 1 and row 2 has overlapping dates, I would like to just combine them together and get the min(Start_date) and max(End_Date)
I do not really know where to start. Tried using row_number to partition them but does not seem to work well. This is what I tried.
select
id,
start_date,
end_date,
lag(end_date, 1) over (partition by id order by start_date) as prev_end_date,
row_number() over (partition by id, (case when prev_end_date >= start_date then 1 else 0) order by start_date) as rn
from users
Are there any suggestions out there? Thank you kind sirs.

This is a type of gaps-and-islands problem. Because the dates are arbitrary, let me suggest the following approach:
Use a cumulative max to get the maximum end_date before the current date.
Use logic to determine when there is no overall (i.e. a new period starts).
A cumulative sum of the starts provides an identifier for the group.
Then aggregate.
As SQL:
select id, min(start_date), max(end_date)
from (select u.*,
sum(case when prev_end_date >= start_date then 0 else 1
end) over (partition by id
order by start_date, voucher_code
rows between unbounded preceding and current row
) as grp
from (select u.*,
max(end_date) over (partition by id
order by start_date, voucher_code
rows between unbounded preceding and 1 preceding
) as prev_end_date
from users u
) u
) u
group by id, grp;

Another approach would be using recursive CTE:
Divide all rows into numbered partitions grouped by id and ordered by start_date and end_date
Iterate over them calculating group_start_date for each row (rows which have to be merged in final result would have the same group_start_date)
Finally you need to group the CTE by id and group_start_date taking max end_date from each group.
Here is corresponding sqlfiddle: http://sqlfiddle.com/#!18/7059b/2
And the SQL, just in case:
WITH cteSequencing AS (
-- Get Values Order
SELECT *, start_date AS group_start_date,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY start_date, end_date) AS iSequence
FROM users),
Recursion AS (
-- Anchor - the first value in groups
SELECT *
FROM cteSequencing
WHERE iSequence = 1
UNION ALL
-- Remaining items
SELECT b.id, b.start_date, b.end_date,
CASE WHEN a.end_date > b.start_date THEN a.group_start_date
ELSE b.start_date
END
AS groupStartDate,
b.iSequence
FROM Recursion AS a
INNER JOIN cteSequencing AS b ON a.iSequence + 1 = b.iSequence AND a.id = b.id)
SELECT id, group_start_date as start_date, MAX(end_date) as end_date FROM Recursion group by id, group_start_date ORDER BY id, group_start_date

Oracle SQL LAG() function results in duplicate rows

I have a very simple query that results in two rows:
SELECT DISTINCT
id,
trunc(start_date) start_date
FROM example.table
WHERE ID = 1
This results in the following rows:
id start_date
1 7/1/2012
1 9/1/2016
I want to add a column that simply shows the previous date for each row. So I'm using the following:
SELECT DISTINCT id,
Trunc(start_date) start_date,
Lag(start_date, 1)
over (
ORDER BY start_date) pdate
FROM example.table
WHERE id = 1
However, when I do this, I get four rows instead of two:
id start_date pdate
1 7/1/2012 NULL
1 7/1/2012 7/1/2012
1 9/1/2016 7/1/2012
1 9/1/2016 9/1/2012
If I change the offset to 2 or 3 the results remain the same. If I change the offset to 0, I get two rows again but of course now the start_date == pdate.
I can't figure out what's going on

Use an explicit GROUP BY instead:
SELECT id, trunc(start_date) as start_date,
LAG(trunc(start_date)) OVER (PARTITION BY id ORDER BY trunc(start_date))
FROM example.table
WHERE ID = 1
GROUP BY id, trunc(start_date)

The reason for this is: the order of execution of an SQL statements, is that LAG runs before the DISTINCT.
You actually want to run the LAG after the DISTINCT, so the right query should be:
WITH t1 AS (
SELECT DISTINCT id, trunc(start_date) start_date
FROM example.table
WHERE ID = 1
)
SELECT *, LAG(start_date, 1) OVER (ORDER BY start_date) pdate
FROM t1

How to replace null with 0 in conditional selection

I got 100 supervisors in my list, and I would like to count how many employees under their supervision at the beginning of 01/01/2018.
These are the codes I tried. However, for supervisors have no employees, their names just disappear in the table. I just wanna keep their names and set the number of employees as 0 if they don't have any.
select
Supervisor,
IFNULL(COUNT(EmpID),0) AS start_headcount
from
`T1`
where
(Last_hire_date < date'2018-01-01'
AND
term_date >= date'2018-01-01' )OR
( Last_hire_date < date'2018-01-01'
AND
term_date is null)
group by
1
order by
1 asc
The result turned out to be only 92 supervisors appeared in the list who have employees under them. The other 8 supervisors who have no employees just gone. I cannot figure out a better way to improve it.
Can anyone also help with this?

Below is for BigQuery Standard SQL
#standardSQL
SELECT
Supervisor,
COUNTIF(
(Last_hire_date < DATE '2018-01-01' AND term_date >= DATE '2018-01-01' )
OR
(Last_hire_date < DATE'2018-01-01' AND term_date IS NULL)
) AS start_headcount
FROM
`T1`
GROUP BY
1
ORDER BY
1 ASC
The problem in original query in question was because filtering was happening on WHERE clause level thus effectively totally excluding not matching rows and as result some Supervisor were not shown
So, instead, I moved that condition into COUNTIF() - replacing IFNULL(COUNT()) stuff
In case if your data stored such that you need to take DISTINCT into account - below will address this case
here you are not counting distinct employee ID as the headcount
#standardSQL
SELECT
Supervisor,
COUNT(DISTINCT
IF(
(Last_hire_date < DATE '2018-01-01' AND term_date >= DATE '2018-01-01' )
OR
(Last_hire_date < DATE'2018-01-01' AND term_date IS NULL),
EmpID,
NULL
)
) AS start_headcount
FROM
`T1`
GROUP BY
1
ORDER BY
1 ASC

get date range between dates

I have following table tbl in database and I have dynamic joining date 1-1-2012 and I want this date is between (Fall and spring) or (spring and summer) or (summer and fall).I want query in which i passed only joining date which return semestertime and joining date in Oracle.
Semestertime joiningDate
Fall 10-13-2011
Spring 2-1-2012
Summer 6-11-2012
Fall 10-1-2015

If I understand your question correctly:
SELECT *
FROM your_table
WHERE joiningDate between to_date (your_lower_limit_date_here, 'mm-dd-yyyy')
AND to_date (your_upper_limit_date_here, 'mm-dd-yyyy`);

What about something like that:
select 'BEFORE' term,
t."Semestertime", to_char(t."joiningDate", 'MM-DD-YYYY')
from (
select tbl.*, rownum rn from tbl where tbl."joiningDate" < to_date('1-1-2012','MM-DD-YYYY')
-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- your reference date
order by tbl."joiningDate" desc) t
where rn = 1
union all
select 'AFTER' term,
t."Semestertime", to_char(t."joiningDate", 'MM-DD-YYYY')
from (
select tbl.*, rownum rn from tbl where tbl."joiningDate" > to_date('1-1-2012','MM-DD-YYYY')
-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- your reference date
order by tbl."joiningDate" asc) t
where rn = 1
This will return the "term" before and after a given date. You will probably have to adapt such query to your specific needs. But that might be a good starting point.
For example, given your business rules, you might consider using <= instead of <. You you might require to have the result displayer a column instead of rows. Bu all of this shouldn't be too had to change.
As an alternate solution using CTE and sub-queries:
with testdata as (select to_date('1-1-2012','MM-DD-YYYY') refdate from dual)
select v.what, tbl.* from tbl join
(
select 'BEFORE' what, max(t1."joiningDate") d
from tbl t1
where t1."joiningDate" < to_date('1-1-2012','MM-DD-YYYY')
union all
select 'AFTER' what, min(t1."joiningDate") d
from tbl t1
where t1."joiningDate" > to_date('1-1-2012','MM-DD-YYYY')
) v
on tbl."joiningDate" = v.d
See http://sqlfiddle.com/#!4/c7fa5/15 for a live demo comparing those solutions.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to simplify this oracle sql query? - sql

You can do it with a single table hit by getting value_1 of latest date_to value in the past. select value_1 from jobs where date_to < sysdate and emp_id = 123 If you need the latest job role do a order by desc and get first row.

Related

Grouping results by a set of dates in Redshift with two tables

How can i group rows on sql base on condition

Oracle SQL LAG() function results in duplicate rows

How to replace null with 0 in conditional selection

get date range between dates

Categories

Resources