I want to optimize this query, but only using index, hints, clusters and pctfree and pctused. Thanks.
WITH
A AS (SELECT SSN from contracts where (end_date is null or end_date>sysdate)),
B AS (SELECT SSN,start_date, NVL(end_date,sysdate) finish,
(NVL(end_date,sysdate)-start_date) length
FROM CONTRACTS NATURAL JOIN A)
SELECT SSN
FROM B
GROUP BY SSN HAVING (Max(finish)-MIN(start_date)) > SUM(length)
You should be able to get rid of the join by using an analytic query:
SELECT SSN
FROM (
SELECT SSN,
start_date,
NVL( end_date, SYSDATE ) finish,
COUNT( CASE WHEN end_date IS NULL OR end_date > SYSDATE THEN 1 END )
OVER ( PARTITION BY SSN ) AS has_invalid_end_date,
FROM contracts
)
WHERE has_invalid_end_date > 0
GROUP BY SSN
HAVING MAX( finish ) - MIN( start_date ) > SUM( finish - start_date );
I think you could just rewrite this as:
with b as (select ssn,
start_date,
nvl(end_date, sysdate) finish_date,
nvl(end_date, sysdate) - start_date duration
from contracts)
select ssn
from b
where end_date is null
or end_date > sysdate
group by ssn
having max(finish_date) - min(start_date) > sum(duration);
You might also benefit from having an index on (ssn, start_date, end_date).
Related
I have an EMPLOYEE table in SQL Server Database with the below columns and data
I want to merge the 1st , 2nd and 3rd records (ORDERED BY START_DATE) as they are just extensions and produce the following output as
As you can see, I have merged the first 3 records and took the START_DATE from the 1st row and end the END_DATE from the 3rd row
I need a SQL query to create this output, which will merge consecutive records (time based) for an employee_id if their employee_types are the same.
This is a gaps-and-islands problem, but with date ranges. The most general solution is to assume that there might be gaps between the rows (although your data does not have this).
You can solve this be finding out where the "periods of constancy" begin. In this case, lag() is your friend. Then when you find where they begin, a cumulative sum identifies the groups and aggregation solves the problem:
select employee_id, employee_type,
min(start_date), max(end_date)
from (select e.*,
sum(case when start_date = dateadd(day, 1, prev_end_date) then 1 else 0 end) over
(partition by employee_id, employee_type) as grp
from (select e.*,
lag(end_date) over (partition by employee_id, employee_type order by start_date) as prev_end_date
from employee e
) e
) e
group by employee_id, employee_type, grp;
This should help, although you should really tag a the target database, and avoid tagging randomly.
For SQL Server & MySQL
SELECT
employee_id
, employee_type
, MIN(start_date) start_date
, MAX(end_date) end_date
FROM
EMPLOYEE
GROUP BY
employee_id
, employee_type
, YEAR(end_date)
ORDER BY start_date
and For Oracle
SELECT
employee_id
, employee_type
, MIN(start_date) start_date
, MAX(end_date) end_date
FROM
EMPLOYEE
GROUP BY
employee_id
, employee_type
, extract(year from end_date)
ORDER BY start_date
Demos :
SQL Server
Oracle
MySQL
The following query helped to solve the problem
SELECT employee_id,
employee_type,
MIN(start_date) ,
MAX(end_date)
FROM (SELECT *,
DENSE_RANK() OVER (PARTITION BY employee_id ORDER BY start_date),
DENSE_RANK() OVER (PARTITION BY employee_id, employee_type ORDER BY start_date),
DENSE_RANK() OVER (PARTITION BY employee_id ORDER BY start_date) -
DENSE_RANK() OVER (PARTITION BY employee_id, employee_type ORDER BY start_date) AS Grp
FROM employee) T
GROUP BY employee_id,
employee_type,
Grp
ORDER BY 3 asc
I have a query which is order by date , there is the query I have simplified it a bit but basically is :
select * from
(select start_date, to_char(end_date,'YYYY-mm-dd') as end_date from date_table
order by start_date ,end_date )
where start_date is null or end_date is null
It shows prefect order
but I add
union all
select start_date, 'single missing day' as end_date from
calendar_dates
where db_date>'2017-12-12' and db_date<'2018-05-13'
Then the whole order messed up. Why is that happened? Union or union all should just append the dataset from first query with the second, right? It should not mess the order in the first query, right?
I know this query doesn't makes any sense, but I have simplified it to
show the syntax.
You can't predict what would be the order outcome by just assuming that UNION ALL will append queries in the order you write them.
The query planner will execute your queries in whatever order it sees it fit. That's why you have the ORDER BY clause. Use it !
For example, if you want to force the order of the first query, then the second, do :
select * from
(select 1, start_date, to_char(end_date,'YYYY-mm-dd') as end_date from date_table
order by start_date ,end_date )
where start_date is null or end_date is null
union all
select 2, start_date, 'single missing day' as end_date from
calendar_dates
where db_date>'2017-12-12' and db_date<'2018-05-13'
ORDER BY 1
You are mistaken. This query:
select d.*
from (select start_date, to_char(end_date,'YYYY-mm-dd') as end_date
from date_table
order by start_date, end_date
) d
where start_date is null or end_date is null
does not "show perfect order". I might just happen to produce the ordering that you want, but that is a coincidence. The only way to get results in a particular order is to use ORDER BY in the outermost SELECT. Period.
So, if you want results in a particular order, then use order by:
select d.*
from ((select d.start_date, to_char(end_date, 'YYYY-mm-dd') as end_date, 1 as ord
from date_table d
where d.start_date is null or d.end_date is null
order by start_date, end_date
) union all
(select cd.start_date, 'single missing day' as end_date, 2 as ord
from calendar_dates cd
where cd.db_date > '2017-12-12' and cd.db_date < '2018-05-13'
)
) d
order by ord, start_date;
UNION or UNION ALL will mess up the order in the first SELECT. Therefore, we can make a trick that we will re-order these columns in the Outer Select as below:
SELECT * FROM
(
select colA, colB
From TableA
-- ORDER BY colA, colB --
UNION ALL
select colC, colD
FROM TableB
ORDER BY colC, colD
) tb
ORDER BY colA, colB
Fields:
Student_ID, Department, Start_Date
ex:
1,A, 2017-01-1
1,B, 2017-07-1
1,C, 2017-12-1
Expected Output:
Student_ID, Department, Start_Date, End_Date
ex:
1,A, 2017-01-1, 2017-07-01
1,B, 2017-07-1,2017-12-01
1,C, 2017-12-1, ...
End_Date is the start Date of the next record for the student ID
Give a row_number based on student_id and order by start_date. And use a join.
Query
with cte as(
select row_number() over(
partition by Student_ID
order by Start_Date asc
) as rn, *
from your_table_name
)
select t1.Student_ID, t1.Department, t1.Start_Date,
t2.Start_Date as end_date
from cte t1
left join cte t2
on t1.rn = t2.rn - 1
and t1.Student_Id = t2.Student_Id;
Find demo here
SELECT student_id,department_id, start_date,
LEAD(start_date, 1) OVER (PARTITION BY student_id ORDER BY start_date) AS "End_Date"
FROM Your_Table
You can use Lead() function as above
I have written following select to get the previous different grade value from jobs table.
This works well but is it possible to simplify the code that it won't have 3 levels?
select value_1
from ( select distinct
value_1,
date_from,
date_to,
emp_id,
(select o.value_1
from jobs o
where o.emp_id=w.emp_id
and (
(o.date_to >= sysdate and o.date_from <= sysdate) or
(o.data_from <= sysdate and o.data_to is null)
)
) current_grade
from jobs w
where w.emp_id = t.emp_id
order by data_from desc
)
where value_1 != current_grade
and data_from <= sysdate
and rownum=1
and t.emp_id=123
order by data_from desc,
value_1,
emp_id
What it suppose to do? I want to select previous different grade value from jobs table. This table is used to store positions for each employee, they have date_from, date_to, additionally in value_1 we store the grade symbol. What is important for me is to select previous different value for grade which could have changed 3 positions before.
I don't think you can get away from a three-level query in this instance, but it can be simplified. As I noted in my comment, the ORDER BY in the outer query is superfluous, and you would actually get incorrect results if the ORDER BY in the second query was not there. Oracle's rownum does not work like other databases' Top-N queries -- rownum is calculated before order by, so using rownum= with an ORDER BY will not necessarily return the highest row.
This should produce the desired result, and is slightly more compact:
SELECT
value_1
FROM
(
SELECT
value_1
FROM
jobs w
WHERE
date_from <= sysdate
and emp_id=123
and value_1 != (SELECT value_1
FROM jobs o
WHERE o.emp_id = w.emp_id
AND (o.date_to >= sysdate and o.date_from <= sysdate
OR o.date_from <= sysdate and o.date_to is null))
ORDER BY date_from desc
)
WHERE
rownum = 1
SQLFiddle here
You can do it with a single table hit by getting value_1 of latest date_to value in the past.
select value_1 from jobs where date_to < sysdate and emp_id = 123
If you need the latest job role do a order by desc and get first row.
I want to get the employee numbers and start dates of all the employees with start dates equal to the earliest date.
I know this is wrong. But just writing to show what I want.
SELECT start_date, employee_no
FROM [employees]
WHERE (start_date = MIN(start_date))
You were close!
SELECT start_date, employee_no
FROM [employees]
WHERE start_date = (SELECT MIN(start_date) FROM employees)
I'd use a Common Expression Table (CTE) to do this. It's like creating a temp table.
;with EmpInfo as
(
SELECT start_date, employee_no, MIN(start_date) OVER () as MinStartDate
FROM [employees]
)
SELECT start_date, employee_no FROM EmpInfo WHERE start_date = MinStartDate
Here's a Microsoft web page about Using Common Table Expressions
This will do it:
SELECT start_date, employee_no
FROM [employees]
WHERE start_date = (select MIN(start_date) from [employees])
SELECT start_date, employee_no
FROM [employees]
WHERE start_date = (SELECT MIN(start_date) FROM [employees])