Get The Employee who has taken Minimum days of leave - sql

Recently in one of the interviews I had given , I was asked question to
write a query to find the employee with min days of leave in previous 3 months department wise.
The table structure was
EMPID | TO_DATE | FROM_DATE | LEAVE _TYPE | DEPT_NO
I was able to write the query
SELECT
min(days) FROM (SELECT id ,(sum((TO_DATE-FROM_DATE)+1) ) days ,dept
FROM emp_leave
WHERE to_date between ADD_MONTHS(sysdate,-3) AND sysdate group by id,dept)
group by dept ;
but when I try to select the emp_id I have to add it in group by statement.
I was stuck there.

I think the query should have been something like
select dept, min(id) keep (dense_rank last order by days)
from ( SELECT id ,
(sum((TO_DATE-FROM_DATE)+1) ) days ,
dept
FROM emp_leave
WHERE to_date between ADD_MONTHS(sysdate,-3)
AND sysdate group by id,dept)
group by dept
;
Well of course, in SQL you have a lot of different way to do this, but when it is about ranking stuff, the first/last function is very useful

Try this
SELECT id
,(sum((TO_DATE - FROM_DATE) + 1)) days
,dept
FROM emp_leave
WHERE to_date BETWEEN ADD_MONTHS(sysdate, - 3)
AND sysdate
GROUP BY id
,dept
ORDER BY days LIMIT 1

Related

How to find the number of occurences within a date range?

Let's say I have hospital visits in the table TestData
I would like to know which patients have had a second hospital visit within 7 days of their first hospital visit.
How would I code this in SQL?
I have patient_id as a TEXT
the date is date_visit is also TEXT and takes the format MM/DD/YYYY
patient_id
date_visit
A123B29133
07/12/2011
A123B29133
07/14/2011
A123B29133
07/20/2011
A123B29134
12/05/2016
In the above table patient A123B29133 fulfills the condition as they were seen on 07/14/2011 which is less that 7 days from 07/12/2011
You can use a subquery with exists:
with to_d(id, v_date) as (
select patient_id, substr(date_visit, 7, 4)||"-"||substr(date_visit, 1, 2)||"-"||substr(date_visit, 4, 2) from visits
)
select t2.id from (select t1.id, min(t1.v_date) d1 from to_d t1 group by t1.id) t2
where exists (select 1 from to_d t3 where t3.id = t2.id and t3.v_date != t2.d1 and t3.v_date <= date(t2.d1, '+7 days'))
id
A123B29133
Since your date column is not in YYYY-MM-DD which is the default value used by several sqlite date functions, the substr function was used to transform your date in this format. JulianDay was then used to convert your dates to an integer value which would ease the comparison of 7 days. The MIN window function was used to identify the first hospital visit date for that patient. The demo fiddle and samples show the query that was used to transform the data and the results before the final query which filters based on your requirements i.e. < 7 days. With this approach using window functions, you may also retrieve the visit_date and the number of days since the first visit date if desired.
You may read more about sqlite date functions here.
Query #1
SELECT
patient_id,
visit_date,
JulianDay(visit_date) -
MIN(JulianDay(visit_date)) OVER (PARTITION BY patient_id)
as num_of_days_since_first_visit
FROM
(
SELECT
*,
(
substr(date_visit,7) || '-' ||
substr(date_visit,0,3) || '-' ||
substr(date_visit,4,2)
) as visit_date
FROM
visits
) v;
patient_id
visit_date
num_of_days_since_first_visit
A123B29133
2011-07-12
0
A123B29133
2011-07-14
2
A123B29133
2011-07-20
8
A123B29134
2016-12-05
0
Query #2
The below is your desired query, which uses the previous query as a CTE and applies the filter for visits less than 7 days. num_of_days <> 0 is applied to remove entries where the first date is also the date of the record.
WITH num_of_days_since_first_visit AS (
SELECT
patient_id,
visit_date,
JulianDay(visit_date) - MIN(JulianDay(visit_date)) OVER (PARTITION BY patient_id) num_of_days
FROM
(
SELECT
*,
(
substr(date_visit,7) || '-' ||
substr(date_visit,0,3) || '-' ||
substr(date_visit,4,2)
) as visit_date
FROM
visits
) v
)
SELECT DISTINCT
patient_id
FROM
num_of_days_since_first_visit
WHERE
num_of_days <> 0 AND num_of_days < 7;
patient_id
A123B29133
View on DB Fiddle
Let me know if this works for you.
I would like to know which patients have had a second hospital visit within 7 days of their first hospital visit.
You can use lag(). The following gets all rows where this is true:
select t.*
from (select t.*,
lag(date_visit) over (partition by patient_id order by date_visit) as prev_date_visit
from t
) t
where prev_date_visit >= date(date_visit, '-7 day');
If you just want the patient_ids, you can use select distinct patient_id.

What is a better alternative to a "helper" table in an Oracle database?

Let's say I have an 'employees' table with employee start and end dates, like so:
employees
employee_id start_date end_date
53 '19901117' '99991231'
54 '19910208' '20010512'
55 '19910415' '20120130'
. . .
. . .
. . .
And let's say I want to get the monthly count of employees who were employed at the end of the month. So the resulting data set I'm after would look like:
month count of employees
'20150131' 120
'20150228' 118
'20150331' 122
. .
. .
. .
The best way I currently know how to do this is to create a "helper" table to join onto, such as:
helper_tbl
month
'20150131'
'20150228'
'20150331'
.
.
.
And then do a query like so:
SELECT t0b.month,
count(t0a.employee_id)
FROM employees t0a
JOIN helper_tbl t0b
ON t0b.month BETWEEN t0a.start_date AND t0a.end_date
GROUP BY t0b.month
However, this is somewhat annoying solution to me, because it means I'm having to create these little helper tables all the time and they clutter up my schema. I feel like other people must run into the same need for "helper" tables, but I'm guessing people have figured out a better way to go about this that isn't so manual. Or do you all really just keep creating "helper" tables like I do to get around these situations?
I understand this question is a bit open-ended up for stack overflow, so let me offer a more closed-ended version of the question which is, "Given just the 'employees' table, what would YOU do to get the resulting data set that I showed above?"
You can use a CTE to generate all the month values, either form a fixed starting point or based on the earliest date in your table:
with months (month) as (
select add_months(first_month, level - 1)
from (
select trunc(min(start_date), 'MM') as first_month from employees
)
connect by level <= ceil(months_between(sysdate, first_month))
)
select * from months;
With data that was an earliest start date of 1990-11-17 as in your example, that generates 333 rows:
MONTH
-------------------
1990-11-01 00:00:00
1990-12-01 00:00:00
1991-01-01 00:00:00
1991-02-01 00:00:00
1991-03-01 00:00:00
...
2018-06-01 00:00:00
2018-07-01 00:00:00
You can then use that in a query that joins to your table, something like:
with months (month) as (
select add_months(first_month, level - 1)
from (
select trunc(min(start_date), 'MM') as first_month from employees
)
connect by level <= ceil(months_between(sysdate, first_month))
)
select m.month, count(*) as employees
from months m
left join employees e
on e.start_date <= add_months(m.month, 1)
and (e.end_date is null or e.end_date >= add_months(m.month, 1))
group by m.month
order by m.month;
Presumably you wan to include people who are still employed, so you need to allow for the end date being null (unless you're using a magic end-date value for people who are still employed...)
With dates stored as string it's a bit more complicated but you can generate the month information in a similar way:
with months (month, start_date, end_date) as (
select add_months(first_month, level - 1),
to_char(add_months(first_month, level - 1), 'YYYYMMDD'),
to_char(last_day(add_months(first_month, level - 1)), 'YYYYMMDD')
from (
select trunc(min(to_date(start_date, 'YYYYMMDD')), 'MM') as first_month from employees
)
connect by level <= ceil(months_between(sysdate, first_month))
)
select m.month, m.start_date, m.end_date, count(*) as employees
from months m
left join employees e
on e.start_date <= m.end_date
and (e.end_date is null or e.end_date > m.end_date)
group by m.month, m.start_date, m.end_date
order by m.month;
Very lightly tested with a small amount of made-up data and both seem to work.
If you want to get the employees who were employed at the end of the month, then you can use the LAST_DAY function in the WHERE clause of the your query. Also, you can use that function in the GROUP BY clause of your query. So your query would be like below,
SELECT LAST_DAY(start_date), COUNT(1)
FROM employees
WHERE start_date = LAST_DAY(start_date)
GROUP BY LAST_DAY(start_date)
or if you just want to count employees employed per month then use below query,
SELECT LAST_DAY(start_date), COUNT(1)
FROM employees
GROUP BY LAST_DAY(start_date)

sql db2 group by first day of each year

I have the following:
id date_start date_end
----- ---------- --------
1a3 2001-12-12 2002-12-12
23b 2005-01-24 2008-11-02
11ad 2012-01-15 2014-13-09
19d 2015-01-23 2016-02-04
And I want to get the count of each person where the date range includes the first day of each year.
for example I can do:
select count(distinct id) from table
where '2001-01-01' between date_start and date_end
but I want to produce the count for all years from 2000-2015. I want to avoid manually doing:
select count(distinct id) from table
where '2001-01-01' between date_start and date_end
select count(distinct id) from table
where '2002-01-01' between date_start and date_end
select count(distinct id) from table
where '2003-01-01' between date_start and date_end
I am just having trouble visualizing the group by clause for this.If I had just year I could do:
select count(distinct id), year from table
group by year
however I cannot fit the where '2001-01-01' between date_start and date_end into this group clause.
can anyone help?
Thanks!
You can use left join. Here is a method:
select y.yyyy, count(distinct t.id)
from ((select '2001-01-01' as yyyy from sys.sysdummy) union all
(select '2002-01-01' as yyyy from sys.sysdummy) union all
(select '2003-01-01' as yyyy from sys.sysdummy)
) y left join
table t
on y.yyyy between date_start and date_end
group by y.yyyy
order by y.yyyy;
Well you only have to check and count if the the year of date_start is different from the one of date_end.
select count(*)
from table
where year(date_start) < year(date_end)
I hope you don't mind me answering my own question. Using Michael's logic I was able to combine this with a dummy table - an idea from Gordon's answer. here is what I did:
with dummy(yr) as (
select 2000 from SYSIBM.SYSDUMMY1
union all
select yr + 1 from dummy where yr < 2015
)
select d.yr, count(distinct t.id)
from table t, dummy d
where d.yr between year(t.date_start) + 1 and year(t.date_end)
group by d.yr
order by d.yr
as I say I hope the community does not mind me taking the parts of both posters answers to get the solution I was looking for. I did not want hardcoding and I think the answer I post satisfies this while using a simple piece of date arithmetic logic.

Postgresql nested aggregate functions

I want to find the employees who have taken the maximum number of leaves in the current month.
I started with this query:
select MAX(TotalLeaves) as HighestLeaves
FROM (SELECT emp_id, count(adate) as TotalLeaves
from attendance
group by emp_id) AS HIGHEST;
But i am facing problems in displaying the employee id and getting the result only for the current month. Please help me out.
If you just want to show corresponding employee_id in your current query, you can sort results and get top 1 row, and you need to filter data before group to get only current month:
select
emp_id, TotalLeaves
from (
select emp_id, count(adate) as TotalLeaves
from attendance
where adate >= date_trunc('month', current_date)
group by emp_id
) as highest
order by TotalLeaves desc
limit 1;
Actually, you don't need to use subquery at all here:
select emp_id, count(adate) as TotalLeaves
from attendance
where adate >= date_trunc('month', current_date)
group by emp_id
order by TotalLeaves desc
limit 1;
sql fiddle demo
SELECT emp_id, count(adate) as TotalLeaves
from attendance
where adata > date_trunc('month', NOW())
group by emp_id
order by 2 desc limit 1

Sql query to find date period between multiple rows

I have a table with three columns (City_Code | Start_Date | End_Date).
Suppose i have the following data:
New_York|01/01/1985|01/01/1987
Paris|02/01/1987|01/01/1990
San Francisco|02/01/1990|04/01/1990
Paris|05/01/1990|08/01/1990
New_York|09/01/1990|11/01/1990
New_York|12/01/1990|19/01/1990
New_York|20/01/1990|28/01/1990
I would like to get the date period for which someone lived in the last city of his residence. In this example that is New_York(09/01/1990-28/01/1990) using only sql. I can get this period by manipulating the data with java , but is it possible to do it with plain sql?
Thanks in advance
You can grab the first and last date of residence by city using this:
SELECT TOP 1 City_Code, MIN(Start_Date), Max(End_Date)
FROM Table
GROUP BY City_Code
ORDER BY Max(End_Date) desc
but, the problem is that the start date will be the first date of residence in the city in question.
For 10g you don't have the option of SELECT TOP n so you must be a little creative.
WITH last_period
AS
(SELECT city, moved_in, moved_out, NVL(moved_in-LEAD(moved_out, 1) OVER (ORDER BY city), 0) AS lead
FROM periods
WHERE city = (SELECT city FROM periods WHERE moved_out = (SELECT MAX(moved_out) FROM periods)))
SELECT city, MIN(moved_in) AS moved_in, MAX(moved_out) AS moved_out
FROM last_period
WHERE lead >= 0
GROUP BY city;
This works for the example dataset that you have given. It could stand some optimisation for a large dataset but gives you a working example, tested on Oracle 10g.
If it's MySQL, you can easily use
TIME_TO_SEC(TIMEDIFF(end_date, start_date)) AS `diff_in_secs`
Having time difference in seconds you go any further.
On SQL Server, couldn't you use:
SELECT TOP 1 City_Code, Start_Date + "-" + End_Date
FROM MyTable
ORDER BY enddate DESC
That would get the date period and city with the latest end date.
This is assuming you are trying to just find the city where the person most recently lived, formatted with a dash.
Given that this is Oracle, you can simply subtract the end date and start date to get the number of days in between.
Select City_Code, (End_Date - Start_Date) Days
From MyTable
Where Start_Date = (
Select Max( T1.Start_ Date )
From MyTable As T1
)
If you are using SQL Server you can use the DateDiff() function
DATEDIFF ( datepart , startdate , enddate )
http://msdn.microsoft.com/en-us/library/ms189794.aspx
EDIT
I don't know Oracle but I did find this article
SELECT
MAX(t.duration)
FROM (
SELECT
(End_Date - Start_Date) duration
From
Table
) as t
I hope this will work.
If you want to calculate only the last period length for the last city of residence, then it's probably something like this:
SELECT TOP 1
City_Code,
End_Date - Start_Date AS Days
FROM atable
ORDER BY Start_Date DESC
But if you mean to include all the periods the person has ever lived in a city that happens to be their last city of residence, then it's a bit more complicated, but not too much:
SELECT TOP 1
City_Code,
SUM(End_Date - Start_Date) AS Days
FROM atable
GROUP BY City_Code
ORDER BY MAX(Start_Date) DESC
But the above solution most probably returns the last city information only after it calculates the data for all cities. Do we need that? Not necessarily, so maybe we should use another approach. Maybe like this:
SELECT
City_Code,
SUM(End_Date - Start_Date) AS Days
FROM atable
WHERE City_Code = (SELECT TOP 1 City_Code FROM atable ORDER BY Start_Date DESC)
GROUP BY City_Code
i'm short on time - but this feels like you could use the window function LAG to compare to the previous row and retain the appropriate begin date from that row when the city changes, and dont change it when the city is the same - this should correctly preserve the range.