SQL - Query Year by Year - sql

I have a table of employees' info, including their employment start and end date. I want to extract a list of employees who have been with the company for the full year, year by year, for the past ten years.
So for example, if I want to get a list of employees who've been with the company throughout 2010, I'll do a query like this:
SELECT employee_name FROM employees
WHERE employment_start_date < DATE '2010-01-01'
AND employment_end_date > DATE '2010-12-31'
Now, I could repeat this process manually 10 times for each year from 2010 to 2020 (and manually append the relevant year as an additional column), but surely there's an easier way to do this with a single SQL query?
More background info:
I'm actually trying to translate my Cypher query directly into an SQL query (because different companies uses different database system). Using Cypher, I'll be doing this:
WITH [2010,2011,2012,...,2019,2020] AS years
UNWIND years as y
MATCH (e:employees)
WHERE e.employment_start_date.year < y
AND e.employment_end_date.year > y
RETURN y, e.employee_name
So I'm trying to find an SQL equivalent for this
Sample table data:
|employee_name|employment_start_date|employment_end_date|
|:---:|:---:|:---:|
|John|2009-06-01|2015-03-02|
|Mary|2010-04-02|2014-03-07|
|Joseph|2011-03-02|2011-07-03|
|Stephen|2003-06-14|2011-03-07|
|Dew|2010-06-02|2012-02-06|
Desired Results:
|Year|employee_name|
|:---:|:---:|
|2010|John|
|2010|Stephen|
|2011|John|
|2011|Mary|
|2011|Dew|

You can use:
WITH years ( year ) AS (
SELECT DATE '2010-01-01' FROM DUAL
UNION ALL
SELECT ADD_MONTHS( year, 12 )
FROM years
WHERE year < DATE '2020-01-01'
)
SELECT y.year, e.employee_name
FROM employees e
INNER JOIN years y
ON ( e.employment_start_date <= y.year
AND e.employment_end_date >= ADD_MONTHS( y.year, 12 ) )

An alternative to MT0's suggestion:
WITH years (year) AS(
SELECT EXTRACT (YEAR FROM DATE '2010-01-01') + ROWNUM -1 AS "YEAR"
FROM dual
CONNECT BY ROWNUM <=10
)
SELECT y.year, e.employee_name
FROM employee e
INNER JOIN years y
ON (
EXTRACT(YEAR FROM employment_start_date) < y.year
AND EXTRACT(YEAR FROM employment_end_date) > y.year
)

Related

Rolling 12 month filter criteria in SQL

Having an issue in SQL script where I’m trying to achieve filter criteria of rolling 12 months in the day column which stored data as a text in server.
Goal is to count sizes for product at retail store location over the last 12 months from the current day. Currently, in my query I'm using the criteria of year 2019 which only counts the sizes for that year but not for rolling 12 months from current date.
CALENDARDAY column is in text field in the data set and data stores in yyyymmdd format.
When trying to run below script in Tableau with GETDATE and DATEADD function it is giving me a functional error. I am trying to access SAP HANA server with below query.
Any help would be appreciated
Select
SKU, STYLE_ID, Base_Style_ID, COLOR, SIZEKEY, STORE, Year,
count(SIZEKEY)over(partition by STYLE_ID,COLOR,STORE,Year) as SZ_CNT
from
(
select
a."RAW" As SKU,
a."STYLENUM" As STYLE_ID,
mat."BASENUM" AS Base_Style_ID,
a."COLORNUM" AS COLOR,
a."SIZE" AS SIZEKEY,
a."STORENUM" AS STORE,
substring(a."CALENDARDAY",1,4) As year
from PRTRPT_XRE as a
JOIN ZAT_SKU As mat On a."RAW" = mat."SKU"
where a."ORGANIZATION" = 'M20'
and a."COLORNUM" is not null
and substring(a."CALENDARDAY",1,4) = '2019'
Group BY
a."RAW",
a."STYLENUM",
mat."BASENUM",
a."ZCOLORCD",
a."SIZE",
a."STORENUM",
substring(a."CALENDARDAY",1,4)
)
I have never worked on that DB / Server, so I don't have a way to test this.
But hopefully this will work (expecting exact 12 months before today's date)
AND ADD_MONTHS (TO_DATE (a."CALENDARDAY", 'YYYY-MM-DD'), 12) > CURRENT_DATE
or
AND ADD_MONTHS (a."CALENDARDAY", 12) > CURRENT_DATE
Below condition from one of our CALENDAR table also worked same way as ADD_MONTHS mentioned in above response
select distinct CALENDARDAY
from
(
select FISCALWEEK, CALENDARDAY, CNST, row_number()over(partition by CNST order by FISCALWEEK desc) as rnum
from
(
select distinct FISCALWEEK, CALENDARDAY, 'A' as CNST
from CALENDARTABLE
where CALENDARDAY < current_date
order by 1,2
)
) where rnum < 366

What is a better alternative to a "helper" table in an Oracle database?

Let's say I have an 'employees' table with employee start and end dates, like so:
employees
employee_id start_date end_date
53 '19901117' '99991231'
54 '19910208' '20010512'
55 '19910415' '20120130'
. . .
. . .
. . .
And let's say I want to get the monthly count of employees who were employed at the end of the month. So the resulting data set I'm after would look like:
month count of employees
'20150131' 120
'20150228' 118
'20150331' 122
. .
. .
. .
The best way I currently know how to do this is to create a "helper" table to join onto, such as:
helper_tbl
month
'20150131'
'20150228'
'20150331'
.
.
.
And then do a query like so:
SELECT t0b.month,
count(t0a.employee_id)
FROM employees t0a
JOIN helper_tbl t0b
ON t0b.month BETWEEN t0a.start_date AND t0a.end_date
GROUP BY t0b.month
However, this is somewhat annoying solution to me, because it means I'm having to create these little helper tables all the time and they clutter up my schema. I feel like other people must run into the same need for "helper" tables, but I'm guessing people have figured out a better way to go about this that isn't so manual. Or do you all really just keep creating "helper" tables like I do to get around these situations?
I understand this question is a bit open-ended up for stack overflow, so let me offer a more closed-ended version of the question which is, "Given just the 'employees' table, what would YOU do to get the resulting data set that I showed above?"
You can use a CTE to generate all the month values, either form a fixed starting point or based on the earliest date in your table:
with months (month) as (
select add_months(first_month, level - 1)
from (
select trunc(min(start_date), 'MM') as first_month from employees
)
connect by level <= ceil(months_between(sysdate, first_month))
)
select * from months;
With data that was an earliest start date of 1990-11-17 as in your example, that generates 333 rows:
MONTH
-------------------
1990-11-01 00:00:00
1990-12-01 00:00:00
1991-01-01 00:00:00
1991-02-01 00:00:00
1991-03-01 00:00:00
...
2018-06-01 00:00:00
2018-07-01 00:00:00
You can then use that in a query that joins to your table, something like:
with months (month) as (
select add_months(first_month, level - 1)
from (
select trunc(min(start_date), 'MM') as first_month from employees
)
connect by level <= ceil(months_between(sysdate, first_month))
)
select m.month, count(*) as employees
from months m
left join employees e
on e.start_date <= add_months(m.month, 1)
and (e.end_date is null or e.end_date >= add_months(m.month, 1))
group by m.month
order by m.month;
Presumably you wan to include people who are still employed, so you need to allow for the end date being null (unless you're using a magic end-date value for people who are still employed...)
With dates stored as string it's a bit more complicated but you can generate the month information in a similar way:
with months (month, start_date, end_date) as (
select add_months(first_month, level - 1),
to_char(add_months(first_month, level - 1), 'YYYYMMDD'),
to_char(last_day(add_months(first_month, level - 1)), 'YYYYMMDD')
from (
select trunc(min(to_date(start_date, 'YYYYMMDD')), 'MM') as first_month from employees
)
connect by level <= ceil(months_between(sysdate, first_month))
)
select m.month, m.start_date, m.end_date, count(*) as employees
from months m
left join employees e
on e.start_date <= m.end_date
and (e.end_date is null or e.end_date > m.end_date)
group by m.month, m.start_date, m.end_date
order by m.month;
Very lightly tested with a small amount of made-up data and both seem to work.
If you want to get the employees who were employed at the end of the month, then you can use the LAST_DAY function in the WHERE clause of the your query. Also, you can use that function in the GROUP BY clause of your query. So your query would be like below,
SELECT LAST_DAY(start_date), COUNT(1)
FROM employees
WHERE start_date = LAST_DAY(start_date)
GROUP BY LAST_DAY(start_date)
or if you just want to count employees employed per month then use below query,
SELECT LAST_DAY(start_date), COUNT(1)
FROM employees
GROUP BY LAST_DAY(start_date)

Select dummy data for a certain period of years based on some conditions from a main query

The following query returns all ExceptionDays of an employee:
SELECT
idEmployee, idExceptionDayType,
YEAR(startDate) year,
SUM (days) total
FROM
ExceptionDays ed
GROUP BY idEmployee, idExceptionDayType, YEAR(startDate)
I need to generate dummy ExceptionDays starting with the hiring year of the employee till current year. This subquery shouldn't create dummy data if a year already exists in the ExceptionDays table for the current employee.
So basically, if the employee was hired in 2010 and it has data in ExceptionDays table only for 2014 and 2015 , the query should return 2010,2011,2012,2013 as dummy data and for 2014 and 2015 it should get the valid data.
For generating dummy data I used a temporary calendar of years :
SELECT
1 as idEmployee,1 as idExceptionDayType,YEAR(c.date) year,0 total
FROM
TMP_CALENDAR c WITH (NOLOCK)
WHERE
YEAR(c.date) >= 2012
AND YEAR(c.date) < YEAR(GETDATE())
GROUP BY YEAR(c.date)
This second subquery should somehow be joined to the first query and the expecting query would become :
SELECT
ED.idEmployee,1 as idExceptionDayType,YEAR(c.date) year,0 total
FROM
TMP_CALENDAR c WITH (NOLOCK)
JOIN Employees E ON ED.idEmployee = E.idEmployee
WHERE
YEAR(c.date) >= YEAR(e.hiringDate)
AND YEAR(c.date) < YEAR(GETDATE()) AND YEAR <> YEAR(ed.year)
GROUP BY YEAR(c.date)
As we can see the ED reference to the first query is needed so the query would become valid but I'm not sure how I'm supposed to do that.
Can someone help me a lil bit?
Thanks in advance,
The idea behind this type of query is to generate the output rows first -- using a CROSS JOIN to get all the years and employees, and then additional logic to bring in the other values.
The following version just uses ExceptionDays to get all the years, assuming there is at least one such day in each year. You can also use a calendar table for the y subquery:
SELECT e.idEmployee, COALESCE(ed.idExceptionDayType, 1) as idExceptionDayType
y.year, COALESCE(SUM(ed.days), 0) as total
FROM (SELECT DISTINCT idEmployee FROM ExceptionDays) e CROSS JOIN
(SELECT DISTINCT YEAR(startDate) as year FROM ExceptionDays) y LEFT JOIN
ExceptionDays ed
ON ed.idEmployee = e.idEmployee and
y.year BETWEEN YEAR(ed.startDate) AND year(GETDATE())
GROUP BY e.idEmployee, ed.idExceptionDayType, y.year
You can try something like this.
Get all distinct Years from Calendar
Cross Join with all Employees and filter to get only those years after employee has joined
Filter only those employees which do not have an entry in ExceptionDays for the specific year
Query
SELECT
E.idEmployee,1 as idExceptionDayType,calendar_year,0 total
FROM
(
SELECT YEAR(c.date) calendar_year
FROM TMP_CALENDAR
GROUP BY YEAR(c.date)
)c
CROSS JOIN Employees E
LEFT JOIN ExceptionDays ed
ON ED.idEmployee = E.idEmployee
AND ed.YEAR(startDate) = calendar_year
WHERE calendar_year >= YEAR(e.hiringDate)
AND calendar_year < YEAR(GETDATE())
AND ED.idEmployee IS NULL

SQL query for all the days of a month

i have the following table RENTAL(book_date, copy_id, member_id, title_id, act_ret_date, exp_ret_date). Where book_date shows the day the book was booked. I need to write a query that for every day of the month(so from 1-30 or from 1-29 or from 1-31 depending on month) it shows me the number of books booked.
i currently know how to show the number of books rented in the days that are in the table
select count(book_date), to_char(book_date,'DD')
from rental
group by to_char(book_date,'DD');
my questions are:
How do i show the rest of the days(if let's say for some reason in my database i have no books rented on 20th or 19th or multiple days) and put the number 0 there?
How do i show the number of days only of the current month so(28,29,30,31 all these 4 are possible depending on month or year)... i am lost . This must be done using only SQL query no pl/SQL or other stuff.
The following query would give you all days in the current month, in your case you can replace SYSDATE with your date column and join with this query to know how many for a given month
SELECT DT
FROM(
SELECT TRUNC (last_day(SYSDATE) - ROWNUM) dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')
The answer is to create a table like this:
table yearsmonthsdays (year varchar(4), month varchar(2), day varchar(2));
use any language you wish, e.g. iterate in java with Calendar.getInstance().getActualMaximum(Calendar.DAY_OF_MONTH) to get the last day of the month for as many years and months as you like, and fill that table with the year, month and days from 1 to last day of month of your result.
you'd get something like:
insert into yearsmonthsdays ('1995','02','01');
insert into yearsmonthsdays ('1995','02','02');
...
insert into yearsmonthsdays ('1995','02','28'); /* non-leap year */
...
insert into yearsmonthsdays ('1996','02','01');
insert into yearsmonthsdays ('1996','02','02');
...
insert into yearsmonthsdays ('1996','02','28');
insert into yearsmonthsdays ('1996','02','29'); /* leap year */
...
and so on.
Once you have this table done, your work is almost finished. Make an outer left join between your table and this table, joining year, month and day together, and when no lines appear, the count will be zero as you wish. Without using programming, this is your best bet.
In oracle, you can query from dual and use the conncect by level syntax to generate a series of rows - in your case, dates. From there on, it's just a matter of deciding what dates you want to display (in my example I used all the dates from 2014) and joining on your table:
SELECT all_date, COALESCE (cnt, 0)
FROM (SELECT to_date('01/01/2014', 'dd/mm/yyyy') + rownum - 1 AS all_date
FROM dual
CONNECT BY LEVEL <= 365) d
LEFT JOIN (SELECT TRUNC(book_date), COUNT(book_date) AS cnt
FROM rental
GROUP BY book_date) r ON d.all_date = TRUNC(r.book_date)
There's no need to get ROWNUM involved ... you can just use LEVEL in the CONNECT BY:
WITH d1 AS (
SELECT TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL AS book_date
FROM dual
CONNECT BY TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL <= LAST_DAY(SYSDATE)
)
SELECT TRUNC(d1.book_date), COUNT(r.book_date)
FROM d1 LEFT JOIN rental r
ON TRUNC(d1.book_date) = TRUNC(r.book_date)
GROUP BY TRUNC(d1.book_date);
Simply replace SYSDATE with a date in the month you're targeting for results.
All days of the month based on current date
select trunc(sysdate) - (to_number(to_char(sysdate,'DD')) - 1)+level-1 x from dual connect by level <= TO_CHAR(LAST_DAY(sysdate),'DD')
It did works to me:
SELECT DT
FROM (SELECT TRUNC(LAST_DAY(SYSDATE) - (CASE WHEN ROWNUM=1 THEN 0 ELSE ROWNUM-1 END)) DT
FROM DUAL
CONNECT BY ROWNUM <= 32)
WHERE DT >= TRUNC(SYSDATE, 'MM')
In Oracle SQL the query must look like this to not miss the last day of month:
SELECT DT
FROM(
SELECT trunc(add_months(sysdate, 1),'MM')- ROWNUM dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')

SQL Query to return data for the first of every month

I want to find the number of staff in a department at the start of each month, for the last 12 months.
I can get the desired output using 12 separate queries and UNION ALL similar to below:
SELECT
o.DEP_ID
,COUNT(o.STAFF_ID) STAFF_COUNT
,TRUNC(SYSDATE,'MON') EFFECTIVE_DATE
FROM
OCCUPANCIES o
WHERE
o.START_DATE <= TRUNC(SYSDATE,'MON')
AND o.END_DATE >= TRUNC(SYSDATE,'MON')
GROUP BY
o.DEP_ID
,TRUNC(SYSDATE,'MON')
UNION ALL
SELECT
o.DEP_ID
,COUNT(o.STAFF_ID) STAFF_COUNT
,ADD_MONTHS(TRUNC(SYSDATE,'MON'),-1) EFFECTIVE_DATE
FROM
OCCUPANCIES o
WHERE
o.START_DATE <= ADD_MONTHS(TRUNC(SYSDATE,'MON'),-1)
AND o.END_DATE >= ADD_MONTHS(TRUNC(SYSDATE,'MON'),-1)
GROUP BY
o.DEP_ID
,ADD_MONTHS(TRUNC(SYSDATE,'MON'),-1)
This gives me output similar to the following:
Unfortunately my real query is very long, and editing it is becoming unwieldy to say the least because I am making the same changes in 12 places each time.
Is there a way of doing this in a single SELECT statement?
EDIT: I have uploaded an example to SQLFiddle
You can generate a list of effective dates and use it in your query
SELECT
o.DEP_ID
,COUNT(o.STAFF_ID) STAFF_COUNT
,dt.EFFECTIVE_DATE
FROM
OCCUPANCIES o,
(SELECT ADD_MONTHS(TRUNC(SYSDATE,'MON'), 1-LEVEL) EFFECTIVE_DATE
FROM dual
CONNECT BY LEVEL <=12) dt
WHERE
dt.EFFECTIVE_DATE BETWEEN o.START_DATE AND o.END_DATE
GROUP BY
o.DEP_ID
,dt.EFFECTIVE_DATE