SQL Server Get all Birthday Years - sql

I have a table in SQL Server that is Composed of
ID, B_Day
1, 1977-02-20
2, 2001-03-10
...
I want to add rows to this table for each year of a birthday, up to the current birthday year.
i.e:
ID, B_Day
1,1977-02-20
1,1978-02-20
1,1979-02-20
...
1,2020-02-20
2, 2001-03-10
2, 2002-03-10
...
2, 2019-03-10
I'm struggling to determine what the best strategy for accomplishing this. I thought about recursively self-joining, but that creates far too many layers. Any suggestions?

The following should work
with row_gen
as (select top 200 row_number() over(order by name)-1 as rnk
from master..spt_values
)
select a.id,a.b_day,dateadd(year,rnk,b_day) incr_b_day
from dbo.t a
join row_gen b
on dateadd(year,b.rnk,a.b_day)<=getdate()
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=0d06c95e1914ca45ca192d0d192bd2e0

You can use recursive approach :
with cte as (
select t.id, t.b_day, convert(date, getdate()) as mx_dt
from table t
union all
select c.id, dateadd(year, 1, c.b_day), c.mx_dt
from cte c
where dateadd(year, 1, c.b_day) < c.mx_dt
)
select c.id, c.b_day
from cte c
order by c.id, c.b_day;
Default recursion is 100, you can add query hint for more recursion option (maxrecursion 0).

If your dataset is not too big, one option is to use a recursive query:
with cte as (
select id, b_day bday0, b_day, 1 lvl from mytable
union all
select
id,
bday0,
dateadd(year, lvl, bday0), lvl + 1
from cte
where dateadd(year, lvl, bday0) <= getdate()
)
select id, b_day from cte order by id, b_day
Demo on DB Fiddle:
id | b_day
-: | :---------
1 | 1977-02-20
1 | 1978-02-20
1 | 1979-02-20
1 | 1980-02-20
1 | 1981-02-20
1 | 1982-02-20
1 | 1983-02-20
1 | 1984-02-20
1 | 1985-02-20
1 | 1986-02-20
1 | 1987-02-20
1 | 1988-02-20
1 | 1989-02-20
1 | 1990-02-20
1 | 1991-02-20
1 | 1992-02-20
1 | 1993-02-20
1 | 1994-02-20
1 | 1995-02-20
1 | 1996-02-20
1 | 1997-02-20
1 | 1998-02-20
1 | 1999-02-20
1 | 2000-02-20
1 | 2001-02-20
1 | 2002-02-20
1 | 2003-02-20
1 | 2004-02-20
1 | 2005-02-20
1 | 2006-02-20
1 | 2007-02-20
1 | 2008-02-20
1 | 2009-02-20
1 | 2010-02-20
1 | 2011-02-20
1 | 2012-02-20
1 | 2013-02-20
1 | 2014-02-20
1 | 2015-02-20
1 | 2016-02-20
1 | 2017-02-20
1 | 2018-02-20
1 | 2019-02-20
1 | 2020-02-20
2 | 2001-03-01
2 | 2002-03-01
2 | 2003-03-01
2 | 2004-03-01
2 | 2005-03-01
2 | 2006-03-01
2 | 2007-03-01
2 | 2008-03-01
2 | 2009-03-01
2 | 2010-03-01
2 | 2011-03-01
2 | 2012-03-01
2 | 2013-03-01
2 | 2014-03-01
2 | 2015-03-01
2 | 2016-03-01
2 | 2017-03-01
2 | 2018-03-01
2 | 2019-03-01
2 | 2020-03-01

Related

GROUP BY with MAX and UNION, or JOIN?

How to obtain from this table date_departure and date_arrival for each travel according visiting_order
select * from step;
id_step | id_travel | id_port | visiting_order | date_arrival | date_departure
---------+-----------+---------+----------------+--------------+----------------
1 | 1 | 1 | 0 | | 2021-01-12
2 | 1 | 2 | 1 | 2021-05-20 | 2021-05-22
3 | 1 | 3 | 2 | 2021-07-27 |
4 | 2 | 4 | 0 | | 2021-02-13
5 | 2 | 5 | 1 | 2021-02-27 |
6 | 3 | 7 | 0 | | 2022-01-12
7 | 3 | 6 | 1 | 2022-05-27 |
like this :
id_travel | date_departure | date_arrival
------------+----------------+--------------
1 | 2021-01-12 | 2021-07-27
2 | 2021-02-13 | 2021-02-27
3 | 2022-01-12 | 2022-05-27
?
My first intention was to take both columns and UNION them
(SELECT id_travel, date_departure FROM step WHERE visiting_order = 0
GROUP BY id_travel, date_departure)
UNION
(SELECT A.id AS id_travel, A.arr_date AS date_arrival FROM
(SELECT id_travel, MAX(visiting_order), date_arrival
FROM step GROUP BY id_travel
) AS A(id, ord, arr_date)
);
and first select is ok
id_travel | date_departure
-----------+----------------
1 | 2021-01-12
2 | 2021-02-13
3 | 2022-01-12
but second one return an error
ERROR: column "step.date_arrival" must appear in the GROUP BY clause or be used in an aggregate function
Seems like this can just be:
SELECT id_travel
, min(date_departure) AS date_departure
, max(date_arrival) AS date_arrival
FROM step
GROUP BY 1
ORDER BY 1;
Certainly works with your sample data.
SELECT DISTINCT step.id_travel, sub_query1.date_departure, sub_query.date_arrival
FROM step
INNER JOIN
(SELECT step.id_travel, MAX(step.date_arrival) as date_arrival
FROM step
GROUP BY step.id_travel
)
AS sub_query ON (sub_query.id_travel = step.id_travel)
INNER JOIN
(SELECT id_travel, date_departure FROM step WHERE visiting_order = 0 GROUP BY id_travel, date_departure)
AS sub_query1 ON (sub_query1.id_travel = step.id_travel)
ORDER BY id_travel;

Count previous row only if > 2 days later

In SQL Server 2012 using Studio: I need results displayed count of distinct clientnumbers (CN) for re-entry, grouped by Type like this:
Type CountOfCN
5 1
10 3
Only a RE-entry counts (ENTRY_NO 1 never counts) and it has to be more than 2 days after the end of the previous entry for that clientnumber. So basically ENTRY_NO 1 doesn't count. ENTRY_NO 2 counts if it's startdate is more than 2 days after the enddate of ENTRY_NO 1, and so on with ENTRY_NO 3, 4, 5.
I got ENTRY_NO by doing a ROW_NUMBER function when I created the table. I have no idea how to go about creating a datediff or dateadd function (?) to look at the previous row's enddate and calculate it with my startdate for each CN?
Here is my table:
CN STARTDATE ENDDATE TYPE ENTRY_NO
1 1/1/2018 1/20/2018 10 1
1 1/21/2018 1/30/2018 5 2
1 2/3/2018 NULL 10 3
2 1/1/2018 1/20/2018 10 1
2 1/27/2018 1/30/2018 10 2
3 1/1/2018 1/20/2018 5 1
3 1/27/2018 1/30/2018 10 2
3 2/10/2018 2/20/2018 5 3
4 1/7/2018 1/30/2018 5 1
5 1/27/2018 1/30/2018 5 1
5 1/31/2018 NULL 5 2
So the rows that should be in the results are ENTRY_NO 2 for CN 1, ENTRY_NO 2 for CN 2, ENTRY_NO 2 & 3 for CN 3.
Only the last Entry may/may not have a NULL enddate
Using the LAG window function you can get the previous enddate.
SELECT *
FROM
(
SELECT * ,
LAG(ENDDATE) OVER (PARTITION BY CN ORDER BY STARTDATE) AS prevEndDate
FROM yourtable
) q
WHERE DATEDIFF(d, prevEndDate, STARTDATE) > 2
AND ENDDATE IS NOT NULL
Inner join the table to itself on the conditions you want to enforce:
Can't be Entry_No 1
The Entry_No on one side is one greater than on the other side
Previous Entry must be more than 2 days earlier
Both sides of the join have the same CN
Use that join to create a CTE or derived table, and then SELECT from it, grouping by Type and getting the COUNT(*)
So this ended up being more involved than I first thought, but here it goes...
You can run this example in SSMS.
Create a table variable matching your definition above:
DECLARE #data TABLE ( CN INT, STARTDATE DATETIME, ENDDATE DATETIME, [TYPE] INT, ENTRY_NO INT );
Insert data given:
INSERT INTO #data ( CN, STARTDATE, ENDDATE, [TYPE], ENTRY_NO ) VALUES
( 1, '1/1/2018', '1/20/2018', 10, 1 )
, ( 1, '1/21/2018', '1/30/2018', 5, 2 )
, ( 1, '2/3/2018', NULL, 10, 3 )
, ( 2, '1/1/2018', '1/20/2018', 10, 1 )
, ( 2, '1/27/2018', '1/30/2018', 10, 2 )
, ( 3, '1/1/2018', '1/20/2018', 5, 1 )
, ( 3, '1/27/2018', '1/30/2018', 10, 2 )
, ( 3, '2/10/2018', '2/20/2018', 5, 3 )
, ( 4, '1/7/2018', '1/30/2018', 5, 1 )
, ( 5, '1/27/2018', '1/30/2018', 5, 1 )
, ( 5, '1/31/2018', NULL, 5, 2 );
Confirm inserted data:
+----+-------------------------+-------------------------+------+----------+
| CN | STARTDATE | ENDDATE | TYPE | ENTRY_NO |
+----+-------------------------+-------------------------+------+----------+
| 1 | 2018-01-01 00:00:00.000 | 2018-01-20 00:00:00.000 | 10 | 1 |
| 1 | 2018-01-21 00:00:00.000 | 2018-01-30 00:00:00.000 | 5 | 2 |
| 1 | 2018-02-03 00:00:00.000 | NULL | 10 | 3 |
| 2 | 2018-01-01 00:00:00.000 | 2018-01-20 00:00:00.000 | 10 | 1 |
| 2 | 2018-01-27 00:00:00.000 | 2018-01-30 00:00:00.000 | 10 | 2 |
| 3 | 2018-01-01 00:00:00.000 | 2018-01-20 00:00:00.000 | 5 | 1 |
| 3 | 2018-01-27 00:00:00.000 | 2018-01-30 00:00:00.000 | 10 | 2 |
| 3 | 2018-02-10 00:00:00.000 | 2018-02-20 00:00:00.000 | 5 | 3 |
| 4 | 2018-01-07 00:00:00.000 | 2018-01-30 00:00:00.000 | 5 | 1 |
| 5 | 2018-01-27 00:00:00.000 | 2018-01-30 00:00:00.000 | 5 | 1 |
| 5 | 2018-01-31 00:00:00.000 | NULL | 5 | 2 |
+----+-------------------------+-------------------------+------+----------+
Run SQL to get type count given your business rules:
ENTRY_NO must be greater than 1
Current CN ENDDATE must be greater than 2 days from previous ENDDATE
T-SQL:
SELECT
[TYPE], COUNT( DISTINCT CN ) AS ClientCount
FROM #data
WHERE
CN IN (
SELECT DISTINCT CN FROM (
SELECT
dat.CN
, dat.ENTRY_NO
, dat.[TYPE]
, DATEDIFF( DD
, LAG( ENDDATE, 1, NULL ) OVER ( PARTITION BY CN ORDER BY CN, ENDDATE ) -- gets enddate for previous CN entry
, ENDDATE
) AS DayDiff
FROM #data dat
) AS Clients
WHERE
Clients.ENTRY_NO >= 2
AND Clients.DayDiff > 2
)
GROUP BY
[TYPE]
ORDER BY
[TYPE];
Returns:
+------+-------------+
| TYPE | ClientCount |
+------+-------------+
| 5 | 2 |
| 10 | 3 |
+------+-------------+
A quick look at the IN subquery shows us that CNs 1, 2, and 3 will be included during the "TYPE" count.
SELECT
dat.CN
, dat.ENTRY_NO
, dat.[TYPE]
, DATEDIFF( DD
, LAG( ENDDATE, 1, NULL ) OVER ( PARTITION BY CN ORDER BY CN, ENDDATE ) -- gets enddate for previous CN entry
, ENDDATE
) AS DayDiff
FROM #data dat
ORDER BY
dat.CN, dat.ENTRY_NO;
+----+----------+------+---------+
| CN | ENTRY_NO | TYPE | DayDiff |
+----+----------+------+---------+
| 1 | 1 | 10 | NULL |
| 1 | 2 | 5 | 10 |
| 1 | 3 | 10 | NULL |
| 2 | 1 | 10 | NULL |
| 2 | 2 | 10 | 10 |
| 3 | 1 | 5 | NULL |
| 3 | 2 | 10 | 10 |
| 3 | 3 | 5 | 21 |
| 4 | 1 | 5 | NULL |
| 5 | 1 | 5 | NULL |
| 5 | 2 | 5 | NULL |
+----+----------+------+---------+

Weekly Average Reports: Redshift

My Sales data for first two weeks of june, Monday Date i.e 1st Jun , 8th Jun are below
date | count
2015-06-01 03:25:53 | 1
2015-06-01 03:28:51 | 1
2015-06-01 03:49:16 | 1
2015-06-01 04:54:14 | 1
2015-06-01 08:46:15 | 1
2015-06-01 13:14:09 | 1
2015-06-01 16:20:13 | 5
2015-06-01 16:22:13 | 1
2015-06-01 16:27:07 | 1
2015-06-01 16:29:57 | 1
2015-06-01 19:16:45 | 1
2015-06-08 10:54:46 | 1
2015-06-08 15:12:10 | 1
2015-06-08 20:35:40 | 1
I need a find weekly avg of sales happened in a given range .
Complex Query:
(some_manipulation_part), ifact as
( select date, sales_count from final_result_set
) select date_part('h',date )) as h ,
date_part('dow',date )) as day_of_week ,
count(sales_count)
from final_result_set
group by h, dow.
Output :
h | day_of_week | count
3 | 1 | 3
4 | 1 | 1
8 | 1 | 1
10 | 1 | 1
13 | 1 | 1
15 | 1 | 1
16 | 1 | 8
19 | 1 | 1
20 | 1 | 1
If I try to apply avg on the above final result, It is not actually fetching correct answer!
(some_manipulation_part), ifact as
( select date, sales_count from final_result_set
) select date_part('h',date )) as h ,
date_part('dow',date )) as day_of_week ,
avg(sales_count)
from final_result_set
group by h, dow.
h | day_of_week | count
3 | 1 | 1
4 | 1 | 1
8 | 1 | 1
10 | 1 | 1
13 | 1 | 1
15 | 1 | 1
16 | 1 | 1
19 | 1 | 1
20 | 1 | 1
So I 've two mondays in the given range, it is not actually dividing by it. I am not even sure what is happening inside redshift.
To get "weekly averages" use date_trunc():
SELECT date_trunc('week', my_date_column) as week
, avg(sales_count) AS avg_sales
FROM final_result_set
GROUP BY 1;
I hope you are not actually using date as name for your date column. It's a reserved word in SQL and a basic type name, don't use it as identifier.
If you group by the day of week (DOW) you get averages per weekday. and sunday is 0. (Use ISODOW to get 7 for Sunday.)

Horizontal Count SQL

I apologize if this is a duplicate question but I could not find my answer.
I am trying to take data that is horizontal, and get a count of how many times a specific number appears.
Example table
+-------+-------+-------+-------+
| Empid | KPI_A | KPI_B | KPI_C |
+-------+-------+-------+-------+
| 232 | 1 | 3 | 3 |
| 112 | 2 | 3 | 2 |
| 143 | 3 | 1 | 1 |
+-------+-------+-------+-------+
I need to see the following:
+-------+--------------+--------------+--------------+
| EmpID | (1's Scored) | (2's Scored) | (3's Scored) |
+-------+--------------+--------------+--------------+
| 232 | 1 | 0 | 2 |
| 112 | 0 | 2 | 1 |
| 143 | 2 | 0 | 1 |
+-------+--------------+--------------+--------------+
I hope that makes sense. Any help would be appreciated.
Since you are counting data across multiple columns, it might be easier to unpivot your KPI columns first, then count the scores.
You could use either the UNPIVOT function or CROSS APPLY to convert your KPI columns into multiple rows. The syntax would be similar to:
select EmpId, KPI, Val
from yourtable
cross apply
(
select 'A', KPI_A union all
select 'B', KPI_B union all
select 'C', KPI_C
) c (KPI, Val)
See SQL Fiddle with Demo. This gets your multiple columns into multiple rows, which is then easier to work with:
| EMPID | KPI | VAL |
|-------|-----|-----|
| 232 | A | 1 |
| 232 | B | 3 |
| 232 | C | 3 |
| 112 | A | 2 |
Now you can easily count the number of 1's, 2's, and 3's that you have using an aggregate function with a CASE expression:
select EmpId,
sum(case when val = 1 then 1 else 0 end) Score_1,
sum(case when val = 2 then 1 else 0 end) Score_2,
sum(case when val = 3 then 1 else 0 end) Score_3
from
(
select EmpId, KPI, Val
from yourtable
cross apply
(
select 'A', KPI_A union all
select 'B', KPI_B union all
select 'C', KPI_C
) c (KPI, Val)
) d
group by EmpId;
See SQL Fiddle with Demo. This gives a final result of:
| EMPID | SCORE_1 | SCORE_2 | SCORE_3 |
|-------|---------|---------|---------|
| 112 | 0 | 2 | 1 |
| 143 | 2 | 0 | 1 |
| 232 | 1 | 0 | 2 |

SQL - Grouping with aggregation

I have a table (TABLE1) that lists all employees with their Dept IDs, the date they started and the date they were terminated (NULL means they are current employees).
I would like to have a resultset (TABLE2) , in which every row represents a day starting since the first employee started( in the sample table below, that date is 20090101 ), till today. (the DATE field). I would like to group the employees by DeptID and calculate the total number of employees for each row of TABLE2.
How do I this query? Thanks for your help, in advance.
TABLE1
DeptID EmployeeID StartDate EndDate
--------------------------------------------
001 123 20100101 20120101
001 124 20090101 NULL
001 234 20110101 20120101
TABLE2
DeptID Date EmployeeCount
-----------------------------------
001 20090101 1
001 20090102 1
... ... 1
001 20100101 2
001 20100102 2
... ... 2
001 20110101 3
001 20110102 3
... ... 3
001 20120101 1
001 20120102 1
001 20120103 1
... ... 1
This will work if you have a date look up table. You will need to specify the department ID. See it in action.
Query
SELECT d.dt, SUM(e.ecount) AS RunningTotal
FROM dates d
INNER JOIN
(SELECT b.dt,
CASE
WHEN c.ecount IS NULL THEN 0
ELSE c.ecount
END AS ecount
FROM dates b
LEFT JOIN
(SELECT a.DeptID, a.dt, SUM([count]) AS ecount
FROM
(SELECT DeptID, EmployeeID, 1 AS [count], StartDate AS dt FROM TABLE1
UNION ALL
SELECT DeptID, EmployeeID,
CASE
WHEN EndDate IS NOT NULL THEN -1
ELSE 0
END AS [count], EndDate AS dt FROM TABLE1) a
WHERE a.dt IS NOT NULL AND DeptID = 1
GROUP BY a.DeptID, a.dt) c ON c.dt = b.dt) e ON e.dt <= d.dt
GROUP BY d.dt
Result
| DT | RUNNINGTOTAL |
-----------------------------
| 2009-01-01 | 1 |
| 2009-02-01 | 1 |
| 2009-03-01 | 1 |
| 2009-04-01 | 1 |
| 2009-05-01 | 1 |
| 2009-06-01 | 1 |
| 2009-07-01 | 1 |
| 2009-08-01 | 1 |
| 2009-09-01 | 1 |
| 2009-10-01 | 1 |
| 2009-11-01 | 1 |
| 2009-12-01 | 1 |
| 2010-01-01 | 2 |
| 2010-02-01 | 2 |
| 2010-03-01 | 2 |
| 2010-04-01 | 2 |
| 2010-05-01 | 2 |
| 2010-06-01 | 2 |
| 2010-07-01 | 2 |
| 2010-08-01 | 2 |
| 2010-09-01 | 2 |
| 2010-10-01 | 2 |
| 2010-11-01 | 2 |
| 2010-12-01 | 2 |
| 2011-01-01 | 3 |
| 2011-02-01 | 3 |
| 2011-03-01 | 3 |
| 2011-04-01 | 3 |
| 2011-05-01 | 3 |
| 2011-06-01 | 3 |
| 2011-07-01 | 3 |
| 2011-08-01 | 3 |
| 2011-09-01 | 3 |
| 2011-10-01 | 3 |
| 2011-11-01 | 3 |
| 2011-12-01 | 3 |
| 2012-01-01 | 1 |
Schema
CREATE TABLE TABLE1 (
DeptID tinyint,
EmployeeID tinyint,
StartDate date,
EndDate date)
INSERT INTO TABLE1 VALUES
(1, 123, '2010-01-01', '2012-01-01'),
(1, 124, '2009-01-01', NULL),
(1, 234, '2011-01-01', '2012-01-01')
CREATE TABLE dates (
dt date)
INSERT INTO dates VALUES
('2009-01-01'), ('2009-02-01'), ('2009-03-01'), ('2009-04-01'), ('2009-05-01'),
('2009-06-01'), ('2009-07-01'), ('2009-08-01'), ('2009-09-01'), ('2009-10-01'),
('2009-11-01'), ('2009-12-01'), ('2010-01-01'), ('2010-02-01'), ('2010-03-01'),
('2010-04-01'), ('2010-05-01'), ('2010-06-01'), ('2010-07-01'), ('2010-08-01'),
('2010-09-01'), ('2010-10-01'), ('2010-11-01'), ('2010-12-01'), ('2011-01-01'),
('2011-02-01'), ('2011-03-01'), ('2011-04-01'), ('2011-05-01'), ('2011-06-01'),
('2011-07-01'), ('2011-08-01'), ('2011-09-01'), ('2011-10-01'), ('2011-11-01'),
('2011-12-01'), ('2012-01-01')
you need somthing along these lines.
SELECT *
, ( SELECT COUNT(EmployeeID) AS EmployeeCount
FROM TABLE1 AS f
WHERE t.[Date] BETWEEN f.BeginDate AND f.EndDate
)
FROM ( SELECT DeptID
, BeginDate AS [Date]
FROM TABLE1
UNION
SELECT DeptID
, EndDate AS [Date]
FROM TABLE1
) AS t
EDIT since OP clarified that he wants all the dates here is the updated solution
I have excluded a Emplyee from Count if his job is ending on that date.But if you want to include change t.[Date] < f.EndDate to t.[Date] <= f.EndDate in the below solution. Plus I assume the NULL value in EndDate mean Employee still works for Department.
DECLARE #StartDate DATE = (SELECT MIN(StartDate) FROM Table1)
,#EndDate DATE = (SELECT MAX(EndDate) FROM Table1)
;WITH CTE AS
(
SELECT DISTINCT DeptID,#StartDate AS [Date] FROM Table1
UNION ALL
SELECT c.DeptID, DATEADD(dd,1,c.[Date]) AS [Date] FROM CTE AS c
WHERE c.[Date]<=#EndDate
)
SELECT * ,
EmployeeCount=( SELECT COUNT(EmployeeID)
FROM TABLE1 AS f
WHERE f.DeptID=t.DeptID AND t.[Date] >= f.StartDate
AND ( t.[Date] < f.EndDate OR f.EndDate IS NULL )
)
FROM CTE AS t
ORDER BY 1
OPTION ( MAXRECURSION 0 )
here is SQL Fiddler demo.I have added another department and added an Employee to it.
http://sqlfiddle.com/#!3/5c4ec/1