CountDistinct() is counting value twice when grouped in SSRS - sql

-- Sample data.
declare #Table1 as Table ( RegisterId Int Identity, UnitId Int, DateRegistered date);
declare #Table2 as Table ( Id Int Identity, RegisterId Int, Rep1 int, Rep2 int, DateCreated Date );
declare #Table3 as Table ( UnitId int Identity, UnitName varchar(40), SquadName varchar(40))
insert into #Table1 ( UnitId, DateRegistered )
values
( 1, '20160115' );
insert into #Table2 ( RegisterId, Rep1, Rep2, DateCreated )
values
( 1, 3, 4, '20160122' ), ( 1, 10, 4, '20160129' ), ( 1, 32, 45, '20160210' );
insert into #Table3 ( UnitName )
values
( 'Tango', 'West' ), ( 'Lima', 'West' ), ( 'Foxtrot', 'West' );
SELECT t3.UnitName
, t2.RegisterId
, t2.DateCreated
, t2.Rep1 + t2.Rep2 as 'TotalReps'
, DateName(month, t2.DateCreated) as 'Month'
, DateName(year, t2.DateCreated) as 'Year'
FROM #Table1 t1
INNER JOIN #Table2 t2 ON t1.RegisterId = t2.RegisterId
INNER JOIN #Table3 t3 ON t1.UnitId = t3.UnitId
Building a report in SSRS, the above is my query. Report parameters are a start date, enddate and UnitId(s).
In the report I have 3 Row Groups - Month, Year, SquadName. In the report I am using the TotalReps for totalreps, CountDistinct(Field!RegisterId.Value) for the ConfirmedRegisters and Count(Field!RegisterId.Value) for CheckIn. THe TOTALs are just SUMS of the expressions, SUM(CountDistinct(Field!RegisterId.Value)).
The report shows like:
TotalReps ConfirmedRegisters CheckIns
WEST
2016
Jan
21 1 2
Feb
77 1 1
TOTAL 98 1 3
Some definitions. A ConfirmedRegister means the Id exists in Table1 AND Table2. A Checkin is just a count of Table2 Ids. So to be a checkin, there must be a row in Table2 and a ConfirmedREgister can ONLY BE COUNTED ONCE, regardless of the number of checkins and when they happen. So if a Table1 register occurs in Jan 2016 and there are checkins off the registerid in Jan and Feb 2016 as our test data suggests, the report should show a zero in the ConfirmedRegisters columns for Feb because the RegisterId was counted in Jan.
Should be:
TotalReps ConfirmedRegisters CheckIns
WEST
2016
Jan
21 1 2
Feb
77 0 1
TOTAL 98 1 3
Notice the TOTAL Confirmed Registers is showing correct, I guess because it is totalling the whole date range. But The MONTHLY totals are incorrect for the CONFIRMEDREGISTERS columns because it is counting RegisterID for Jan and Feb where it should only count the Jan and put nothing or 0 for Feb.
Not sure if I need do fix this in the query or the report.

I fixed this by using a CTE with a windowing function (row_number() over(partition ...) to mimic a 'First()' type function so I could then count the first occurrence of each RegisterId from Table2.

Related

Return columns from different tables that are after a given date

I need to write a query that returns product orders that were open during April of 2018 and are still open and also returns product orders that were open during April of 2018 and are no longer open.
The rows need to include in the results of the name of the customer that placed the order, the id for the order, and the date the order was filled.
Here is the table info
CREATE TABLE dbo.ProductOrders
(
POID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY ,
ProductId INT NOT NULL
CONSTRAINT FK_ProductOrders_ProductId_ref_Products_ProductId
FOREIGN KEY REFERENCES dbo.Products ( ProductId ) ,
CustomerId INT NOT NULL ,
OrderedQuantity INT ,
Filled BIT NOT NULL
CONSTRAINT DF_ProductOrders_Filled
DEFAULT ( 0 ) ,
DateOrdered DATETIME
CONSTRAINT DF_ProductOrders_DateOrdered
DEFAULT ( GETDATE()) ,
DateFilled DATETIME
CONSTRAINT DF_ProductOrders_DateFilled
DEFAULT ( GETDATE())
);
INSERT dbo.ProductOrders ( ProductId ,
CustomerId ,
OrderedQuantity ,
Filled ,
DateOrdered ,
DateFilled )
VALUES ( 2, 1, 1000, 0, '4/16/18 8:09:13', NULL ) ,
( 2, 1, 500, 1, '3/27/18 17:00:21', '6/24/18 13:29:01' ) ,
( 3, 3, 2000, 1, '12/01/04 13:28:58', '2/19/05 19:41:42' ) ,
( 1, 1, 632, 0, '5/23/18 4:25:52', NULL ) ,
( 4, 4, 901, 0, '3/30/18 21:30:28', NULL );
CREATE TABLE dbo.Customers
(
CustomerId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY ,
CustomerName NVARCHAR(100) ,
Active BIT NOT NULL
CONSTRAINT DF_Customers_Active
DEFAULT ( 1 )
);
INSERT dbo.Customers ( CustomerName ,
Active )
VALUES ( 'Bikes R'' Us', 1 ) ,
( 'Industrial Giant', 1 ) ,
( 'Widget-Works', 0 ) ,
( 'Custom Hangers', 1 );
This is my best attempt at it, I know this is not the right syntax but I'm not sure if I need a join between these to tables to make this work or how I would go about selecting orders that start at April 2018 and also are open or closed after that date.
select CustomerName, POID, DataFilled,
From ProductOrders, Customers
Where DateOrdered is >= April 2018
I think you want and join and filtering:
select c.customername, po.poid, po.dateordered, po.datefilled
from productorders po
inner join customers c on c.customerid = po.customerid
where
po.dateordered >= '20180401'
and po.dateordered < '20180501'
and po.datefilled < getdate()
This gives you orders that were ordered in April 2018 and are not open anymore as of now. To get orders that are still open, you would change the last condition to po.datefilled is null.

SQL Server How to insert when not exist?

I have two tables, one is called Invoices and another is called Records.
CREATE TABLE Invoices
(
InvoiceNum INT NOT NULL,
Amount DECIMAL,
RecordPK UNIQUEIDENTIFIER NOT NULL
)
CREATE TABLE Records(
RecordPK UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
StartNum INT NOT NULL,
NextNum INT NOT NULL,
MaxNum INT NOT NULL,
InvPrefix VARCHAR(2) NOT NULL
)
The records table will record the invoice start number, how many invoices we have created(NextNum) and how many invoices we can create(MaxNum).
For example, Assume we have several records in two tables.
Invoice Table:
InvoiceNum Amount RecordPk
1 19.00 EDFA0541-5583-4CDD-BDFF-21D6F6504522
2 50.00 EDFA0541-5583-4CDD-BDFF-21D6F6504522
3 3.00 EDFA0541-5583-4CDD-BDFF-21D6F6504522
10 1.00 D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9
11 99.00 D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9
12 13.00 D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9
Records Table:
RecordPk StartNum NextNum MaxNum Prefix
EDFA0541-5583-4CDD-BDFF-21D6F6504522 1 4 10 AA
D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9 10 13 14 AA
My question is when I search the invoice table with Prefix AA, how can I get the result like below, the InvoiceNum should reach the MaxNum, the Amount and RecordPK of not exist rows should left blank, the Remark column should fill with Blank.
InvoiceNum Amount RecordPk Remark
1 19.00 EDFA0541-5583-4CDD-BDFF-21D6F6504522
2 50.00 EDFA0541-5583-4CDD-BDFF-21D6F6504522
3 3.00 EDFA0541-5583-4CDD-BDFF-21D6F6504522
4 Blank
5 Blank
6 Blank
7 Blank
8 Blank
9 Blank
10 1.00 D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9
11 99.00 D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9
12 13.00 D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9
13 Blank
14 Blank
You need to generate a table with numbers to cover the range of numbers that you need (for each row in Records table, from StartNum to MaxNum). You can do this for example, by selecting from some existing table with enough rows and using ROW_NUMBER window function. Then filter this sequence to include only the numbers you need. Left join the Invoices table to show the data for the corresponding invoice and use IIF function to check is there invoice with this number or not.
declare #Invoices table(InvoiceNum INT NOT NULL, Amount DECIMAL, RecordPK UNIQUEIDENTIFIER NOT NULL)
declare #Records table(RecordPK UNIQUEIDENTIFIER NOT NULL PRIMARY KEY, StartNum INT NOT NULL, NextNum INT NOT NULL, MaxNum INT NOT NULL, InvPrefix VARCHAR(2) NOT NULL)
insert into #Invoices(InvoiceNum, Amount, RecordPk) values
(1 , 19.00, 'EDFA0541-5583-4CDD-BDFF-21D6F6504522'),
(2 , 50.00, 'EDFA0541-5583-4CDD-BDFF-21D6F6504522'),
(3 , 3.00 , 'EDFA0541-5583-4CDD-BDFF-21D6F6504522'),
(10, 1.00 , 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9'),
(11, 99.00, 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9'),
(12, 13.00, 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9')
insert into #Records(RecordPk, StartNum, NextNum, MaxNum, InvPrefix) values
('EDFA0541-5583-4CDD-BDFF-21D6F6504522', 1 , 4 , 10, 'AA'),
('D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9', 10, 13, 14, 'AA')
;with numbers as (select ROW_NUMBER() over(order by object_id) as No from sys.objects)
select
n.No as InvoiceNum
, inv.Amount
, inv.RecordPK
, IIF(inv.InvoiceNum is null, 'Blank', null) as Remark
from numbers n
left join #Invoices inv on n.No = inv.InvoiceNum
where exists(select * from #Records r where r.StartNum <= n.No and n.No <= r.MaxNum)
#Andrey Nikolov has it covered, however I've been working on this for the last 15 minutes so I thought I'd post it anyway.
Essentially an intermediary table should be used to count up the values you don't have, then in my version of this answer I've used a union query to generate the "Blank" value. I have not included the unique identifier for brevity but the application is the same.
if OBJECT_ID('tempdb..#invoice') is not null drop table #invoice;
if OBJECT_ID('tempdb..#rowcount') is not null drop table #rowcount;
create table #invoice
(
invoicenum int,
amount decimal
);
insert into #invoice (invoicenum, amount)
values
(1, 19.00),
(2, 50.00),
(3, 3.00),
(10, 1.00),
(11, 99.00),
(12, 13.00);
create table #rowcount
(
rownumber int
);
declare #max int = 1;
select #max=count(*) from #invoice;
declare #runs int = 1;
while #runs<=#max
begin
insert into #rowcount (rownumber)
values (#runs);
select #runs=#runs+1;
end
select invoicenum, cast(amount as nvarchar(25)) as amount from #invoice
union
select rownumber, 'BLANK' from #rowcount r left join #invoice i on
r.rownumber=i.invoicenum where i.invoicenum is null
order by invoicenum;
drop table #invoice, #rowcount;
You need a LEFT JOIN
SELECT I.*,
CASE WHEN I.InvoiceNum IS NULL THEN 'Blank' END Remark
FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14)) RC (InvoiceNum)
LEFT JOIN Invoices I
ON RC.InvoiceNum = I.InvoiceNum;
The value 1 is the StartNum and 14 is the MAX MaxNum.
I used VALUES cause the number is know, you can use a RecursiveCTE to generate the missing InvoiceNum then LEFT JOIN the CTE with your table.
Demo
I will do it this way:
IF OBJECT_ID('tempdb..#Invoices') IS NOT NULL DROP TABLE #Invoices
CREATE TABLE #Invoices
(
InvoiceNum INT NOT NULL,
Amount DECIMAL,
RecordPK UNIQUEIDENTIFIER NOT NULL
)
IF OBJECT_ID('tempdb..#Records') IS NOT NULL DROP TABLE #Records
CREATE TABLE #Records(
RecordPK UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
StartNum INT NOT NULL,
NextNum INT NOT NULL,
MaxNum INT NOT NULL,
InvPrefix VARCHAR(2) NOT NULL
)
INSERT INTO #Invoices
SELECT 1, 19.00, 'EDFA0541-5583-4CDD-BDFF-21D6F6504522'
UNION SELECT 2 , 50.00, 'EDFA0541-5583-4CDD-BDFF-21D6F6504522'
UNION SELECT 3 , 3.00 , 'EDFA0541-5583-4CDD-BDFF-21D6F6504522'
UNION SELECT 10 , 1.00 , 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9'
UNION SELECT 11 , 99.00, 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9'
UNION SELECT 12 , 13.00, 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9'
INSERT INTO #Records
SELECT 'EDFA0541-5583-4CDD-BDFF-21D6F6504522', 1, 4, 10, 'AA'
UNION SELECT 'D64EFF0E-65D5-467E-8C82-BFBB6A24AAC9', 10, 13, 14, 'AA'
DECLARE #MAX_NUM INT = (SELECT MAX(MaxNum) FROM #Records)
DECLARE #TEMP_INV TABLE (InvoiceNum INT)
INSERT INTO #TEMP_INV
SELECT Num
FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY object_id) AS Num FROM sys.objects
) A
WHERE Num <= #MAX_NUM
IF OBJECT_ID('tempdb..#TEMP') IS NOT NULL DROP TABLE #TEMP
SELECT I.InvoiceNum, I.Amount, I.RecordPK
INTO #TEMP
FROM #Invoices I
INNER JOIN #Records R
ON I.RecordPK = R.RecordPK
WHERE R.InvPrefix = 'AA'
SELECT A.InvoiceNum, B.Amount, B.RecordPK, CASE WHEN B.InvoiceNum IS NULL THEN 'BLANK' END AS Remark
FROM #TEMP_INV A
LEFT JOIN #TEMP B
ON A.InvoiceNum = B.InvoiceNum

SQL Server: select newest rows who's sum matches a value

Here is a table...
ID QTY DATE CURRENT_STOCK
----------------------------------
1 1 Jan 30
2 1 Feb 30
3 2 Mar 30
4 6 Apr 30
5 8 May 30
6 21 Jun 30
I need to return the newest rows whose summed qty equal or exceed the current stock level, excluding any additional rows once this total has been reached, so I am expecting to see just these rows...
ID QTY DATE CURRENT_STOCK
----------------------------------
4 6 Apr 30
5 8 May 30
6 21 Jun 30
I am assuming I need a CTE (Common Table Expression) and have looked at this question but cannot see how to translate that to my requirement.
Help!?
Declare #YourTable table (ID int,QTY int,DATE varchar(25), CURRENT_STOCK int)
Insert Into #YourTable values
(1 ,1 ,'Jan' ,30),
(2 ,1 ,'Feb' ,30),
(3 ,2 ,'Mar' ,30),
(4 ,6 ,'Apr' ,30),
(5 ,8 ,'May' ,30),
(6 ,21 ,'Jun' ,30)
Select A.*
From #YourTable A
Where ID>= (
Select LastID=max(ID)
From #YourTable A
Cross Apply (Select RT = sum(Qty) from #YourTable where ID>=A.ID) B
Where B.RT>=CURRENT_STOCK
)
Returns
ID QTY DATE CURRENT_STOCK
4 6 Apr 30
5 8 May 30
6 21 Jun 30
One way to do it with your provided data set
if object_id('tempdb..#Test') is not null drop table #Test
create table #Test (ID int, QTY int, Date_Month nvarchar(5), CURRENT_STOCK int)
insert into #Test (ID, QTY, Date_Month, CURRENT_STOCK)
values
(1, 1, 'Jan', 30),
(2, 1, 'Feb', 30),
(3, 2, 'Mar', 30),
(4, 6, 'Apr', 30),
(5, 8, 'May', 30),
(6, 21, 'Jun', 30)
if object_id('tempdb..#Finish') is not null drop table #Finish
create table #Finish (ID int, QTY int, Date_Month nvarchar(5), CURRENT_STOCK int)
declare #rows int = (select MAX(ID) from #Test)
declare #stock int = (select MAX(CURRENT_STOCK) from #Test)
declare #i int = 1
declare #Sum int = 0
while #rows > #i
BEGIN
select #Sum = #Sum + QTY from #Test where ID = #rows
IF (#SUM >= #stock)
BEGIN
set #i = #rows + 1 -- to exit loop
END
insert into #Finish (ID, QTY, Date_Month, CURRENT_STOCK)
select ID, QTY, Date_Month, CURRENT_STOCK from #Test where ID = #rows
set #rows = #rows - 1
END
select * from #Finish
Setup Test Data
-- Setup test data
CREATE TABLE #Stock
([ID] int, [QTY] int, [DATE] varchar(3), [CURRENT_STOCK] int)
;
INSERT INTO #Stock
([ID], [QTY], [DATE], [CURRENT_STOCK])
VALUES
(1, 1, 'Jan', 30),
(2, 1, 'Feb', 30),
(3, 2, 'Mar', 30),
(4, 6, 'Apr', 30),
(5, 8, 'May', 30),
(6, 21, 'Jun', 30)
;
Solution for SQL Server 2012+
If you have a more recent version of SQL server which supports full window function syntax, you can do it look this:
-- Calculate a running total of qty by Id descending
;WITH stock AS (
SELECT *
-- This calculates the SUM over a 'window' of rows based on the first
-- row in the result set through the current row, as specified by the
-- ORDER BY clause
,SUM(qty) OVER(ORDER BY Id DESC
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS TotalQty
FROM #Stock
),
-- Identify first row in mininum set that matches or exceeds CURRENT_STOCK
first_in_set AS (
SELECT TOP 1 *
FROM stock
WHERE TotalQty >= CURRENT_STOCK
)
-- Fetch matching set
SELECT *
FROM #stock
WHERE Id >= (SELECT Id FROM first_in_set)
Solution for SQL Server 2008
For SQL Server 2008, which only has basic support for window functions, you can calculate the running total using CROSS APPLY:
-- Calculate a running total of qty by Id descending
;WITH stock AS (
SELECT *
-- This window function causes the results of this query
-- to be sorted in descending order by Id
,ROW_NUMBER() OVER(ORDER BY Id DESC) AS sort_order
FROM #Stock s1
-- CROSS APPLY 'applies' the query (or UDF) to every row in a result set
-- This CROSS APPLY query produces a 'running total'
CROSS APPLY (
SELECT SUM(Qty) AS TotalQty
FROM #Stock s2
WHERE s2.Id >= s1.id
) total_calc
WHERE TotalQty >= s1.CURRENT_STOCK
),
-- Identify first row in mininum set that matches or exceeds CURRENT_STOCK
first_in_set AS (
SELECT TOP 1 Id
FROM stock
WHERE sort_order = 1
)
-- Fetch matching set
SELECT *
FROM #stock
WHERE Id >= (SELECT Id
FROM first_in_set)

Returning a set of the most recent rows from a table

I'm trying to retrieve the latest set of rows from a source table containing a foreign key, a date and other fields present. A sample set of data could be:
create table #tmp (primaryId int, foreignKeyId int, startDate datetime,
otherfield varchar(50))
insert into #tmp values (1, 1, '1 jan 2010', 'test 1')
insert into #tmp values (2, 1, '1 jan 2011', 'test 2')
insert into #tmp values (3, 2, '1 jan 2013', 'test 3')
insert into #tmp values (4, 2, '1 jan 2012', 'test 4')
The form of data that I'm hoping to retrieve is:
foreignKeyId maxStartDate otherfield
------------ ----------------------- -------------------------------------------
1 2011-01-01 00:00:00.000 test 2
2 2013-01-01 00:00:00.000 test 3
That is, just one row per foreignKeyId showing the latest start date and associated other fields - the primaryId is irrelevant.
I've managed to come up with:
select t.foreignKeyId, t.startDate, t.otherField from #tmp t
inner join (
select foreignKeyId, max(startDate) as maxStartDate
from #tmp
group by foreignKeyId
) s
on t.foreignKeyId = s.foreignKeyId and s.maxStartDate = t.startDate
but (a) this uses inner queries, which I suspect may lead to performance issues, and (b) it gives repeated rows if two rows in the original table have the same foreignKeyId and startDate.
Is there a query that will return just the first match for each foreign key and start date?
Depending on your sql server version, try the following:
select *
from (
select *, rnum = ROW_NUMBER() over (
partition by #tmp.foreignKeyId
order by #tmp.startDate desc)
from #tmp
) t
where t.rnum = 1
If you wanted to fix your attempt as opposed to re-engineering it then
select t.foreignKeyId, t.startDate, t.otherField from #tmp t
inner join (
select foreignKeyId, max(startDate) as maxStartDate, max(PrimaryId) as Latest
from #tmp
group by foreignKeyId
) s
on t.primaryId = s.latest
would have done the job, assuming PrimaryID increases over time.
Qualms about inner query would have been laid to rest as well assuming some indexes.

SQL count exposure of life time by age

(Using SQL Server 2008)
I need some help visualizing a solution. Let's say I have the following simple table for members of a pension scheme:
[Date of Birth] [Date Joined] [Date Left]
1970/06/1 2003/01/01 2007/03/01
I need to calculate the number of lives in each age group from 2000 to 2009.
NOTE: "Age" is defined as "age last birthday" (or "ALB") on 1 January of each of those yeasrs. e.g. if you are exactly 41.35 or 41.77 etc. years old on 1/1/2009 then you would be ALB 41.
So if the record above were the only entry in the database, then the output would be something like:
[Year] [Age ] [Number of Lives]
2003 32 1
2004 33 1
2005 34 1
2006 35 1
2007 36 1
(For 2000, 2001, 2002, 2008 and 2009 there are no lives on file since the sole member only joined on 1/1/2003 and left on 1/3/2007)
I hope I am making myself clear enough.
Anyone have any suggestions?
Thanks, Karl
[EDIT]
Adding another layer to the problem:
What if I had:
[Date of Birth] [Date Joined] [Date Left] [Gender] [Pension Value]
1970/06/1 2003/01/01 2007/03/01 'M' 100,000
and I want the output to be:
[Year] [Age ] [Gender] sum([Pension Value]) [Number of Lives]
2003 32 M 100,000 1
2004 33 M 100,000 1
2005 34 M 100,000 1
2006 35 M 100,000 1
2007 36 M 100,000 1
Any ideas?
WITH years AS
(
SELECT 1900 AS y
UNION ALL
SELECT y + 1
FROM years
WHERE y < YEAR(GETDATE())
),
agg AS
(
SELECT YEAR(Dob) AS Yob, YEAR(DJoined) AS YJoined, YEAR(DLeft) AS YLeft
FROM mytable
)
SELECT y, y - Yob, COUNT(*)
FROM agg
JOIN years
ON y BETWEEN YJoined AND YLeft
GROUP BY
y, y - Yob
OPTION (MAXRECURSION 0)
People born on same year always have the same age in your model
That's why if they go at all, they always go into one group and you just need to generate one row per year for the period they stay in the program.
You can try something like this
DECLARE #Table TABLE(
[Date of Birth] DATETIME,
[Date Joined] DATETIME,
[Date Left] DATETIME
)
INSERT INTO #Table ([Date of Birth],[Date Joined],[Date Left]) SELECT '01 Jun 1970', '01 Jan 2003', '01 Mar 2007'
INSERT INTO #Table ([Date of Birth],[Date Joined],[Date Left]) SELECT '01 Jun 1979', '01 Jan 2002', '01 Mar 2008'
DECLARE #StartYear INT,
#EndYear INT
SELECT #StartYear = 2000,
#EndYear = 2009
;WITH sel AS(
SELECT #StartYear YearVal
UNION ALL
SELECT YearVal + 1
FROM sel
WHERE YearVal < #EndYear
)
SELECT YearVal AS [Year],
COUNT(Age) [Number of Lives]
FROM (
SELECT YearVal,
YearVal - DATEPART(yy, [Date of Birth]) - 1 Age
FROM sel LEFT JOIN
#Table ON DATEPART(yy, [Date Joined]) <= sel.YearVal
AND DATEPART(yy, [Date Left]) >= sel.YearVal
) Sub
GROUP BY YearVal
Try the following sample query
SET NOCOUNT ON
Declare #PersonTable as Table
(
PersonId Integer,
DateofBirth DateTime,
DateJoined DateTime,
DateLeft DateTime
)
INSERT INTO #PersonTable Values
(1, '1970/06/10', '2003/01/01', '2007/03/01'),
(1, '1970/07/11', '2003/01/01', '2007/03/01'),
(1, '1970/03/12', '2003/01/01', '2007/03/01'),
(1, '1973/07/13', '2003/01/01', '2007/03/01'),
(1, '1972/06/14', '2003/01/01', '2007/03/01')
Declare #YearTable as Table
(
YearId Integer,
StartOfYear DateTime
)
insert into #YearTable Values
(1, '1/1/2000'),
(1, '1/1/2001'),
(1, '1/1/2002'),
(1, '1/1/2003'),
(1, '1/1/2004'),
(1, '1/1/2005'),
(1, '1/1/2006'),
(1, '1/1/2007'),
(1, '1/1/2008'),
(1, '1/1/2009')
;WITH AgeTable AS
(
select StartOfYear, DATEDIFF (YYYY, DateOfBirth, StartOfYear) Age
from #PersonTable
Cross join #YearTable
)
SELECT StartOfYear, Age, COUNT (1) NumIndividuals
FROM AgeTable
GROUP BY StartOfYear, Age
ORDER BY StartOfYear, Age
First some preparation to have something to test with:
CREATE TABLE People (
ID int PRIMARY KEY
,[Name] varchar(50)
,DateOfBirth datetime
,DateJoined datetime
,DateLeft datetime
)
go
-- some data to test with
INSERT INTO dbo.People
VALUES
(1, 'Bob', '1961-04-02', '1999-01-01', '2007-05-07')
,(2, 'Sadra', '1960-07-11', '1999-01-01', '2008-05-07')
,(3, 'Joe', '1961-09-25', '1999-01-01', '2009-02-11')
go
-- helper table to hold years
CREATE TABLE dimYear (
CalendarYear int PRIMARY KEY
)
go
-- fill-in years for report
DECLARE
#yr int
,#StartYear int
,#EndYear int
SET #StartYear = 2000
SET #EndYear = 2009
SET #yr = #StartYear
WHILE #yr <= #EndYear
BEGIN
INSERT INTO dimYear (CalendarYear) values(#yr)
SET #yr =#yr+1
END
-- show test data and year tables
select * from dbo.People
select * from dbo.dimYear
go
Then a function to return person's age for each year, if the person is still an active member.
-- returns [CalendarYear], [Age] for a member, if still active member in that year
CREATE FUNCTION dbo.MemberAge(#DateOfBirth datetime, #DateLeft datetime)
RETURNS TABLE
AS
RETURN (
SELECT
CalendarYear,
CASE
WHEN DATEDIFF(dd, cast(CalendarYear AS varchar(4)) + '-01-01',#DateLeft) > 0
THEN DATEDIFF(yy, #DateOfBirth, cast(CalendarYear AS varchar(4)) + '-01-01')
ELSE -1
END AS Age
FROM dimYear
);
go
And the final query:
SELECT
a.CalendarYear AS "Year"
,a.Age AS "Age"
,count(*) AS "Number Of Lives"
FROM
dbo.People AS p
CROSS APPLY dbo.MemberAge(p.DateOfBirth, p.DateLeft) AS a
WHERE a.Age > 0
GROUP BY a.CalendarYear, a.Age
Deal with this in pieces (some random thoughts) - create views to test you dev steps if you can:
ALB - do a query that, for a given year, gives you your memeber's ALB
Member in year - another bit of query that tell you whether a member was a member in a given year
Put those two together and you should be able to create a query that says whether a person was a member in a given year and what their ALB was for that year.
Hmm, tricky - following this chain of thought what you'd then want to do is generate a table that has all the years the person was a member and their ALB in that year (and a unique id)
From 4. select year, alb, count(id) group by year, alb
I'm not sure I'm going in the right direction from about 3 though it should work.
You may find a (temporary) table of years helpful - joining things to a table of dates makes all kinds of things possible.
Not really an answer, but certainly some direction...