Daily record count based on status allocation - sql

I have a table named Books and a table named Transfer with the following structure:
CREATE TABLE Books
(
BookID int,
Title varchar(150),
PurchaseDate date,
Bookstore varchar(150),
City varchar(150)
);
INSERT INTO Books VALUES (1, 'Cujo', '2022-02-01', 'CentralPark1', 'New York');
INSERT INTO Books VALUES (2, 'The Hotel New Hampshire', '2022-01-08', 'TheStrip1', 'Las Vegas');
INSERT INTO Books VALUES (3, 'Gorky Park', '2022-05-19', 'CentralPark2', 'New York');
CREATE TABLE Transfer
(
BookID int,
BookStatus varchar(50),
TransferDate date
);
INSERT INTO Transfer VALUES (1, 'Rented', '2022-11-01');
INSERT INTO Transfer VALUES (1, 'Returned', '2022-11-05');
INSERT INTO Transfer VALUES (1, 'Rented', '2022-11-06');
INSERT INTO Transfer VALUES (1, 'Returned', '2022-11-09');
INSERT INTO Transfer VALUES (2, 'Rented', '2022-11-03');
INSERT INTO Transfer VALUES (2, 'Returned', '2022-11-09');
INSERT INTO Transfer VALUES (2, 'Rented', '2022-11-15');
INSERT INTO Transfer VALUES (2, 'Returned', '2022-11-23');
INSERT INTO Transfer VALUES (3, 'Rented', '2022-11-14');
INSERT INTO Transfer VALUES (3, 'Returned', '2022-11-21');
INSERT INTO Transfer VALUES (3, 'Rented', '2022-11-25');
INSERT INTO Transfer VALUES (3, 'Returned', '2022-11-29');
See fiddle.
I want to do a query for a date interval (in this case 01.11 - 09.11) that returns the book count for each day based on BookStatus from Transfer, like so:
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Status | 01.11 | 02.11 | 03.11 | 04.11 | 05.11 | 06.11 | 07.11 | 08.11 | 09.11 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Rented | 2 | 1 | 2 | 2 | 0 | 2 | 3 | 3 | 1 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Returned | 1 | 2 | 1 | 1 | 3 | 1 | 0 | 0 | 2 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
A book remains rented as long as it was not returned, and is counted as 'Returned' every day until it is rented out again.
This is what the query result would look like for one book (BookID 1):

I see two possible solutions.
Dynamic solution
Use a (recursive) common table expression to generate a list of all the dates that fall within the requested range.
Use two cross apply statements that each perform a count() aggregation function to count the amount of book transfers.
-- generate date range
with Dates as
(
select convert(date, '2022-11-01') as TransferDate
union all
select dateadd(day, 1, d.TransferDate)
from Dates d
where d.TransferDate < '2022-11-10'
)
select d.TransferDate,
c1.CountRented,
c2.CountReturned
from Dates d
-- count all rented books up till today, that have not been returned before today
cross apply ( select count(1) as CountRented
from Transfer t1
where t1.BookStatus = 'Rented'
and t1.TransferDate <= d.TransferDate
and not exists ( select 'x'
from Transfer t2
where t2.BookId = t1.BookId
and t2.BookStatus = 'Returned'
and t2.TransferDate > t1.TransferDate
and t2.TransferDate <= d.TransferDate ) ) c1
-- count all returned books for today
cross apply ( select count(1) as CountReturned
from Transfer t1
where t1.BookStatus = 'Returned'
and t1.TransferDate = d.TransferDate ) c2;
Result:
TransferDate CountRented CountReturned
------------ ----------- -------------
2022-11-01 1 0
2022-11-02 1 0
2022-11-03 2 0
2022-11-04 2 0
2022-11-05 1 1
2022-11-06 2 0
2022-11-07 2 0
2022-11-08 2 0
2022-11-09 0 2
2022-11-10 0 0
This result is not the pivoted outcome described in the question. However, pivoting this dynamic solution requires dynamic sql, which is not trivial!
Static solution
This will delivery the exact outcome as described in the question (including the date formatting), but requires the date range to be fully typed out once.
The essential building blocks are similar to the dynamic solution above:
A recursive common table expression to generate a date range.
Two cross apply's to perform the counting calculations like before.
There is also:
An extra cross join to duplicate the date range for each BookStatus (avoid NULL values in the result).
Some replace(), str() and datepart() functions to format the dates.
A case expression to merge the two counts to a single column.
The solution is probably not the most performant, but it does deliver the requested result. If you want to validate for BookID=1 then just uncomment the extra WHERE filter clauses.
with Dates as
(
select convert(date, '2022-11-01') as TransferDate
union all
select dateadd(day, 1, d.TransferDate)
from Dates d
where d.TransferDate < '2022-11-10'
),
PivotInput as
(
select replace(str(datepart(day, d.TransferDate), 2), space(1), '0') + '.' + replace(str(datepart(month, d.TransferDate), 2), space(1), '0') as TransferDate,
s.BookStatus as [Status],
case when s.BookStatus = 'Rented' then sc1.CountRented else sc2.CountReturned end as BookStatusCount
from Dates d
cross join (values('Rented'), ('Returned')) s(BookStatus)
cross apply ( select count(1) as CountRented
from Transfer t1
where t1.BookStatus = s.BookStatus
and t1.TransferDate <= d.TransferDate
--and t1.BookID = 1
and not exists ( select 'x'
from Transfer t2
where t2.BookId = t1.BookId
and t2.BookStatus = 'Returned'
and t2.TransferDate > t1.TransferDate
and t2.TransferDate <= d.TransferDate ) ) sc1
cross apply ( select count(1) as CountReturned
from Transfer t3
where t3.TransferDate = d.TransferDate
--and t3.BookID = 1
and t3.BookStatus = 'Returned' ) sc2
)
select piv.*
from PivotInput pivi
pivot (sum(pivi.BookStatusCount) for pivi.TransferDate in (
[01.11],
[02.11],
[03.11],
[04.11],
[05.11],
[06.11],
[07.11],
[08.11],
[09.11],
[10.11])) piv;
Result:
Status 01.11 02.11 03.11 04.11 05.11 06.11 07.11 08.11 09.11 10.11
Rented 1 1 2 2 1 2 2 2 0 0
Returned 0 0 0 0 1 0 0 0 2 0
Fiddle to see things in action.

Related

SQL SERVER 2017 - How do I query to retrieve a group of data only if all of the data inside that group are marked as completed?

I have some observation tables like below. The observation data might be in individual form or grouped form which is determined by the observation category table.
cat table (which holds category data)
id | title | is_groupable
-------------------------------------------
1 | Cat 1 | 1
2 | Cat 2 | 1
3 | Cat 3 | 0
4 | Cat 4 | 0
5 | Cat 5 | 1
obs table (Holds observation data, groupable data are indicated by is_groupable of cat table, and the data is grouped in respect to index of obs table. and is_completed field indicates if some action has been taken on that or not)
id | cat_id | index | is_completed | created_at
------------------------------------------------------
1 | 3 | 100 | 0 | 2017-12-01
2 | 4 | 400 | 1 | 2017-12-02
// complete action taken group indicated by 1 in is_completed field
3 | 1 | 200 | 1 | 2017-12-1
4 | 1 | 200 | 1 | 2017-12-1
// not complete action taken group
5 | 2 | 300 | 0 | 2017-12-1
6 | 2 | 300 | 1 | 2017-12-1
7 | 2 | 300 | 0 | 2017-12-1
// complete action taken group
8 | 5 | 400 | 1 | 2017-12-1
9 | 5 | 400 | 1 | 2017-12-1
10 | 5 | 400 | 1 | 2017-12-1
For the sake of easeness in understanding i have separated the set of data as completed or not using the comment above in obs table.
Now what I want to achieve is retrieve the set of data in a group format from obs table. In the above case the set of groups are
{3,4}
{5,6,7}
{8,9,10}
i want to get set {3,4} and {8,9,10} in my result since every data in the group are flagged as is_completed: 1
I don't need {5,6,7} set of data because it has only 6 is flagged as completed, 5 and 7 are not taken action and hence not completed.
What I have done till now is
(Lets ignore the individual case, because It is very easy and already completed and for the group case as well, Im able to retrieve the group data, if ignoring the action taken case, ie I able to group them and retrieve the sets irrespective of taken action or not.)
(SELECT
null AS id,
cat.is_groupable AS is_grouped,
cat.title,
cat.id AS category_id,
o.index,
o.date,
null AS created_at,
null AS is_action_taken,
(
-- individual observation
SELECT
oi.id AS "observation.id",
oi.category_id AS "observation.category_id",
oi.index AS "observation.index",
oi.created_at AS "observation.created_at",
-- action taken flag according to is_completed
CAST(
CASE
WHEN ((oi.is_completed) > 0) THEN 1
ELSE 0
END AS BIT
) AS "observation.is_action_taken",
-- we might do some sort of comparison here
CAST(
(
CASE
--
-- Check if total count == completed count
WHEN (
SELECT COUNT(obs.id)
FROM obs
WHERE obs.category_id = cat.id AND obs.index = o.index
) = (
SELECT COUNT(obs.id)
FROM obs
WHERE obs.category_id = cat.id AND oi.index = o.index
AND oi.is_action_taken = 1
) then 1
else 0
end
) as bit
) as all_completed_in_group
FROM observations oi
WHERE oi.category_id = cat.id
AND oi.index = o.index
FOR JSON PATH
) AS observations
FROM obs o
INNER JOIN cat ON cat.id = o.category_id
WHERE cat.is_groupable = 1
GROUP BY cat.id, cat.name, o.index, cat.is_groupable, o.created_at
)
Let's not get over on if this query executes successfully or not. I just want idea, if there is any better approach than this one, or if this approach is correct or not.
Hopefully this is what you need. To check the group completeness I just used AND NOT EXISTS for the group that has an is_completed = '0'. The inner join is used to get the corresponding obs id's. The algorithm is put in a CTE (common table expression). Then I use STUFF on the CTE to get the output.
DECLARE #cat TABLE (id int, title varchar(100), is_groupable bit)
INSERT INTO #cat VALUES
(1, 'Cat 1', 1), (2, 'Cat 2', 1), (3, 'Cat 3', 0), (4, 'Cat 4', 0), (5, 'Cat 5', 1)
DECLARE #obs TABLE (id int, cat_id int, [index] int, is_completed bit, created_at date)
INSERT INTO #obs VALUES
(1, 3, 100, 0, '2017-12-01'), (2, 4, 400, 1, '2017-12-02')
-- complete action taken group indicated by 1 in is_completed field
,(3, 1, 200, 1, '2017-12-01'), (4, 1, 200, 1, '2017-12-01')
-- not complete action taken group
,(5, 2, 300, 0, '2017-12-01'), (6, 2, 300, 1, '2017-12-01'), (7, 2, 300, 0, '2017-12-01')
-- complete action taken group
,(8, 5, 400, 1, '2017-12-01'), (9, 5, 400, 1, '2017-12-01'), (10, 5, 400, 1, '2017-12-01')
;
WITH cte AS
(
SELECT C.id [cat_id]
,O2.id [obs_id]
FROM #cat C INNER JOIN #obs O2 ON C.id = O2.cat_id
WHERE C.is_groupable = 1 --is a group
AND NOT EXISTS (SELECT *
FROM #obs O
WHERE O.cat_id = C.id
AND O.is_completed = '0'
--everything in the group is_completed
)
)
--Stuff will put everything on one row
SELECT DISTINCT
'{'
+ STUFF((SELECT ',' + CAST(C2.obs_id as varchar)
FROM cte C2
WHERE C2.cat_id = C1.cat_id
FOR XML PATH('')),1,1,'')
+ '}' AS returnvals
FROM cte C1
Produces output:
returnvals
{3,4}
{8,9,10}
I would try this approach, the trick is using the sum of completed to be checked against the total present for the group.
;WITH aux AS (
SELECT o.cat_id, COUNT(o.id) Tot, SUM(CONVERT(int, o.is_completed)) Compl, MIN(CONVERT(int, c.is_groupable)) is_groupable
FROM obs o INNER JOIN cat c ON o.cat_id = c.id
GROUP BY o.cat_id
)
, res AS (
SELECT o.*, a.is_groupable
FROM obs o INNER JOIN aux a ON o.cat_id = a.cat_id
WHERE (a.Tot = a.Compl AND a.is_groupable = 1) OR a.Tot = 1
)
SELECT CONVERT(nvarchar(10), id) id, CONVERT(nvarchar(10), cat_id) cat_id
INTO #res
FROM res
SELECT * FROM #res
SELECT m.cat_id, LEFT(m.results,Len(m.results)-1) AS DataGroup
FROM
(
SELECT DISTINCT r2.cat_id,
(
SELECT r1.id + ',' AS [text()]
FROM #res r1
WHERE r1.cat_id = r2.cat_id
ORDER BY r1.cat_id
FOR XML PATH ('')
) results
FROM #res r2
) m
ORDER BY DataGroup
DROP TABLE #res
The conversion is due to bit datatype

Check the conditions on the rows having same ids in sql server

I have a table as below
Id | CompanyName
--- | ---
100 | IT
100 | R&D
100 | Financial
100 | Insurance
110 | IT
110 | Financial
110 | Product Based
111 | R&D
111 | IT
The table contains the data like the above structure but contains thousands of ids like these
I want to find out all the ids in which all the company names are IT and R&D.If for any of the ids the company name is not in neither IT not R&D then dont consider those ids.
e.g. id 100 cannot be in this list because it has extra company name as Financial but id 111 will be considered because all the companies are IT and R&D
Any help?
select id,
sum(case when CompanyName='IT' OR CompanyName='R&D' then 1 else 0 end) as c1
from t
group by id
having count(*)=2 and c1=2
I would do this as:
select id
from t
having sum(case when CompanyName not in ('IT', 'R&D') then 1 else 0 end) = 0 and
sum(case when CompanyName in ('IT', 'R&D') then 1 else 0 end) = 2; -- use "> 0" if you want either one.
Note: This assumes that you don't have duplicates in the table.
SELECT DISTINCT Id
FROM companies
WHERE
Id NOT IN (
SELECT DISTINCT Id
FROM companies
WHERE
CompanyName != 'IT'
AND CompanyName != 'R&D'
)
;
Subquery: Select all records - only Id, distinctly - with CompanyName different from "IT" and "R&D".
Main query: Select all records - only Id, distinctly - which are not in the prior list.
Results: 111
Tested on sqlfiddle.com with option "MS SQL Server 2014" and schema syntax:
CREATE TABLE companies
([Id] int, [CompanyName] varchar(255))
;
INSERT INTO companies
([Id], [CompanyName])
VALUES
(100, 'IT'),
(100, 'R&D'),
(100, 'Financial'),
(100, 'Insurance'),
(110, 'IT'),
(110, 'Financial'),
(110, 'Product Based'),
(111, 'R&D'),
(111, 'IT')
;

SQL Server retrieving multiple columns with rank 1

I will try to describe my issue as clearly as possible.
I have a dataset of unique 1000 clients, say ##temp1.
I have another dataset which holds the information related to the 1000 clients from ##temp1 across the past 7 years. Lets call this dataset ##temp2. There are 6 specific columns in this second dataset (##temp2) that I am interested in, lets call them column A, B, C, D, E, F. Just for information, the information that columns A, C, E hold is year of some form in data type float (2012, 2013, 2014..) and the information that columns B, D, F hold is ratings of some form (1,2,3,..upto 5) in data type float. Both year and ratings columns have NULL values which I have converted to 0 for now.
My eventual goal is to create a report which holds the information for the 1000 clients in ##temp1, such that each row should hold the information in the following form,
ClientID | ClientName | ColA_Latest_Year1 | ColB_Corresponding_Rating_Year_1 | ColC_Latest_Year2 | ColD_Corresponding_Rating_Year_2 | ColE_Latest_Year3 | ColF_Corresponding_Rating_Year3.
ColA_Latest_Year1 should hold the latest year for that particular client from dataset ##temp2 and ColB_Corresponding_Rating_Year_1 should hold the rating from Column B corresponding to the year pulled in from Column A. Same goes for the other columns.
The approach which I have taken so far, was,
Create ##temp1 as needed
Create ##temp2 as needed
##temp1 LEFT JOIN ##temp2 on client ids to retrieve the year and rating information for all the clients in ##temp1 and put all that information in ##temp3. There will be multiple rows for every client in ##temp3 because the data in ##temp3 is for multiple years.
Ranked the year column (B,D,F) partition by client_ids and put in in ##temp4,
What I have now is something like this,
Rnk_A | Rnk_C | Rnk_F | ColA | ColB | ColC | ColD | ColE | ColF | Client_id | Client_name
2 | 1 | 1 | 0 | 0 | 0 | 0 | 2014 | 1 | 111 | 'ABC'
1 | 2 | 1 | 2012 | 1 | 0 | 0 | 0 | 0 | 111 | 'ABC'
My goal is
Rnk_A | Rnk_C | Rnk_F | ColA | ColB | ColC | ColD | ColE | ColF | Client_id | Client_name
1 | 1 | 1 | 2012| 1 | 0 | 0 | 2014| 1 | 111 | 'ABC'
Any help is appreciated.
This answer assumes you don't have any duplicates per client in columns A, C, E. If you do have duplicates you'd need to find a way to differentiate them and make the necessary changes.
The hurdle that you've failed to overcome in your attempt (as described) is that you're trying to join from temp1 to temp2 only once for lookup information that could come from 3 distinct rows of temp2. This cannot work as you hope. You must perform separate joins for each pair [A,B] [C,D] and [E,F]. The following demonstrates a solution using CTEs to derive the lookup data for each pair.
/********* Prepare sample tables and data ***********/
declare #t1 table (
ClientId int,
ClientName varchar(50)
)
declare #t2 table (
ClientId int,
ColA datetime,
ColB float,
ColC datetime,
ColD float,
ColE datetime,
ColF float
)
insert into #t1
select 1, 'Client 1' union all
select 2, 'Client 2' union all
select 3, 'Client 3' union all
select 4, 'Client 4'
insert into #t2
select 1, '20001011', 1, '20010101', 7, '20130101', 14 union all
select 1, '20040101', 4, '20170101', 1, '20120101', 1 union all
select 1, '20051231', 0, '20020101', 15, '20110101', 1 union all
select 2, '20060101', 2, NULL, 15, '20110101', NULL union all
select 2, '20030101', 3, NULL, NULL, '20100101', 17 union all
select 3, NULL, NULL, '20170101', 42, NULL, NULL
--select * from #t1
--select * from #t2
/********* Solution ***********/
;with MaxA as (
select ROW_NUMBER() OVER (PARTITION BY t2.ClientId ORDER BY t2.ColA DESC) rn,
t2.ClientId, t2.ColA, t2.ColB
from #t2 t2
--where t2.ColA is not null and t2.ColB is not null
), MaxC as (
select ROW_NUMBER() OVER (PARTITION BY t2.ClientId ORDER BY t2.ColC DESC) rn,
t2.ClientId, t2.ColC, t2.ColD
from #t2 t2
--where t2.ColC is not null and t2.ColD is not null
), MaxE as (
select ROW_NUMBER() OVER (PARTITION BY t2.ClientId ORDER BY t2.ColE DESC) rn,
t2.ClientId, t2.ColE, t2.ColF
from #t2 t2
--where t2.ColE is not null and t2.ColF is not null
)
select t1.ClientId, t1.ClientName, a.ColA, a.ColB, c.ColC, c.ColD, e.ColE, e.ColF
from #t1 t1
left join MaxA a on
a.ClientId = t1.ClientId
and a.rn = 1
left join MaxC c on
c.ClientId = t1.ClientId
and c.rn = 1
left join MaxE e on
e.ClientId = t1.ClientId
and e.rn = 1
If you run this you may notice some peculiar results for Client 2 in columns C and F. This is because (as per your question) there may be some NULL values. ColC date is "unknown" and ColF rating is "unknown".
My solution preserves NULL values instead of converting them to zeroes. This allows you to handle them explicitly if you so choose. I commented out lines in the above query that could be used to ignore NULL dates and ratings if necessary.

merging multiple queries that have the same core syntax

Sorry for the vague title!
I'm building a web filter where users will be able to click on options and filter down results using MS-SQL 2012
The problem I have is i'm running 4 queries on every selection made to build the filter and 1 for the results.
Ignoring how it can be best coded so I don't have to reload the filter queries every time, I need help in how i can merge the 4 queries that produce the filter options and counts into 1.
The core of the syntax is the same where most of the logic goes to extract the results based on the filtered selection. However, i need to still produce the filters with their counts.
select datePart(yy,p.startDate) year, count(p.personId) itemCount
from person p
where p.field1 = something
and p.field2 = somethingElse...
Is there a way of running the core and producing the filter list and counts (see below) all in one rather than doing each individually?
Below is an example of 2 of the filters, but there are others that do similar things, either produce a list from existing data or produce a list from based on before, between and after certain dates.
--Filter 1 to get years and counts
with annualList as
(
select a.year
from table a
where a.year > 2000
)
select al.year, isnull(count,0) itemCount
from annualList al
left join (
select datePart(yy,p.startDate) year, count(p.personId) itemCount
from person p
where p.field1 = something
and p.field2 = somethingElse...
) group by datePart(yy,p.startDate) persons
on al.year = persons.year
order by al.year desc;
--filter 2 to get group stats
select anotherList.groupStatus, groupStatusCounts.itemCount
from (
select 'Child' as groupStatus
union all
select 'Adult' as groupStatus
union all
select 'Pensioner' as groupStatus
) anotherList
left join (
SELECT personStatus.groupStatus, count(personStatus.personId) itemCount
FROM ( select p.personId
case when (p.age between 1 and 17) then 'Child'
case when (p.age between 18 and 67) then 'Adult'
case when (p.age > 65) then 'Pensioner'
end as groupStatus
FROM person p
--and some other syntax to calculate the age...
where p.field1 = something
and p.field2 = somethingElse exactly as above query...
) personStatus
GROUP BY personStatus.groupStatus
) groupStatusCounts
ON anotherList.groupStatus = groupStatusCounts.groupStatus
As an example, using the dataset below, from regDate and using a group by I will be able to to get a list of years from 2010-2014 (filter1 code above).
Using the dataofbirth i'll need to 'calculate' the groupStatus using a case. As you can see from the dataofbirth data, I don't have any records where I can identify a pensioner hence the way I wrote filter 2 (see code above), it will give me the filters I need even if I don't have the data to represent it.
Tried to add the code on SQL Fiddle unfortunately it was down last night
INSERT INTO personTable
([PersonID], [dateOfBirth], [regDate])
VALUES
(1, '1979-01-01 00:00:00', '2010-01-01 00:00:00'),
(2, '1979-01-01 00:00:00', '2010-01-01 00:00:00'),
(3, '1979-01-01 00:00:00', '2011-01-01 00:00:00'),
(4, '1979-01-01 00:00:00', '2011-01-01 00:00:00'),
(5, '1979-01-01 00:00:00', '2012-01-01 00:00:00'),
(6, '1979-01-01 00:00:00', '2012-01-01 00:00:00'),
(7, '1984-01-01 00:00:00', '2012-01-01 00:00:00'),
(8, '1992-01-01 00:00:00', '2012-01-01 00:00:00'),
(9, '2000-01-01 00:00:00', '2013-01-01 00:00:00'),
(10, '2010-01-01 00:00:00', '2014-01-01 00:00:00')
my example of SQL Fiddle
The result want to end up with I need it to look like this based on the results shown in Fiddle
filter | Year | groupStatus
-----------+-----------+--------
2010 | 0 | null
2011 | 2 | null
2012 | 3 | null
2013 | 0 | null
2014 | 3 | null
child | null | 2
adult | null | 6
pensioner | null | 0
Thanking you in advance
I may have misunderstood the requirement massively, however I suspect you want to use GROUPING SETS. As a simple example (ignoring your specific filters) imagine the following simple data set:
PersonID | GroupStatus | Year
---------+-------------+--------
1 | Adult | 2013
2 | Adult | 2013
3 | Adult | 2014
4 | Adult | 2014
5 | Adult | 2014
6 | Child | 2012
7 | Child | 2014
8 | Pensioner | 2012
9 | Pensioner | 2013
10 | Pensioner | 2013
From what I gather you are trying to get 2 different summaries out of this data without repeating the query, e.g.
GroupStatus | ItemCount
------------+-----------
Adult | 5
Child | 2
Pensioner | 3
And
Year | ItemCount
------------+-----------
2012 | 2
2013 | 4
2014 | 4
You can get this in a single data set using:
SELECT Year,
GroupStatus,
ItemCount = COUNT(*)
FROM T
GROUP BY GROUPING SETS ((Year), (GroupStatus));
Which will yield a single data set:
YEAR GROUPSTATUS ITEMCOUNT
------------------------------------
NULL Adult 5
NULL Child 2
NULL Pensioner 3
2012 NULL 2
2013 NULL 4
2014 NULL 4
If you wanted to include the full data as well, you can just include a grouping set with all fields e.g. (PersonID, Year, GroupStatus).
Examples on SQL Fiddle
EDIT
The below query gives the output you have requested:
WITH AllValues AS
( SELECT TOP (DATEPART(YEAR, GETDATE()) - 2009)
Filter = CAST(2009 + ROW_NUMBER() OVER(ORDER BY object_id) AS VARCHAR(15)),
FilterType = 'Year'
FROM sys.all_objects o
UNION ALL
SELECT Filter, 'GroupStatus'
FROM (VALUES ('child'), ('Adult'), ('Pensioner')) T (Filter)
)
SELECT v.Filter,
[Year] = CASE WHEN v.FilterType = 'Year' THEN ISNULL(data.ItemCount, 0) END,
GroupStatus = CASE WHEN v.FilterType = 'GroupStatus' THEN ISNULL(data.ItemCount, 0) END
FROM AllValues AS v
LEFT JOIN
( SELECT [Year] = DATENAME(YEAR, t.RegDate),
Age = a.Name,
ItemCount = COUNT(*)
FROM T
LEFT JOIN
(VALUES
(0, 18, 'Child'),
(18, 65, 'adult'),
(65, 1000000, 'pensioner')
) a (LowerValue, UpperValue, Name)
ON a.LowerValue <= DATEDIFF(HOUR, T.dateofbirth, GETDATE()) / 8766.0
AND a.UpperValue > DATEDIFF(HOUR, T.dateofbirth, GETDATE()) / 8766.0
WHERE T.PersonID > 2
GROUP BY GROUPING SETS ((DATENAME(YEAR, t.RegDate)), (a.Name))
) AS data
ON ISNULL(data.[Year], data.Age) = v.Filter;
Example on SQL Fiddle

CONCAT(column) OVER(PARTITION BY...)? Group-concatentating rows without grouping the result itself

I need a way to make a concatenation of all rows (per group) in a kind of window function like how you can do COUNT(*) OVER(PARTITION BY...) and the aggregate count of all rows per group will repeat across each particular group. I need something similar but a string concatenation of all values per group repeated across each group.
Here is some example data and my desired result to better illustrate my problem:
grp | val
------------
1 | a
1 | b
1 | c
1 | d
2 | x
2 | y
2 | z
And here is what I need (the desired result):
grp | val | groupcnct
---------------------------------
1 | a | abcd
1 | b | abcd
1 | c | abcd
1 | d | abcd
2 | x | xyz
2 | y | xyz
2 | z | xyz
Here is the really tricky part of this problem:
My particular situation prevents me from being able to reference the same table twice (I'm actually doing this within a recursive CTE, so I can't do a self-join of the CTE or it will throw an error).
I'm fully aware that one can do something like:
SELECT a.*, b.groupcnct
FROM tbl a
CROSS APPLY (
SELECT STUFF((
SELECT '' + aa.val
FROM tbl aa
WHERE aa.grp = a.grp
FOR XML PATH('')
), 1, 0, '') AS groupcnct
) b
But as you can see, that is referencing tbl two times in the query.
I can only reference tbl once, hence why I'm wondering if windowing the group-concatenation is possible (I'm a bit new to TSQL since I come from a MySQL background, so not sure if something like that can be done).
Create Table:
CREATE TABLE tbl
(grp int, val varchar(1));
INSERT INTO tbl
(grp, val)
VALUES
(1, 'a'),
(1, 'b'),
(1, 'c'),
(1, 'd'),
(2, 'x'),
(2, 'y'),
(2, 'z');
In sql 2017 you can use STRING_AGG function:
SELECT STRING_AGG(T.val, ',') AS val
, T.grp
FROM #tbl AS T
GROUP BY T.grp
I tried using pure CTE approach: Which is the best way to form the string value using column from a Table with rows having same ID? Thinking it is faster
But the benchmark tells otherwise, it's better to use subquery(or CROSS APPLY) results from XML PATH as they are faster: Which is the best way to form the string value using column from a Table with rows having same ID?
DECLARE #tbl TABLE
(
grp INT
,val VARCHAR(1)
);
BEGIN
INSERT INTO #tbl(grp, val)
VALUES
(1, 'a'),
(1, 'b'),
(1, 'c'),
(1, 'd'),
(2, 'x'),
(2, 'y'),
(2, 'z');
END;
----------- Your Required Query
SELECT ST2.grp,
SUBSTRING(
(
SELECT ','+ST1.val AS [text()]
FROM #tbl ST1
WHERE ST1.grp = ST2.grp
ORDER BY ST1.grp
For XML PATH ('')
), 2, 1000
) groupcnct
FROM #tbl ST2
Is it possible for you to just put your stuff in the select instead or do you run into the same issue? (i replaced 'tbl' with 'TEMP.TEMP123')
Select
A.*
, [GROUPCNT] = STUFF((
SELECT '' + aa.val
FROM TEMP.TEMP123 AA
WHERE aa.grp = a.grp
FOR XML PATH('')
), 1, 0, '')
from TEMP.TEMP123 A
This worked for me -- wanted to see if this worked for you too.
I know this post is old, but just in case, someone is still wondering, you can create scalar function that concatenates row values.
IF OBJECT_ID('dbo.fnConcatRowsPerGroup','FN') IS NOT NULL
DROP FUNCTION dbo.fnConcatRowsPerGroup
GO
CREATE FUNCTION dbo.fnConcatRowsPerGroup
(#grp as int) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #val AS VARCHAR(MAX)
SELECT #val = COALESCE(#val,'')+val
FROM tbl
WHERE grp = #grp
RETURN #val;
END
GO
select *, dbo.fnConcatRowsPerGroup(grp)
from tbl
Here is the result set I got from querying a sample table:
grp | val | (No column name)
---------------------------------
1 | a | abcd
1 | b | abcd
1 | c | abcd
1 | d | abcd
2 | x | xyz
2 | y | xyz
2 | z | xyz