how to use SQL table rows as columns for another table - sql

I have one table of activities, one table of users, and a third table linking users to activities using foreign keys.
What I'm trying to do is create a results table that will have the activities as columns and the users as rows with the cells being the number of activities of that type the user participated in.
For example, the columns would be
User | Activity A | Activity B | Activity C
And a user who had done each activity three times would result in a row of
John Doe | 3 | 3 | 3
Now, I can do this easily if I manually add a count() call for each activity in the database like:
select
u.name,
(select count(*)
from userActivity ua
where ua.userID = user.userID and ua.activityID = 1),
(select count(*)
from userActivity ua
where ua.userID = user.userID and ua.activityID = 2),
(select count(*)
from userActivity ua
where ua.userID = user.userID and ua.activityID = 3)
from
user u
But this doesn't help me if someone enters an Activity D into the system tomorrow. The report wouldn't show it. How can I use the Activity table's rows as columns?

I did a quick query that might help. This uses the Pivot function, which was mentioned before.
You can run the whole thing, or just skip to the bottom!
-- Temp tables
IF OBJECT_ID('tempdb.dbo.#_tmp') IS NOT NULL DROP TABLE #_tmp
IF OBJECT_ID('tempdb.dbo.#_user') IS NOT NULL DROP TABLE #_user
IF OBJECT_ID('tempdb.dbo.#_activity') IS NOT NULL DROP TABLE #_activity
IF OBJECT_ID('tempdb.dbo.#_useractivity') IS NOT NULL DROP TABLE #_useractivity
-- User table
CREATE TABLE #_user (
[USER_ID] INT IDENTITY(1,1) NOT NULL,
[FIRST_NAME] NVARCHAR(50)
)
INSERT INTO #_user ([FIRST_NAME])
VALUES ('John'), ('Peter'), ('Paul')
-- Activity table
CREATE TABLE #_activity (
[ACTIVITY_ID] INT IDENTITY(1,1) NOT NULL,
[ACTIVITY_NAME] NVARCHAR(255)
)
INSERT INTO #_activity ([ACTIVITY_NAME])
VALUES ('Sailing'), ('Bowling'), ('Hiking')
-- Composite table
CREATE TABLE #_useractivity (
[LOG_ID] INT IDENTITY(1,1) NOT NULL,
[USER_ID] INT,
[ACTIVITY_ID] INT
)
INSERT INTO #_useractivity ([USER_ID], [ACTIVITY_ID])
VALUES (1,1),(1,2),(1,3),(1,3),(2,2),(2,3),(3,1), (3,2),(1,2),(2,1)
-- Main data table.
SELECT USR.FIRST_NAME
, A.ACTIVITY_NAME
INTO #_tmp
FROM #_useractivity AS UA
INNER JOIN #_user AS USR ON USR.USER_ID = UA.USER_ID
INNER JOIN #_activity AS A ON A.ACTIVITY_ID = UA.ACTIVITY_ID
SELECT * FROM #_tmp
-- Use pivot function to get desired results.
DECLARE #_cols AS NVARCHAR(MAX)
DECLARE #_sql AS NVARCHAR(MAX)
SET #_cols = STUFF((SELECT ',' + QUOTENAME(T.ACTIVITY_NAME)
FROM #_tmp AS T
GROUP BY T.ACTIVITY_NAME
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,1,'')
-- Trick is to add 1 "counter" before pivoting.
set #_sql = '
SELECT Name, ' + #_cols + '
FROM (
SELECT FIRST_NAME AS Name, ACTIVITY_NAME, 1 AS COUNT
FROM #_tmp
) AS SRC
PIVOT (
SUM(COUNT) FOR ACTIVITY_NAME IN (' + #_cols + ')
) p'
EXEC(#_sql)
Main data table:
FIRST_NAME ACTIVITY_NAME
John Sailing
John Bowling
John Hiking
John Hiking
Peter Bowling
Peter Hiking
Paul Sailing
Paul Bowling
John Bowling
Peter Sailing
Output:
Name Bowling Hiking Sailing
John 2 2 1
Paul 1 NULL 1
Peter 1 1 1

You seem to want conditional aggregation:
select u.name,
sum(case when ua.activityID = 1 then 1 else 0 end) as cnt_1,
sum(case when ua.activityID = 2 then 1 else 0 end) as cnt_2,
sum(case when ua.activityID = 3 then 1 else 0 end) as cnt_3
from user u left join
userActivity ua
on ua.userID = u.userID
group by u.name;

Related

Needed helpful hand with a bit complicated query

I have a table 'Tasks' with the following structure
[TaskId],[CompanyId], [Year], [Month], [Value]
220,1,2018,1,50553.32
220,2,2018,2,222038.12
and another table where users have permissions to particular companies in table named 'UsersCopmpanies'
[UserId], [CompanyId]
1,1
and the thing is task no. 220 was moved between companies. In January task belonged to copmanyId=1 and than in February this task belonged to copmanyId = 2.
According to the table 'UsersCopmpanies' user does not have permision to compnayid = 2.
What I need to do is to get both rows from table 'Tasks' expect field Value, because user does not have persmission.
Expected result should be:
[TaskId], [CompanyId], [Year], [Month],[Value]
220,1,2018,1,50553.32
220,2,2018,2,(NULL or somenthing else for.example string 'lack of permission')
You can use a left join:
select t.TaskId, t.CompanyId, t.Year, t.Month,
(case when uc.CompanyId is not null then Value end) as Value
from tasks t left join
UsersCompanies uc
on uc.CompanyId = t.CompanyId and uc.UserId = 1;
I think this query using LEFT JOIN can be work at you expected :
CREATE TABLE #MyTasks
(TaskId int,
CompanyId int,
YearCol varchar(50),
MonthCol varchar(50),
SomeValue varchar(50)
);
GO
INSERT INTO #MyTasks
SELECT 220,1,2018,1,50553.32
UNION
SELECT 220,2,2018,2,222038.12
CREATE TABLE #MyUsersCopmpanies
(UserId int PRIMARY KEY,
CompanyId varchar(50)
);
GO
INSERT INTO #MyUsersCopmpanies
SELECT 1,1
DECLARE #MyUserParam INT = 1;
SELECT #MyTasks.TaskId, #MyTasks.CompanyId, #MyTasks.YearCol, #MyTasks.MonthCol,
CASE WHEN #MyUsersCopmpanies.UserId IS NOT NULL THEN #MyTasks.SomeValue ELSE 'lack of permission' END AS 'ValueTaskByPermissions'
FROM #MyTasks
LEFT JOIN #MyUsersCopmpanies ON #MyUsersCopmpanies.CompanyId = #MyTasks.CompanyId AND #MyUsersCopmpanies.UserId = #MyUserParam;
DROP TABLE #MyTasks
DROP TABLE #MyUsersCopmpanies
RESULT :
TaskId CompanyId YearCol MonthCol ValueTaskByPermissions
220 1 2018 1 50553.32
220 2 2018 2 lack of permission
Some code :
SELECT t.taskid,t.companyid,t.year,t.month,
(CASE WHEN u.companyid IS NOT NULL THEN t.value ELSE "lack of permission" end) AS ValueData
FROM `x_task` t LEFT JOIN x_userscopmpanies u ON u.companyid = t.companyid

Validating a summary count column with the actual records

I have a column in the User table 'total_approved_sales' that contains the count of all sales with status'approved'.
My total_approved_sales column might be off for some users, so I want to list all users who's total_approved_sales doesn't equal the sum from the sales table
i.e. select count(*) from sales where userId=#userId and status='approved'
Table layout looks like:
USER
- total_approved_sales
sales
- userId
- STATUS
How can I query for those users who's counts are off?
joining to an aggregated derived table:
select
u.UserId
, u.total_approved_sales
, a.recount
from user u
left join (
select s.userid, recount = count(*)
from sales s
where s.status = 'approved'
group by s.userid
) a
on u.userid = a.userid
where u.total_approved_sales <> isnull(a.recount,0)
given the following test setup:
create table [user] (userid int, total_approved_sales int);
insert into [user] values (0,0),(1,1),(2,1)
create table sales (userid int, [status] varchar(32))
insert into sales values (1,'approved'),(1,'pending'),(2,'approved'),(2,'approved')
rextester demo: http://rextester.com/TPQZ17719
returns:
+--------+----------------------+---------+
| UserId | total_approved_sales | recount |
+--------+----------------------+---------+
| 2 | 1 | 2 |
+--------+----------------------+---------+
You can achieve this using APPLY operator:
select *
from [user] u
outer apply (select count(*) from sales where userId=u.id and status='approved') sales(cnt)
where u.total_approved_sales <> sales.cnt;

How to synthesize attribute for joined tables

I have a view defined like this:
CREATE VIEW [dbo].[PossiblyMatchingContracts] AS
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts
FROM [dbo].AllContracts AS C
INNER JOIN [dbo].AllContracts AS CC
ON C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB
OR C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB
OR C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB
OR C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB
OR C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE C.UniqueID NOT IN
(
SELECT UniqueID FROM [dbo].DefinitiveMatches
)
AND C.AssociatedUser IS NULL
AND C.UniqueID <> CC.UniqueID
Which is basically finding contracts where f.e. the first name and the birthday are matching. This works great. Now I want to add a synthetic attribute to each row with the value from only one source row.
Let me give you an example to make it clearer. Suppose I have the following table:
UniqueID | FirstName | LastName | Birthday
1 | Peter | Smith | 1980-11-04
2 | Peter | Gray | 1980-11-04
3 | Peter | Gray-Smith| 1980-11-04
4 | Frank | May | 1985-06-09
5 | Frank-Paul| May | 1985-06-09
6 | Gina | Ericson | 1950-11-04
The resulting view should look like this:
UniqueID | PossiblyMatchingContracts | SyntheticID
1 | 2 | PeterSmith1980-11-04
1 | 3 | PeterSmith1980-11-04
2 | 1 | PeterSmith1980-11-04
2 | 3 | PeterSmith1980-11-04
3 | 1 | PeterSmith1980-11-04
3 | 2 | PeterSmith1980-11-04
4 | 5 | FrankMay1985-06-09
5 | 4 | FrankMay1985-06-09
6 | NULL | NULL [or] GinaEricson1950-11-04
Notice that the SyntheticID column uses ONLY values from one of the matching source rows. It doesn't matter which one. I am exporting this view to another application and need to be able to identify each "match group" afterwards.
Is it clear what I mean? Any ideas how this could be done in sql?
Maybe it helps to elaborate a bit on the actual use case:
I am importing contracts from different systems. To account for the possibility of typos or people that have married but the last name was only updated in one system, I need to find so called 'possible matches'. Two or more contracts are considered a possible match if they contain the same birthday plus the same first, last or birth name. That implies, that if contract A matches contract B, contract B also matches contract A.
The target system uses multivalue reference attributes to store these relationships. The ultimate goal is to create user objects for these contracts. The catch first is, that the shall only be one user object for multiple matching contracts. Thus I'm creating these matches in the view. The second catch is, that the creation of user objects happens by workflows, which run parallel for each contract. To avoid creating multiple user objects for matching contracts, each workflow needs to check, if there is already a matching user object or another workflow, which is about to create said user object. Because the workflow engine is extremely slow compared to sql, the workflows should not repeat the whole matching test. So the idea is, to let the workflow check only for the 'syntheticID'.
I have solved it with a multi step approach:
Create the list of possible 1st level matches for each contract
Create the base groups list, assigning a different group for for
each contract (as if they were not related to anybody)
Iterate the matches list updating the group list when more contracts need to
be added to a group
Recursively build up the SyntheticID from final group list
Output results
First of all, let me explain what I have understood, so you can tell if my approach is correct or not.
1) matching propagates in "cascade"
I mean, if "Peter Smith" is grouped up with "Peter Gray", it means that all Smith and all Gray are related (if they have the same birth date) so Luke Smith can be in the same group of John Gray
2) I have not understood what you mean with "Birth Name"
You say contracts matches on "first, last or birth name", sorry, I'm italian, I thought birth name and first were the same, also in your data there is not such column. Maybe it is related to that dash symbol between names?
When FirstName is Frank-Paul it means it should match both Frank and Paul?
When LastName is Gray-Smith it means it should match both Gray and Smith?
In following code I have simply ignored this problem, but it could be handled if needed (I already did a try, breaking names, unpivoting them and treating as double match).
Step Zero: some declaration and prepare base data
declare #cli as table (UniqueID int primary key, FirstName varchar(20), LastName varchar(20), Birthday varchar(20))
declare #comb as table (id1 int, id2 int, done bit)
declare #grp as table (ix int identity primary key, grp int, id int, unique (grp,ix))
declare #str_id as table (grp int primary key, SyntheticID varchar(1000))
declare #id1 as int, #g int
;with
t as (
select *
from (values
(1 , 'Peter' , 'Smith' , '1980-11-04'),
(2 , 'Peter' , 'Gray' , '1980-11-04'),
(3 , 'Peter' , 'Gray-Smith', '1980-11-04'),
(4 , 'Frank' , 'May' , '1985-06-09'),
(5 , 'Frank-Paul', 'May' , '1985-06-09'),
(6 , 'Gina' , 'Ericson' , '1950-11-04')
) x (UniqueID , FirstName , LastName , Birthday)
)
insert into #cli
select * from t
Step One: Create the list of possible 1st level matches for each contract
;with
p as(select UniqueID, Birthday, FirstName, LastName from #cli),
m as (
select p.UniqueID UniqueID1, p.FirstName FirstName1, p.LastName LastName1, p.Birthday Birthday1, pp.UniqueID UniqueID2, pp.FirstName FirstName2, pp.LastName LastName2, pp.Birthday Birthday2
from p
join p pp on (pp.Birthday=p.Birthday) and (pp.FirstName = p.FirstName or pp.LastName = p.LastName)
where p.UniqueID<=pp.UniqueID
)
insert into #comb
select UniqueID1,UniqueID2,0
from m
Step Two: Create the base groups list
insert into #grp
select ROW_NUMBER() over(order by id1), id1 from #comb where id1=id2
Step Three: Iterate the matches list updating the group list
Only loop on contracts that have possible matches and updates only if needed
set #id1 = 0
while not(#id1 is null) begin
set #id1 = (select top 1 id1 from #comb where id1<>id2 and done=0)
if not(#id1 is null) begin
set #g = (select grp from #grp where id=#id1)
update g set grp= #g
from #grp g
inner join #comb c on g.id = c.id2
where c.id2<>#id1 and c.id1=#id1
and grp<>#g
update #comb set done=1 where id1=#id1
end
end
Step Four: Build up the SyntheticID
Recursively add ALL (distinct) first and last names of group to SyntheticID.
I used '_' as separator for birth date, first names and last names, and ',' as separator for the list of names to avoid conflicts.
;with
c as(
select c.*, g.grp
from #cli c
join #grp g on g.id = c.UniqueID
),
d as (
select *, row_number() over (partition by g order by t,s) n1, row_number() over (partition by g order by t desc,s desc) n2
from (
select distinct c.grp g, 1 t, FirstName s from c
union
select distinct c.grp, 2, LastName from c
) l
),
r as (
select d.*, cast(CONVERT(VARCHAR(10), t.Birthday, 112) + '_' + s as varchar(1000)) Names, cast(0 as bigint) i1, cast(0 as bigint) i2
from d
join #cli t on t.UniqueID=d.g
where n1=1
union all
select d.*, cast(r.names + IIF(r.t<>d.t,'_',',') + d.s as varchar(1000)), r.n1, r.n2
from d
join r on r.g = d.g and r.n1=d.n1-1
)
insert into #str_id
select g, Names
from r
where n2=1
Step Five: Output results
select c.UniqueID, case when id2=UniqueID then id1 else id2 end PossibleMatchingContract, s.SyntheticID
from #cli c
left join #comb cb on c.UniqueID in(id1,id2) and id1<>id2
left join #grp g on c.UniqueID = g.id
left join #str_id s on s.grp = g.grp
Here is the results
UniqueID PossibleMatchingContract SyntheticID
1 2 1980-11-04_Peter_Gray,Gray-Smith,Smith
1 3 1980-11-04_Peter_Gray,Gray-Smith,Smith
2 1 1980-11-04_Peter_Gray,Gray-Smith,Smith
2 3 1980-11-04_Peter_Gray,Gray-Smith,Smith
3 1 1980-11-04_Peter_Gray,Gray-Smith,Smith
3 2 1980-11-04_Peter_Gray,Gray-Smith,Smith
4 5 1985-06-09_Frank,Frank-Paul_May
5 4 1985-06-09_Frank,Frank-Paul_May
6 NULL 1950-11-04_Gina_Ericson
I think that in this way the resulting SyntheticID should also be "unique" for each group
This creates a synthetic value and is easy to change to suit your needs.
DECLARE #T TABLE (
UniqueID INT
,FirstName VARCHAR(200)
,LastName VARCHAR(200)
,Birthday DATE
)
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 1,'Peter','Smith','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 2,'Peter','Gray','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 3,'Peter','Gray-Smith','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 4,'Frank','May','1985-06-09'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 5,'Frank-Paul','May','1985-06-09'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 6,'Gina','Ericson','1950-11-04'
DECLARE #PossibleMatches TABLE (UniqueID INT,[PossibleMatch] INT,SynKey VARCHAR(2000)
)
INSERT INTO #PossibleMatches
SELECT t1.UniqueID [UniqueID],t2.UniqueID [Possible Matches],'Ln=' + t1.LastName + ' Fn=' + + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.FirstName=t2.FirstName
AND t1.LastName=t2.LastName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID [UniqueID],t2.UniqueID [Possible Matches],'Fn=' + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.FirstName=t2.FirstName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID,t2.UniqueID,'Ln=' + t1.LastName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.LastName=t2.LastName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID,pm.UniqueID,'Ln=' + t1.LastName + ' Fn=' + + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
LEFT JOIN #PossibleMatches pm on pm.UniqueID=t1.UniqueID
WHERE pm.UniqueID IS NULL
SELECT *
FROM #PossibleMatches
ORDER BY UniqueID,[PossibleMatch]
I think this will work for you
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
FIRST_VALUE(CC.FirstName+CC.LastName+CC.Birthday)
OVER (PARTITION BY C.UniqueID ORDER BY CC.UniqueID) as SyntheticID
FROM
[dbo].AllContracts AS C INNER JOIN
[dbo].AllContracts AS CC ON
C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE
C.UniqueID NOT IN(
SELECT UniqueID FROM [dbo].DefinitiveMatches)
AND C.AssociatedUser IS NULL
You can try this:
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
FIRST_VALUE(CC.FirstName+CC.LastName+CC.Birthday)
OVER (PARTITION BY C.UniqueID ORDER BY CC.UniqueID) as SyntheticID
FROM
[dbo].AllContracts AS C
INNER JOIN
[dbo].AllContracts AS CC
ON
C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB
OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB
OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB
OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB
OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE
C.UniqueID NOT IN
(
SELECT UniqueID FROM [dbo].DefinitiveMatches
)
AND
C.AssociatedUser IS NULL
This will generate one extra row (because we left out C.UniqueID <> CC.UniqueID) but will give you the good souluton.
Following an example with some example data extracted from your original post. The idea: Generate all SyntheticID in a CTE, query all records with a "PossibleMatch" and Union it with all records which are not yet included:
DECLARE #t TABLE(
UniqueID int
,FirstName nvarchar(20)
,LastName nvarchar(20)
,Birthday datetime
)
INSERT INTO #t VALUES (1, 'Peter', 'Smith', '1980-11-04');
INSERT INTO #t VALUES (2, 'Peter', 'Gray', '1980-11-04');
INSERT INTO #t VALUES (3, 'Peter', 'Gray-Smith', '1980-11-04');
INSERT INTO #t VALUES (4, 'Frank', 'May', '1985-06-09');
INSERT INTO #t VALUES (5, 'Frank-Paul', 'May', '1985-06-09');
INSERT INTO #t VALUES (6, 'Gina', 'Ericson', '1950-11-04');
WITH ctePrep AS(
SELECT UniqueID, FirstName, LastName, BirthDay,
ROW_NUMBER() OVER (PARTITION BY FirstName, BirthDay ORDER BY FirstName, BirthDay) AS k,
FirstName+LastName+CONVERT(nvarchar(10), Birthday, 126) AS SyntheticID
FROM #t
),
cteKeys AS(
SELECT FirstName, BirthDay, SyntheticID
FROM ctePrep
WHERE k = 1
),
cteFiltered AS(
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
keys.SyntheticID
FROM #t AS C
JOIN #t AS CC ON C.FirstName = CC.FirstName
AND C.Birthday = CC.Birthday
JOIN cteKeys AS keys ON keys.FirstName = c.FirstName
AND keys.Birthday = c.Birthday
WHERE C.UniqueID <> CC.UniqueID
)
SELECT UniqueID, PossiblyMatchingContracts, SyntheticID
FROM cteFiltered
UNION ALL
SELECT UniqueID, NULL, FirstName+LastName+CONVERT(nvarchar(10), Birthday, 126) AS SyntheticID
FROM #t
WHERE UniqueID NOT IN (SELECT UniqueID FROM cteFiltered)
Hope this helps. The result looked OK to me:
UniqueID PossiblyMatchingContracts SyntheticID
---------------------------------------------------------------
2 1 PeterSmith1980-11-04
3 1 PeterSmith1980-11-04
1 2 PeterSmith1980-11-04
3 2 PeterSmith1980-11-04
1 3 PeterSmith1980-11-04
2 3 PeterSmith1980-11-04
4 NULL FrankMay1985-06-09
5 NULL Frank-PaulMay1985-06-09
6 NULL GinaEricson1950-11-04
Tested in SSMS, it works perfect. :)
--create table structure
create table #temp
(
uniqueID int,
firstname varchar(15),
lastname varchar(15),
birthday date
)
--insert data into the table
insert #temp
select 1, 'peter','smith','1980-11-04'
union all
select 2, 'peter','gray','1980-11-04'
union all
select 3, 'peter','gray-smith','1980-11-04'
union all
select 4, 'frank','may','1985-06-09'
union all
select 5, 'frank-paul','may','1985-06-09'
union all
select 6, 'gina','ericson','1950-11-04'
select * from #temp
--solution is as below
select ab.uniqueID
, PossiblyMatchingContracts
, c.firstname+c.lastname+cast(c.birthday as varchar) as synID
from
(
select a.uniqueID
, case
when a.uniqueID < min(b.uniqueID)over(partition by a.uniqueid)
then a.uniqueID
else min(b.uniqueID)over(partition by a.uniqueid)
end as SmallestID
, b.uniqueID as PossiblyMatchingContracts
from #temp a
left join #temp b
on (a.firstname = b.firstname OR a.lastname = b.lastname) AND a.birthday = b.birthday AND a.uniqueid <> b.uniqueID
) as ab
left join #temp c
on ab.SmallestID = c.uniqueID
Result capture is attached below:
Say we have following table (a VIEW in your case):
UniqueID PossiblyMatchingContracts SyntheticID
1 2 G1
1 3 G2
2 1 G3
2 3 G4
3 1 G4
3 4 G6
4 5 G7
5 4 G8
6 NULL G9
In your case you can set initial SyntheticID as a string like PeterSmith1980-11-04 using UniqueID for each line. Here is a recursive CTE query it divides all lines to unconnected groups and select MAX(SyntheticId) in the current group as a new SyntheticID for all lines in this group.
WITH CTE AS
(
SELECT CAST(','+CAST(UniqueID AS Varchar(100)) +','+ CAST(PossiblyMatchingContracts as Varchar(100))+',' as Varchar(MAX)) as GroupCont,
SyntheticID
FROM PossiblyMatchingContracts
UNION ALL
SELECT CAST(GroupCont+CAST(UniqueID AS Varchar(100)) +','+ CAST(PossiblyMatchingContracts as Varchar(100))+',' AS Varchar(MAX)) as GroupCont,
pm.SyntheticID
FROM CTE
JOIN PossiblyMatchingContracts as pm
ON
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
OR
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
)
AND NOT
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
AND
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
)
)
SELECT pm.UniqueID,
pm.PossiblyMatchingContracts,
ISNULL(
(SELECT MAX(SyntheticID) FROM CTE WHERE
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
OR
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
))
,pm.SyntheticID) as SyntheticID
FROM PossiblyMatchingContracts pm

Combine results to count totals for individuals

I have a table in my database that keeps track of pages sent to individual users and groups. Users are part of groups. Only individual users can answer pages. Here is the DDL for the table:
--PageStatus 1 = Expired
--PageStatus 2 = Answered
--PageStatus 3 = Canceled
CREATE TABLE [Pagings] (
[Id] int NOT NULL IDENTITY(1,1) ,
[UserProfileId] int NULL ,
[GroupId] int NULL ,
[Message] nvarchar(MAX) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PageStatus] int NOT NULL ,
[DateCreated] datetime NULL ,
[DateModified] datetime NULL ,
[IsRecurring] bit NOT NULL DEFAULT ((0)) ,
[AnsweredById] int NULL ,
[AnsweredDateTime] datetime NULL ,
CONSTRAINT [PK_ft.Pagings] PRIMARY KEY ([Id])
)
ON [PRIMARY]
TEXTIMAGE_ON [PRIMARY]
GO
Anytime the PageStatus is Expired (1) or Canceled (3) we do not have any data for the AnsweredById or the AnsweredDateTime columns. If it is answered then we set the value of the UserProfileId coming from the application in the AnsweredById column of the person who answered it. If a group is paged anyone who answered the page is assumed to be part of that group and their UserProfileId is set inside the AnsweredById column.
Here is a sample result and the SqlFiddle to accompany the data.
I need to figure out how to get the Total count of pages for a User including the group they belong to, how many pages they answered, and the group totals. Here is an example of what I would expect as the result based on the set above:
UserId GroupId TotalPagesForUser TotalAnsweredForUser TotalPagesForGroup TotalAnsweredForGroup
------ ------- ----------------- -------------------- ------------------ ----------------------
1 2 3 1 1 1
3 1 3 1 2 2
4 1 2 1 2 2
I've tried joining the table to itself on the UserProfileId and AnsweredById and with a Group table that exists in the database, but my results were way off and i end up with a a lot of duplicated data.
I would break it into two parts, first assemble the aggregate numbers for Users, then get the Group numbers either in a subquery as part of the main query or two separate queries with the results being assembled at the end. Anyway, my rough first attempt:
Select u.Id as UserId,
g.Id as GroupId,
g.name as GroupName,
count(0) as TotalPagesForUser,
Sum(case when p.AnsweredById IS NOT NULL then 1 else 0 end) as TotalAnsweredForUser,
(SELECT COUNT(0) FROM Pagings WHERE GroupId = g.Id) as TotalPagesForGroup,
(SELECT COUNT(0) FROM Pagings WHERE GroupId = g.Id AND AnsweredById IS NOT NULL) as TotalAnsweredForGroup
from UserProfiles u
INNER JOIN Groups g on g.Id = u.GroupId
INNER JOIN Pagings p on p.UserProfileId = u.Id or p.AnsweredById = u.Id
GROUP BY u.Id, g.Id, g.name
ORDER BY u.Id
Although I'm getting slightly different values than what you were projecting...but I haven't had a chance to look over the source data in detail yet.

SQL query assistance with bridge table

I'm working with a existing database and trying to write a sql query to get out all the account information including permission levels. This is for a security audit. We want to dump all of this information out in a readible fashion to make it easy to compare. My problem is that there is a bridge/link table for the permissions so there are multiple records per user. I want to get back results with all the permission for one user on one line. Here is an example:
Table_User:
UserId UserName
1 John
2 Joe
3 James
Table_UserPermissions:
UserId PermissionId Rights
1 10 1
1 11 2
1 12 3
2 11 2
2 12 3
3 10 2
PermissionID links to a table with the name of the Permission and what it does. Right is like 1 = view, 2 = modify, and etc.
What I get back from a basic query for User 1 is:
UserId UserName PermissionId Rights
1 John 10 1
1 John 11 2
1 John 12 3
What I would like something like this:
UserId UserName Permission1 Rights1 Permission2 Right2 Permission3 Right3
1 John 10 1 11 2 12 3
Ideally I would like this for all users.
The closest thing I've found is the Pivot function in SQL Server 2005.
Link
The problem with this from what I can tell is that I need to name each column for each user and I'm not sure how to get the rights level. With real data I have about 130 users and 40 different permissions.
Is there another way with just sql that I can do this?
You could do something like this:
select userid, username
, max(case when permissionid=10 then rights end) as permission10_rights
, max(case when permissionid=11 then rights end) as permission11_rights
, max(case when permissionid=12 then rights end) as permission12_rights
from userpermissions
group by userid, username;
You have to explicitly add a similar max(...) column for each permissionid.
If you where using MySQL I would suggest you use group_concat() like below.
select UserId, UserName,
group_concat(PermissionId) as PermIdList,
group_concat(Rights SEPARATOR ',') as RightsList
from Table_user join Table_UserPermissions on
Table_User.UserId = Table_UserPermissions.UserId=
GROUP BY Table_User.UserId
This would return
UserId UserName PermIdList RightsList
1 John 10,11,12 1,2,3
A quick google search for 'mssql group_concat' revealed a couple different stored procedures (I), (II) for MSSQL that can achieve the same behavior.
Short answer:
No.
You can't dynamically add columns in to your query.
Remember, SQL is a set based language. You query sets and join sets together.
What you're digging out is a recursive list and requiring that the list be strung together horizontally rather then vertically.
You can, sorta, fake it, with a set of self joins, but in order to do that, you have to know all possible permissions before you write the query...which is what the other suggestions have proposed.
You can also pull the recordset back into a different language and then iterate through that to generate the proper columns.
Something like:
SELECT Table_User.userID, userName, permissionid, rights
FROM Table_User
LEFT JOIN Table_UserPermissions ON Table_User.userID =Table_UserPermissions.userID
ORDER BY userName
And then display all the permissions for each user using something like (Python):
userID = recordset[0][0]
userName = recordset[0][1]
for row in recordset:
if userID != row[0]:
printUserPermissions(username, user_permissions)
user_permissions = []
username = row[1]
userID = row[0]
user_permissions.append((row[2], row[3]))
printUserPermissions(username, user_permissions)
You could create a temporary table_flatuserpermissions of:
UserID
PermissionID1
Rights1
PermissionID2
Rights2
...etc to as many permission/right combinations as you need
Insert records to this table from Table_user with all permission & rights fields null.
Update records on this table from table_userpermissions - first record insert and set PermissionID1 & Rights1, Second record for a user update PermissionsID2 & Rights2, etc.
Then you query this table to generate your report.
Personally, I'd just stick with the UserId, UserName, PermissionID, Rights columns you have now.
Maybe substitute in some text for PermissionID and Rights instead of the numeric values.
Maybe sort the table by PermissionID, User instead of User, PermissionID so the auditor could check the users on each permission type.
If it's acceptable, a strategy I've used, both for designing and/or implementation, is to dump the query unpivoted into either Excel or Access. Both have much friendlier UIs for pivoting data, and a lot more people are comfortable in that environment.
Once you have a design you like, then it's easier to think about how to duplicate it in TSQL.
It seems like the pivot function was designed for situations where you can use an aggregate function on one of the fields. Like if I wanted to know how much revenue each sales person made for company x. I could sum up the price field from a sales table. I would then get the sales person and how much revenue in sales they have. For the permissions though it doesn't make sense to sum/count/etc up the permissionId field or the Rights field.
You may want to look at the following example on creating cross-tab queries in SQL:
http://www.databasejournal.com/features/mssql/article.php/3521101/Cross-Tab-reports-in-SQL-Server-2005.htm
It looks like there are new operations that were included as part of SQL Server 2005 called PIVOT and UNPIVOT
For this type of data transformation you will need to perform both an UNPIVOT and then a PIVOT of the data. If you know the values that you want to transform, then you can hard-code the query using a static pivot, otherwise you can use dynamic sql.
Create tables:
CREATE TABLE Table_User
([UserId] int, [UserName] varchar(5))
;
INSERT INTO Table_User
([UserId], [UserName])
VALUES
(1, 'John'),
(2, 'Joe'),
(3, 'James')
;
CREATE TABLE Table_UserPermissions
([UserId] int, [PermissionId] int, [Rights] int)
;
INSERT INTO Table_UserPermissions
([UserId], [PermissionId], [Rights])
VALUES
(1, 10, 1),
(1, 11, 2),
(1, 12, 3),
(2, 11, 2),
(2, 12, 3),
(3, 10, 2)
;
Static PIVOT:
select *
from
(
select userid,
username,
value,
col + '_'+ cast(rn as varchar(10)) col
from
(
select u.userid,
u.username,
p.permissionid,
p.rights,
row_number() over(partition by u.userid
order by p.permissionid, p.rights) rn
from table_user u
left join Table_UserPermissions p
on u.userid = p.userid
) src
unpivot
(
value
for col in (permissionid, rights)
) unpiv
) src
pivot
(
max(value)
for col in (permissionid_1, rights_1,
permissionid_2, rights_2,
permissionid_3, rights_3)
) piv
order by userid
See SQL Fiddle with Demo
Dynamic PIVOT:
If you have an unknown number of permissionids and rights, then you can use dynamic sql:
DECLARE
#query AS NVARCHAR(MAX),
#colsPivot as NVARCHAR(MAX)
select #colsPivot = STUFF((SELECT ','
+ quotename(c.name +'_'+ cast(t.rn as varchar(10)))
from
(
select row_number() over(partition by u.userid
order by p.permissionid, p.rights) rn
from table_user u
left join Table_UserPermissions p
on u.userid = p.userid
) t
cross apply sys.columns as C
where C.object_id = object_id('Table_UserPermissions') and
C.name not in ('UserId')
group by c.name, t.rn
order by t.rn
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query
= 'select *
from
(
select userid,
username,
value,
col + ''_''+ cast(rn as varchar(10)) col
from
(
select u.userid,
u.username,
p.permissionid,
p.rights,
row_number() over(partition by u.userid
order by p.permissionid, p.rights) rn
from table_user u
left join Table_UserPermissions p
on u.userid = p.userid
) src
unpivot
(
value
for col in (permissionid, rights)
) unpiv
) x1
pivot
(
max(value)
for col in ('+ #colspivot +')
) p
order by userid'
exec(#query)
See SQL Fiddle with demo
The result for both is:
| USERID | USERNAME | PERMISSIONID_1 | RIGHTS_1 | PERMISSIONID_2 | RIGHTS_2 | PERMISSIONID_3 | RIGHTS_3 |
---------------------------------------------------------------------------------------------------------
| 1 | John | 10 | 1 | 11 | 2 | 12 | 3 |
| 2 | Joe | 11 | 2 | 12 | 3 | (null) | (null) |
| 3 | James | 10 | 2 | (null) | (null) | (null) | (null) |