Getting Paged Distinct Records Using SQL (Not Duplicate) - sql

Following #mdb's answer to apply pagination using SQL SERVER, I find it hard to retrieve distinct records when the main table is joined to other tables for a one-to-many relationship, i.e, A person has many addresses.
Use case, suppose I want to retrieve all persons which has an address in New York given tables #temp_person and #temp_addresses, I would join them on PersonID and OwnerID.
The problem arises when there are multiple addresses for a person, the result set contains duplicate records.
To make it clearer, here's a sample query with data:
Sample Data:
create table #temp_person (
PersonID int not null,
FullName varchar(max) not null
)
create table #temp_addresses(
AddressID int identity not null,
OwnerID int not null,
Address1 varchar(max),
City varchar(max)
)
insert into #temp_person
values
(1, 'Sample One'),
(2, 'Sample Two'),
(3, 'Sample Three')
insert into #temp_addresses (OwnerID, Address1, City)
values
(1, 'Somewhere East Of', 'New York'),
(1, 'Somewhere West Of', 'New York'),
(2, 'blah blah blah', 'Atlantis'),
(2, 'Address2 Of Sample Two', 'New York'),
(2, 'Address3 Of Sample Two', 'Nowhere City'),
(3, 'Address1 Of Sample Three', 'New York'),
(3, 'Address2 Of Sample Three', 'Seattle')
--drop table #temp_addresses, #temp_person
Pagination Query:
SELECT
(
CAST( RowNum as varchar(MAX) )
+ '/'
+ CAST(TotalCount as varchar(MAX))
) as ResultPosition
, PersonID
, FullName
FROM (
SELECT DISTINCT
ROW_NUMBER() OVER(ORDER BY p.FullName ASC) as RowNum
, p.PersonID
, p.FullName
, Count(1) OVER() as TotalCount
FROM #temp_person p
LEFT JOIN #temp_addresses a
ON p.PersonID = a.OwnerID
WHERE City = 'New York'
) as RowConstrainedResult
WHERE RowNum > 0 AND RowNum <= 3
ORDER BY RowNum
Expected Results:
ResultPosition PersonID FullName
1/3 1 Sample One
2/3 2 Sample Two
3/3 3 Sample Three
Actual Results:
ResultPosition PersonID FullName
1/4 1 Sample One
2/4 1 Sample One
3/4 3 Sample Three
As you can see, the inner query is returning multiple records due to the join with #temp_addresses.
Is there a way we could only return unique records by PersonID?
UPDATE:
Actual use case is for an "Advanced Search" functionality where the user can search using different filters, i.e, name, firstname, last name, birthdate, address, etc.. The <WHERE_CLAUSE> and <JOIN_STATEMENTS> in the query are added dynamically so GROUP BY is not applicable here.
Also, please address the "Pagination" scheme for this question. That is, I want to retrieve only N number of results from Start while also retrieving the total count of the results as if they are not paged. i.e, I retrieve only 25 rows out of a total of 500 results.

Just do group by PersonID and no need to use subquery
SELECT
cast(row_number() over (order by (select 1)) as varchar(max)) +'/'+
cast(Count(1) OVER() as varchar(max)) ResultPosition,
p.PersonID,
max(p.FullName) FullName
FROM #temp_person p
LEFT JOIN #temp_addresses a ON p.PersonID = a.OwnerID
WHERE City = 'New York'
group by p.PersonID
EDIT : I would use CTE for the pagination
;with cte as
(
SELECT
row_number() over(order by (select 1)) rn,
cast(row_number() over (order by (select 1)) as varchar(max)) +'/'+
cast(Count(1) OVER() as varchar(max)) ResultPosition,
p.PersonID,
max(p.FullName) FullName
FROM #temp_person p
LEFT JOIN #temp_addresses a ON p.PersonID = a.OwnerID
WHERE City = 'New York'
group by p.PersonID
)
select * from cte
where rn > 0 and rn <= 2
Result:
ResultPosition PersonID FullName
1/3 1 Sample One
2/3 2 Sample Two
3/3 3 Sample Three

You need to have distinct rows before using ROW_NUMBER().
If you will filter by City, there are no need to use LEFT JOIN. Use INNER JOIN instead.
select ResultPosition = cast(row_number() over (order by (r.PersonID)) as varchar(max)) +'/'+ cast(Count(r.PersonID) OVER() as varchar(max)), *
from(
SELECT distinct p.PersonID,
p.FullName
FROM #temp_person p
JOIN #temp_addresses a ON
p.PersonID = a.OwnerID
WHERE City = 'New York') r
EDIT:
Considering pagination
declare #page int =1, #rowsPage int = 25
select distinct position, ResultPosition = cast(position as varchar(10)) + '/' + cast(count(*) OVER() as varchar(10)), *
from(
SELECT position = DENSE_RANK () over (order by p.PersonID),
p.PersonID,
p.FullName
FROM #temp_person p
LEFT JOIN #temp_addresses a ON
p.PersonID = a.OwnerID
WHERE City = 'New York'
) r
where position between #rowsPage*(#page-1)+1 and #rowsPage*#page

Geoman Yabes, Check if this help... Gives results expected in your example and you can have pagination using RowNum:-
SELECT *
FROM
(SELECT ROW_NUMBER() OVER(ORDER BY RowConstrainedResult.PersonId ASC) As RowNum,
Count(1) OVER() As TotalRows,
RowConstrainedResult.PersonId,
RowConstrainedResult.FullName
FROM (
SELECT
RANK() OVER(PARTITION BY p.PersonId ORDER BY a.Address1 ASC) as Ranking
, p.PersonID
, p.FullName
FROM #temp_person p
INNER JOIN #temp_addresses a ON p.PersonID = a.OwnerID WHERE City = 'New York'
) as RowConstrainedResult WHERE Ranking = 1) Filtered
Where RowNum > 0 And RowNum <= 4
Sample Data:
insert into #temp_person
values
(1, 'Sample One'),
(2, 'Sample Two'),
(3, 'Sample Three'),
(4, 'Sample 4'),
(5, 'Sample 5'),
(6, 'Sample 6')
insert into #temp_addresses (OwnerID, Address1, City)
values
(1, 'Somewhere East Of', 'New York'),
(1, 'Somewhere West Of', 'New York'),
(2, 'blah blah blah', 'Atlantis'),
(2, 'Address2 Of Sample Two', 'New York'),
(2, 'Address3 Of Sample Two', 'Nowhere City'),
(3, 'Address1 Of Sample Three', 'New York'),
(3, 'Address2 Of Sample Three', 'Seattle'),
(4, 'Address1 Of Sample 4', 'New York'),
(4, 'Address1 Of Sample 4', 'New York 2'),
(5, 'Address1 Of Sample 5', 'New York'),
(6, 'Address1 Of Sample 6', 'New York')

Related

SQL LEFT JOIN to many categories

Suppose the following easy scenario, where a product row gets connected to one primary category, subcategory, and sub-subcategory.
DECLARE #PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));
INSERT #PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1', '10', '100'),
(2, 'NIKE CORTEZ', '1', '12', '104'),
(3, 'ADIDAS PANTS', '2', '27', '238'),
(4, 'PUMA REVOLUTION 5', '3', '35', '374'),
(5, 'SALOMON SHELTER CS', '4', '15', '135'),
(6, 'NIKE EBERNON LOW', '2', '14', '157');
DECLARE #CATS TABLE (ID int, DESCR varchar(100));
INSERT #CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');
DECLARE #SUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');
DECLARE #SUBSUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');
SELECT prod.ID,
prod.DESCRIPTION,
CONCAT(cat1.DESCR, ' > ', cat2.DESCR, ' > ', cat3.DESCR) AS CATEGORIES
FROM #PRODUCTS AS prod
LEFT JOIN #CATS AS cat1 ON cat1.ID = prod.CAT
LEFT JOIN #SUBCATS AS cat2 ON cat2.ID = prod.SUBCAT
LEFT JOIN #SUBSUBCATS AS cat3 ON cat3.ID = prod.SUBSUBCAT;
Now suppose that the foreign keys on #PRODUCTS table aren't just indices to their respective tables. They are comma-separated indices to more than one categories, subcategories, and sub-subcategories like here.
DECLARE #PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));
INSERT #PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
(2, 'NIKE CORTEZ', '1, 5', '12, 15', '104, 374'),
(3, 'ADIDAS PANTS', '2, 6', '27, 35', '238, 374');
DECLARE #CATS TABLE (ID int, DESCR varchar(100));
INSERT #CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');
DECLARE #SUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');
DECLARE #SUBSUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');
SELECT prod.ID,
prod.DESCRIPTION
--CONCAT(cat1.DESCR, ' > ', cat2.DESCR, ' > ', cat3.DESCR) AS CATEGORIES
FROM #PRODUCTS AS prod
--LEFT JOIN #CATS AS cat1 ON cat1.ID = prod.CAT
--LEFT JOIN #SUBCATS AS cat2 ON cat2.ID = prod.SUBCAT
--LEFT JOIN #SUBSUBCATS AS cat3 ON cat3.ID = prod.SUBSUBCAT;
In this case I want to achieve the following:
Be able to retrieve the respective names of the cats, subcats, sub-subcats, ie. for cats '1, 2' be able to retrieve their names (I tried LEFT JOIN #CATS AS cat1 ON cat1.ID IN prod.CAT but it doesn't work)
Create triplets of the corresponding cats, subcats, sub-subcats, ie. for
cats '1, 2'
subcats '12, 17'
sub-subcats '239, 372'
(after retrieving the appropriate names) create pipe-separated category routes like name of cat 1 > name of subcat 12 > name of sub-subcat 239 | name of cat 2 > name of subcat 17 > name of sub-subcat 372
So, for a row like (1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
I would like to get the following result
ID
DESCRIPTION
CATEGORIES
1
NIKE MILLENIUM
MEN > FOOTWEAR > RUNNING # WOMEN > OUTERWEAR > FLEECE (I had to use # as the delimiter of the two triplets because pipe messed with the table's columns)
In case the user stupidly stores more cat IDs than subcat IDs, or sub-subcat IDs, the query should just match the ones that have a corresponding position match, ie for
cats '1, 2'
subcats '12'
sub-subcats '239, 372'
it should just create one triplet, like name of 1 > name of 12 > name of 239
STRING_SPLIT() does not promise to return the values in a specific order, so it won't work in this case as ordinal position matters.
Use OPENJSON() split the string into separate rows to ensure the values are returned in the same order.
OPENJSON() also returns a key field, so you can join on the row number within each grouping. You'll want an INNER JOIN since your requirement is that all values in that "column" must exist.
Use STUFF() to assemble the various cat>subcat>subsubcat values.
DECLARE #PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));
INSERT #PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
(2, 'NIKE CORTEZ', '1, 5', '12, 15', '104, 374'),
(3, 'ADIDAS PANTS', '2, 6, 1', '27, 35, 10', '238, 374, 100'),
(4, 'JOE THE PLUMBER JEANS', '1, 5', '27', '238, 374');
DECLARE #CATS TABLE (ID int, DESCR varchar(100));
INSERT #CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');
DECLARE #SUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');
DECLARE #SUBSUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');
;
with prod as (
SELECT p.ID,
p.DESCRIPTION
--CONCAT(cat1.DESCR, ' > ', cat2.DESCR, ' > ', cat3.DESCR) AS CATEGORIES
, c.value as CatId
, c.[key] as CatKey
, sc.value as SubCatId
, sc.[key] as SubCatKey
, ssc.value as SubSubCatId
, ssc.[key] as SubSubCatKey
FROM #PRODUCTS p
cross apply OPENJSON(CONCAT('["', REPLACE(cat, ', ', '","'), '"]')) c
cross apply OPENJSON(CONCAT('["', REPLACE(subcat, ', ', '","'), '"]')) sc
cross apply OPENJSON(CONCAT('["', REPLACE(subsubcat, ', ', '","'), '"]')) ssc
where c.[key] = sc.[key]
and c.[key] = ssc.[key]
)
, a as (
select p.ID
, p.DESCRIPTION
, c.DESCR + ' > ' + sc.DESCR + ' > ' + ssc.DESCR as CATEGORIES
, p.CatKey
from prod p
inner join #CATS c on c.ID = p.CatId
inner join #SUBCATS sc on sc.ID = p.SubCatId
inner join #SUBSUBCATS ssc on ssc.ID = p.SubSubCatId
)
select DISTINCT ID
, DESCRIPTION
, replace(STUFF((SELECT distinct ' | ' + a2.CATEGORIES
from a a2
where a.ID = a2.ID
FOR XML PATH(''))
,1,2,''), '>', '>') CATEGORIES
from a
Totally separate answer because of the change to older technology. I think my original answer is still good for folks using current SQL Server versions, so I don't want to remove it.
I don't remember where I got the function. When I found it today it was named split_delimiter. I changed the name, added some comments, and incorporated the ability to have a delimiter that is more than one character long.
CREATE FUNCTION [dbo].[udf_split_string](#delimited_string VARCHAR(8000), #delimiter varchar(10))
RETURNS TABLE AS
RETURN
WITH cte10(num) AS ( -- 10 rows
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
, cte100(num) AS ( -- 100 rows
SELECT 1
FROM cte10 t1, cte10 t2
)
, cte10000(num) AS ( -- 10000 rows
SELECT 1
FROM cte100 t1, cte100 t2
)
, cte1(num) AS ( -- 1 row per character
SELECT TOP (ISNULL(DATALENGTH(#delimited_string), 0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM cte10000
)
, cte2(num) AS ( -- locations of strings
SELECT 1
UNION ALL
SELECT t.num + len(replace(#delimiter, ' ', '_'))
FROM cte1 t
WHERE SUBSTRING(#delimited_string, t.num, len(replace(#delimiter, ' ', '_'))) = #delimiter
)
, cte3(num, [len]) AS (
SELECT t.num
, ISNULL(NULLIF(CHARINDEX(#delimiter, #delimited_string, t.num), 0) - t.num, 8000)
FROM cte2 t
)
SELECT [Key] = ROW_NUMBER() OVER (ORDER BY t.num)
, [Value] = SUBSTRING(#delimited_string, t.num, t.[len])
FROM cte3 t;
GO
DECLARE #PRODUCTS TABLE (ID int, DESCRIPTION varchar(50), CAT varchar(30), SUBCAT varchar(30), SUBSUBCAT varchar(30));
INSERT #PRODUCTS (ID, DESCRIPTION, CAT, SUBCAT, SUBSUBCAT) VALUES
(1, 'NIKE MILLENIUM', '1, 2', '10, 12', '100, 135'),
(2, 'NIKE CORTEZ', '1, 5', '12, 15', '104, 374'),
(3, 'ADIDAS PANTS', '2, 6, 1', '27, 35, 10', '238, 374, 100'),
(4, 'JOE THE PLUMBER JEANS', '1, 5', '27', '238, 374');
DECLARE #CATS TABLE (ID int, DESCR varchar(100));
INSERT #CATS (ID, DESCR) VALUES
(1, 'MEN'),
(2, 'WOMEN'),
(3, 'UNISEX'),
(4, 'KIDS'),
(5, 'TEENS'),
(6, 'BACK TO SCHOOL');
DECLARE #SUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBCATS (ID, DESCR) VALUES
(10, 'FOOTWEAR'),
(12, 'OUTERWEAR'),
(14, 'SWIMWEAR'),
(15, 'HOODIES'),
(27, 'CLOTHING'),
(35, 'SPORTS');
DECLARE #SUBSUBCATS TABLE (ID int, DESCR varchar(100));
INSERT #SUBSUBCATS (ID, DESCR) VALUES
(100, 'RUNNING'),
(104, 'ZIP TOPS'),
(135, 'FLEECE'),
(157, 'BIKINIS'),
(238, 'PANTS'),
(374, 'JOGGERS');
;
with prod as (
SELECT p.ID,
p.DESCRIPTION
, c.value as CatId
, c.[key] as CatKey
, sc.value as SubCatId
, sc.[key] as SubCatKey
, ssc.value as SubSubCatId
, ssc.[key] as SubSubCatKey
FROM #PRODUCTS p
cross apply dbo.udf_split_string(cat, ', ') c
cross apply dbo.udf_split_string(subcat, ', ') sc
cross apply dbo.udf_split_string(subsubcat, ', ') ssc
where c.[key] = sc.[key]
and c.[key] = ssc.[key]
)
, a as (
select p.ID
, p.DESCRIPTION
, c.DESCR + ' > ' + sc.DESCR + ' > ' + ssc.DESCR as CATEGORIES
, p.CatKey
from prod p
inner join #CATS c on c.ID = p.CatId
inner join #SUBCATS sc on sc.ID = p.SubCatId
inner join #SUBSUBCATS ssc on ssc.ID = p.SubSubCatId
)
select DISTINCT ID
, DESCRIPTION
, replace(STUFF((SELECT distinct ' | ' + a2.CATEGORIES
from a a2
where a.ID = a2.ID
FOR XML PATH(''))
,1,2,''), '>', '>') CATEGORIES
from a
Well that should do work, i changed your character ">" for "-" just for see the data more simple.
the design of your tables is not perfect but the first try almost never is.
select mainp.ID, mainp.DESCRIPTION, stuff(ppaths.metapaths, len(ppaths.metapaths),1,'') metalinks
from #PRODUCTS mainp
cross apply(
select
(select
c.DESCR + '-' + sc.DESCR + '-' + sbc.DESCR + '|'
from #PRODUCTS p
cross apply (select row_number() over(order by Value) id, Value from split(p.CAT, ','))cat_ids
inner join #cats c on c.ID = cat_ids.Value
cross apply (select row_number() over(order by Value) id, Value from split(p.SUBCAT, ','))subcat_ids
inner join #SUBCATS sc on sc.ID = subcat_ids.Value
and subcat_ids.id = subcat_ids.id
cross apply (select row_number() over(order by Value) id, Value from split(p.SUBSUBCAT, ','))subsubcat_ids
inner join #SUBSUBCATS sbc on sbc.ID = subsubcat_ids.Value
and subsubcat_ids.id = subcat_ids.id
where p.id = mainp.ID
for xml path('')) metapaths
) ppaths
the link for split function
https://desarrolladores.me/2014/03/sql-server-funcion-split-para-dividir-un-string/

Select top 3 records from each territory

I am struggling with a PostgreSQL query to select the top 3 records by each category. I have 3 tables (as given below) and data,
geography:
formulary:
formulary_controller:
I want to select the top 3 formularies order by lives desc from each territory under a region.
The query I came up with so far is,
select
t.name territory, f.name formulary, sum(f.lives) lives
from formulary f join geography t on f.territory_id = t.id
where t.parent_id = '1'
group by t.name, f.name
order by lives desc, territory;
If I apply limit 3 top above query will give me the top 3 from all territories, but I want to select the top 3 formularies from each territory under the keystone region.
Queries to create a table and load dummy data:
create table formulary_controller
(
id SERIAL primary key,
name varchar(300)
);
create table geography
(
id SERIAL primary key,
name varchar(250),
parent_id integer references geography(id)
);
create table formulary
(
id SERIAL primary key,
name varchar(300),
lives integer,
controller_id integer references formulary_controller(id),
territory_id integer references geography(id)
);
insert into formulary_controller (id, name)
values(1, 'cont1'), (2, 'cont2');
insert into geography (id, name, parent_id)
values
(1, 'keystone', null),
(2, 'pittsburgh', 1),
(3, 'Baltimore', 1);
insert into formulary
(name, lives, controller_id, territory_id)
values
('PA FRM 1', 200, 1, 2),
('PA FRM 2', 1400, 1, 2),
('PA FRM 3', 1300, 1, 2),
('PA FRM 4', 100, 1, 2),
('PA FRM 5', 2430, 1, 2),
('BA FRM 1', 100, 2, 3),
('BA FRM 2', 2300, 2, 3),
('BA FRM 3', 1200, 2, 3),
('BA FRM 4', 1650, 2, 3),
('BA FRM 5', 1200, 2, 3);
you can find top 3 territory_id by using below query using row_number()
with cte (select a.*,
row_number() over(partition by territory_id order by lives desc) rn
from formulary a
) select cte.* from cte where rn<=3
You can try the below - DEMO HERE
select * from
(
select
t.name territory, f.name formulary, sum(f.lives) lives,
row_number() over(partition by t.name order by sum(f.lives) desc) as rn
from formulary f join geography t on f.territory_id = t.id
where t.parent_id = '1'
group by t.name,f.name
)A where rn<=3

How to get columns from different unrelated tables having similar column names?

I have 3 tables:
Pay Group:
PayGroupId Name Description Code
1 US Weekly US Weekly USW
2 Can Weekly Canada Weekly CANW
3 US Monthly US Monthly USM
4 Can Monthly Can Monthly CANM
Pay Type:
PayTypeId Name Description Code
1 Hourly Hourly H
2 Salary Salaried S
Pay Code:
PayCodeId Name Description Code
1 Regular Regular REG
2 PTO PTO PTO
3 Sick Sick SICK
I need a report in following format:
PayGroup PayType PayCode
US Weekly Hourly Regular
Can Weekly Salary PTO
US Monthly Sick
Can we do this?
I suspect this gets you the result you are after, but seems like an odd requirement:
WITH PG AS(
SELECT [Name],
ROW_NUMBER() OVER (ORDER BY PayGroupID ASC) AS RN
FROM PayGroup),
PT AS(
SELECT [Name],
ROW_NUMBER() OVER (ORDER BY PayTypeID ASC) AS RN
FROM PayGroup),
PC AS(
SELECT [Name],
ROW_NUMBER() OVER (ORDER BY PayCodeID ASC) AS RN
FROM PayCode)
SELECT PG.[Name] AS PayGroup,
PT.[Name] AS PayType,
PC.[Name] AS PayCode
FROM PG
FULL OUTER JOIN PT ON PG.RN = PT.RN
FULL OUTER JOIN PC ON PG.RN = PC.RN
OR PT.RN = PC.RN;
CREATE TABLE #table1
([PayGroupId] int, [Name] varchar(11), [Description] varchar(13), [Code] varchar(4))
;
INSERT INTO #table1
([PayGroupId], [Name], [Description], [Code])
VALUES
(1, 'US Weekly', 'US Weekly', 'USW'),
(2, 'Can Weekly', 'Canada Weekly', 'CANW'),
(3, 'US Monthly', 'US Monthly', 'USM'),
(4, 'Can Monthly', 'Can Monthly', 'CANM')
;
CREATE TABLE #table2
([PayTypeId] int, [Name] varchar(6), [Description] varchar(8), [Code] varchar(1))
;
INSERT INTO #table2
([PayTypeId], [Name], [Description], [Code])
VALUES
(1, 'Hourly', 'Hourly', 'H'),
(2, 'Salary', 'Salaried', 'S')
;
CREATE TABLE #table3
([PayCodeId] int, [Name] varchar(7), [Description] varchar(7), [Code] varchar(4))
;
INSERT INTO #table3
([PayCodeId], [Name], [Description], [Code])
VALUES
(1, 'Regular', 'Regular', 'REG'),
(2, 'PTO', 'PTO', 'PTO'),
(3, 'Sick', 'Sick', 'SICK')
;
select a.name PayGroup ,isnull(B.Name,'') PayType ,isnull(C.Name,'')PayCode
from #table1 A left join #table2 B on a.[PayGroupId]=b.[PayTypeId]left join
#table3 c on c.[PayCodeId]=a.[PayGroupId]
PayGroup PayType PayCode
US Weekly Hourly Regular
Can Weekly Salary PTO
US Monthly Sick
Can Monthly

Building Matrix via SQL

I am using two query from a schema which counts companies, but I am struggling how to combine them give Columns as Country and Industry as row and give the company counts accordingly.
select g.simpleindustrydescription, count(c.companyid) as companycount from ciqcompany c
join ciqsimpleindustry g on g.simpleIndustryid = c.simpleIndustryid
join ciqbusinessdescription b on b.companyid = c.companyid
group by g.simpleindustrydescription
select g.country, count(c.companyid) as companycount from ciqcompany c
join ciqcountrygeo g on g.countryid = c.countryid
join ciqbusinessdescription b on b.companyid = c.companyid
group by g.country
Expected Output:
Country A Country B Country C
Industry A 5 5 6
Industry B 3 3 4
Industry C 4 8 6
Due to lack of real example data, here a simple pivot example basing on some dummy records:
CREATE TABLE #tCountry
(
ID INT
,Name NVARCHAR(100)
);
INSERT INTO #tCountry VALUES (1, 'Country A'), (2, 'Country B'), (3, 'Country C');
CREATE TABLE #tIndustry
(
ID INT
,Name NVARCHAR(100)
);
INSERT INTO #tIndustry VALUES (1, 'Industry A'), (2, 'Industry B'), (3, 'Industry C');
CREATE TABLE #tMapping
(
ID INT
,CountryID INT
,IndustryID INT
,Name NVARCHAR(100)
);
INSERT INTO #tMapping VALUES (1, 1, 1, 'Country A Industry A - 1'), (2, 1, 1, 'Country A Industry A - 2');
INSERT INTO #tMapping VALUES (3, 1, 2, 'Country A Industry B - 1'), (4, 1, 2, 'Country A Industry b - 2'), (5, 1, 2, 'Country A Industry b - 3');
INSERT INTO #tMapping VALUES (6, 2, 1, 'Country B Industry A - 1');
DECLARE #lCountries NVARCHAR(max) = N'';
DECLARE #stmt NVARCHAR(MAX) = N'';
SELECT #lCountries += N', ' + QUOTENAME(CountryName)
FROM(
SELECT DISTINCT tc.Name CountryName
FROM #tCountry tc
JOIN #tMapping tm ON tm.CountryID = tc.ID
) x;
SELECT #lCountries = STUFF(#lCountries, 1, 2, '');
SELECT #stmt = 'SELECT *
FROM
(
SELECT ti.Name IndustryName, tc.Name CountryName, COUNT(*) MappingCounter
FROM #tMapping tm
JOIN #tCountry tc ON tm.CountryID = tc.ID
JOIN #tIndustry ti ON tm.IndustryID = ti.ID
GROUP BY ti.Name, tc.Name
) t
PIVOT (MAX(t.MappingCounter) FOR CountryName in (' + #lCountries + ')) AS x';
EXEC sp_executesql #stmt
Anyways, if you are dealing with an unknown number of countries, you might want to extend this example a little and use dynamic SQL in order to build the pivot statement. Got an example somewhere, but would have to search for it...
Result of my example:
IndustryName Country A Country B Country C
Industry A 2 1 NULL
Industry B 3 NULL NULL

SQL Server function to get top level parent in hierarchy

I have following table (master_group) structure :
code name under
1 National Sales Manager 1
2 regional sales manager 1
3 area sales manager 2
4 sales manager 3
How do I get the ultimate parent of a particular row like :
code name under ultimateparent
1 National Sales Manager 1 1
2 regional sales manager 1 1
3 area sales manager 2 1
4 sales manager 3 1
With recursive cte going from top to childs:
with cte as(
select *, code as ultimate from t where code = under
union all
select t.*, c.ultimate from t
join cte c on c.code = t.under
where t.code <> t.under
)
select * from cte
For data:
create table t (code int, name varchar(100), under int)
insert into t values
(1, 'National Sales Manager', 1),
(2, 'regional sales manager', 1),
(3, 'area sales manager', 2),
(4, 'sales manager', 3),
(5, 'a', 5),
(6, 'b', 5),
(7, 'c', 5),
(8, 'd', 7),
(9, 'e', 7),
(10, 'f', 9),
(11, 'g', 9)
it generates the output:
code name under ultimate
1 National Sales Manager 1 1
5 a 5 5
6 b 5 5
7 c 5 5
8 d 7 5
9 e 7 5
10 f 9 5
11 g 9 5
2 regional sales manager 1 1
3 area sales manager 2 1
4 sales manager 3 1
Fiddle http://sqlfiddle.com/#!6/17c12e/1
You can use a recursive CTE to walk the tree and then choose the highest level for each code:
with cte as (
select mg.code, mg.name as name, mg.under as under, mg.under as parent, 1 as lev
from master_group mg
union all
select mg.code, mg.name, mg.under, cte.under as parent, cte.lev + 1
from master_group mg join
cte
on mg.under = cte.code
where cte.under is not null and cte.under <> mg.code
)
select code, name, under, parent as ultimateparent
from (select cte.*, max(lev) over (partition by cte.code) as maxlev
from cte
) t
where lev = maxlev;
Here is a SQL Fiddle.
I would put NULL as under (in my example ParentId) when it's the top record. With this assumption here's a solution
;
WITH Result AS
(
SELECT Id, ParentId, Name, Id as [Top] FROM
sample
where ParentId IS NULL
UNION ALL
SELECT s.Id, s.ParentId, s.Name, [Top]
FROM sample s INNER JOIN Result R ON s.ParentId = R.Id
)
http://sqlfiddle.com/#!6/13b9d/14
I suggest you to use a recursive function like this:
CREATE FUNCTION dbo.parentID (#code int)
RETURNS int AS
BEGIN
DECLARE #ResultVar int
SELECT #ResultVar = (SELECT under FROM master_group WHERE code = #code)
IF #ResultVar <> #code
BEGIN
SELECT #ResultVar = dbo.parentID(#ResultVar)
END
RETURN #ResultVar
END
GO
An use it like this:
SELECT *,
dbo.parentId(code) AS ultimateparent
FROM master_group
I'm going to shamelessly steal the data setup from another answer and demonstrate how you'd do this with hierarchyid:
create table t (code int, name varchar(100), under int)
insert into t values
(1, 'National Sales Manager', null),
(2, 'regional sales manager', 1),
(3, 'area sales manager', 2),
(4, 'sales manager', 3),
(5, 'a', null),
(6, 'b', 5),
(7, 'c', 5),
(8, 'd', 7),
(9, 'e', 7),
(10, 'f', 9),
(11, 'g', 9);
with cte as (
select code, name, under as parentCode, code as ultimateParent, cast('/' + cast(code as varchar) + '/' as nvarchar(max)) as h
from t
where under is null
union all
select child.code, child.name, child.under as ParentCode, parent.ultimateParentCode, cast(parent.h + cast(child.code as varchar) + '/' as nvarchar(max))
from t as child
join cte as parent
on child.under = parent.code
), hier as (
select code, name, parentCode, ultimateParentCode, cast(h as hierarchyid) as h
from cte
)
select code, name, parentCode, ultimateParentCode, h.ToString(), h.GetAncestor(h.GetLevel()-1).ToString()
from hier
Keep in mind, the recursive CTE need only be done once (or on data changes). The point that I'm making is that once you have a hierarchyid calculated (which you can store in row, btw), it's easy to answer the question you're posing with method calls on the hierarchyid (and possibly a join if you want to get back the progenitor's info).