Query to Get All IDs When Duplicate Data Has Different Keys - sql

I've got a SQL table similar to this:
+-----------------------------------------------+
| ID | FirstName | LastName | SomeOtherData|
+-----------------------------------------------+
| 200 | Robert | Barone | Foo |
| 228 | Doug | Heffernan | Bar |
| 2091 | Robert | Barone | Foo |
| 3921 | Doug | Heffernan | Bar |
| 291 | Greg | Warner | Barfoo |
+-----------------------------------------------+
Now what I'm having trouble producing is a table that'll list both IDs for a given Person, assuming that FirstName and LastName are used to indicate duplicates. So, basically I'm trying to get:
+---------------------------------------------------------+
| ID | OtherID | FirstName | LastName | SomeOtherData|
+---------------------------------------------------------+
| 200 | 2091 | Robert | Barone | Foo |
| 228 | 3921 | Doug | Heffernan | Bar |
| 291 | | Greg | Warner | Barfoo |
+---------------------------------------------------------+
Would anyone be able to help me out with something like this? Thanks!

You can use a PIVOT which will transform the data from rows into columns:
select [1] Id,
[2] OtherId,
firstname,
lastname
from
(
select id, firstname, lastname,
row_number() over(partition by firstname, lastname
order by id) rn
from yourtable
) src
pivot
(
max(id)
for rn in ([1], [2])
) piv
See SQL Fiddle with Demo
Or you could use an aggregate function with a CASE expression:
select
max(case when rn = 1 then id end) Id,
max(case when rn = 2 then id end) OtherId,
firstname,
lastname
from
(
select id, firstname, lastname,
row_number() over(partition by firstname, lastname
order by id) rn
from yourtable
) src
group by firstname, lastname
The above will work great if you have a known number of duplicate values (1, 2, etc). You could also implement dynamic SQL if you have more than 2 id's. The dynamic SQL would look like:
DECLARE #cols AS NVARCHAR(MAX),
#colNames AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(cast(row_number() over(partition by firstname, lastname order by id) as varchar(50)))
from yourtable
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
select #colNames = STUFF((SELECT distinct ', ' + QUOTENAME(cast(row_number() over(partition by firstname, lastname order by id) as varchar(50))) +' as Id_' + cast(row_number() over(partition by firstname, lastname order by id) as varchar(50))
from yourtable
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT ' + #colNames + ', firstname, lastname from
(
select id, firstname, lastname,
row_number() over(partition by firstname, lastname
order by id) rn
from yourtable
) x
pivot
(
max(id)
for rn in (' + #cols + ')
) p
'
execute(#query)
See SQL Fiddle with Demo
The result of all 3 would be:
| ID | OTHERID | FIRSTNAME | LASTNAME |
-----------------------------------------
| 200 | 2091 | Robert | Barone |
| 228 | 3921 | Doug | Heffernan |
| 291 | (null) | Greg | Warner |

Related

Dynamic field content as Row Sql

I have the following dataset on a sql database
----------------------------------
| ID | NAME | AGE | STATUS |
-----------------------------------
| 1ASDF | Brenda | 21 | Single |
-----------------------------------
| 2FDSH | Ging | 24 | Married|
-----------------------------------
| 3SDFD | Judie | 18 | Widow |
-----------------------------------
| 4GWWX | Sophie | 21 | Married|
-----------------------------------
| 5JDSI | Mylene | 24 | Singe |
-----------------------------------
I want to query that dataset so that i can have this structure in my result
--------------------------------------
| AGE | SINGLE | MARRIED | WIDOW |
--------------------------------------
| 21 | 1 | 1 | 0 |
--------------------------------------
| 24 | 1 | 1 | 0 |
--------------------------------------
| 18 | 0 | 0 | 1 |
--------------------------------------
And the status column can be dynamic so there will be more columns to come.
Is this possible?
Since you are using SQL Server, you can use the PIVOT table operator like this:
SELECT *
FROM
(
SELECT Age, Name, Status FROM tablename
) AS t
PIVOT
(
COUNT(Name)
FOR Status IN(Single, Married, Widow)
) AS p;
SQL Fiddle Demo
To do it dynamically you have to use dynamic sql like this:
DECLARE #cols AS NVARCHAR(MAX);
DECLARE #query AS NVARCHAR(MAX);
select #cols = STUFF((SELECT distinct ',' +
QUOTENAME(status)
FROM tablename
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
, 1, 1, '');
SELECT #query = '
SELECT *
FROM
(
SELECT Age, Name, Status FROM tablename
) AS t
PIVOT
(
COUNT(Name)
FOR Status IN( ' +#cols + ')
) AS p;';
execute(#query);
Updated SQL Fiddle Demo

grouping and switching the columns and rows

I don't know if this would officially be called a pivot, but the result that I would like is this:
+------+---------+------+
| Alex | Charley | Liza |
+------+---------+------+
| 213 | 345 | 1 |
| 23 | 111 | 5 |
| 42 | 52 | 2 |
| 323 | | 5 |
| 23 | | 1 |
| 324 | | 5 |
+------+---------+------+
my input data is in this form:
+-----+---------+
| Apt | Name |
+-----+---------+
| 213 | Alex |
| 23 | Alex |
| 42 | Alex |
| 323 | Alex |
| 23 | Alex |
| 324 | Alex |
| 345 | Charley |
| 111 | Charley |
| 52 | Charley |
| 1 | Liza |
| 5 | Liza |
| 2 | Liza |
| 5 | Liza |
| 1 | Liza |
| 5 | Liza |
+-----+---------+
because I have approximately 100 names, I don't want to have to do a ton of sub queries lik this
select null, null, thirdcolumn from...
select null, seconcolumn from...
select firstcolumn from...
Is there a way to do this with PIVOT or otherwise?
You can do this with dynamic PIVOT and the ROW_NUMBER() function:
DECLARE #cols AS VARCHAR(1000),
#query AS VARCHAR(8000)
SELECT #cols = STUFF((SELECT ',' + QUOTENAME(Name)
FROM (SELECT DISTINCT Name
FROM #test
)sub
ORDER BY Name
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)')
,1,1,'')
PRINT #cols
SET #query = '
WITH cte AS (SELECT DISTINCT *
FROM #test)
,cte2 AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Apt)RowRank
FROM cte)
SELECT *
FROM cte2
PIVOT (max(Apt) for Name in ('+#cols+')) p
'
EXEC (#query)
SQL Fiddle - Distinct List, Specific Order
Edit: If you don't want the list to be distinct, eliminate the first cte above, and if you want to keep arbitrary ordering change the ORDER BY to (SELECT 1):
DECLARE #cols AS VARCHAR(1000),
#query AS VARCHAR(8000)
SELECT #cols = STUFF((SELECT ',' + QUOTENAME(Name)
FROM (SELECT DISTINCT Name
FROM #test
)sub
ORDER BY Name
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)')
,1,1,'')
PRINT #cols
SET #query = '
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY Name ORDER BY (SELECT 1))RowRank
FROM #test)
SELECT *
FROM cte
PIVOT (max(Apt) for Name in ('+#cols+')) p
'
EXEC (#query)
SQL Fiddle - Full List, Arbitrary Order
And finally, if you didn't want the RowRank field in your results, just re-use the #cols variable in your SELECT:
SET #query = '
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY Name ORDER BY (SELECT 1))RowRank
FROM #test)
SELECT '+#cols+'
FROM cte
PIVOT (max(Apt) for Name in ('+#cols+')) p
'
EXEC (#query)
Oh, this is something of a pain, but you can do it with SQL. You are trying to concatenate the columns.
select seqnum,
max(case when name = 'Alex' then apt end) as Alex,
max(case when name = 'Charley' then apt end) as Charley,
max(case when name = 'Liza' then apt end) as Liza
from (select t.*, row_number() over (partition by name order by (select NULL)) as seqnum
from t
) t
group by seqnum
order by seqnum;
As a note: there is no guarantee that the original ordering will be the same within each column. As you know, SQL tables are inherently unordered, so you would need a column to specify the ordering.
To handle multiple names, I'd just get the list using a query such as:
select distinct 'max(case when name = '''+name+''' then apt end) as '+name+','
from t;
And copy the results into the query.

Transpose Column Value to Row in T-SQL

I know this might have been asked before but I really could not find the answer. I have a temporary table named #TEMP which looks like this:
+===============================+=============================+
| NAME | ATTRIBUTE |
+===============================+=============================+
| BadgeType | Permanent |
+-------------------------------+-----------------------------+
| PrimaryLocationInCompany | No |
+-------------------------------+-----------------------------+
| AdminAccessToProductionServer | No |
+-------------------------------+-----------------------------+
| AccessToImportantFIles | No |
+-------------------------------+-----------------------------+
| Waiver_Number | 56987 |
+-------------------------------+-----------------------------+
| Summary | User not much active |
+-------------------------------+-----------------------------+
| TimeStamp | 3/3/2009 |
+-------------------------------+-----------------------------+
| UserID | 86478925 |
+-------------------------------+-----------------------------+
What I want to do is to transpose both the Name and Attribute values to rows. The Attribute values may vary but the Name values are always fixed.
The result should look like this:
+----------+---------------+------------------------------+--------------------------------+-----------------------+--------------------------------------------------------+----------+-----------+
| UserID | BadgeType | PrimaryLocationIntelFacility | adminAccessToProductionServer | AccessToClassifiedData| Info_Sec_Waiver_Number | Summary | TimeStamp |
+----------+---------------+------------------------------+--------------------------------+-----------------------+--------------------------------------------------------+----------+-----------+
| 11313403 | GREEN | No | No | No | This contingent worker is eligible for remote access. | 3/3/2009 | |
+----------+---------------+------------------------------+--------------------------------+-----------------------+--------------------------------------------------------+----------+-----------+
Try this
SELECT UserID,BadgeType,PrimaryLocationInCompany,AdminAccessToProductionServer,AccessToImportantFIles,WaiverNumber,Summary,[TimeStamp]
FROM
(
SELECT * FROM #Temp
) p
PIVOT
(
MIN([ATTRIBUTE]) FOR [NAME] IN(BadgeType,PrimaryLocationInCompany,AdminAccessToProductionServer,AccessToImportantFIles,WaiverNumber,Summary,[TimeStamp],UserID)
) T
Also check the below for dynamic columns
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + [NAME]
from #Temp
group by [NAME]
FOR XML PATH(''))
,1,1,'')
set #query = 'SELECT ' + #cols + ' from
(
select * from #Temp
) x
pivot
(
MAX([ATTRIBUTE])
for [NAME] in (' + #cols + ')
) p '
execute(#query)

SQL Query needs to match similar records

I have a very large table of contacts which I am building an interface to help my client to de-dupe. Here is an example of the table content
id | firstname | lastname | email | address1 | addres2 | verifiedAt |
1 | James | johnson | james#test.com | | | |
2 | David | bloggs | james#bloggs.com | | | |
3 | John | nobel | james#nobel.com | | | |
4 | Terry | jacket | james#jacket.com | | | 05/05/2013 |
5 | James | johnson | james#johnson.com| | | |
6 | James | privett | james#test.com | | | |
I need to write a query that will return the first contact that has another contact in the same table where either the email addresses match or the firstname + lastname match.
Is this possible in a single query?
Thanks in advance
Try this (SQL Fiddle).
SELECT DISTINCT *
FROM
( SELECT
MIN(id) as [id]
FROM mytable
GROUP BY email
HAVING COUNT(*) > 1
UNION ALL
SELECT
MIN(id) as [id]
FROM mytable
GROUP BY firstName,lastName
HAVING Count(*) > 1 )dups
JOIN myTable t
ON t.Id = dups.id
This works (SQLFiddle DEMO):
SELECT a.* FROM mytable a
JOIN (
SELECT email
FROM mytable
GROUP BY email
HAVING count(*) > 1
) b ON a.email = b.email
UNION
SELECT a.* FROM mytable a
JOIN (
SELECT firstname, lastname
FROM mytable
GROUP BY firstname, lastname
HAVING count(*) > 1
) b ON a.firstname = b.firstname AND a.lastname = b.lastname
To make sure that this query works fast, be sure to have at least following indexes:
CREATE INDEX i1 ON mytable(email);
CREATE INDEX i2 ON mytable(firstname, lastname);
One method:
with cte as
(select c.*,
row_number() over (partition by email order by id) rnem,
count(*) over (partition by email) ctem,
row_number() over (partition by firstname, lastname order by id) rnfl,
count(*) over (partition by firstname, lastname) ctfl
from contacts c)
select * from cte
where (ctem > 1 and rnem = 1) or (ctfl > 1 and rnfl = 1)
SQLFiddle here.

Write advanced SQL Select

Item table:
| Item | Qnty | ProdSched |
| a | 1 | 1 |
| b | 2 | 1 |
| c | 3 | 1 |
| a | 4 | 2 |
| b | 5 | 2 |
| c | 6 | 2 |
Is there a way I can output it like this using SQL SELECT?
| Item | ProdSched(1)(Qnty) | ProdSched(2)(Qnty) |
| a | 1 | 4 |
| b | 2 | 5 |
| c | 3 | 6 |
You can use PIVOT for this. If you have a known number of values to transform, then you can hard-code the values via a static pivot:
select item, [1] as ProdSched_1, [2] as ProdSched_2
from
(
select item, qty, prodsched
from yourtable
) x
pivot
(
max(qty)
for prodsched in ([1], [2])
) p
see SQL Fiddle with Demo
If the number of columns is unknown, then you can use a dynamic pivot:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(prodsched)
from yourtable
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT item,' + #cols + ' from
(
select item, qty, prodsched
from yourtable
) x
pivot
(
max(qty)
for prodsched in (' + #cols + ')
) p '
execute(#query)
see SQL Fiddle with Demo
SELECT Item,
[ProdSched(1)(Qnty)] = MAX(CASE WHEN ProdSched = 1 THEN Qnty END),
[ProdSched(2)(Qnty)] = MAX(CASE WHEN ProdSched = 2 THEN Qnty END)
FROM dbo.tablename
GROUP BY Item
ORDER BY Item;
Let's hit this in two phases. First, although this is not the exact format you wanted, you can get the data you asked for as follows:
Select item, ProdSched, max(qty)
from Item1
group by item,ProdSched
Now, to get the data in the format you desired, one way of accomplishing it is a PIVOT table. You can cook up a pivot table in SQL Server as follows:
Select item, [1] as ProdSched1, [2] as ProdSched2
from ( Select Item, Qty, ProdSched
from item1 ) x
Pivot ( Max(qty) for ProdSched in ([1],[2])) y