Pivot dataset with column names - sql

if you have a table (example)
declare #MyTable table (
CustomerName nvarchar(50),
BirthDate datetime,
BirtPlace nvarchar(50),
Phone nvarchar(50),
Email nvarchar(50)
)
insert into #MyTable
(
CustomerName,
BirthDate,
BirtPlace,
Phone,
Email
)
values (
'Customer1',
'12.05.1990',
'Place1',
N'+000125456789',
N'customer#customer.com'
)
Is it possible to get following result set:
CustomerName Customer1
BirtDate 1990-12-05
BirtPlace Place1
Phone +000125456789
Email customer#customer.com
Something like pivot, but i don't have any idea how to get to this result.

As you want to change columns to rows the function you want is unpivot not pivot.
This should do the trick:
SELECT col, val
FROM
(
SELECT
CustomerName,
CAST(BirthDate AS NVARCHAR(50)) BirthDate,
BirtPlace,
Phone,
Email
FROM #MyTable
) AS t
UNPIVOT
(
val FOR col IN (CustomerName, BirthDate, BirtPlace, Phone, Email)
) AS u

Try this
SELECT myColumn, myDetail
FROM
(
SELECT
CustomerName,
CONVERT(NVARCHAR(50),BirthDate,121) AS BirthDate,
BirtPlace,
Phone,
Email
FROM
#MyTable
) AS A
UNPIVOT
(
myDetail FOR myColumn IN (CustomerName, BirthDate, BirtPlace, Phone, Email)
) AS tbUnpivot

This is a unpivot issue, and if you used old version sql server 2000, this unpivot syntax will not work, then you can use UNION:
SELECT 'CustomerName' AS colName, CustomerName AS colVal FROM #MyTable
UNION
SELECT 'BirthDate' AS colName, CAST(BirthDate AS NVARCHAR(50)) AS colVal FROM #MyTable
UNION
SELECT 'BirthPlace' AS colName, BirthPlace AS colVal FROM #MyTable
UNION
SELECT 'Phone' AS colName, Phone AS colVal FROM #MyTable
UNION
SELECT 'Email' AS colName, Email AS colVal FROM #MyTable;

SELECT CustomerName,BirtDate,BirtPlace,Phone,Email FROM #MyTable\G
You have no ID and no Foreign keys so i think thats a way to solve your problem. The \G giveĀ“s you the SQL-query as a list.
values (
'Customer1', '1990-12-05', 'Place1', '+000125456789', 'customer#customer.com'
);
I hope i could help you
Have a nice Day

Related

Identify which columns are different in the two queries

I currently have a query that looks like this:
Select val1, val2, val3, val4 from Table_A where someID = 10
UNION
Select oth1, val2, val3, oth4 from Table_B where someId = 10
I initially run this same query above but with EXCEPT, to identify which ID's are returned with differences, and then I do a UNION query to find which columns specifically are different.
My goal is to compare the values between the two tables (some columns have different names). And that's what I'm doing.
However, the two queries above have about 250 different field names, so it is quite mundane to scroll through to find the differences.
Is there a better and quicker way to identify which column names are different after running the two queries?
EDIT: Here's my current process:
DROP TABLE IF EXISTS #Table_1
DROP TABLE IF EXISTS #Table_2
SELECT 'Dave' AS Name, 'Smih' AS LName, 18 AS Age, 'Alabama' AS State
INTO #Table_1
SELECT 'Dave' AS Name, 'Smith' AS LName, 19 AS Age, 'Alabama' AS State
INTO #Table_2
--FInd differences
SELECT Name, LName,Age,State FROM #Table_1
EXCEPT
SELECT Name, LName,Age,State FROM #Table_2
--How I compare differences
SELECT Name, LName,Age,State FROM #Table_1
UNION
SELECT Name, LName,Age,State FROM #Table_2
Is there any way to streamline this so I can get a column list of differences?
Here is a generic way to handle two tables differences.
We just need to know their primary key column.
It is based on JSON, and will work starting from SQL Server 2016 onwards.
SQL
-- DDL and sample data population, start
DECLARE #TableA TABLE (rowid INT IDENTITY(1,1), FirstName VARCHAR(100), LastName VARCHAR(100), Phone VARCHAR(100));
DECLARE #TableB table (rowid int Identity(1,1), FirstName varchar(100), LastName varchar(100), Phone varchar(100));
INSERT INTO #TableA(FirstName, LastName, Phone) VALUES
('JORGE','LUIS','41514493'),
('JUAN','ROBERRTO','41324133'),
('ALBERTO','JOSE','41514461'),
('JULIO','ESTUARDO','56201550'),
('ALFREDO','JOSE','32356654'),
('LUIS','FERNANDO','98596210');
INSERT INTO #TableB(FirstName, LastName, Phone) VALUES
('JORGE','LUIS','41514493'),
('JUAN','ROBERTO','41324132'),
('ALBERTO','JOSE','41514461'),
('JULIO','ESTUARDO','56201551'),
('ALFRIDO','JOSE','32356653'),
('LUIS','FERNANDOO','98596210');
-- DDL and sample data population, end
SELECT rowid
,[key] AS [column]
,Org_Value = MAX( CASE WHEN Src=1 THEN Value END)
,New_Value = MAX( CASE WHEN Src=2 THEN Value END)
FROM (
SELECT Src=1
,rowid
,B.*
FROM #TableA A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
UNION ALL
SELECT Src=2
,rowid
,B.*
FROM #TableB A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
) AS A
GROUP BY rowid,[key]
HAVING MAX(CASE WHEN Src=1 THEN Value END)
<> MAX(CASE WHEN Src=2 THEN Value END)
ORDER BY rowid,[key];
Output
rowid
column
Org_Value
New_Value
2
LastName
ROBERRTO
ROBERTO
2
Phone
41324133
41324132
4
Phone
56201550
56201551
5
FirstName
ALFREDO
ALFRIDO
5
Phone
32356654
32356653
6
LastName
FERNANDO
FERNANDOO

Remove duplicates with less null values

I have a table of employees which contains about 25 columns. Right now there are a lot of duplicates and I would like to try and get rid of some of these duplicates.
First, I want to find the duplicates by looking for multiple records that have the same values in first name, last name, employee number, company number and status.
SELECT
firstname,lastname,employeenumber, companynumber, statusflag
FROM
employeemaster
GROUP BY
firstname,lastname,employeenumber,companynumber, statusflag
HAVING
(COUNT(*) > 1)
This gives me duplicates but my goal is to find and keep the best single record and delete the other records. The "best single record" is defined by the record with the least amount of NULL values in all of the other columns. How can I do this?
I am using Microsoft SQL Server 2012 MGMT Studio.
EXAMPLE:
Red: DELETE
Green: KEEP
NOTE: There are a lot more columns in the table than what this table shows.
You can use the sys.columns table to get a list of columns and build a dynamic query. This query will return a 'KeepThese' value for every record you want to keep based on your given criteria.
-- insert test data
create table EmployeeMaster
(
Record int identity(1,1),
FirstName varchar(50),
LastName varchar(50),
EmployeeNumber int,
CompanyNumber int,
StatusFlag int,
UserName varchar(50),
Branch varchar(50)
);
insert into EmployeeMaster
(
FirstName,
LastName,
EmployeeNumber,
CompanyNumber,
StatusFlag,
UserName,
Branch
)
values
('Jake','Jones',1234,1,1,'JJONES','PHX'),
('Jake','Jones',1234,1,1,NULL,'PHX'),
('Jake','Jones',1234,1,1,NULL,NULL),
('Jane','Jones',5678,1,1,'JJONES2',NULL);
-- get records with most non-null values with dynamic sys.column query
declare #sql varchar(max)
select #sql = '
select e.*,
row_number() over(partition by
e.FirstName,
e.LastName,
e.EmployeeNumber,
e.CompanyNumber,
e.StatusFlag
order by n.NonNullCnt desc) as KeepThese
from EmployeeMaster e
cross apply (select count(n.value) as NonNullCnt from (select ' +
replace((
select 'cast(' + c.name + ' as varchar(50)) as value union all select '
from sys.columns c
where c.object_id = t.object_id
for xml path('')
) + '#',' union all select #','') + ')n)n'
from sys.tables t
where t.name = 'EmployeeMaster'
exec(#sql)
Try this.
;WITH cte
AS (SELECT Row_number()
OVER(
partition BY firstname, lastname, employeenumber, companynumber, statusflag
ORDER BY (SELECT NULL)) rn,
firstname,
lastname,
employeenumber,
companynumber,
statusflag,
username,
branch
FROM employeemaster),
cte1
AS (SELECT a.firstname,
a.lastname,
a.employeenumber,
a.companynumber,
a.statusflag,
Row_number()
OVER(
partition BY a.firstname, a.lastname, a.employeenumber, a.companynumber, a.statusflag
ORDER BY (CASE WHEN a.username IS NULL THEN 1 ELSE 0 END +CASE WHEN a.branch IS NULL THEN 1 ELSE 0 END) )rn
-- add the remaining columns in case statement
FROM cte a
JOIN employeemaster b
ON a.firstname = b.firstname
AND a.lastname = b.lastname
AND a.employeenumber = b.employeenumber
AND a.companynumbe = b.companynumber
AND a.statusflag = b.statusflag)
SELECT *
FROM cte1
WHERE rn = 1
I test with MySQL and use NULL String concat to found the best record. Because LENGTH ( NULL || 'data') is 0. Only if all column not NULL some length exists. Maybe this is not perfekt.
create table EmployeeMaster
(
Record int auto_increment,
FirstName varchar(50),
LastName varchar(50),
EmployeeNumber int,
CompanyNumber int,
StatusFlag int,
UserName varchar(50),
Branch varchar(50),
PRIMARY KEY(record)
);
INSERT INTO EmployeeMaster
(
FirstName, LastName, EmployeeNumber, CompanyNumber, StatusFlag, UserName, Branch
) VALUES ('Jake', 'Jones', 1234, 1, 1, 'JJONES', 'PHX'), ('Jake', 'Jones', 1234, 1, 1, NULL, 'PHX'), ('Jake', 'Jones', 1234, 1, 1, NULL, NULL), ('Jane', 'Jones', 5678, 1, 1, 'JJONES2', NULL);
My query idea looks like this
SELECT e.*
FROM employeemaster e
JOIN ( SELECT firstname,
lastname,
employeenumber,
companynumber,
statusflag,
MAX( LENGTH ( username || branch ) ) data_quality
FROM employeemaster
GROUP BY firstname, lastname, employeenumber, companynumber, statusflag
HAVING count(*) > 1
) g
ON LENGTH ( username || branch ) = g.data_quality

Query to Append a Number to a Record it it finds a duplicate

I have an sql table with the following fields: Letter, Number, Result
Title Name Result
Mr Mark
Mr Mark
Mr Luke
Mr John
Mr John
I need to create an update query to have the result as
Title Name Result
Mr Mark MrMark
Mr Mark MrMark2
Mr Luke MrLuke
Mr John MrJohn
Mr John MrJohn2
Note that the second and the fifth record had a number 2 appended since it already found the same record (same Title and Name) previously.
Please help.
If it is MS SQL, try using ROW_NUMBER and PARTITION BY?
DECLARE #temp TABLE (Title NVARCHAR(200), Name NVARCHAR(200), Result NVARCHAR(200));
INSERT #temp
SELECT 'Mr', 'Mark', NULL
UNION ALL
SELECT 'Mr', 'Mark', NULL
UNION ALL
SELECT 'Mr', 'Luke', NULL
UNION ALL
SELECT 'Mr', 'John', NULL
UNION ALL
SELECT 'Mr', 'John', NULL
SELECT * FROM #temp
DECLARE #tempWithOrdering TABLE (RowNum INT, Title NVARCHAR(200), Name NVARCHAR(200), Result NVARCHAR(200));
INSERT #tempWithOrdering
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Title ), Title, Name, Result FROM #temp
SELECT * FROM #tempWithOrdering
SELECT
Title,
Name,
Result = (
SELECT TOP(1) Name +
CASE RowNum
WHEN t1.RowNum THEN ''
ELSE CAST(t1.RowNum AS NVARCHAR(12))
END
FROM #tempWithOrdering
WHERE Name = t1.Name
)
FROM #tempWithOrdering t1
Assuming you are using sql server and the duplicate is on the field 'Name' .Try this.
Use Analytic fn ROW_NUMBER() ;
If it is Oracle use || instead of +
WITH TEMP AS
(
SELECT Title , Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY bill_period) AS RK
FROM TABLE1
)
SELECT Title , Name,Title + Name +RK FROM TEMP;

How can I get around differences in column types when using unpivot?

I am having problems using unpivot on columns, that are not the exact same datatype, and I can't figure out how to convert the columns on the fly, because the syntax for UNPIVOT does not seem to support it.
Consider this example:
DECLARE #People TABLE
(PersonId int, Firstname varchar(50), Lastname varchar(50))
-- Load Sample Data
INSERT INTO #People VALUES (1, 'Abe', 'Albertson')
INSERT INTO #People VALUES (2, 'Benny', 'Boomboom')
SELECT PersonId, ColumnName, Value FROM #People
UNPIVOT
(
ColumnName FOR
Value IN (FirstName, LastName)
)
The result would be this:
PersonId ColumnName Value
----------- ----------------- ----------------
1 Abe Firstname
1 Albertson Lastname
2 Benny Firstname
2 Boomboom Lastname
Everything is unicorns and rainbows. Now I change the datatype of Lastname to varchar(25) and everything breaks. The output is:
The type of column "Lastname" conflicts with the type of other columns
specified in the UNPIVOT list.
How can I get around this and convert everything to say a varchar(50) on the fly, without tampering with the actual data types on the table?
SqlFiddle working example (same datatype): http://sqlfiddle.com/#!3/f3719
SqlFiddle broken example (diff datatypes): http://sqlfiddle.com/#!3/5dca13/1
You cannot convert inside the UNPIVOT syntax but you can convert the data inside a subquery similar to the following:
select PersonId, ColumnName, Value
from
(
select personid,
firstname,
cast(lastname as varchar(50)) lastname
from People
) d
unpivot
(
Value FOR
ColumnName in (FirstName, LastName)
) unpiv;
See SQL Fiddle with Demo
Another way to do this would be to use CROSS APPLY, depending on your version of SQL Server you can use CROSS APPLY with VALUES or UNION ALL:
select PersonId, ColumnName, Value
from People
cross apply
(
select 'firstname', firstname union all
select 'lastname', cast(lastname as varchar(50))
) c (ColumnName, value)
See SQL Fiddle with Demo.
select col1, col2, Answer, Question_Col
from (
SELECT
col1,
col2,
col3,
CAST(col4 AS NVARCHAR(MAX)) col4,
-- Note: Here datatype should be same as that of previous columns
CAST(col5 AS NVARCHAR(MAX)) col5
FROM Table_Name )a
unpivot
(
Answer
for Question_Col in (
col3,
col4,
col5
)
) unpiv;

How do you SELECT several columns with one distinct column

Objective: using SqlServer 2005, Select multiple columns, but ensure that 1 specific column is not a duplicate
Issue: The following code does not remove the duplicates. The field that has duplicates is email.
SELECT DISTINCT
email,
name,
phone
FROM
database.dbo.table
WHERE
status = 'active'
GROUP BY
email,
name,
phone
Thank you in advance for any comments, suggestions or recommendations.
It removes email duplicates but you have to decide which name, phone you need. The result is based on name, phone sort order.
WITH cl
as
(
SELECT email, name, phone, ROW_NUMBER() OVER(PARTITION BY email ORDER BY name, phone) rn
FROM
database.dbo.table
WHERE
status = 'active')
select *
from cl
where rn =1
This is a way of doing it
DECLARE #Table AS TABLE
(email NVARCHAR(100), name NVARCHAR(100), phone NVARCHAR(100))
INSERT INTO #Table
( email , name , phone)
VALUES ( N'fred', -- email - nvarchar(100)
N'bob', -- name - nvarchar(100)
N'steve' -- phone- nvarchar(100)
)
INSERT INTO #Table
( email , name , phone)
VALUES ( N'fred', -- email - nvarchar(100)
N'bob2', -- name - nvarchar(100)
N'ste1ve' -- phone- nvarchar(100)
)
INSERT INTO #Table
( email , name , phone)
VALUES ( N'fred1', -- email - nvarchar(100)
N'bob3', -- name - nvarchar(100)
N'steve3' -- phone- nvarchar(100)
)
SELECT email , MAX(name ) c2, MAX(col3) c3 FROM #Table GROUP BY email