Aggregate Function on multiple columns in SQL Server - sql

I have the following data in a #temp table:
Id code Fname CompanyId FieldName Value
----------------------------------------------------------------
465 00133 JENN WILSON 1 ERA 1573
465 00133 JENN WILSON 1 ESHIFTALLOW 3658
465 00133 JENN WILSON 1 NETPAY 51560
I want to do following operation i.e
One Row will be addition on two columns i.e ERA + ESHIFTALLOW
Other Row will be subtraction & addition on three columns i.e NETPAY - ERA + ESHIFTALLOW
I had tried using case statement in SQL Server.
Following is the output required
where Field1= ERA + ESHIFTALLOW & Filed2=NETPAY - ERA + ESHIFTALLOW
Id code Fname CompanyId FieldName Value
----------------------------------------------------------------
465 00133 JENN WILSON 1 Field1 5231
465 00133 JENN WILSON 1 Filed2 46329
I had tried using SQL SERVER Case Statement but not getting proper output
SQL Query : Aggregate option in SQL Server CASE statement

I see at least 2 methods to get those results. A group by or a pivot
In the example below the 2 methods are shown.
CREATE TABLE #Temp (Id INT, code VARCHAR(5), Fname VARCHAR(20), CompanyId INT, FieldName VARCHAR(20), Value INT);
insert into #Temp (Id, code, Fname, CompanyId, FieldName, Value)
values
(465,00133,'JENN WILSON',1,'ERA',1573),
(465,00133,'JENN WILSON',1,'ESHIFTALLOW',3658),
(465,00133,'JENN WILSON',1,'NETPAY',51560);
with Q AS (
SELECT Id, code, Fname, CompanyId,
sum(case when FieldName = 'ERA' then Value end) as ERA,
sum(case when FieldName = 'ESHIFTALLOW' then Value end) as ESHIFTALLOW,
sum(case when FieldName = 'NETPAY' then Value end) as NETPAY
from #Temp
group by Id, code, Fname, CompanyId
)
select Id, code, Fname, CompanyId, 'Field1' as FieldName, (ERA + ESHIFTALLOW) as Value from Q
union all
select Id, code, Fname, CompanyId, 'Field2', (NETPAY - ERA + ESHIFTALLOW) from Q
;
with Q AS (
SELECT Id, code, Fname, CompanyId,
(ERA + ESHIFTALLOW) as Field1,
(NETPAY - ERA + ESHIFTALLOW) as Field2
FROM (SELECT * FROM #Temp) s
PIVOT ( SUM(VALUE) FOR FieldName IN (ERA, ESHIFTALLOW, NETPAY)) p
)
select Id, code, Fname, CompanyId, 'Field1' as FieldName, Field1 as Value from Q
union all
select Id, code, Fname, CompanyId, 'Field2', Field2 from Q
;
Note that SUM(VALUE) was used instead of MAX(VALUE). In this case it will yield the same results. It's just a choice really.

Building heavily on LukStorms' answer, you can use a PIVOT and an UNPIVOT to get the results you want:
CREATE TABLE #Temp
(Id INT, Code VARCHAR(5), Fname VARCHAR(20), CompanyId INT, FieldName VARCHAR(20), Value INT);
INSERT INTO #Temp
(Id, Code, Fname, CompanyId, FieldName, Value)
VALUES
(465,00133, 'JENN WILSON', 1, 'ERA', 1573),
(465,00133, 'JENN WILSON', 1, 'ESHIFTALLOW', 3658),
(465,00133, 'JENN WILSON', 1, 'NETPAY', 51560);
SELECT Id, Code, Fname, CompanyId, FieldName, Value
FROM (
SELECT Id, Code, Fname, CompanyId,
ERA + ESHIFTALLOW AS Field1,
NETPAY - ERA + ESHIFTALLOW AS Field2
FROM (
SELECT *
FROM #Temp
) AS s
PIVOT (
SUM(Value)
FOR FieldName IN (ERA, ESHIFTALLOW, NETPAY)
) AS p
) AS r
UNPIVOT (
Value
FOR FieldName IN (Field1, Field2)
) AS u
;

I have no idea whether this solution is anywhere near the most efficient, but it should work:
SELECT
BASE.*,
ERA.Value AS ERA,
ESALLOW.Value AS ESHIFTALLOW,
ERA.Value + ESALLOW.Value AS Field1,
etc...
FROM (
SELECT DISTINCT Id, code, Fname, CompanyId
FROM #TEMP ) BASE
LEFT OUTER JOIN (
SELECT Id, Value
FROM #TEMP
WHERE FieldName = 'ERA' ) ERA
ON BASE.Id = ERA.Id
LEFT OUTER JOIN (
SELECT Id, Value
FROM #TEMP
WHERE FieldName = 'ESHIFTALLOW' ) ESALLOW
ON BASE.Id = ESALLOW.Id
This gives you a simple table that has every type of value in a separate column, instead of in separate rows. This makes calculations possible.

Related

Identify which columns are different in the two queries

I currently have a query that looks like this:
Select val1, val2, val3, val4 from Table_A where someID = 10
UNION
Select oth1, val2, val3, oth4 from Table_B where someId = 10
I initially run this same query above but with EXCEPT, to identify which ID's are returned with differences, and then I do a UNION query to find which columns specifically are different.
My goal is to compare the values between the two tables (some columns have different names). And that's what I'm doing.
However, the two queries above have about 250 different field names, so it is quite mundane to scroll through to find the differences.
Is there a better and quicker way to identify which column names are different after running the two queries?
EDIT: Here's my current process:
DROP TABLE IF EXISTS #Table_1
DROP TABLE IF EXISTS #Table_2
SELECT 'Dave' AS Name, 'Smih' AS LName, 18 AS Age, 'Alabama' AS State
INTO #Table_1
SELECT 'Dave' AS Name, 'Smith' AS LName, 19 AS Age, 'Alabama' AS State
INTO #Table_2
--FInd differences
SELECT Name, LName,Age,State FROM #Table_1
EXCEPT
SELECT Name, LName,Age,State FROM #Table_2
--How I compare differences
SELECT Name, LName,Age,State FROM #Table_1
UNION
SELECT Name, LName,Age,State FROM #Table_2
Is there any way to streamline this so I can get a column list of differences?
Here is a generic way to handle two tables differences.
We just need to know their primary key column.
It is based on JSON, and will work starting from SQL Server 2016 onwards.
SQL
-- DDL and sample data population, start
DECLARE #TableA TABLE (rowid INT IDENTITY(1,1), FirstName VARCHAR(100), LastName VARCHAR(100), Phone VARCHAR(100));
DECLARE #TableB table (rowid int Identity(1,1), FirstName varchar(100), LastName varchar(100), Phone varchar(100));
INSERT INTO #TableA(FirstName, LastName, Phone) VALUES
('JORGE','LUIS','41514493'),
('JUAN','ROBERRTO','41324133'),
('ALBERTO','JOSE','41514461'),
('JULIO','ESTUARDO','56201550'),
('ALFREDO','JOSE','32356654'),
('LUIS','FERNANDO','98596210');
INSERT INTO #TableB(FirstName, LastName, Phone) VALUES
('JORGE','LUIS','41514493'),
('JUAN','ROBERTO','41324132'),
('ALBERTO','JOSE','41514461'),
('JULIO','ESTUARDO','56201551'),
('ALFRIDO','JOSE','32356653'),
('LUIS','FERNANDOO','98596210');
-- DDL and sample data population, end
SELECT rowid
,[key] AS [column]
,Org_Value = MAX( CASE WHEN Src=1 THEN Value END)
,New_Value = MAX( CASE WHEN Src=2 THEN Value END)
FROM (
SELECT Src=1
,rowid
,B.*
FROM #TableA A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
UNION ALL
SELECT Src=2
,rowid
,B.*
FROM #TableB A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
) AS A
GROUP BY rowid,[key]
HAVING MAX(CASE WHEN Src=1 THEN Value END)
<> MAX(CASE WHEN Src=2 THEN Value END)
ORDER BY rowid,[key];
Output
rowid
column
Org_Value
New_Value
2
LastName
ROBERRTO
ROBERTO
2
Phone
41324133
41324132
4
Phone
56201550
56201551
5
FirstName
ALFREDO
ALFRIDO
5
Phone
32356654
32356653
6
LastName
FERNANDO
FERNANDOO

Convert Table to Specific Column Wise

I have a table like this. How can I convert to this format?
DECLARE #A TaBLE (KeyValue INT, Name VARCHAR(50), Value VARCHAR(512))
INSERT INTO #A
VALUES (0, 'AccountID', '192507'), (0, 'member_id', '999159'),
(0, 'firstname', 'Test1'), (0, 'lastname', 'Last1'),
(1, 'AccountID', '192508'), (1, 'member_id', '999160'),
(1, 'firstname', 'Test2'), (1, 'lastname', 'Last2')
SELECT * FROM #A
I have table rows for this model:
KeyValue Name Value
-----------------------------------
0 AccountID 192507
0 member_id 999159
0 firstname Test1
0 lastname Last1
1 AccountID 192508
1 member_id 999160
1 firstname Test2
1 lastname Last2
My expected output is:
AccountID member_id firstname lastname
--------------------------------------------
192507 999159 Test1 Last1
192508 999160 Test2 Last2
I tried this code But it didn't work
select *
from
(
select Name,value
from #A
) d
pivot
(
MAX(value)
for Name in (AccountID,member_id,firstname,lastname)
) piv;
Try this below logic-
DEMO HERE
SELECT
MAX(CASE WHEN Name = 'AccountID' THEN Value ELSE NULL END) AccountID,
MAX(CASE WHEN Name = 'member_id' THEN Value ELSE NULL END) member_id ,
MAX(CASE WHEN Name = 'firstname' THEN Value ELSE NULL END) firstname ,
MAX(CASE WHEN Name = 'lastname' THEN Value ELSE NULL END) lastname
FROM #A
GROUP BY KeyValue
You can get the desired result by using PIVOT. In your query you just need to select all the columns, like below.
SELECT AccountID, member_id, firstname, lastname
FROM
(
select * from #A
) d
PIVOT
(
MAX(value)
FOR Name IN (AccountID, member_id, firstname, lastname)
) piv;
You can run the test here.
In the temp table, you should select all useful information like this
select AccountID, member_id, firstname, lastname
from
(
select * from #A -- instead of `select Name,value`
) d
pivot
(
MAX(value)
for Name in (AccountID,member_id,firstname,lastname)
) piv;
Result here

Remove duplicates with less null values

I have a table of employees which contains about 25 columns. Right now there are a lot of duplicates and I would like to try and get rid of some of these duplicates.
First, I want to find the duplicates by looking for multiple records that have the same values in first name, last name, employee number, company number and status.
SELECT
firstname,lastname,employeenumber, companynumber, statusflag
FROM
employeemaster
GROUP BY
firstname,lastname,employeenumber,companynumber, statusflag
HAVING
(COUNT(*) > 1)
This gives me duplicates but my goal is to find and keep the best single record and delete the other records. The "best single record" is defined by the record with the least amount of NULL values in all of the other columns. How can I do this?
I am using Microsoft SQL Server 2012 MGMT Studio.
EXAMPLE:
Red: DELETE
Green: KEEP
NOTE: There are a lot more columns in the table than what this table shows.
You can use the sys.columns table to get a list of columns and build a dynamic query. This query will return a 'KeepThese' value for every record you want to keep based on your given criteria.
-- insert test data
create table EmployeeMaster
(
Record int identity(1,1),
FirstName varchar(50),
LastName varchar(50),
EmployeeNumber int,
CompanyNumber int,
StatusFlag int,
UserName varchar(50),
Branch varchar(50)
);
insert into EmployeeMaster
(
FirstName,
LastName,
EmployeeNumber,
CompanyNumber,
StatusFlag,
UserName,
Branch
)
values
('Jake','Jones',1234,1,1,'JJONES','PHX'),
('Jake','Jones',1234,1,1,NULL,'PHX'),
('Jake','Jones',1234,1,1,NULL,NULL),
('Jane','Jones',5678,1,1,'JJONES2',NULL);
-- get records with most non-null values with dynamic sys.column query
declare #sql varchar(max)
select #sql = '
select e.*,
row_number() over(partition by
e.FirstName,
e.LastName,
e.EmployeeNumber,
e.CompanyNumber,
e.StatusFlag
order by n.NonNullCnt desc) as KeepThese
from EmployeeMaster e
cross apply (select count(n.value) as NonNullCnt from (select ' +
replace((
select 'cast(' + c.name + ' as varchar(50)) as value union all select '
from sys.columns c
where c.object_id = t.object_id
for xml path('')
) + '#',' union all select #','') + ')n)n'
from sys.tables t
where t.name = 'EmployeeMaster'
exec(#sql)
Try this.
;WITH cte
AS (SELECT Row_number()
OVER(
partition BY firstname, lastname, employeenumber, companynumber, statusflag
ORDER BY (SELECT NULL)) rn,
firstname,
lastname,
employeenumber,
companynumber,
statusflag,
username,
branch
FROM employeemaster),
cte1
AS (SELECT a.firstname,
a.lastname,
a.employeenumber,
a.companynumber,
a.statusflag,
Row_number()
OVER(
partition BY a.firstname, a.lastname, a.employeenumber, a.companynumber, a.statusflag
ORDER BY (CASE WHEN a.username IS NULL THEN 1 ELSE 0 END +CASE WHEN a.branch IS NULL THEN 1 ELSE 0 END) )rn
-- add the remaining columns in case statement
FROM cte a
JOIN employeemaster b
ON a.firstname = b.firstname
AND a.lastname = b.lastname
AND a.employeenumber = b.employeenumber
AND a.companynumbe = b.companynumber
AND a.statusflag = b.statusflag)
SELECT *
FROM cte1
WHERE rn = 1
I test with MySQL and use NULL String concat to found the best record. Because LENGTH ( NULL || 'data') is 0. Only if all column not NULL some length exists. Maybe this is not perfekt.
create table EmployeeMaster
(
Record int auto_increment,
FirstName varchar(50),
LastName varchar(50),
EmployeeNumber int,
CompanyNumber int,
StatusFlag int,
UserName varchar(50),
Branch varchar(50),
PRIMARY KEY(record)
);
INSERT INTO EmployeeMaster
(
FirstName, LastName, EmployeeNumber, CompanyNumber, StatusFlag, UserName, Branch
) VALUES ('Jake', 'Jones', 1234, 1, 1, 'JJONES', 'PHX'), ('Jake', 'Jones', 1234, 1, 1, NULL, 'PHX'), ('Jake', 'Jones', 1234, 1, 1, NULL, NULL), ('Jane', 'Jones', 5678, 1, 1, 'JJONES2', NULL);
My query idea looks like this
SELECT e.*
FROM employeemaster e
JOIN ( SELECT firstname,
lastname,
employeenumber,
companynumber,
statusflag,
MAX( LENGTH ( username || branch ) ) data_quality
FROM employeemaster
GROUP BY firstname, lastname, employeenumber, companynumber, statusflag
HAVING count(*) > 1
) g
ON LENGTH ( username || branch ) = g.data_quality

Query to Append a Number to a Record it it finds a duplicate

I have an sql table with the following fields: Letter, Number, Result
Title Name Result
Mr Mark
Mr Mark
Mr Luke
Mr John
Mr John
I need to create an update query to have the result as
Title Name Result
Mr Mark MrMark
Mr Mark MrMark2
Mr Luke MrLuke
Mr John MrJohn
Mr John MrJohn2
Note that the second and the fifth record had a number 2 appended since it already found the same record (same Title and Name) previously.
Please help.
If it is MS SQL, try using ROW_NUMBER and PARTITION BY?
DECLARE #temp TABLE (Title NVARCHAR(200), Name NVARCHAR(200), Result NVARCHAR(200));
INSERT #temp
SELECT 'Mr', 'Mark', NULL
UNION ALL
SELECT 'Mr', 'Mark', NULL
UNION ALL
SELECT 'Mr', 'Luke', NULL
UNION ALL
SELECT 'Mr', 'John', NULL
UNION ALL
SELECT 'Mr', 'John', NULL
SELECT * FROM #temp
DECLARE #tempWithOrdering TABLE (RowNum INT, Title NVARCHAR(200), Name NVARCHAR(200), Result NVARCHAR(200));
INSERT #tempWithOrdering
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Title ), Title, Name, Result FROM #temp
SELECT * FROM #tempWithOrdering
SELECT
Title,
Name,
Result = (
SELECT TOP(1) Name +
CASE RowNum
WHEN t1.RowNum THEN ''
ELSE CAST(t1.RowNum AS NVARCHAR(12))
END
FROM #tempWithOrdering
WHERE Name = t1.Name
)
FROM #tempWithOrdering t1
Assuming you are using sql server and the duplicate is on the field 'Name' .Try this.
Use Analytic fn ROW_NUMBER() ;
If it is Oracle use || instead of +
WITH TEMP AS
(
SELECT Title , Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY bill_period) AS RK
FROM TABLE1
)
SELECT Title , Name,Title + Name +RK FROM TEMP;

I need to split string in select statement and insert to table

I have a data in one table. I need to copy it to another table. One of the column is text delimited string. So what I'm thinking to select all columns insert get indentity value and with subquery to split based on delimiter and insert it to another table.
Here is the data example
ID Name City Items
1 Michael Miami item|item2|item3|item4|item5
2 Jorge Hallandale item|item2|item3|item4|item5
copy Name, City to one table get identity
and split and copy Items to another table with Identity Column Value
So output should be
Users table
UserID Name City
1 Michael Miami
2 Jorge Hallandale
...
Items table
ItemID UserID Name
1 1 Item
2 1 Item2
3 1 Item3
4 1 Item4
5 2 Item
6 2 Item2
7 2 Item3
8 2 Item4
Not really sure how to do it with T-SQL. Answers with examples would be appreciated
You may create you custom function to split the string in T-Sql. You could then use the Split function as part of a JOIN with your base table to generate the final results for your INSERT statement. Have a look at this post. Hope this help.
You can do this using xml and cross apply.
See the following:
DECLARE #t table (ID int, Name varchar(20), City varchar(20), Items varchar(max));
INSERT #t
SELECT 1,'Michael','Miami' ,'item|item2|item3|item4|item5' UNION
SELECT 2,'Jorge' ,'Hallandale','item|item2|item3|item4|item5'
DECLARE #u table (UserID int identity(1,1), Name varchar(20), City varchar(20));
INSERT #u (Name, City)
SELECT DISTINCT Name, City FROM #t
DECLARE #i table (ItemID int identity(1,1), UserID int, Name varchar(20));
WITH cte_Items (Name, Items) as (
SELECT
Name
,CAST(REPLACE('<r><i>' + Items + '</i></r>','|','</i><i>') as xml) as Items
FROM
#t
)
INSERT #i (UserID, Name)
SELECT
u.UserID
,s.Name as Name
FROM
cte_Items t
CROSS APPLY (SELECT i.value('.','varchar(20)') as Name FROM t.Items.nodes('//r/i') as x(i) ) s
INNER JOIN #u u ON t.Name = u.Name
SELECT * FROM #i
See more here:
http://www.kodyaz.com/articles/t-sql-convert-split-delimeted-string-as-rows-using-xml.aspx
Can you accomplish this with recursion? My T-SQL is rusty but this may help send you in the right direction:
WITH CteList AS (
SELECT 0 AS ItemId
, 0 AS DelimPos
, 0 AS Item_Num
, CAST('' AS VARCHAR(100)) AS Item
, Items AS Remainder
FROM Table1
UNION ALL
SELECT Row_Number() OVER(ORDER BY UserID) AS ItemId
, UserID
, CASE WHEN CHARINDEX('|', Remainder) > 0
THEN CHARINDXEX('|', Remainder)
ELSE LEN(Remainder)
END AS dpos
, Item_num + 1 as Item_Num
, REPLACE(Remainder, '|', '') AS Element
, right(Remainder, dpos+1) AS Remainder
FROM CteList
WHERE dpos > 0
AND ItemNum < 20 /* Force a MAX depth for recursion */
)
SELECT ItemId
, Item
FROM CteList
WHERE item_num > 0
ORDER BY ItemID, Item_Num