Get separate row for each column change in cdc - sql

I Created a simple Employee table with 3 fields
FirstName , LastName , CurrentPayScale
This is what I did so far
DECLARE #begin_lsn BINARY(10), #end_lsn BINARY(10) , #A NVARCHAR(100)
SET #begin_lsn = sys.fn_cdc_get_min_lsn('dbo_Employee')
SET #end_lsn = sys.fn_cdc_get_max_lsn()
SELECT __$operation AS OperationTypeId,
CASE __$operation WHEN 1 THEN 'has deleted' WHEN 2 THEN 'has added' ELSE 'has updated' END AS [Action],
( SELECT CC.column_name + ','
FROM cdc.captured_columns CC
INNER JOIN cdc.change_tables CT ON CC.[object_id] = CT.[object_id]
WHERE capture_instance = 'dbo_Employee'
AND sys.fn_cdc_is_bit_set(CC.column_ordinal, EmployeeCDC.__$update_mask) = 1
FOR XML PATH('')) AS ChangedColumns,
(SELECT FIRSTNAME + ','
FROM [cdc].[dbo_Employee_CT]
WHERE __$start_lsn = EmployeeCDC.__$start_lsn
AND __$operation = 3 FOR XML PATH('')) as 'Old Value',
EmployeeCDC.FirstName AS 'New Value'
FROM [cdc].[dbo_Employee_CT] AS EmployeeCDC
WHERE EmployeeCDC.__$operation <> 3
This query return this result
Now in third row you can see 2 columns were changed FirstName and CurrentPayScale
I just want to see result by each column, I think it can be done with pivot as far as I searched, but I also don't know how to use pivot.

Related

Create a pivot of same columns in to 1 row

I'm using a SQL Server, I've a query which return the data of all the fields, The main thing is that 1 field can belongs to multiple records, the record ID differentiate them.
I've a data set like this.
This is my current data set
My current query:
Select fd.FieldName ,FV.FieldID, Data , R.RecordID from FieldValues FV
Inner Join Records R on R.RecordID = FV.RecordID
Inner Join Forms F On f.FormID = R.FormID
Inner join Fields fd on fd.FieldID = fv.FieldID
Where R.RecordID IN (45,46)
I need to create 1 row of each columns that belongs to the same RecordID like this.
Service Name Location city VendorCode RecordID
Raj ABC LOCATION ABC CITY 32 45
BEN ABC LOCATION ABC CITY -- 46
The above is my desired output.
I've tried with pivot but have not succeeded.
If you don't like to deal with dynamic pivot and you do know the key of the rows you want to convert into columns, you can use standard sql with max and case when
select
max(case fd.FieldName when 'SelectService' then Data else null end) as ServiceName,
max(case fd.FieldName when 'EnterYourLocation' then Data else null end) as Location,
max(case fd.FieldName when 'City' then Data else null end) as city,
max(case fd.FieldName when 'VendorCodeOption' then Data else null end) as VendorCode,
R.RecordId
from FieldValues FV
Inner Join Records R on R.RecordID = FV.RecordID
Inner Join Forms F On f.FormID = R.FormID
Inner join Fields fd on fd.FieldID = fv.FieldID
where R.RecordID IN (45,46)
group by R.RecordId
This is the solution with pivot but it is missing to include adjust joins
declare #columns varchar(max) set #columns = ''
select #columns = coalesce(#columns + '[' + cast(col as varchar(MAX)) + '],', '')
FROM ( select FieldName as col from FieldValues group by FieldName ) m
set #columns = left(#columns,LEN(#columns)-1)
DECLARE #SQLString nvarchar(max);
set #SQLString = '
select * from
( select RecordId, FieldName, Data from FieldValues) m
PIVOT
( MAX(Data)
FOR FieldName in (' + #columns + ')
) AS PVT'
EXECUTE sp_executesql #SQLString

SQL Merge Data into Single Row

I have a table set up that tracks changes to a user's account.
It has ID, UserAccountNo, OldVal, NewVal, ChangeColumnName columns.
I have a query set up similar to this:
Select case
when ChangeColumnName = 'Address1' then NewVal else '' end as Address1,
when ChangeColumnName = 'Address2' then NewVal else '' end as Address2,
when ChangeColumnName = 'City' then NewVal else '' end as City,
when ChangeColumnName = 'State' then NewVal else '' end as State,
when ChangeColumnName = 'Zip' then NewVal else '' end as Zip,
when ChangeColumnName = 'Phone' then NewVal else '' end as Phone
from table
Where (Conditions)
If someone changes the city, state, and zip, there are 3 entries in the table. When I run this query, I get 3 rows returned. I would like to get them all together in one row, and haven't been able to figure out how.
When I tried using groupby with max(colname) as suggested in other posts, it gives the max NewVal value, so I end up with email addresses in Phone columns.
Is this possible to do in SQL 2008 without reforming the entire table?
Try this
create table #t
(
id int,
userAccountNo int,
oldVal varchar(255),
newVal varchar(255),
changeColName varchar(255)
);
insert #t values (1, 1, '123 main st', '123 s. main st.', 'Address1'),
(2, 1, 'Springville', 'Springfield', 'City'),
(3, 1, 'Springfield', 'N. Springfield', 'City'),
(4, 2, '12345', '12346', 'Zip');
with U as (select distinct userAccountNo from #t),
Address1 as (select userAccountNo, newVal from #t as T1 where changeColName = 'Address1' and id >=ALL
(select id from #t as T2 where T1.userAccountNo = T2.userAccountNo and T1.changeColName = T2.changeColName)),
City as (select userAccountNo, newVal from #t as T1 where changeColName = 'City' and id >=ALL
(select id from #t as T2 where T1.userAccountNo = T2.userAccountNo and T1.changeColName = T2.changeColName)),
Zip as (select userAccountNo, newVal from #t as T1 where changeColName = 'Zip' and id >=ALL
(select id from #t as T2 where T1.userAccountNo = T2.userAccountNo and T1.changeColName = T2.changeColName))
select
U.userAccountNo,
A1.newVal as [Address1],
C.newVal as [City],
Z.newVal as [Zip]
from
U
full outer join Address1 as A1 on U.userAccountNo = A1.userAccountNo
full outer join City as C on U.userAccountNo = C.userAccountNo
full outer join Zip as Z on U.userAccountNo = Z.userAccountNo;
And if it seems to work it can be extended to cover all of your columns.
I suggest that you use pivot command, use this script and let me know :
IF OBJECT_ID('_temp') IS NOT NULL DROP TABLE _temp
SELECT *
INTO _temp
FROM (
Select 'PostalCode' AS ChangeColumnName, '95100' AS NewValue UNION ALL
Select 'City' AS ChangeColumnName, 'Argenteuil' AS NewValue UNION ALL
Select 'LastName' AS ChangeColumnName, 'DAOUI' AS NewValue UNION ALL
Select 'FirstName' AS ChangeColumnName, 'Youssef' AS NewValue UNION ALL
Select 'Phone Number' AS ChangeColumnName, '00212 6 60 93 36 12' AS NewValue
) AS Temp
DECLARE #v_ListeColonnes VARCHAR(MAX) = ''
,#v_sql VARCHAR(MAX) = ''
SELECT #v_ListeColonnes = #v_ListeColonnes + ',' + QUOTENAME(ChangeColumnName)
FROM _temp
IF LEN(#v_ListeColonnes) > 1
BEGIN
SELECT #v_ListeColonnes = RIGHT(#v_ListeColonnes, LEN(#v_ListeColonnes)-1)
SET #v_sql = 'SELECT '+CHAR(13)
+' ' + #v_ListeColonnes + ' '+CHAR(13)
+'FROM _temp '+CHAR(13)
+'PIVOT (MAX(NewValue) '+CHAR(13)
+' FOR ChangeColumnName in(' + #v_ListeColonnes + ')) as pvt '+CHAR(13)
EXEC(#v_sql)
END
IF OBJECT_ID('_temp') IS NOT NULL DROP TABLE _temp
I hope this will help you.
I assumed you need one row and one column for all changes, it works for any number of columns changed.
SQL FIDDLE TEST
declare #changes as varchar(max)
declare #UserAccountNo int
set #UserAccountNo=1
set #changes=''
select #changes=#changes + ColumnChanged +'-'
from changes where UserAccountNo=#UserAccountNo
select #UserAccountNo 'UserAccountNo', #changes 'Changes'

Stored procedure that returns a table from 2 combined

I am trying to write a stored procedure which returns a result combining 2 table variables which looks something like this.
Name | LastName | course | course | course | course <- Columns
Name | LastName | DVA123 | DVA222 | nothing | nothing <- Row1
Pete Steven 200 <- Row2
Steve Lastname 50 <- Row3
From these 3 tables
Table Staff:
Name | LastName | SSN |
Steve Lastname 234
Pete Steven 132
Table Course Instance:
Course | Year | Period |
DVA123 2013 1
DVA222 2014 2
Table Attended by:
Course | SSN | Year | Period | Hours |
DVA123 234 2013 1 200
DVA222 132 2014 2 50
I am taking #year as a parameter that will decide what year in the course will be displayed in the result.
ALTER proc [dbo].[test4]
#year int
as
begin
-- I then declare the 2 tables which I will then store the values from the tables
DECLARE #Table1 TABLE(
Firstname varchar(30) NOT NULL,
Lastname varchar(30) NOT NULL
);
DECLARE #Table2 TABLE(
Course varchar(30) NULL
);
Declare #variable varchar(max) -- variable for saving the cursor value and then set the course1 to 4
I want at highest 4 results/course instances which I later order by the period of the year
declare myCursor1 CURSOR
for SELECT top 4 period from Course instance
where year = #year
open myCursor1
fetch next from myCursor1 into #variable
--print #variable
while ##fetch_status = 0
Begin
UPDATE #Table2
SET InstanceCourse1 = #variable
where current of myCursor1
fetch next from myCursor1 into #variable
print #variable
End
Close myCursor1
deallocate myCursor1
insert into #table1
select 'Firstname', 'Lastname'
insert into #table1
select Firstname, Lastname from staff order by Lastname
END
select * from #Table1 -- for testing purposes
select * from #Table2 -- for testing purposes
--Then i want to combine these tables into the output at the top
This is how far I've gotten, I don't know how to get the courses into the columns and then get the amount of hours for each staff member.
If anyone can help guide me in the right direction I would be very grateful. My idea about the cursor was to get the top (0-4) values from the top4 course periods during that year and then add them to the #table2.
Ok. This is not pretty. It is a really ugly dynamic sql, but in my testing it seems to be working. I have created an extra subquery to get the courses values as the first row and then Union with the rest of the result. The top four courses are gathered by using ROW_Number() and order by Year and period. I had to make different versions of the courses string I am creating in order to use them for both column names, and in my pivot. Give it a try. Hopefully it will work on your data as well.
DECLARE #Year INT
SET #Year = 2014
DECLARE #Query NVARCHAR(2000)
DECLARE #CoursesColumns NVARCHAR(2000)
SET #CoursesColumns = (SELECT '''' + Course + ''' as c' + CAST(ROW_NUMBER() OVER(ORDER BY Year, Period) AS nvarchar(50)) + ',' AS 'data()'
FROM AttendedBy where [Year] = #Year
for xml path(''))
SET #CoursesColumns = LEFT(#CoursesColumns, LEN(#CoursesColumns) -1)
SET #CoursesColumns =
CASE
WHEN CHARINDEX('c1', #CoursesColumns) = 0 THEN #CoursesColumns + 'NULL as c1, NULL as c2, NULL as c3, NULL as c4'
WHEN CHARINDEX('c2', #CoursesColumns) = 0 THEN #CoursesColumns + ',NULL as c2, NULL as c3, NULL as c4'
WHEN CHARINDEX('c3', #CoursesColumns) = 0 THEN #CoursesColumns + ', NULL as c3, NULL as c4'
WHEN CHARINDEX('c4', #CoursesColumns) = 0 THEN #CoursesColumns + ', NULL as c4'
ELSE #CoursesColumns
END
DECLARE #Courses NVARCHAR(2000)
SET #Courses = (SELECT Course + ' as c' + CAST(ROW_NUMBER() OVER(ORDER BY Year, Period) AS nvarchar(50)) + ',' AS 'data()'
FROM AttendedBy where [Year] = #Year
for xml path(''))
SET #Courses = LEFT(#Courses, LEN(#Courses) -1)
SET #Courses =
CASE
WHEN CHARINDEX('c1', #Courses) = 0 THEN #Courses + 'NULL as c1, NULL as c2, NULL as c3, NULL as c4'
WHEN CHARINDEX('c2', #Courses) = 0 THEN #Courses + ',NULL as c2, NULL as c3, NULL as c4'
WHEN CHARINDEX('c3', #Courses) = 0 THEN #Courses + ', NULL as c3, NULL as c4'
WHEN CHARINDEX('c4', #Courses) = 0 THEN #Courses + ', NULL as c4'
ELSE #Courses
END
DECLARE #CoursePivot NVARCHAR(2000)
SET #CoursePivot = (SELECT Course + ',' AS 'data()'
FROM AttendedBy where [Year] = #Year
for xml path(''))
SET #CoursePivot = LEFT(#CoursePivot, LEN(#CoursePivot) -1)
SET #Query = 'SELECT Name, LastName, c1, c2, c3, c4
FROM (
SELECT ''Name'' as name, ''LastName'' as lastname, ' + #CoursesColumns +
' UNION
SELECT Name, LastName,' + #Courses +
' FROM(
SELECT
s.Name
,s.LastName
,ci.Course
,ci.Year
,ci.Period
,CAST(ab.Hours AS NVARCHAR(100)) AS Hours
FROM Staff s
LEFT JOIN AttendedBy ab
ON
s.SSN = ab.SSN
LEFT JOIN CourseInstance ci
ON
ab.Course = ci.Course
WHERE ci.Year=' + CAST(#Year AS nvarchar(4)) +
' ) q
PIVOT(
MAX(Hours)
FOR
Course
IN (' + #CoursePivot + ')
)q2
)q3'
SELECT #Query
execute(#Query)
Edit: Added some where clauses so only courses from given year is shown. Added Screenshot of my results.
try this
DECLARE #CourseNameString varchar(max),
#query AS NVARCHAR(MAX);
SET #CourseNameString=''
select #CourseNameString = STUFF((SELECT distinct ',' + QUOTENAME(Course)
FROM Attended where [Year]= 2013
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = '
select Name,LastName,'+#CourseNameString+' from Staff as e inner join (
SELECT * FROM
(SELECT [Hours],a.SSN,a.Course as c FROM Attended as a inner JOIN Staff as s
ON s.SSN = s.SSN) p
PIVOT(max([Hours])FOR c IN ('+#CourseNameString+')) pvt)p
ON e.SSN = p.SSN'
execute(#query)
Use subquery like this one :
SELECT Firstname, Lastname, (select instanceCourse1 from table2) as InstanceCourse1 from Table1

Converting updated column values to a table as rows

ID State Name Department City
1 O George Sales Phoenix
1 N George Sales Denver
2 O Michael Order Process San diego
2 N Michael Marketing San jose
I got a situation that I need to convert the above tables values to the following format.(Consider the top row is column names)
ID Column OldValue New Value
1 Department Phoenix Denver
2 Department Order Process Marketing
2 City San diego San jose
I.e : I need to capture the changed column values for a table from its old and new records and record them in a different table.But the problem is we have many tables like that and the column names and no of columns are different for each table.
If anyone come with a solution that would be greatly appreciated..!
Thank you in advance.
Is this what you want?
ID Column OldValue New Value
1 City Phoenix Denver
2 Department Order Process Marketing
2 City San Diego San jose
Here is the dynamic code:
DECLARE #sqlStm varchar(max);
DECLARE #sqlSelect varchar(max);
DECLARE #tablename varchar(200);
SET #tablename = 'testtable';
-- Assume table has ID column and State column.
SET #sqlSelect = ''
SET #sqlStm = 'WITH old AS
(
SELECT *
FROM '+#tablename+'
WHERE State=''O''
), new AS
(
SELECT *
FROM '+#tablename+'
WHERE State=''N''
)';
DECLARE #aCol varchar(128)
DECLARE curCols CURSOR FOR
SELECT column_name
FROM information_schema.columns
WHERE table_name = #tablename
AND UPPER(column_name) NOT IN ('ID','STATE')
OPEN curCols
FETCH curCols INTO #aCol
WHILE (##FETCH_STATUS = 0)
BEGIN
SET #sqlStm = #sqlStm +
', changed'+#aCol+' AS
(
SELECT n.ID, '''+#aCol+''' AS [Column], o.['+#aCol+'] AS oldValue, n.['+#aCol+'] AS newValue
FROM new n
JOIN old o ON n.ID = o.ID AND n.['+#aCol+'] != o.['+#aCol+']
)'
IF LEN(#sqlSelect) > 0 SET #sqlSelect = #sqlSelect + ' UNION ALL '
SET #sqlSelect = #sqlSelect + '
SELECT * FROM changed'+#aCol
FETCH curCols INTO #aCol
END
CLOSE curCols
DEALLOCATE curCols
SET #sqlSelect = #sqlSelect + '
ORDER BY id, [Column]'
PRINT #sqlStm+#sqlSelect
EXEC (#sqlStm+#sqlSelect)
Which in my test output the following:
WITH old AS
(
SELECT *
FROM testtable
WHERE State='O'
), new AS
(
SELECT *
FROM testtable
WHERE State='N'
), changedName AS
(
SELECT n.ID, 'Name' AS [Column], o.[Name] AS oldValue, n.[Name] AS newValue
FROM new n
JOIN old o ON n.ID = o.ID AND n.[Name] != o.[Name]
), changedDepartment AS
(
SELECT n.ID, 'Department' AS [Column], o.[Department] AS oldValue, n.[Department] AS newValue
FROM new n
JOIN old o ON n.ID = o.ID AND n.[Department] != o.[Department]
), changedCity AS
(
SELECT n.ID, 'City' AS [Column], o.[City] AS oldValue, n.[City] AS newValue
FROM new n
JOIN old o ON n.ID = o.ID AND n.[City] != o.[City]
)
SELECT * FROM changedName UNION ALL
SELECT * FROM changedDepartment UNION ALL
SELECT * FROM changedCity
ORDER BY id, [Column]
Original answer below:
I would do it like this -- because I think it is clearer than other ways which might be faster:
with old as
(
Select ID, Name,Department,City
From table1
Where State='O'
), new as
(
Select ID, Name,Department,City
From table1
Where State='N'
), oldDepartment as
(
Select ID, 'Department' as Column, o.Department as oldValue, n.Department as newValue
From new
join old on new.ID = old.ID and new.Department != old.Department
), oldCity as
(
Select ID, 'City' as Column, o.City as oldValue, n.City as newValue
From new
join old on new.ID = old.ID and new.City != old.City
)
select * from oldDepartment
union all
select * from oldCity
Depending on many things (size of tables and indexes etc) it might actually be faster than using pivots or cases or grouping. It really depends on your data. If this is a one-off run I'd just go for the easiest to grok.
The cleanest approach is probably to unpivot the data and then use aggregation. This does require custom coding for each table, which you might be able to generalize by using some form a dynamic SQL.
For your particular example, here is an illustration of what to do:
select id, col,
max(case when OldNew = 'Old' then value end) as OldValue,
max(case when OldNew = 'New' then value end) as NewValue
from ((select ID, OldNew, 'Name' as col, Name as value
from t
) union all
(select ID, OldNew, 'Department' as col, Department as value
from t
) union all
(select ID, OldNew, 'City' as col, City as value
from t
)
) unpvt
group by id, col
having max(value) <> min(value) and max(value) is not null;
This is for illustration purposes. The unpivot can be done more efficiently than using union all, particularly when there are many scans. Here is a more efficient version, although the exact syntax depends on the database:
select id, col,
max(case when OldNew = 'Old' then value end) as OldValue,
max(case when OldNew = 'New' then value end) as NewValue
from (select ID, OldNew, cols.col,
(case when cols.col = 'Name' then Name
when cols.col = 'Department' then Department
when cols.col = 'City' then City
end) as value
from t cross join
(select 'Name' as col union all select 'Department' union all select 'City') cols
) unpvt
group by id, col
having max(value) <> min(value) and max(value) is not null;
This is more efficient because it will typically only scan your table once, rather than once for each column as in the union all version.
In either version, there is an implicit assumption that all the columns have the same character type. This is implicit in the format you are converting to, where all the values are in a single column.

How to count in SQL all fields with null values in one record?

Is there any way to count all fields with null values for specific record excluding PrimaryKey column?
Example:
ID Name Age City Zip
1 Alex 32 Miami NULL
2 NULL 24 NULL NULL
As output I need to get 1 and 3. Without explicitly specifying column names.
declare #T table
(
ID int,
Name varchar(10),
Age int,
City varchar(10),
Zip varchar(10)
)
insert into #T values
(1, 'Alex', 32, 'Miami', NULL),
(2, NULL, 24, NULL, NULL)
;with xmlnamespaces('http://www.w3.org/2001/XMLSchema-instance' as ns)
select ID,
(
select *
from #T as T2
where T1.ID = T2.ID
for xml path('row'), elements xsinil, type
).value('count(/row/*[#ns:nil = "true"])', 'int') as NullCount
from #T as T1
Result:
ID NullCount
----------- -----------
1 1
2 3
Update:
Here is a better version. Thanks to Martin Smith.
;with xmlnamespaces('http://www.w3.org/2001/XMLSchema-instance' as ns)
select ID,
(
select T1.*
for xml path('row'), elements xsinil, type
).value('count(/row/*[#ns:nil = "true"])', 'int') as NullCount
from #T as T1
Update:
And with a bit faster XQuery expression.
;with xmlnamespaces('http://www.w3.org/2001/XMLSchema-instance' as ns)
select ID,
(
select T1.*
for xml path('row'), elements xsinil, type
).value('count(//*/#ns:nil)', 'int') as NullCount
from #T as T1
SELECT id,
CASE WHEN Name IS NULL THEN 1 ELSE 0 END +
CASE WHEN City IS NULL THEN 1 ELSE 0 END +
CASE WHEN Zip IS NULL THEN 1 ELSE 0 END
FROM YourTable
If you do not want explicit column names in query, welcome to dynamic querying
DECLARE #sql NVARCHAR(MAX) = ''
SELECT #sql = #sql + N' CASE WHEN '+QUOTENAME(c.name)+N' IS NULL THEN 1 ELSE 0 END +'
FROM sys.tables t
JOIN sys.columns c
ON t.object_id = c.object_id
WHERE
c.is_nullable = 1
AND t.object_id = OBJECT_ID('YourTableName')
SET #sql = N'SELECT id, '+#sql +N'+0 AS Cnt FROM [YourTableName]'
EXEC(#sql)
This should solve your problem:
select count (id)
where ( isnull(Name,"") = "" or isnull(City,"") = "" or isnull(Zip,"") = "" )
Not a smart solution, but it should do the work.
DECLARE #tempSQL nvarchar(max)
SET #tempSQL = N'SELECT '
SELECT #tempSQL = #tempSQL + 'sum(case when ' + cols.name + ' is null then 1 else 0 end) "Null Values for ' + cols.name + '",
sum(case when ' + cols.name + ' is null then 0 else 1 end) "Non-Null Values for ' + cols.name + '",' FROM sys.columns cols WHERE cols.object_id = object_id('TABLE1');
SET #tempSQL = SUBSTRING(#tempSQL, 1, LEN(#tempSQL) - 1) + ' FROM TABLE1;'
EXEC sp_executesql #tempSQL