Comparing values between two rows of data and only showing the columns that are different

Comparing values between two rows of data and only showing the columns that are different - sql

In a previous application version we were using a particular field for a primary key, but because the field may represent different identities across various systems we have made it a non significant field(ie not a primary key or part of a composite primary) however since we dont have another system yet users still use that field as a primary method of identification.
The problem is with auditing...previously I used a single table to do all audits for the database dumping the data with a newvalue oldvalue schema using the generic trigger that is floating around. This could still work fine except for one thing. I have moved contactinformation into a separate table that is tied to the new primary key of the original table. So when changes are made the unfamiliar and unused primary key shows in the auditlog instead of the now insignificant foreignSystemID...
I moved to doing a one to one copy method of auditing so that any changes to any table are now written to a mirror image in a different schema. The problem comes down to showing changes to the users. They are used to seeing a report that shows only the changed values for a particular doctor...
My question would be using sql queries and Crystal reports, how could I show only the changed column values between rows in my audit tables. I have looked at the pivot command, but I dont think thats really going to help me. I had also looked at the code within the script that compares the columns and determines if they are different and writes them to the table.
im really spinning in the sand here and this is a critical issue for me to solve. Thanks in advance for ANY help...
we are early enough into production that I could change my changetracking method if need be, but it needs to be soon. thanks
EDIT:
My boss and I have worked on this a bit and this is what we have started with...I would like to get further opinions and options...as well...thanks..
CREATE TABLE #TEMP (
DoctorsID bigint,
TableName varchar(50),
FieldName varchar(50),
CurrentFieldValue varchar(255),
PreviousFieldValue varchar(255),
PreviousValueDate datetime
)
DECLARE #sql varchar(MAX)
SELECT
#sql = COALESCE(#sql,'') +
CAST(
'INSERT INTO #TEMP ' +
'SELECT ' +
'o.DoctorsID, ' +
'''' + TABLE_NAME + ''' ,' +
'''' + COLUMN_NAME + ''',' +
'o.' + COLUMN_NAME + ',' +
'a.' + COLUMN_NAME + ',' +
'a.AuditDate' +
' FROM ' +
'dbo.DoctorLicenses o ' +
'INNER JOIN Audit.DoctorLicenses a ON ' +
'o.DoctorsID = a.DoctorsID ' +
'WHERE ' +
'AuditDate BETWEEN ''10/01/2010'' AND ''10/31/2010'' AND ' +
'o.' + COLUMN_NAME + ' <> a.' + COLUMN_NAME +
';'
AS varchar(MAX))
FROM
INFORMATION_SCHEMA.COLUMNS AS [Fields]
WHERE
TABLE_SCHEMA = 'dbo' AND
TABLE_NAME = 'DoctorLicenses'
PRINT #sql
EXEC(#sql)
SELECT * FROM #TEMP
DROP TABLE #TEMP

It sounds to me like there is a design issue, but I have a hard time envisioning what your design is at the moment. Can you be more specific on what your tables look like at the moment and what data you're trying to generate the report(s) on?
Also, when talking about auditing "the changed values", how do you keep track of what's been changed?

Related

Change * in SELECT to show all columns

I'm using SQL Server Management Studio.
Let's say I have a table with 100 fields, and I want to show 75 of them, how can I show all of the columns so then I can just comment out the ones I don't want? Basically de-nest the *...
Thanks!

You can open the columns "folder" under the table in the object explorer and drag the folder to your query window. It will generate the entire list of columns with commas. Not nicely formatted since it drops all the columns on a single line but it works.
You could also use sys.columns to help. This would let you copy and paste the results into your query window.
select name + ', '
from sys.columns
where object_id = object_id('YourTableName')
There are also lots and lots of third party tools that can do this kind of thing.

SSMS support GUI tool for it , but if you don't like GUI then you can use below script
declare
#table_name varchar(200) = 'Employees',
#column_sql varchar(max) = 'select ';
select
#column_sql = #column_sql + 'T.[' + COLUMN_NAME + '],'
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME=#table_name;
select left(#column_sql,len(#column_sql)-1) + ' from ' + #table_name + ' T';
In NorthWind Employee Sample will get below result :
select
T.[EmployeeID],T.[LastName],T.[FirstName],T.[Title],
T.[TitleOfCourtesy],T.[BirthDate],T.[HireDate],
T.[Address],T.[City],T.[Region],T.[PostalCode],
T.[Country],T.[HomePhone],T.[Extension],T.[Photo],
T.[Notes],T.[ReportsTo],T.[PhotoPath]
from Employees T

SQL Update a column in a table using Trigger to a Function

I am trying to auto generate a document reference number when a new document is added to an sql table - the reference number is a concatonation of some of the other fields in that table.
Looking online i can see one method is to use the dbid function to generate a uid and then create a function to concatonate and then a trigger on the table on insert to populate the column, but ive spent numerous hours and i cant get it to work.
The table has the following columns:-
Table:- dbo.codeallocations21322
Columns :-
Dbid
Projectcode
Type
Discipline]
Hdlreference
So the hdlreference column would be populated with :-
[projectcode]-[type]-[discipline]-[bdid]
With the [bdid] set to 6 characters'.
Eg 21322-rfq-mech-000001
Any help would be much appreciated / advise a better way ?
Many thanks in advance.

You can use a computed column
alter table codeallocations21322
add hdlreference
as ( projectcode + '-' + type + '-' + discipline + '-' + dbid);

Many thanks John - altered slightly to convert the Int dbID to a varchar and works great :-
alter table CodeAllocations21322
add hdlreference2
as (ProjectCode + '-' + Type + '-' + Discipline + '-' + right('00000' + Cast (dbID AS varchar(5)), 5));
Thanks again, look forward to talking to you in the future.

How can I exclude GUIDs from a select distinct without listing all other columns in a table?

So let's say 'table' has two columns that act as a GUID - the ID column and msrepl_tran_version. Our original programmer did not know that our replication created this column and included it in a comparison, which has resulted in almost 20,000 records being put into this table, of which only 1,588 are ACTUALLY unique, and it's causing long load times.
I'm looking for a way to exclude the ID and replication columns from a select distinct, without having to then list every single column in the table, since I'm going to have to select from the record set multiple times to fix this (there are other tables affected and the query is going to be ridiculous) I don't want to have to deal with my code being messy if I can help it.
Is there a way to accomplish this without listing all of the other columns?
Select distinct {* except ID, msrepl_tran_version} from table
Other than (where COL_1 is ID and COL_N is the replication GUID)
Select distinct COL_2, ..., COL_N-1, COL_N+1, ... from table

After more searching, I found the answer:
SELECT * INTO #temp FROM table
ALTER TABLE #temp DROP COLUMN id
ALTER TABLE #temp DROP COLUMN msrepl_tran_version
SELECT DISTINCT * FROM #temp
This works for what I need. Thanks for the answers guys!

Absolutely, 100% not possible, there is no subtract columns instruction.

It can't be done in the spirit of the OP's initial question. However, it can be done with dynamic sql:
--Dynamically build list of column names.
DECLARE #ColNames NVARCHAR(MAX) = ''
SELECT #ColNames = #ColNames + '[' + c.COLUMN_NAME + '],'
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA = 'dbo'
AND c.TABLE_NAME = 'YourTable'
--Exclude these.
AND c.COLUMN_NAME NOT IN ('ID', 'msrepl_tran_version')
--Keep original column order for appearance, convenience.
ORDER BY c.ORDINAL_POSITION
--Remove trailing comma.
SET #ColNames = LEFT(#ColNames, LEN(#ColNames) - 1)
--Verify query
PRINT ('SELECT DISTINCT ' + #ColNames + ' FROM [dbo].[YourTable]')
--Uncomment when ready to proceed.
--EXEC ('SELECT DISTINCT ' + #ColNames + ' FROM [dbo].[YourTable]')
One additional note: since you need to select from the record set multiple times and potentially join to other tables, you can use the above to create a view on the table. This should make your code fairly clean.

Is there a better way to apply isnull to all columns than what I'm doing?

A number of times over the last month I've had to replace 'null' fields with '0' to every column returned from a query.
to save a lot of time (some of these are returning a high number of columns) I've been using the following and then pasting the results for relevant columns into a new query:
select ', isnull(' + COLUMN_NAME + ', 0)' + ' as ' + COLUMN_NAME
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = 'summary_by_scca_sales_category '
and TABLE_SCHEMA = 'property''
Essentially I'm wondering if there's a better way that I can do this? Ideally a method where I could automatically apply isnull to all columns being returned in a query (without using two queries).
For example:
I want to take a query like:
select *
from tablename
And for every column returned by * replace null results with 0 without having to write an isnull() line for each column.
edit:
Will accomplish this with a view (doh, should have thought of that). For interests / educations sake is there a way to do something like this with code also?

You could create a VIEW against the tables in question where the ISNULL logic you want is set up. Then queries against the views would return the data you want.
EDIT:
As requested, some sample code to accomplish creating the VIEWs automatically. This is pretty gross, but for something that only has to be run once it will work. Beware of type issues (you stated everything should transmute to 0 so I assume all your columns are of a suitable numeric type):
DECLARE #table_def varchar(max)
SET #table_def = 'CREATE VIEW <tname>_NoNull AS SELECT '
SELECT #table_def = REPLACE(#table_def, '<tname>', t.name) +
'ISNULL(' + c.name + ', 0) AS ' + c.name + ', '
FROM sys.tables t
INNER JOIN sys.columns c ON t.object_id = c.object_id
WHERE t.name = <<table name>>
SELECT #table_def

SQL Server Compare similar tables with query

Simple concept we are basically doing some auditing, comparing what came in, and what actually happened during processing. I am looking for a better way to execute a query that can do side by side table comparisons with columns that are slightly differnt in name and potentialy type.
DB Layout:
Table (* is the join condition)
Log (Un-altered data record.)
- LogID
- RecordID*
- Name
- Date
- Address
- Products
- etc.
Audit (post processing record)
- CardID*
- CarName
- DeploymentDate
- ShippingAddress
- Options
- etc.
For example this would work if you look past the annoying complexity to write, and performance issues.
The query just joins the left and right and selects them as strings. Showing each field matched up.
select
cast(log.RecordID as varchar(40)) + '=' + cast(audit.CardID as varchar(40),
log.Name+ '=' + audit.Name ,
cast(log.Date as varchar(40)) + '=' + cast(audit.DeploymentDate as varchar(40),
log.Address + '=' + audit.ShippingAddress,
log.Products+ '=' + audit.Options
--etc
from Audit audit, Log log
where audit.CardID=log.RecordId
Which would output something like:
1=1 Test=TestName 11/09/2009=11/10/2009 null=My Address null=Wheels
This works but is extremely annoying to build. Another thing I thought of was to just alias the columns, union the two tables, and order them so they would be in list form. This would allow me to see the column comparisons. This comes with the obvious overhead of the union all.
ie:
Log 1 Test 11/09/2009 null, null
Audit 1 TestName 11/10/2009 My Address Wheels
Any suggestions on a better way to audit this data?
Let me know what other questions you may have.
Additional notes. We are going to want to reduce the unimportant information so in some cases we might null the column if they are equal (but i know its too slow)
case when log.[Name]<>audit.[CarName] then (log.[Name] + '!=' + audit.[CarName]) else null end
or if we are doing the second way
nullif(log.[Name], audit.[CarName]) as [Name]
,nullif(audit.[CarName], log.[Name]) as [Name]

I've found the routine given here by Jeff Smith to be helpful for doing table comparisons in the past. This might at least give you a good base to start from. The code given on that link is:
CREATE PROCEDURE CompareTables(#table1 varchar(100),
#table2 Varchar(100), #T1ColumnList varchar(1000),
#T2ColumnList varchar(1000) = '')
AS
-- Table1, Table2 are the tables or views to compare.
-- T1ColumnList is the list of columns to compare, from table1.
-- Just list them comma-separated, like in a GROUP BY clause.
-- If T2ColumnList is not specified, it is assumed to be the same
-- as T1ColumnList. Otherwise, list the columns of Table2 in
-- the same order as the columns in table1 that you wish to compare.
--
-- The result is all records from either table that do NOT match
-- the other table, along with which table the record is from.
declare #SQL varchar(8000);
IF #t2ColumnList = '' SET #T2ColumnList = #T1ColumnList
set #SQL = 'SELECT ''' + #table1 + ''' AS TableName, ' + #t1ColumnList +
' FROM ' + #Table1 + ' UNION ALL SELECT ''' + #table2 + ''' As TableName, ' +
#t2ColumnList + ' FROM ' + #Table2
set #SQL = 'SELECT Max(TableName) as TableName, ' + #t1ColumnList +
' FROM (' + #SQL + ') A GROUP BY ' + #t1ColumnList +
' HAVING COUNT(*) = 1'
exec ( #SQL)

Would something like this work for you:
select
(Case when log.RecordID = audit.CardID THEN 1 else 0) as RecordIdEqual,
(Case when log.Name = audit.Name THEN 1 else 0) as NamesEqual ,
(Case when log.Date = audit.DeploymentDate THEN 1 else 0) as DatesEqual,
(Case when log.Address = audit.ShippingAddress THEN 1 else 0) as AddressEqual,
(Case when log.Products = audit.Options THEN 1 else 0) as ProductsEqual
--etc
from Audit audit, Log log
where audit.CardID=log.RecordId
This will give you a break down of what's equal based on the column name. Seems like it might be easier than doing all the casting and having to interpret the resulting string...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Comparing values between two rows of data and only showing the columns that are different - sql

Related

Change * in SELECT to show all columns

SQL Update a column in a table using Trigger to a Function

How can I exclude GUIDs from a select distinct without listing all other columns in a table?

Is there a better way to apply isnull to all columns than what I'm doing?

SQL Server Compare similar tables with query

Categories

Resources