Identify Row as having changes excluding changes in certain columns - sql

Within our business rules, we need to track when a row is designated as being changed. The table contains multiple columns designated as non-relevant per our business purposes (such as a date entered field, timestamp, reviewed bit field, or received bit field). The table has many columns and I'm trying to find an elegant way to determine if any of the relevant fields have changed and then record an entry in an auditing table (entering the PK value of the row - the PK cannot be edited). I don't even need to know which column actually changed (although it would be nice down the road).
I am able to accomplish it through a stored procedure, but it is an ugly SP using the following syntax for an update (OR statements shortened considerably for post):
INSERT INTO [TblSourceDataChange] (pkValue)
SELECT d.pkValue
FROM deleted d INNER JOIN inserted i ON d.pkValue=i.pkValue
WHERE ( i.[F440] <> d.[F440]
OR i.[F445] <> d.[F445]
OR i.[F450] <> d.[F450])
I'm trying to find a generic way where I could designated the ignore fields and the stored proc would still work even if I added additional relevant fields into the table. The non-relevant fields do not change very often whereas the relevant fields tend to be a little more dynamic.

Have a look at Change Data Capture. This is a new feature in SQL Server 2008.
First You enable CDC on the database:
EXEC sys.sp_cdc_enable_db
Then you can enable it on specific tables, and specify which columns to track:
EXEC sys.sp_cdc_enable_table
#source_schema = 'dbo',
#source_name = 'xxx',
#supports_net_changes = 1,
#role_name = NULL,
#captured_column_list = N'xxx1,xxx2,xxx3'
This creates a change table named cdc.dbo_xxx. Any changes made to records in the table are recorded in that table.

I object! The one word I cannot use to describe the option available is elegant. I have yet to find a satisfying way to accomplish what you want. There are options, but all of them feel a bit unsatisfactory. When/why you chose these options depends on some factors you didn't mention.
How often do you need to "ask" what fields changed? meaning, do users infrequently click on the "audit history" link? Or is this all the time to sort out how your app should behave?
How much does disk space cost you ? I'm not being flippant, but i've worked places where the storage strategy for our auditing was million dollar issue based on what we were being charged for san space -- meaning expensive for SQL server to reconstitute wasn't a consideration, storage size was. You maybe be the same or inverse.
Change Data Capture
As #TGnat mentioned you can use CDC. This method is great because you simply enable change tracking, then call the sproc to start tracking. CDC is nice because it's pretty efficient storage and horsepower wise. You also kind of set it and forget it---that is, until developers come along and want to change the shape of your tables. For developer sanity you'll want to generate a script that disables/enables tracking for your entities.
I noticed you want to exclude certain columns, rather than include them. You could accomplish this with a FOR XML PATH trick. You could write a query something like the following, then use the #capturedColList variable when calling sys.sp_cdc_enable_table ..
SET #capturedColList = SELECT Substring( (
SELECT ',' + COLUMN_Name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '<YOUR_TABLE>' AND
COLUMN_NAME NOT IN ('excludedA', 'excludedB')
FOR XML PATH( '' )
) , 2, 8000)
Triggers w/Cases
The second option I see is to have some sort of code generation. It could be an external harness or a SPROC that writes your triggers. Whatever your poison, it will need to be automated and generic. But you'll basically code that writes DDL for triggers that compare current to INSERTED or DELETED using tons of unweildy CASE statements for each column.
There is a discussion of the style here.
Log Everything, Sort it out later
The last option is to use a trigger to log every row change. Then, you write code (SPROCS/UDFs) that can look through your log data and recognize when a change has occured. Why would you choose this option? Disk space isn't a concern, and while you need to be able to understand what changed, you only rarely ask the system this question.
HTH,
-eric

Use a trigger and make sure it can handle multiple row inserts.

I found the answer in the post SQL Server Update, Get only modified fields and adapted the SQL to fit my needs (this sql is in a trigger). The SQL is posted below:
DECLARE #idTable INT
SELECT #idTable = T.id
FROM sysobjects P JOIN sysobjects T ON P.parent_obj = T.id
WHERE P.id = ##procid
IF EXISTS
(SELECT * FROM syscolumns WHERE id = #idTable
AND CONVERT(VARBINARY,REVERSE(COLUMNS_UPDATED())) & POWER(CONVERT(BIGINT, 2), colorder - 1) > 0 AND name NOT IN ('timestamp','Reviewed')
)
BEGIN
--Do appropriate stuff here
END

Related

How to update and select records in the same sql query

As a typical scenario in any prod environment, we have multiple nodes which fetches and processes items from the database (oracle).
We want to make sure that each node fetches unique set of items from database and acts on it. To make this possible we are looking whether it is possible to update the records status (for e.g., Idle to In-Process), and the same update query returning the records which it updated. In this way every node will act on its own set of records and not interfere with each others' set.
We want to avoid pl/sql due to maintenance reasons. We tried with "select for update", but in few cases it was leading to database locks getting hold up for longer period of time.
Any suggestions on how to achieve this through simple sql or hibernate (since we have hibernate option available as well)?
A couple of thoughts on this. First up in Oracle you can use the RETURNING clause as part of an update statement to return select columns (such as the primary key) from the table being updated into a collection. But this method does require PL/SQL to work since you need to work with collections, although BULK transactions will mitigate some of the drawbacks of using PL/SQL.
Another option would be to add a column to your table where you can indicate which node is processing the record(s), similar to your idea of a status column indicating Idle, or Processing statuses. This one would be NULL for not being handled, or a value uniquely identifying the node or process working on the record.
A little extra research led to this post here on Stack about using Oracles RETURNING INTO statement with Java. It also leads right back to Oracle's own documentation on the subject of Oracles DML Returning feature as supported by Java
Finally we were able to find the solution for our problem. Our problem statement: Claim the top 100 items from the list order by time of their creation. So the logic that we wanted to apply was based on FIFO. So in this case each node will pick-up top 100 items from the database and start processing on it. In this way each node will work on its own set of items, without overlapping on each others path.
We achieved this by creating a TYPE in oracle database, and then used hibernate to claim the items and store the claimed items temporarily in TYPE. Here is the code:
create type TMP_TYPE as table of VARCHAR2(1000);
//Hibernate code
String query = "BEGIN UPDATE OUR_TABLE SET OUR_TABLE_STATUS = 'IP' WHERE OUR_TABLE_STATUS = 'ID' AND ID_OUR_TABLE IN (SELECT ID_OUR_TABLE FROM (SELECT ID_OUR_TABLE FROM OUR_TABLE ORDER BY AGEING_SINCE ASC ) ) AND ROWNUM < 101 RETURNING UUID BULK COLLECT INTO ?;END;";
Connection connection = getSession().connection();
CallableStatement cStmt = connection.prepareCall(query);
cStmt = connection.prepareCall(query);
cStmt.registerOutParameter(1, Types.ARRAY, " TMP_TYPE ");
cStmt.execute();
String[] updateBulkCollectArr = (String[]) (cStmt.getArray(1).getArray());
`
Got idea from here Oracle Type and Bulk Collect
Thanks #Sentinel

Query a SQL Database & Edit the found data set

I know this question has probably been asked before I just can't manage to get mine going. I set up my SQL to have two tables but in this instance I will only be using one called 'Book'. It has various columns but the ones I want to work with is called 'WR', 'Customer', 'Contact', 'Model', 'SN', 'Status', 'Tech', 'WDone' and 'IN'.
I want to enter text into a editbox called edtWR and I want the button btnSearch to search the 'WR' column until it has a match (all of the entries will be different). Once it has that it must write 'Customer', 'Contact', 'Model', 'SN', 'Status' to labels, lets call them lblCustomer lblContact lblModel lblSN & lblStatus.
Once the person has verified that that is the 'WR' that they want the must enter text into edit boxes and one memo called edtTech, mmoWDone and edtIN and click on btnUpdate. that should then update that record.
I have 3 ADO Connections on called dtbOut thats my ADOConnection1, tableOut thats my ADOTable and dataOut thats by ADODataSet. dataOut's command text is Select * From Book if it helps.
I can get the whole process to work perfectly on a access database but with almost no experience on SQL I need help. I will add code for the Access database in case it is needed for reference.
procedure TFOut.btnSearchClick(Sender: TObject);
begin
dataout.Filter := 'WR = ''' + 'WR ' + edtwr.Text + '''';
dataout.Filtered := True;
dataout.First;
lblcustomer.Caption := 'Customer: ' + dataout.FieldByName('Customer').AsString;
lblcontact.Caption := 'Contact: ' + dataout.FieldByName('Contact').AsString;
lblSN.Caption := 'SN: ' + dataout.FieldByName('SN').AsString;
lblModel.Caption := 'Model: ' + dataout.FieldByName('Model').AsString;
lblstatus.Caption := 'Status: ' + dataout.FieldByName('Status').AsString;
procedure TFOut.btnUpdateClick(Sender: TObject);
begin
dataout.Edit;
dataout.FieldByName('Tech').AsString := edtTech.Text;
dataout.FieldByName('WDone').AsString := mmoWDone.Lines.GetText;
dataout.FieldByName('IN').AsString := edtIN.Text;
dataout.Post;
end
Do I need any additional components on my form for me to be able to do this in SQL, what do I need and how do I even start. Ive read a lot of things and it seems line I will need to get a ADOQuery1 but when it comes to the ADOQuery1.SQL part I fall off the wagon. I have also tried it the Access way and I can search but as soon as I try to update I get a "Insufficient key column information for updating or refreshing" Error, witch I also have no idea how to address.
If I need to state the question otherwise, please explain how to change to make it more clear and if I need to add anything in the whole explanation or code, please inform me of what.
SO isn't really the place for database tutorials, so I'm not going to
attempt one but instead focus on one basic thing that it's crucial to understand and get right in your database design before you even begin
to write a Delphi db app. (I'm going to talk about this in terms of
Sql Server, not MS Access.)
You mentioned getting an error "Insufficient key column information for updating or refreshing" which you said you had no idea how to address.
A Delphi dataset (of any sort, not just an ADO one) operates by maintaining a logical cursor which which points at exactly one row in the dataset. When you open a (non-empty) dataset, this cursor is pointing at the first row in the dataset and you can move the cursor around using various TDataSet methods such as Next & Prior, First, Last and MoveBy. Some, but not all, types of TDataSet implement its Locate method which enables you to go to a row which matches criteria you specify, other types, not. Delphi's ADO components do implement Locate (btw, Locate operates on rows you're already retrieved from the server, it's not for finding rows on the server).
One of the key ideas of Sql-oriented TDataSets such as TAdoQuery is that you can leave it to automatically generate Sql statements to do Updates, Deletes and Inserts. This is not only a significant productivity aid, but it avoids coding errors and omissions when you try to do it yourself.
If you observe ADO doing its stuff against an MS Sql Server table using SS's Profiler utility, then with a well-designed database, you'll find that it does this quite nicely and efficiently provided the database design follows one cardinal rule, namely that there must be a way to uniquely identify a particular row in a table. The most common way to do this is to include in each table, usually as the first column, an int(eger) ID column, and to define it as the "Primary key" of the table. Although there are other methods to generate a suitable ID value to go in this column, Sql Server has a specific column type, 'Identity' which takes care of this on the server.
Once a table has such a column, the ADO layer (which is a data-access layer provided by Windows that dataset components such as TAdoQuery sit upon) can automatically generate Sql statements to do Updates and Deletes, e.g.
Delete from Table1 where Table1ID = 999
and
Update Table1 set SomeCharField = 'SomeValue' where Table1ID = 666
and you can leave it to the AdoQuery to pick up the ID value for a newly-inserted row from the server.
One of the helpful aspects of leaving the Sql to be generated automatically is that it ensures that the Sql only affects a single row and so avoids affecting more rows than you intend.
Once you've got this key aspect of your database design correct, you'll find that Delphi's TDataSet descendants such as TAdoQuery and its DB-aware components can deal with most simple database applications without you having to write any Sql statements at all to update, insert or delete
rows. Usually, however, you do still need to write Sql statements to retrieve the rows you want from the server by using a 'Where' clause to restrict the rows retrieved to a sub-set of the rows on the server.
Maybe your next step should be to read up on parameterized Sql queries, to reduce your exposure to "Sql Injection":
https://en.wikipedia.org/wiki/SQL_injection
as it's best to get into the habit of writing Sql queries using parameters. Btw, Sql Injection isn't just about Sql being intercepted and modified when it's sent over the internet: there are forms of injection where a malicious user who knows what they're doing can simply type in some extra Sql statements where the app "expects" them simply to specify some column value as a search criterion.

Store, retrieve and update a sequence number (datatype int) in a single row of a table in SQL Server 2008

How to store, retrieve and update a sequence number in a single row of a table with a schema like:
ID (int)
LookUp(varchar)
SeqNum(int) --business logic dictates the SeqNum is constrained to a particular range, say 1300 to 7600
To me this looks like the clickercounter guy at a ball park using a clicker to ticks one off for each person that goes by. I want each person to have a unique number. I want multiple clickercounter people to use the same clicker and I don't want any missed values.
So far my approaches have either resulted in a deadlock condition leaving the table inaccessible or me scratching my head wondering about how to structure a stored procedure that calls a stored procedure that has a transaction to lock the record, read it, update it, commit the transaction, and unlock the record
In pseudo code I tried something like
From within a stored procedure:
Call getnum stored procedure
sproc getnum
begin trans
select current seqnum into a variable from Seqtbl where lookupval = 'nosebleed'
update Seqtbl.seqnum++ where lookupval = 'nosebleed'
end trans
I thought of adding a bool column bLock and then having the getnum stored procedure check if the value = false then update the lock (bLock=true) followed by a read, update, and update the lock (bLock = false) without using a transaction. But I am not convinced that ill conceived timing of multiple processes could not interfere with each other.
I do see others using identity columns to achieve similar solutions but it seems that these approaches require one table per LookUp (from the sample schema above) value.
Does anyone suggestions, strategies used to solve similar problems, guidance, or links to send me to school on the important aspects of SQL Server needed to understand a solution to this scenario?
You should get rid of deadlocks if you use just single stament:
declare #id int
update Seqtbl
set #id = seqnum, seqnum = seqnum + 1
where lookupval = 'nosebleed'
The bigger problem here is that you said that there cannot be holes in the sequence. If your actual transaction can be rolled back, then you'll have to include the sequence fetching to the same transaction to be rolled back as well and that's probably going to cause you a lot of blocking, depending on how much many calls there are.
If you're using SQL Server 2012 or newer, you should also look into sequence object, but that's not going to solve the issue with missing values either.
This is a bit long for a comment.
Why are you using a sequence for this? Your analogy to the click-counter "guy" would not suggest a sequence or identity value. Instead, it would suggest inserting the click with an identity column and/or precise creation date. A query can then be used to assign a sequential value when you need it:
select t.*, row_number() over (order by id)
from table t;
You can then use arithmetic to get the value in the range that you want.

Move SELECT to SQL Server side

I have an SQLCLR trigger. It contains a large and messy SELECT inside, with parts like:
(CASE WHEN EXISTS(SELECT * FROM INSERTED I WHERE I.ID = R.ID)
THEN '1' ELSE '0' END) AS IsUpdated -- Is selected row just added?
as well as JOINs etc. I like to have the result as a single table with all included.
Question 1. Can I move this SELECT to SQL Server side? If yes, how to do this?
Saying "move", I mean to create a stored procedure or something else that can be executed before reading dataset in while cycle.
The 2 following questions make sense only if answer is "yes".
Why do I want to move SELECT? First off, I don't like mixing SQL with C# code. At second, I suppose that server-side queries run faster, since the server have more chances to cache them.
Question 2. Am I right? Is it some sort of optimizing?
Also, the SELECT contains constant strings, but they are localizable. For instance,
WHERE R.Status = "Enabled"
"Enabled" should be changed for French, German etc. So, I want to write 2 static methods -- OnCreate and OnDestroy -- then mark them as stored procedures. When registering/unregistering my assembly on server side, just call them respectively. In OnCreate format the SELECT string, replacing {0}, {1}... with required values from the assembly resources. Then I can localize resources only, not every script.
Question 3. Is it good idea? Is there an existing attribute to mark methods to be executed by SQL Server automatically after (un)registartion an assembly?
Regards,
Well, the SQL-CLR trigger will also execute on the server, inside the server process - so that's server-side as well, no benefit there.
But I agree - triggers ought to be written in T-SQL whenever possible - no real big benefit in having triggers in C#.... can you show the the whole trigger code?? Unless it contains really odd balls stuff, it should be pretty easy to convert to T-SQL.
I don't see how you could "move" the SELECT to the SQL side and keep the rest of the code in C# - either your trigger is in T-SQL (my preference), or then it is in C#/SQL-CLR - I don't think there's any way to "mix and match".
To start with, you probably do not need to do that type of subquery inside of whatever query you are doing. The INSERTED table only has rows that have been updated (or inserted but we can assume this is an UPDATE Trigger based on the comment in your code). So you can either INNER JOIN and you will only match rows in the Table with the alias of "R" or you can LEFT JOIN and you can tell which rows in R have been updated as the ones showing NULL for all columns were not updated.
Question 1) As marc_s said below, the Trigger executes in the context of the database. But it goes beyond that. ALL database related code, including SQLCLR executes in the database. There is no client-side here. This is the issue that most people have with SQLCLR: it runs inside of the SQL Server context. And regarding wanting to call a Stored Proc from the Trigger: it can be done BUT the INSERTED and DELETED tables only exist within the context of the Trigger itself.
Question 2) It appears that this question should have started with the words "Also, the SELECT". There are two things to consider here. First, when testing for "Status" values (or any Lookup values) since this is not displayed to the user you should be using numeric values. A "status" of "Enabled" should be something like "1" so that the language is not relevant. A side benefit is that not only will storing Status values as numbers take up a lot less space, but they also compare much faster. Second is that any text that is to be displayed to the user that needs to be sensitive to language differences should be in a table so that you can pass in a LanguageId or LocaleId to get the appropriate French, German, etc. strings to display. You can set the LocaleId of the user or system in general in another table.
Question 3) If by "registration" you mean that the Assembly is either CREATED or DROPPED, then you can trap those events via DDL Triggers. You can look here for some basics:
http://msdn.microsoft.com/en-us/library/ms175941(v=SQL.90).aspx
But CREATE ASSEMBLY and DROP ASSEMBLY are events that are trappable.
If you are speaking of when Assemblies are loaded and unloaded from memory, then I do not know of a way to trap that.
Question 1.
http://www.sqlteam.com/article/stored-procedures-returning-data
Question 3.
It looks like there are no appropriate attributes, at least in Microsoft.SqlServer.Server Namespace.

SQL to search and replace in mySQL

In the process of fixing a poorly imported database with issues caused by using the wrong database encoding, or something like that.
Anyways, coming back to my question, in order to fix this issues I'm using a query of this form:
UPDATE table_name SET field_name =
replace(field_name,’search_text’,'replace_text’);
And thus, if the table I'm working on has multiple columns I have to call this query for each of the columns. And also, as there is not only one pair of things to run the find and replace on I have to call the query for each of this pairs as well.
So as you can imagine, I end up running tens of queries just to fix one table.
What I was wondering is if there is a way of either combine multiple find and replaces in one query, like, lets say, look for this set of things, and if found, replace with the corresponding pair from this other set of things.
Or if there would be a way to make a query of the form I've shown above, to run somehow recursively, for each column of a table, regardless of their name or number.
Thank you in advance for your support,
titel
Let's try and tackle each of these separately:
If the set of replacements is the same for every column in every table that you need to do this on (or there are only a couple patterns), consider creating a user-defined function that takes a varchar and returns a varchar that just calls replace(replace(#input,'search1','replace1'),'search2','replace2') nested as appropriate.
To update multiple columns at the same time you should be able to do UPDATE table_name SET field_name1 = replace(field_name1,...), field_name2 = replace(field_name2,...) or something similar.
As for running something like that for every column in every table, I'd think it would be easiest to write some code which fetches a list of columns and generates the queries to execute from that.
I don't know of a way to automatically run a search-and-replace on each column, however the problem of multiple pairs of search and replace terms in a single UPDATE query is easily solved by nesting calls to replace():
UPDATE table_name SET field_name =
replace(
replace(
replace(
field_name,
'foo',
'bar'
),
'see',
'what',
),
'I',
'mean?'
)
If you have multiple replaces of different text in the same field, I recommend that you create a table with the current values and what you want them replaced with. (Could be a temp table of some kind if this is a one-time deal; if not, make it a permanent table.) Then join to that table and do the update.
Something like:
update t1
set field1 = t2.newvalue
from table1 t1
join mycrossreferncetable t2 on t1.field1 = t2.oldvalue
Sorry didn't notice this is MySQL, the code is what I would use in SQL Server, my SQL syntax may be different but the technique would be similar.
I wrote a stored procedure that does this. I use this on a per database level, although it would be easy to abstract it to operate globally across a server.
I would just paste this inline, but it would seem that I'm too dense to figure out how to use the markdown deal, so the code is here:
http://www.anovasolutions.com/content/mysql-search-and-replace-stored-procedure