Change History tracking and reporting - sql

We're not allowed to use CDC.
We have a requirement to report changes made to a table, in the format of:
On [This Date], user [UserName] changed the field [FieldName] from
[OldValue] to [New Value]
My idea is to use an Update/Insert trigger on the table, call it TableA, and write the row to a new TableA_Tracking tablem which ahs the same columns, as well as a foreign key to the source table.
TableA has a 'LastUpdatedByUserId' as well as a 'LastUpdateDate' column.
Storing the data with the trigger is OK. However, I'm wondering if there is an efficient way to get the data back so that I can report it back to the application.
Is there a pattern I could follow for extracting the data into a table format, and return that to the UI for formatting?
I am thinking, something on the lines of:
WITH Track_CTE (
Placement_TrackID,
PlacementId,
PlacementEventId,
CarerId,
FosterCareAllowanceFlag,
InterstateAllowanceAmount,
FosterCareAllowanceReason,
FosterCareAllowanceDate,
InterstateAllowanceFlag,
LastUpdateUser,
LastUpdateDate
)
AS
(
SELECT
Placement_TrackID,
PlacementId,
PlacementEventId,
CarerId,
FosterCareAllowanceFlag,
InterstateAllowanceAmount,
FosterCareAllowanceReason,
FosterCareAllowanceDate,
InterstateAllowanceFlag,
LastUpdateUser,
LastUpdateDate
FROM
[Placement_Track]
)
SELECT *
FROM Track_CTE c1
LEFT JOIN Track_CTE c2
ON c2.Placement_TrackID = c1.Placement_TrackID - 1
Where Placement_Track is a table that is a direct copy of the source table, except for the PK (First column). The table is written to by a trigger on updates and inserts.
This then has a row of the updated version, and the previous version... and from there, maybe work out the changes? But, I may be way off track.
I'd filter, in the above example, on PlacementId, as that's the PK of the source table, so the selection would be more limited. Also, in this example, the only columns I am tracking are FosterCareAllowanceFlag, InterstateAllowanceAmount, FosterCareAllowanceReason, FosterCareAllowanceDate and InterstateAllowanceFlag.

Whenever a table row changes, old data is available within a trigger in the system temporary table ##deleted & new data is available from the table ##inserted.
So if Table X (id int, col1 varchar(5), col2 varchar(100)) contains:
id = 1, col1 = 'Code', col2 = 'Descrition', then this data is updated via:
UPDATE X set col2 = 'Description' WHERE id = 1;
table ##deleted contains id = 1, col1 = 'Code', col2 = 'Descrition'
table ##inserted contains id = 1, col1 = 'Code', col2 = 'Description'
you could figure out how many columns there are in the table total, (provided the name of the table is defined in a variable inside the trigger) using INFORMATION_SCHEMA.COLUMNS, and with this info, it should be no issue to loop through the columns of the table, comparing ##deleted with ##inserted to generate the audit data you require.

Related

using Polybase and Stored procedure for updating dbo.table from several external tables

I need some help in this..
I have 3 external tables:
create external table ext.titanic
(
PassengerId INT,
Pclass INT,
Pname VARCHAR(100),
Gender VARCHAR(20),
Ticket VARCHAR(30),
Cabin VARCHAR(30)
)
WITH (LOCATION='/titanic.csv',
DATA_SOURCE = blob1,
FILE_FORMAT = TextFileFormat1,
);
create external table ext.titanic2
(
Pclass INT,
Pname VARCHAR(100)
)
WITH (LOCATION='/titanic2.csv',
DATA_SOURCE = blob1,
FILE_FORMAT = TextFileFormat1,
);
create external table ext.titanic3
(
PassengerId INT,
Pname VARCHAR(100),
)
WITH (LOCATION='/titanic3.csv',
DATA_SOURCE = blob1,
FILE_FORMAT = TextFileFormat1,
);
and i have dbo table created:
CREATE TABLE dbo.titanic
WITH
(
DISTRIBUTION = ROUND_ROBIN
)
AS
SELECT
titanic.PassengerId,
titanic.Pclass,
titanic.Pname,
titanic.Gender,
titanic.Ticket,
titanic.Cabin,
titanic3.PassengerId as T3_PassengerId,
titanic3.Pname as T3_Pname,
titanic2.Pclass as T2_Pclass,
titanic2.Pname as T2_Pname
FROM ext.titanic
FULL JOIN ext.titanic2 ON ext.titanic2.PassengerId=ext.titanic.PassengerId
FULL JOIN ext.titanic3 ON ext.titanic3.Pclass=ext.titanic.Pclass;
I have to join them and update the dbo.titanic with a stored procedure
do i need additional ext.table to join them there and after that to merge it with dbo.titanic?
or there is a easy and simple way to do that?
also i need more help for the dbo.titanic and joins..
there are more unique PassengerIds in titanic3 than in titanic,
but i need all PassengerIds from the 2 tables to be in one column.. same for Pclass from both tables... that is bugging me
just for reference - titanic table has around 100000 rows(800 unique PassengerIDs) and titanic2 and titanic3 have 5000 unique (total)rows for PassengerId and Pclass.
The final table must look like dbo.titanic but without T3_PassengerId and T2_Pclass as they must be merged somehow in the PassengerId and Pclass.
I lost a lot of time looking for something like that, but didn't find anything close enough.
This is the best I could find:
https://www.sqlservercentral.com/articles/access-external-data-from-azure-synapse-analytics-using-polybase
and I want to thank the guy that wrote this,
but to use it, I have 3 main issues :
there are no 3 external tables with different columns that need to be joined
there is no update so this can be used after the creation of the tables.(as I understand update cant be used with external tables)
there is no stored procedure used for this update.
Can I use something like this
INSERT INTO table1(column1, column2,...) SELECT column1, column2,... FROM table2 WHERE condition( compare value in table1 <> value in table 2)
thanks in advance
You must not create another ext.table; the way Polybase works is that it will load all data to Temp tables and then it can be merged to dbo.titanic.
Perform a left/right join if the tables don't have the same IDs but you need all of them.
Use the following code, then it will be easy to create the SP:
;WITH [MyCTE] AS (SELECT ...) UPDATE dbo.titanic SET ...;
You can't update using Polybase, you will have to create a new file i.e. titanic4.csv which has the records joined.
Please try and update with your progress, so I can help you further.
I got this...
may be not the most elegant way but it works.. using left join, with additional stg.titanic table (same as dbo.titanic) that combines the 3 external tables.. then merge stg. and dbo. tables..
MERGE dbo.titanic AS [Target]
USING (SELECT
column1,2,3
UpdateTime
from stg.titanic) AS [Source]
ON [Target].PassengerId = [Source].PassengerId
and [Target].Pclass = [Source].Pclass
and [Target].Pname = [Source].Pname --- specifies the condition
WHEN MATCHED THEN
UPDATE SET [Target].UpdateTime = GetDate()
WHEN NOT MATCHED THEN
INSERT (column1,2,3 --- when one of the 3 conditions is not met then insert new row
UpdateTime)
VALUES (
[Source].column1,2,3
[Source].UpdateTime
);
if someone knows a better way it will be good to share with us
Thanks.

Record should only be loaded to target on a scenario

I have two tables a stage table and a target table. I want my target table to hold valid CustomerScore values. Currently, we insert into staging and load to our target table. We do not want to load invalid values(-8.0000). However, if there is a customerNumber with a valid value in our target table we would like to decommission numbers by giving it a customerScore of (-8.0000). This should be the only time this value makes it into the target table, so a record for that CustomerNumber has to already be in the target for this to update that record currently in the target table. My create statement is below
CREATE TABLE stg.CustomerAppreciation (
CustomerId INT identity(1, 1)
,CustomerNumber VARCHAR(50)
,CustomerScore DECIMAL(5, 4)
);
CREATE TABLE ods.CustomerAppreciation (
CustomerId INT identity(1, 1)
,CustomerNumber VARCHAR(50)
,CustomerScore DECIMAL(5, 4)
);
Currently, my target table has two records, each value below belongs to my create table fields.
1 123 0.8468
2 143 1.0342
Now say we want to decommission CustomerID = 2 because there is a record been inserted into staging as
3 143 -8.0000
The target table should now be updated on this CustomerNumber. Making my target table look like:
1 123 0.8468
2 143 -8.0000
This should be the only time we allow -8.0000 into the table when a CustomerNumber already exists. If a customerNumber does not exists in the target table and for some reason -8.0000 is seen in staging it should not be allowed in. How would I write an update query that updates a record in my target table only if that scenario exists and prevents -8.0000 from coming in if it does not exist?
Assuming the staging table only contains one row per customer number (if not, group it to show the highest customer Id), you can use a merge to perform this function. Without checking exact syntax, something like this:
MERGE ods.CustomerAppreciation AS Target
USING (SELECT * FROM stg.CustomerAppreciation WHERE CustomerScore >= 0) AS Source ON Target.CustomerNumber = Source.CustomerNumber
WHEN MATCHED
-- choose your match criteria here
--AND Source.CustomerId > Target.CustomerId
AND NOT EXISTS (SELECT Target.* INTERSECT SELECT Source.*)
THEN UPDATE
SET Target.CustomerScore = Source.CustomerScore;
Not sure if I fully understand the specifics but here is some syntax that should help to at least get you started ...
BEGIN;
MERGE ods.CustomerAppreciation AS X
USING (SELECT CustomerNumber,CustomerScore FROM stg.CustomerAppreciation) AS Y (CustomerNumber,CustomerScore)
ON (X.CustomerNumber = Y.CustomerNumber)
WHEN MATCHED /*AND Y.CustomerNumber = '-8.0000'*/ THEN
UPDATE SET CustomerScore = Y.CustomerScore
WHEN NOT MATCHED BY X /*AND Y.CustomerNumber = '-8.0000'*/ THEN
INSERT (CustomerNumber,CustomerScore)
VALUES (Y.CustomerNumber,Y.CustomerScore)
OUTPUT $action, inserted.* INTO #MyTempTable;
END;

Dynamically Updating Columns with new Data

I am handling an SQL table with over 10K+ Values, essentially it controls updating the status of a production station over the day. Currently the SQL server will report a new message at the current time stamp - ergo a new entry can be generated for the same part hundreds of times a day whilst only having the column "Production_Status" and "TimeStamp" changed. I want to create a new table that selects unique part names then have two other columns that control bringing up the LATEST entry for THAT part.
I have currently selected the data - reordered it so the latest timestamp is first on the list. I am currently trying to do this dynamic table but I am new to sql.
select dateTimeStamp,partNumber,lineStatus
from tblPLCData
where lineStatus like '_ Zone %' or lineStatus = 'Production'
order by dateTimeStamp desc;
The Expected results should be a NewTable with the row count being based off how many parts are in our total production facility - this column will be static - then two other columns that will check Originaltable for the latest status and timestamp and update the two other columns in the newTable.
I don't need help with the table creation but more the logic that surrounds the updating of rows based off of another table.
Much Appreciated.
It looks like you could take advantage of a sub join that finds the MAX statusDate for each partNumber, then joins back to itself so that you can get the corresponding lineStatus value that corresponds to the record with the max date. I just have you inserting/updating a temp table but this can be the general approach you could take.
-- New table that might already exist in your db, I am creating one here
declare #NewTable(
partNumber int,
lineStatus varchar(max),
last_update datetime
)
-- To initially set up your table or to update your table later with new part numbers that were not added before
insert into #NewTable
select tpd.partNumber, tpd.lineStatus, tpd.lineStatusdate
from tblPLCData tpd
join (
select partNumber, MAX(lineStatusdate) lineStatusDateMax
from tblPLCData
group by partNumber
) maxStatusDate on tpd.partNumber = maxStatusDate.partNumber
and tpd.lineStatusdate = maxStatusDate.lineStatusDateMax
left join #NewTable nt on tbd.partNumber = nt.partNumber
where tpd.lineStatus like '_ Zone %' or tpd.lineStatus = 'Production' and nt.partNumber is null
-- To update your table whenever you deem it necessary to refresh it. I try to avoid triggers in my dbs
update nt set nt.lineStatus = tpd.lineStatus, nt.lineStatusdate = tpd.lineStatusDate
from tblPLCData tpd
join (
select partNumber, MAX(lineStatusdate) lineStatusDateMax
from tblPLCData
group by partNumber
) maxStatusDate on tpd.partNumber = maxStatusDate.partNumber
and tpd.lineStatusdate = maxStatusDate.lineStatusDateMax
join #NewTable nt on tbd.partNumber = nt.partNumber
where tpd.lineStatus like '_ Zone %' or tpd.lineStatus = 'Production'

SQL Server, updating item quantities of new items that are replacing old items

I have a CSV with two columns OldItem and NewItem; each column holds a list of integers. Note - the CSV will hold around 1,000 rows.
OldItem | NewItem
-----------------
1021669 | 1167467
1021680 | 1167468
1021712 | 1167466
1049043 | 1000062
We have old items in the system that are being replaced by the new items and we would like to capture the current quantity of the first OldItem and assign it to the first NewItem, quantity of second OldItem assigned to quantity of third OldItem, etc.
The other fun part of the issue is that the Item Numbers that are in the spreadsheet don't match up to the item numbers associated with the quantities, there's a translation table in the system called Alias.
Here are the tables and columns we're interacting with:
table Alias (essentially a translation table)
column Alias (the numbers in the spreadsheet)
column ItemID (the numbers in table "Items" that hold the quantities)
table Items (this holds all the items, new and old)
column ItemID
column Quantity
The only way I can think of doing this is doing a foreach on every OldItem like this, pseudo-code incoming:
foreach OldItem (Select Alias.ItemID WHERE Alias.Alias = OldItem)
then somehow, as I don't know how to return and use that result in SQL:
Select Item.Quantity where Item.ItemID = Alias.ItemID.
At this point I have the quantity that I want, now I have to reference back to the CSV, find the NewItem associated with the OldItem, and do this all over again with the NewItem and then update the NewItem Quantity to the one I found from the OldItem.
-dizzy-
Please help, I could solve this problem by wrapping SQL in PowerShell to handle the logical bits but it has severe performance consequences and I have to do this on MANY databases remotely with very bad network connections!
Given that you have connectivity issues, I suggest the following:
Create a working table in your database
Import your CSV into the working table
Run a script that copies aliases and quantities into the working table. Not required but helps with auditing
Run a script that validates the data
Run a script that copies required data into Items
It's important to note that this assumes that olditems are unique, and only ever map to one new item. There is a checks in the 'testing section' for that
Create a working table
Open SQL Server Management Studio and run this script in your database (choose it in the dropdown)
-- Create a schema to hold working tables that aren't required by the application
CREATE SCHEMA adm;
-- Now create a table in this schema
IF EXISTS (SELECT * FROM sys.objects WHERE name = 'ItemTransfer'
AND type = 'U'
AND schema_id = SCHEMA_ID('adm'))
DROP TABLE adm.ItemTransfer;
CREATE TABLE adm.ItemTransfer (
OldItem INT NOT NULL,
NewItem INT NOT NULL,
OldAlias VARCHAR(50) NULL,
NewAlias VARCHAR(50) NULL,
OldQuantity NUMERIC(19,2) NULL
);
Import the CSV data
There are a number of ways to do this. Your constraint is your unreliable network, and how comfortable you are troubleshooting unfamiliar tools. Here is one method that can be rerun without causing duplicates:
Open your CSV in excel and paste this monstrosity into in column 3, row 2:
="INSERT INTO adm.ItemTransfer (OldItem, NewItem) SELECT " & A2 & "," & B2 & " WHERE NOT EXISTS (SELECT * FROM adm.ItemTransfer WHERE OldItem=" & A2 & " AND NewItem=" & B2 & ");"
This will generate an insert statement for that data. Drag it down to generate all insert statements. There will be a bunch of lines that look something like this:
INSERT INTO adm.ItemTransfer (OldItem, NewItem) SELECT 1,2 WHERE NOT EXISTS (SELECT * FROM adm.ItemTransfer WHERE OldItem=1 AND NewItem=2);
Copy/paste this string of inserts into SQL Server Management Studio and run it. It should insert all of the data into your working table.
I also suggest that you save this file to a .SQL file. This insert statement only inserts if the record isn't already there, so it can be rerun.
Note: There are many ways to import data into SQL Server. the next easiest way is to right click on the database / tasks / import flat file, but it's more complicated to stop duplicates / restarting import
Now you can run SELECT * FROM adm.ItemTransfer and you should see all of your records
Map Alias and Qty
This step can actually be done on the fly but lets just write them into the working table as it will allow us to audit afterwards
These two scripts copy the alias into the working table:
UPDATE adm.ItemTransfer
SET OldAlias = SRC.Alias
FROM
adm.ItemTransfer TGT
INNER JOIN
Alias SRC
ON TGT.OldItem = SRC.ItemID;
UPDATE adm.ItemTransfer
SET NewAlias = SRC.Alias
FROM
adm.ItemTransfer TGT
INNER JOIN
Alias SRC
ON TGT.NewItem = SRC.ItemID;
This one copies in the old item quantity
UPDATE adm.ItemTransfer
SET OldQuantity = SRC.Quantity
FROM
adm.ItemTransfer TGT
INNER JOIN
Items SRC
ON TGT.OldAlias = SRC.ItemID;
After these steps, again run the select statement to inspect.
Pre update check
Before you actually do the update you should check data consistency
Count of records in the staging table:
SELECT
COUNT(*) AS TableCount,
COUNT(DISTINCT OldAlias) UniqueOldAlias,
COUNT(DISTINCT NewAlias) UniqueNewAlias,
FROM adm.ItemTransfer
The numbers should all be the same and should match the CSV record count. If not you have a problem as you are missing records or you are not mapping one to one
This select shows you old items missing an alias:
SELECT * FROM adm.ItemTransfer WHERE OldAlias IS NULL
This select shows you new items missing an alias:
SELECT * FROM adm.ItemTransfer WHERE NewAlias IS NULL
This select shows you old items missing from the item table
SELECT *
FROM adm.ItemTransfer T
WHERE NOT EXISTS (
SELECT * FROM Items I WHERE I.ItemID = T.OldItem)
This select shows you new items missing from the item table
SELECT *
FROM adm.ItemTransfer T
WHERE NOT EXISTS (
SELECT * FROM Items I WHERE I.ItemID = T.NewItem)
Backup the table and do the update
First backup the table inside the database like this:
SELECT *
INTO adm.Items_<dateandtime>
FROM Items
This script makes a copy of the Items table before you update it. You can delete it later if you like
The actual update is pretty simple because we worked it all out in the working table beforehand:
UPDATE Items
SET Quantity = SRC.OldQuantity
FROM Items TGT
INNER JOIN
adm.ItemTransfer SRC
ON SRC.NewAlias = TGT.ItemID;
Summary
All of this can be bundled up into a script and automated if required. As is, you should save all working files to a SQL file, as well as the outputs from the SELECT test statements

Forming a record of a table from separated values?

*sorry, I couldn't find a better title for the question
ـــــــــــــــــــ
I'm implementing an audit feature, currently using the Audit tables provided by EntityFramework-Plus,
There are two tables, one that tracks the modification type, and in which entity (every record represents an update by the user):
[AuditEntryID]
,[EntitySetName]
,[EntityTypeName]
,[State]
,[StateName]
,[CreatedBy]
,[CreatedDate]
, and in the other table the corresponding changes for each modification, the most important columns are the OldValue and the NewValue for a specific PropertyName:
[AuditEntryPropertyID]
,[AuditEntryID]
,[RelationName]
,[PropertyName]
,[OldValue]
,[NewValue]
So to select all edits done on a specific table and for a specific Id, I run this query:
SELECT * FROM dbo.[AuditEntryProperties]
WHERE AuditEntryID IN (
SELECT AuditEntryID
FROM dbo.[AuditEntryProperties]
WHERE NewValue = '8b5f8272-8663-451d-8bf8-45d7d5db1529' AND PropertyName = 'CountryId')
AND AuditEntryID IN (
SELECT AuditEntryID
FROM dbo.[AuditEntries]
WHERE EntitySetName = 'TbCountries' )
Since the PK is inserted each time whether it's edit or delete, in the above query I select all the history of the given Id.
What I need is that I want to implement feature that the user can go back to a specific record at specific time.
I drew this in paint, it depict the idea:
the first row is insert state , the second row there are edit on col1 and col2 , the third row is and edit on col1 and col3..etc.
the col1 is a primary key so its inserted every time (it's value doesn't change!)
Now the most recent record is the mod6, and I need to go back to the mod3, so I'll take the col1 value from mod3 and the col2 value from mod2, col3 from mod3, col4 from mod1.
my problem is how to form a full record of type TbCountries from the table [AuditEntryProperties] at a specific AuditEntryID?
You can get the list of properties and values using row_number():
select aep.*
from (select ae.EntitySetName, aep.propertyname, aep.newvalue,
row_number() over (partition by ae.EntitySetName, aep.propertyname
order by ae.createddate desc
) as seqnum
from dbo.AuditEntries ae join
dbo.AuditEntryProperties aep
on ae.AuditEntryID = aep.AuditEntryID
where ae.createddate < #date and ae.EntitySetName = #tbl
) aep
where seqnum = 1;
This gives the values as one-per-row. You can then pivot or use conditional aggregation if you want them on a single row.