Looking for an alternative method to this query - sql

There are 2 tables, a Component table and a Log table. The component table holds the actual(current) value description and a timestamp when it was last updated.
The Log table contains a component ID that references to wich component it belongs:
Component:
Id
Actual
LastUpdated
Log:
Id
ComponentId
Value
Timestamp
the query that used to work but currently lock the table looks like this.
update Component set Actual =
(select top 1 Value from Log
where Component.Id = ComponentId
order by Id desc),
LastUpdated =
(select top 1 TimeStamp from Log
where Component.Id = ComponentId
order by Id desc)
Both the log and component tables are growing and this query can't keep up like it used to be able to do. there are around 80 components now and a couple of million records.
Is it possible to work in a way like this and just improve the query or is the entire approach wrong?
ps the devices that send the data don't have an reliable system time and therefor having them update the component table leads to inconsistency. When inserting a log i take the system time on the SQL server(default value)
EDIT:
taking a suggestion from the awnsers im trying to create a trigger on log to automaticly update Component when a log is created.
CREATE TRIGGER trg_log_ins
ON Log
AFTER INSERT
AS
BEGIN
update Component
SET Actual = (SELECT i.value FROM inserted as i),
LastUpdated = (SELECT i.Timestamp FROM inserted as i);
END;
but for some reason the query doesn't finish and keeps executing.

I think you're going about this all wrong. A better solution would be a trigger on the Component table, that inserts into the Log table whenever a Component is inserted or updated.
CREATE TRIGGER trg_component_biu
ON Component
AFTER INSERT, UPDATE
AS
BEGIN
INSERT INTO Log(
ComponentId,
Value,
Timestamp
)
SELECT
Id,
Actual,
LastUpdated
FROM inserted;
END;

You can do it by using ROW_NUMBER() like this:
UPDATE t1
SET t1.Actual = t2.value,
t1.LastUpdated = t2.TimeStamp
FROM Component t1
INNER JOIN (SELECT log.*,ROW_NUMBER() OVER (PARTITION BY log.componentID order by log.ID DESC) as rnk
FROM log) t2
ON(t2.componentID = t1.id and t2.rnk = 1)

Based on TOP 1 in your query I guess you are using SQL SERVER. In SQL Server you can use OUTER APPLY
UPDATE c
SET c.Actual = cs.Value,
c.LastUpdated = cs.TimeStamp
FROM Component C
OUTER apply (SELECT TOP 1 TimeStamp,
ComponentId
FROM Log l
WHERE c.Id = l.ComponentId
ORDER BY Id DESC) cs
Adding a non-clustered index on Log table ID column and include TimeStamp,ComponentId will improve query performance
Another way is using ROW_NUMBER and LEFT OUTER JOIN
UPDATE c
SET c.Actual = cs.Value,
c.LastUpdated = cs.TimeStamp
FROM Component C
LEFT OUTER JOIN (SELECT Row_number()OVER(partition BY ComponentId
ORDER BY id DESC) rn,*
FROM Log) cs
ON cs.ComponentId = c.id
AND cs.rn = 1

All data in your Component table is coming from the Log table. Instead of making Component an actual table, you can make it a view, indexed if necessary.
CREATE VIEW Component
WITH SCHEMABINDING
AS
SELECT
ComponentId AS Id,
FIRST_VALUE(Value)
OVER(PARTITION BY ComponentId
ORDER BY Timestamp DESC)
AS Actual,
MAX(Timestamp) AS LastUpdated
FROM Log
GROUP BY ComponentId;

If you are going to use trigger on the Log table it has to work even if several rows are inserted. Here is one possible variant.
Also, this variant would not capture values for a new ComponentID that doesn't exist in the Component table yet.
If there is a possibility that such values would be inserted into the Log table, I'd use MERGE instead of simple UPDATE.
CREATE TRIGGER trg_log_ins
ON Log
AFTER INSERT
AS
BEGIN
WITH
CTE
AS
(
SELECT
Component.Actual AS OldValue
,Component.LastUpdated AS OldTimestamp
,inserted.Value AS NewValue
,inserted.Timestamp AS NewTimestamp
FROM
Component
INNER JOIN inserted ON inserted.ComponentID = Component.ID
)
UPDATE CTE
SET
OldValue = NewValue,
OldTimestamp = NewTimestamp
;
END
Also, if it is possible to insert into Log several rows with the same ComponentID in the same INSERT statement, you'd better choose explicitly which value to use for update. Likely, the one with the latest Timestamp.
So, the query becomes more complicated:
CREATE TRIGGER trg_log_ins
ON Log
AFTER INSERT
AS
BEGIN
WITH
CTE_InsertedRowNumbers
AS
(
SELECT
inserted.ComponentID
,inserted.Value AS NewValue
,inserted.Timestamp AS NewTimestamp
,ROW_NUMBER() OVER (
PARTITION BY inserted.ComponentID
ORDER BY inserted.Timestamp DESC, inserted.ID DESC) AS rn
FROM inserted
)
,CTE_LatestInsertedComponents
AS
(
SELECT
ComponentID
,NewValue
,NewTimestamp
FROM CTE_InsertedRowNumbers
WHERE rn = 1
)
,CTE
AS
(
SELECT
Component.Actual AS OldValue
,Component.LastUpdated AS OldTimestamp
,CTE_LatestInsertedComponents.NewValue
,CTE_LatestInsertedComponents.NewTimestamp
FROM
Component
INNER JOIN CTE_LatestInsertedComponents
ON CTE_LatestInsertedComponents.ComponentID = Component.ID
)
UPDATE CTE
SET
OldValue = NewValue,
OldTimestamp = NewTimestamp
;
END

Related

Copying the rows within the same table

I need to move what's been appended at the end of my table to its very beginning, however
the same record is being copied into destination.
In other words, between the ids 1 and 3567 I only have the record from the id 3567 repeated until the end. I believe that my outer and even inner sub-query lacks something ?
Thanks for the hint
Query:
UPDATE dbo.TABLE
SET Xwgs = dt.Xwgs, Ywgs = dt.Ywgs
FROM
(
SELECT
Xwgs,
Ywgs
FROM dbo.TABLE
WHERE
Id BETWEEN 3567 AND 7243
) dt
WHERE
Id BETWEEN 1 AND 3566
Is this what you want?
update t
set xwgs = dt.xwgs, ywgs = dt.ywgs
from mytable t
inner join (
select xwgs, ywgs
from mytable
where id between 3567 and 7243
) dt
on t.id = dt.id - 3566
The main difference with your query is that it properly correlates the target table and the derived table.
Note that this does not actually move the rows; all it does is copy the values from the upper bucket to the corresponding value in the lower bucket.
You know that You can always sort Your table with ORDER BY id DESC right?
Sometimes its needed do something strange. I do it like that:
Copy the whole table into a temp table (it may be #temporary table)
Drop or Truncate or Delete records from that table
Insert those records again from my temp table
Drop temp table
But an UPDATE is also a solution.
Tip: You can allow inserting values into identity (autoincreament) id column with SET IDENTITY_INSERT
SELECT *
INTO tmp__MyTable -- this will create a new table
FROM MyTable
ORDER BY id
DELETE FROM dbo.MyTable -- will throw an error on foreign keys conflicts
INSERT INTO MyTable (col,col2) -- column list here
SELECT col,col2
FROM tmp__MyTable
ORDER BY id DESC
-- or something like that:
-- ORDER BY CASE WHEN id <= 3566 THEN -id ELSE id END
-- DROP TABLE tmp__MyTable

How to use a special while loop in tsql, do while numeric

I'm loading some quite nasty data through Azure data factory
This is how the data looks after being loaded, existing of 2 parts:
1. Metadata of a test
2. Actual measurements of the test -> the measurement is numeric
Image I have about 10 times such 'packages' of 1.Metadata + 2.Measurements
What I would like it to be / what I'm looking for is the following:
The number column with 1,2,.... is what I'm looking for!
Imagine my screenshot could go no further but this goes along until id=10
I guess a while loop is necessary here...
Query before:
SELECT Field1 FROM Input
Query after:
SELECT GeneratedId, Field1 FROM Input
Thanks a lot in advance!
EDIT: added a hint:
Here is a solution, this requires SQL-SERVER 2012 or later.
Start by getting an Id column on your data. If you can do this previous to the script that would be even better, but if not, try something like this...
CREATE TABLE #InputTable (
Id INT IDENTITY(1, 1),
TestData NVARCHAR(MAX) )
INSERT INTO #InputTable (TestData)
SELECT Field1 FROM Input
Now create a query to get the GeneratedId of each package as well as the Id where they start and end. You can do this by getting all the records LIKE 'title%' since that is the first record of each package, then using ROW_NUMBER, Id, and LEAD for the GeneratedId, StartId, and EndId respectively.
SELECT
GeneratedId = ROW_NUMBER() OVER(ORDER BY (Id)),
StartId = Id,
EndId = LEAD(Id) OVER (ORDER BY (Id))
FROM #InputTable
WHERE TestData LIKE 'title%'
Lastly, join this to the input in order to get all the records, with the correct GeneratedId.
SELECT
package.GeneratedId, i.TestData
FROM (
SELECT
GeneratedId = ROW_NUMBER() OVER(ORDER BY (Id)),
StartId = Id,
EndId = LEAD(Id) OVER (ORDER BY (Id))
FROM #InputTable
WHERE TestData LIKE 'title%' ) package
INNER JOIN #InputTable i
ON i.Id >= package.StartId
AND (package.EndId IS NULL OR i.Id < package.EndId)

SQL INSERT missing rows from Table A to Table B

I'm trying to insert rows into table 'Data' if they don't already exist.
For each row in Export$, I need the code to check 'Data' for rows that match both Period (date) and an ID (int) - if the rows don't already exist then they should be created.
I'm pretty sure my 'NOT EXISTS' part is wrong - what's the best way to do this? Thanks for all your help
IF NOT EXISTS (SELECT * FROM Data, Export$ WHERE Data.ID = Export$.ID AND Data.Period = Export$.Period)
INSERT INTO Data (Period, Performance, ID)
SELECT Period, [Return], [ID] FROM Export$
try something like, will need tweaking to fit your tables
insert into data
select * from export
left join data on data.id = export.id
and data.period = export.period
where data.id is null
try this:
INSERT INTO Data (Period, Performance, ID)
SELECT Period, [Return], [ID]
FROM Export$ e
where not exists (
select *
from Data
where ID = e.ID and Period = e.Period)

sql server: How to detect changed rows

I want to create a trigger to detect whether a row has been changed in SQL Server. My current approach is to loop through each field, apply COLUMNS_UPDATED() to detect whether UPDATE has been called, then finally compare the values of this field for the same row (identified by PK) in inserted vs deleted.
I want to eliminate the looping from the procedure. Probably I can dump the content of inserted and deleted into one table, group on all columns, and pick up the rows with count=2. Those rows will count as unchanged.
The end goal is to create an audit trail:
1) Track user and timestamp
2) Track insert, delete and REAL changes
Any suggestion is appreciated.
Instead of looping you can use BINARY_CHECKSUM to compare entire rows between the inserted and deleted tables, and then act accordingly.
Example
Create table SomeTable(id int, value varchar(100))
Create table SomeAudit(id int, Oldvalue varchar(100), NewValue varchar(100))
Create trigger tr_SomTrigger on SomeTable for Update
as
begin
insert into SomeAudit
(Id, OldValue, NewValue)
select i.Id, d.Value, i.Value
from
(
Select Id, Value, Binary_CheckSum(*) Version from Inserted
) i
inner join
(
Select Id, Value, Binary_CheckSum(*) Version from Deleted
) d
on i.Id = d.Id and i.Version <> d.Version
End
Insert into sometable values (1, 'this')
Update SomeTable set Value = 'That'
Select * from SomeAudit

Can I select the data of a given row and column while executing a sql statement

To clarify the title, in a select statement, in the where clause, I need to verify to table on which I am doing using another select. In that second select, I have to find all the secondary ID. Here is what I have worked out so far
Declare #id INT
--inserting values in temp table
SELECT
rn = ROW_NUMBER() OVER (ORDER BY adt_trl_dt_tm),
*
INTO #Temp
FROM dbo.EVNT_HSTRY
ORDER BY adt_trl_dt_tm DESC
--Searching for items that are deleted and have not been restored
SELECT *
FROM dbo.EVNT_HSTRY hstry
WHERE evnt_hstry_cd LIKE '3' and
adt_trl_dt_tm > (SELECT adt_trl_dt_tm FROM dbo.EVNT_HSTRY WHERE evnt_id = evnt_id
DROP TABLE #Temp
To clarify the code, evnt_id is a foreign key. The primary key is evnt_Hstry_id. The evnt_hstry_cd 3 means deleted. What I am trying to do is to see if the field adt_trl_dt_tm (lastest date modified) of the row being read is the latest by comparing it with all the adt_trl_dt_tm fields that have the same evnt_id.
The table I am doing the select on is the table where we store the history of the events. It is where we say when the event has been added, modified, deleted and or restored.
Sadly, I cannot do that into my application as this statement is being run in an SSIS.
Overall, I need to compare the adt_trl_dt_tm with the other adt_trl_dt_tm that have the same evnt_id and select the latest.
Can you test this with your data ?
SELECT *
FROM dbo.EVNT_HSTRY hstry
WHERE evnt_hstry_cd LIKE '3' and
not exists (select 1 from EVNT_HSTRY WHERE hstry.evnt_id = evnt_id
AND Hstry.adt_trl_dt_tm > adt_trl_dt_tm)
SELECT *
FROM dbo.EVNT_HSTRY hstry
WHERE evnt_hstry_cd = '3' and
adt_trl_dt_tm = (
SELECT max(adt_trl_dt_tm) FROM dbo.EVNT_HSTRY WHERE evnt_id = hstry.evnt_id
)
will result in a row read if the code 3 is the most recent entry in hstry and no row if there is a more recent row not having code 3
Change LIKE in = if it matches exactly