Repeating Subquery in DB2 update - sql

I am currently creating an update statement which will update a bitemporal table. It does the following:
Update every Row for every ID and set the RELATION_ID to the RELATION_ID of the newest row.
In this query, there is a repeating subquery ( Because I first use it to get the value used to update) and also ( I don't want to update rows which already have this RELATION_ID)
Is there a way to reuse the value from the first query or any alternatives?(Without programming, pure SQL required)
UPDATE TBL_CLIENT UPD
SET RELATION_ID = (
SELECT RELATION_ID FROM TBL_CLIENT SUBQ
WHERE UPD.ID = SUBQ.ID AND
UPDATE_TIMESTAMP = (
SELECT MAX(UPDATE_TIMESTAMP) FROM TBL_CLIENT SUBSQ
WHERE SUBSQ.ID = SUBQ.ID
)
)
WHERE ID IN (
SELECT ID
FROM TBL_CLIENT QU
GROUP BY ID
HAVING COUNT(DISTINCT(RELATION_ID)) > 1
) AND
RELATION_ID <> (
SELECT RELATION_ID FROM TBL_CLIENT SUBQ2
WHERE UPD.ID = SUBQ2.ID AND
UPDATE_TIMESTAMP = (
-- Update mit STID des neusten Eintrages
SELECT MAX(UPDATE_TIMESTAMP) FROM TBL_CLIENT SUBSQ2
WHERE SUBSQ2.ID = SUBQ2.ID
)
)
Example:
ID / RELATION_ID / UPDATE_TIMESTAMP
1 / 10 / 1.1.2000
1 / 12 / 5.1.2002
1 / 15 / 28.3.2008
After Update:
1 / 15 / 1.1.2000
1 / 15 / 5.1.2002
1 / 15 / 28.3.2008
The last row was the most recent row, therefore it's relation id was taken (and the row itself wasn't updated!, important for an other part of the query which isn't included here)
Thanks for any recommendations

You can update a view:
UPDATE
( SELECT t.id, t.update_timestamp, ... --- all the PK columns
t.relation_id,
m.relation_id AS new_relation_id
FROM
TBL_CLIENT AS t
JOIN
( SELECT id, relation_id,
ROW_NUMBER() OVER (PARTITION BY id
ORDER BY update_timestamp DESC)
AS rn
FROM TBL_CLIENT
) AS m
ON m.id = t.id
WHERE m.rn = 1
AND m.relation_id <> t.relation_id
) AS upd
SET
relation_id = new_relation_id ;

This might work for you-(I dont know the exact syntax. i work on sybase, so this is as per sybase)
UPDATE TBL_CLIENT AA
SET RELATION_ID = BB.RELATION_ID
FROM
TBL_CLIENT AA,
TBL_CLIENT BB
WHERE
AA.ID=BB.ID
AND BB.UPDATE_TIMESTAMP=(SELECT MAX(UPDATE_TIMESTAMP) FROM TBL_CLIENT CC WHERE CC.ID=BB.ID)

Probably you can use a trigger with an after/before update.

Related

Finding updates in a table using Self-Join

I have a table as shown below
tablename - property
|runId|listingId|listingName
1 123 abc
1 234 def
2 123 abcd
2 567 ghi
2 234 defg
As you can see in above code there is a runId and there is a listing Id. I am trying to fetch for a particular runId which are the new listings added (In this case for runId 2 its 4th row with listing id 567 ) and which are the listing Ids that are update (In this case its row 3 and row 5 with listingId 123 and 234 respectively)
I am trying self join and it is working fairly for new updates but new additions are giving me trouble
SELECT p1.* FROM property p1
INNER JOIN property p2
ON p1.listingid = p2.listingid
WHERE p1.runid=456 AND p2.runid!=456
The above query provides me correct updated records in the table. But I am not able to find new listing. I used p1.listingid != p2.listingId , left outer join, still wont work.
I would use the ROW_NUMBER() analytical function for it.
SELECT
T.*
FROM
(
SELECT
T.*,
CASE
WHEN ROW_NUMBER() OVER(
PARTITION BY LISTINGID
ORDER BY
RUNID
) = 1 THEN 'INSERTED'
ELSE 'UPDATED'
END AS OPERATION_
FROM
PROPERTY
)
WHERE
RUNID = 2
-- AND OPERATION_ = 'INSERTED'
-- AND OPERATION_ = 'UPDATED'
This will provide the result as updated if listingid is added in any of the previous runid
Cheers!!
You may try this.
with cte as (
select row_number() over (partition by listingId order by runId) as Slno, * from property
)
select * from property where listingId not in (
select listingId from cte as c where slno>1
) --- for new listing added
with cte as (
select row_number() over (partition by listingId order by runId) as Slno, * from property
)
select * from property where listingId in (
select listingId from cte as c where slno>1
) --- for modified listing
For this, I would recommend exists and not exists. For updates:
select p.*
from property p
where exists (select 1
from property p2
where p2.listingid = p.listingid and
p2.runid < p.runid
);
If you want the result for a particular runid, add and runid = ? to the outer query.
And for new listings:
select p.*
from property p
where not exists (select 1
from property p2
where p2.listingid = p.listingid and
p2.runid < p.runid
);
With an index on property(listingid, runid), I would expect this to have somewhat better performance than a solution using window functions.
Here is a db<>fiddle.

Aggregate data from multiple rows into single row

In my table each row has some data columns Priority column (for example, timestamp or just an integer). I want to group my data by ID and then in each group take latest not-null column. For example I have following table:
id A B C Priority
1 NULL 3 4 1
1 5 6 NULL 2
1 8 NULL NULL 3
2 634 346 359 1
2 34 NULL 734 2
Desired result is :
id A B C
1 8 6 4
2 34 346 734
In this example table is small and has only 5 columns, but in real table it will be much larger. I really want this script to work fast. I tried do it myself, but my script works for SQLSERVER2012+ so I deleted it as not applicable.
Numbers: table could have 150k of rows, 20 columns, 20-80k of unique ids and average SELECT COUNT(id) FROM T GROUP BY ID is 2..5
Now I have a working code (thanks to #ypercubeᵀᴹ), but it runs very slowly on big tables, in my case script can take one minute or even more (with indices and so on).
How can it be speeded up?
SELECT
d.id,
d1.A,
d2.B,
d3.C
FROM
( SELECT id
FROM T
GROUP BY id
) AS d
OUTER APPLY
( SELECT TOP (1) A
FROM T
WHERE id = d.id
AND A IS NOT NULL
ORDER BY priority DESC
) AS d1
OUTER APPLY
( SELECT TOP (1) B
FROM T
WHERE id = d.id
AND B IS NOT NULL
ORDER BY priority DESC
) AS d2
OUTER APPLY
( SELECT TOP (1) C
FROM T
WHERE id = d.id
AND C IS NOT NULL
ORDER BY priority DESC
) AS d3 ;
In my test database with real amount of data I get following execution plan:
This should do the trick, everything raised to the power 0 will return 1 except null:
DECLARE #t table(id int,A int,B int,C int,Priority int)
INSERT #t
VALUES (1,NULL,3 ,4 ,1),
(1,5 ,6 ,NULL,2),(1,8 ,NULL,NULL,3),
(2,634 ,346 ,359 ,1),(2,34 ,NULL,734 ,2)
;WITH CTE as
(
SELECT id,
CASE WHEN row_number() over
(partition by id order by Priority*power(A,0) desc) = 1 THEN A END A,
CASE WHEN row_number() over
(partition by id order by Priority*power(B,0) desc) = 1 THEN B END B,
CASE WHEN row_number() over
(partition by id order by Priority*power(C,0) desc) = 1 THEN C END C
FROM #t
)
SELECT id, max(a) a, max(b) b, max(c) c
FROM CTE
GROUP BY id
Result:
id a b c
1 8 6 4
2 34 346 734
One alternative that might be faster is a multiple join approach. Get the priority for each column and then join back to the original table. For the first part:
select id,
max(case when a is not null then priority end) as pa,
max(case when b is not null then priority end) as pb,
max(case when c is not null then priority end) as pc
from t
group by id;
Then join back to this table:
with pabc as (
select id,
max(case when a is not null then priority end) as pa,
max(case when b is not null then priority end) as pb,
max(case when c is not null then priority end) as pc
from t
group by id
)
select pabc.id, ta.a, tb.b, tc.c
from pabc left join
t ta
on pabc.id = ta.id and pabc.pa = ta.priority left join
t tb
on pabc.id = tb.id and pabc.pb = tb.priority left join
t tc
on pabc.id = tc.id and pabc.pc = tc.priority ;
This can also take advantage of an index on t(id, priority).
previous code will work with following syntax:
with pabc as (
select id,
max(case when a is not null then priority end) as pa,
max(case when b is not null then priority end) as pb,
max(case when c is not null then priority end) as pc
from t
group by id
)
select pabc.Id,ta.a, tb.b, tc.c
from pabc
left join t ta on pabc.id = ta.id and pabc.pa = ta.priority
left join t tb on pabc.id = tb.id and pabc.pb = tb.priority
left join t tc on pabc.id = tc.id and pabc.pc = tc.priority ;
This looks rather strange. You have a log table for all column changes, but no associated table with current data. Now you are looking for a query to collect your current values from the log table, which is a laborious task naturally.
The solution is simple: have an additional table with the current data. You can even link the tables with a trigger (so either every time a record gets inserted in your log table you update the current table or everytime a change is written to the current table you write a log entry).
Then just query your current table:
select id, a, b, c from currenttable order by id;

Column Operations and eliminating the duplicate

I have this particular table below . I want to eliminate duplicate Course from the group 2, cause it is in group 1. Basically if the course is mapped on to Group 1 which is mandatory, we have to only consider that and not in any other group. I will have to check repeating courses first and then remove the duplicate course which is not mandatory.
Program Group Course Mandatory
Program1 1 a YES
Program1 1 b YES
Program1 1 c YES
Program1 2 d NO
Program1 2 a NO
Program1 2 e NO
Program1 3 f YES
I am not able to figure out same column operations , or my mind is not working today(:-) )
I have tried using Count Operation and creating a flag for duplicate rows ,but cannot do it with 'Group' in the group by clause.
Output:
Program Group Course Mandatory
Program1 1 a YES
Program1 1 b YES
Program1 1 c YES
Program1 2 d NO
Program1 2 e NO
Program1 3 f YES
EDIT
How can we
Check for duplicate records and delete it from only one particular group.
You can use a ROW_NUMBER() function to achieve this:
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY COURSE ORDER BY [Group] ) as RowRank
FROM table
)sub
WHERE RowRank = 1
Demo: SQL Fiddle
Edit: ROW_NUMBER assigns a number to each row. Numbering will start at 1 for each grouping you assign via the PARTITION BY portion, in this case each COURSE would have a number 1 and go up. The order of the numbers is determined by the ORDER BY portion, in this case the lowest [Group] gets the 1.
Editing my answer to reflect clarified requirements.
select *
from #TableName t
where
(Mandatory = 'YES' or
not exists (
select *
from #TableName
where
Program = t.Program and
Course = t.Course and
[Group] != t.[Group] and
Mandatory = 'YES'
))
based on your comments below here's another sample to try
;with group1 as (
select * from #Table where [Group] = 1
),
groups12 as (
select * from group1
union all
select * from #Table t where [Group] = 2 and not exists (select * from group1 where Program = t.Program and Course = t.Course)
),
groups123 as (
select * from groups12
union all
select * from #Table t where [Group] = 3 and not exists (select * from groups12 where Program = t.Program and Course = t.Course)
),
groups1234 as (
select * from groups123
union all
select * from #Table t where [Group] = 4 and not exists (select * from groups123 where Program = t.Program and Course = t.Course)
)
select * from groups1234
This query pulls rows for groups 1-4 in order and only when the Program/Course hasn't already appeared in a lower number group.

Consolidate records

I want to consolidate a set of records
(id) / (referencedid)
1 10
1 11
2 11
2 10
3 10
3 11
3 12
The result of query should be
1 10
1 11
3 10
3 11
3 12
So, since id=1 and id=2 has same set of corresponding referenceids {10,11} they would be consolidated. But id=3 s corresponding referenceids are not the same, hence wouldnt be consolidated.
What would be good way to get this done?
Select id, referenceid
From MyTable
Where Id In (
Select Min( Z.Id ) As Id
From (
Select Z1.id, Group_Concat( Z1.referenceid ) As signature
From (
Select id, referenceid
From MyTable
Order By id, referenceid
) As Z1
Group By Z1.id
) As Z
Group By Z.Signature
)
-- generate count of elements for each distinct id
with Counts as (
select
id,
count(1) as ReferenceCount
from
tblReferences R
group by
R.id
)
-- generate every pairing of two different id's, along with
-- their counts, and how many are equivalent between the two
,Pairings as (
select
R1.id as id1
,R2.id as id2
,C1.ReferenceCount as count1
,C2.ReferenceCount as count2
,sum(case when R1.referenceid = R2.referenceid then 1 else 0 end) as samecount
from
tblReferences R1 join Counts C1 on R1.id = C1.id
cross join
tblReferences R2 join Counts C2 on R2.id = C2.id
where
R1.id < R2.id
group by
R1.id, C1.ReferenceCount, R2.id, C2.ReferenceCount
)
-- generate the list of ids that are safe to remove by picking
-- out any id's that have the same number of matches, and same
-- size of list, which means their reference lists are identical.
-- since id2 > id, we can safely remove id2 as a copy of id, and
-- the smallest id of which all id2 > id are copies will be left
,RemovableIds as (
select
distinct id2 as id
from
Pairings P
where
P.count1 = P.count2 and P.count1 = P.samecount
)
-- validate the results by just selecting to see which id's
-- will be removed. can also include id in the query above
-- to see which id was identified as the copy
select id from RemovableIds R
-- comment out `select` above and uncomment `delete` below to
-- remove the records after verifying they are correct!
--delete from tblReferences where id in (select id from RemovableIds) R

How to find "holes" in a table

I recently inherited a database on which one of the tables has the primary key composed of encoded values (Part1*1000 + Part2).
I normalized that column, but I cannot change the old values.
So now I have
select ID from table order by ID
ID
100001
100002
101001
...
I want to find the "holes" in the table (more precisely, the first "hole" after 100000) for new rows.
I'm using the following select, but is there a better way to do that?
select /* top 1 */ ID+1 as newID from table
where ID > 100000 and
ID + 1 not in (select ID from table)
order by ID
newID
100003
101029
...
The database is Microsoft SQL Server 2000. I'm ok with using SQL extensions.
select ID +1 From Table t1
where not exists (select * from Table t2 where t1.id +1 = t2.id);
not sure if this version would be faster than the one you mentioned originally.
SELECT (ID+1) FROM table AS t1
LEFT JOIN table as t2
ON t1.ID+1 = t2.ID
WHERE t2.ID IS NULL
This solution should give you the first and last ID values of the "holes" you are seeking. I use this in Firebird 1.5 on a table of 500K records, and although it does take a little while, it gives me what I want.
SELECT l.id + 1 start_id, MIN(fr.id) - 1 stop_id
FROM (table l
LEFT JOIN table r
ON l.id = r.id - 1)
LEFT JOIN table fr
ON l.id < fr.id
WHERE r.id IS NULL AND fr.id IS NOT NULL
GROUP BY l.id, r.id
For example, if your data looks like this:
ID
1001
1002
1005
1006
1007
1009
1011
You would receive this:
start_id stop_id
1003 1004
1008 1008
1010 1010
I wish I could take full credit for this solution, but I found it at Xaprb.
from How do I find a "gap" in running counter with SQL?
select
MIN(ID)
from (
select
100001 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null
The best way is building a temp table with all IDs
Than make a left join.
declare #maxId int
select #maxId = max(YOUR_COLUMN_ID) from YOUR_TABLE_HERE
declare #t table (id int)
declare #i int
set #i = 1
while #i <= #maxId
begin
insert into #t values (#i)
set #i = #i +1
end
select t.id
from #t t
left join YOUR_TABLE_HERE x on x.YOUR_COLUMN_ID = t.id
where x.YOUR_COLUMN_ID is null
Have thought about this question recently, and looks like this is the most elegant way to do that:
SELECT TOP(#MaxNumber) ROW_NUMBER() OVER (ORDER BY t1.number)
FROM master..spt_values t1 CROSS JOIN master..spt_values t2
EXCEPT
SELECT Id FROM <your_table>
This solution doesn't give all holes in table, only next free ones + first available max number on table - works if you want to fill in gaps in id-es, + get free id number if you don't have a gap..
select numb + 1 from temp
minus
select numb from temp;
This will give you the complete picture, where 'Bottom' stands for gap start and 'Top' stands for gap end:
select *
from
(
(select <COL>+1 as id, 'Bottom' AS 'Pos' from <TABLENAME> /*where <CONDITION*/>
except
select <COL>, 'Bottom' AS 'Pos' from <TABLENAME> /*where <CONDITION>*/)
union
(select <COL>-1 as id, 'Top' AS 'Pos' from <TABLENAME> /*where <CONDITION>*/
except
select <COL>, 'Top' AS 'Pos' from <TABLENAME> /*where <CONDITION>*/)
) t
order by t.id, t.Pos
Note: First and Last results are WRONG and should not be regarded, but taking them out would make this query a lot more complicated, so this will do for now.
Many of the previous answer are quite good. However they all miss to return the first value of the sequence and/or miss to consider the lower limit 100000. They all returns intermediate holes but not the very first one (100001 if missing).
A full solution to the question is the following one:
select id + 1 as newid from
(select 100000 as id union select id from tbl) t
where (id + 1 not in (select id from tbl)) and
(id >= 100000)
order by id
limit 1;
The number 100000 is to be used if the first number of the sequence is 100001 (as in the original question); otherwise it is to be modified accordingly
"limit 1" is used in order to have just the first available number instead of the full sequence
For people using Oracle, the following can be used:
select a, b from (
select ID + 1 a, max(ID) over (order by ID rows between current row and 1 following) - 1 b from MY_TABLE
) where a <= b order by a desc;
The following SQL code works well with SqLite, but should be used without issues also on MySQL, MS SQL and so on.
On SqLite this takes only 2 seconds on a table with 1 million rows (and about 100 spared missing rows)
WITH holes AS (
SELECT
IIF(c2.id IS NULL,c1.id+1,null) as start,
IIF(c3.id IS NULL,c1.id-1,null) AS stop,
ROW_NUMBER () OVER (
ORDER BY c1.id ASC
) AS rowNum
FROM |mytable| AS c1
LEFT JOIN |mytable| AS c2 ON c1.id+1 = c2.id
LEFT JOIN |mytable| AS c3 ON c1.id-1 = c3.id
WHERE c2.id IS NULL OR c3.id IS NULL
)
SELECT h1.start AS start, h2.stop AS stop FROM holes AS h1
LEFT JOIN holes AS h2 ON h1.rowNum+1 = h2.rowNum
WHERE h1.start IS NOT NULL AND h2.stop IS NOT NULL
UNION ALL
SELECT 1 AS start, h1.stop AS stop FROM holes AS h1
WHERE h1.rowNum = 1 AND h1.stop > 0
ORDER BY h1.start ASC