SQL SERVER 2014
I need to update two columns in TargetTable with values from SourceTable
SourceTbl
PersonNr | Block | BlockReason |
---------|----------|---------------|
000001 | 1 | abuse |
000001 | 1 | age |
000001 | 0 | memo |
000002 | 1 | age |
000002 | 0 | |
000003 | 0 | |
000003 | 0 | |
000004 | 1 | behaviour |
000005 | 0 | |
TargetTable
PersonNr | Block | BlockReason |
---------|----------|---------------|
000001 | 0 | |
000001 | 0 | |
000002 | 0 | |
000002 | 0 | |
000004 | 1 | |
000005 | 0 | |
Result needed:
PersonNr | Block | BlockReason |
---------|----------|---------------|
000001 | 1 | abuse |
000001 | 1 | abuse |
000002 | 1 | age |
000002 | 1 | age |
000004 | 1 | behaviour |
000005 | 0 | |
It is not relevant which BlockReason Person 1 gets,
as far as it's one from a row where Block = '1'.
I've tried this pretty straight-forward update :
UPDATE
src
SET
src.Block = '1',
src.BlockReason = targ.BlockReason
FROM
SourceTbl src
INNER JOIN
TargetTable targ
ON
src.PersonNr= targ.PersonNr
WHERE src.Block = '1'
But ended up with faulty result-rows where Block and Reason are updated separately :
PersonNr | Block | BlockReason |
---------|----------|---------------|
000001 | 1 | memo |
Next I've tried :
MERGE INTO TargetTable AS TGT
USING
(
SELECT Block, BlockReason, PersonNr
FROM SourceTbl
GROUP BY Block, BlockReason, PersonNr
) AS SRC
ON
SRC.PersonNr= TGT.PersonNr AND
SRC.Block= '1'
WHEN MATCHED THEN
UPDATE SET TGT.Block= SRC.Block, TGT.BlockReason= SRC.BlockReason;
Got the error
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
Any help? Hugely appreciated! Truly. Totally.
The problem with your query is that it gives duplicate values and it's trying to update the same record more than once.And the GROUP BY in the subquery doesn't make any sense since you are not using any aggregate function.
Let's take an id(say 1) and check what's going wrong with your query.
src.PersonNr | src.Block | src.BlockReason | tgt.PersonNr | tgt.Block | tgt.BlockReason |
-------------|--------------|-------------------|--------------
000001 | 1 | abuse | 000001 | 0 | |
000001 | 1 | age | 000001 | 0 | |
000001 | 1 | abuse | 000001 | 0 | |
000001 | 1 | age | 000001 | 0 | |
Your query will give you the above result and try to update targettable 2 times for each record once with abuse and next with age.
You can try the below query:
MERGE INTO TargetTable AS TGT
USING
(
SELECT Block, BlockReason, PersonNr
FROM(
SELECT Block, BlockReason, PersonNr,ROW_NUMBER() OVER (PARTITION BY PersonNr ORDER BY [YourPrimaryKey]) RN
FROM SourceTbl ) X
WHERE X.RN=1
) AS SRC
ON
SRC.PersonNr= TGT.PersonNr AND
SRC.Block= '1'
WHEN MATCHED THEN
UPDATE SET TGT.Block= SRC.Block, TGT.BlockReason= SRC.BlockReason;
You have duplicates in your data. Add another (or more than one) column to the ON clause of the MERGE that will help identify exactly one record or find a way to remove duplicates before merging.
The UPDATE should be something like this:
UPDATE
targ
SET
Block = '1',
BlockReason = src.BlockReason
FROM
SourceTbl src
INNER JOIN
TargetTable targ
ON
src.PersonNr= targ.PersonNr
WHERE src.Block = '1'
Since we're only using rows from SourceTbl where Block is 1, it should not be possible for a row affected by this update to end up with a reason which had a Block of 0.
There is still the general issue that this is non-deterministic in cases where multiple rows from SourceTbl are joined to one row in TargetTbl, but since you've indicated that determinism isn't required here, it shouldn't result in a problem.
Related
I am looking to do some unpivoting in sql, converting key-values into a list of IDs. The Column from the table is dynamic and will change weekly (mayhaps even daily).
Where there maybe more KEY Columns and values possibilities.
DATABASE.PUBLIC.IMPORT_TABLE
+-----------+--------+--------+--------+
| SOURCE_ID | KEY1 | KEY2 | [KEY3] |
+-----------+--------+--------+--------+
| 0001 | TRUE | FALSE | [TBD] |
| 0002 | TRUE | FALSE | [TBD] |
| 0003 | TRUE | TRUE | [TBD] |
| 0004 | TRUE | TRUE | [TBD] |
| 0005 | FALSE | TRUE | [TBD] |
+-----------+--------+--------+--------+
To manage the possible combinations there is a table which manages the key-value pairs, and gives them an ID. This table is used to manage the Columns I am interested in. Where sometimes if there there are additional Columns added and not in this table it will be excluded.
DATABASE.PUBLIC.KEY_VALUE_TABLE
+-----------+--------+--------+
| KEY | VALUE | KV_ID |
+-----------+--------+--------+
| KEY1 | TRUE | AAA |
| KEY1 | FALSE | BBB |
| KEY2 | TRUE | CCC |
| KEY2 | FALSE | DDD |
| [KEY3] | [TRUE] | [EEE] |
+-----------+--------+--------+
Using the following query. I am able to achieve the unpivot I am looking for. However it does not take into account changes in dynamic changes for KEY_VALUE_TABLE
CREATE OR REPLACE TABLE DATABASE.PUBLIC.FINAL_TABLE AS
SELECT SOURCE_ID
,LISTAGG(DISTINCT b.KV_ID, ',') within group (order by b.KV_ID desc) AS KV_ID
FROM (
SELECT DISTINCT SOURCE_ID, KEY, VALUE from DATABASE.PUBLIC.IMPORT_TABLE
UNPIVOT(VALUE for KEY in (KEY1,KEY2))
) a
LEFT JOIN DATABASE.PUBLIC.KEY_VALUE_TABLE b
ON (a.KEY=b.KEY AND a.VALUE=b.VALUE)
GROUP BY 1,2
Where the results look like the following
DATABASE.PUBLIC.FINAL_TABLE
+-----------+--------+--------+
| SOURCE_ID | KV_ID |
+-----------+--------+--------+
| 0001 | AAA,DDD |
| 0002 | AAA,DDD |
| 0003 | AAA,CCC |
| 0004 | AAA,CCC |
| 0005 | BBB,CCC |
+-----------+--------+--------+
Is there a way to have the UNPIVOT(VALUE for KEY in (KEY1,KEY2))
to something like UNPIVOT(VALUE for KEY in (SELECT DISTINCT KEY FROM KEY_VALUE_TABLE)). Note I already tried the latter and it does not work.
SQL has an issue using SELECT within the UNPIVOT
I have also tried using UNPIVOT(VALUE for KEY in ($KEY_LIST))
Where my variable is SET KEY_LIST = '[KEY1,KEY2]'
=> SET KEY_LIST = (SELECT listagg(KEY, ',') as my_strings FROM DATABASE.PUBLIC.KEY_VALUE_TABLE)
SQL has a limit on the size of variables.
Any Advice would be appreciated.
My next step is to create a Stored Procedure or a Function. Which takes in external variables. via javascript or something similar.
The trigger is updating multiple rows with the same value
Here a problem, I'm trying to update one or many rows in a table(B), but its updating the rows with the same value let's say
TABLE A
+------+-----+-----+
|ROWID |NAME |CODE |
+------+-----+-----+
| 123 |pepe | 1 |
| 456 |tito | 2 |
| 789 |gege | 3 |
-------------------
Current result:
+------+-----+-----+---------+
|ROWID |NAME |CODE | A_ROWID |
+------+-----+-----+---------+
| 321 |rolo | 1 | 123 |
| 654 |kit | 2 | 123 |
| 987 |gu | 3 | 123 |
-----------------------------
Expected result:
+------+-----+-----+---------+
|ROWID |NAME |CODE | A_ROWID |
+------+-----+-----+---------+
| 321 |rolo | 1 | 123 |
| 654 |kit | 2 | 456 |
| 987 |gu | 3 | 789 |
-----------------------------
This is the trigger, I will really appreciate any advice
ALTER TRIGGER [dbo].[TRIG_AfterInsert]
ON [dbo].[B]
AFTER INSERT
AS
BEGIN
DECLARE #Code nvarchar(100);
SELECT #Code = CODE FROM INSERTED;
UPDATE B
SET B.A_ROWID = A.ROWID
FROM A
WHERE A.CODE = #Code;
END
The trigger is called once per the statement, which means if you are updating one row it will be called once, if you are updating 1 million rows it will be called once again.
So, now when you know triggers core behavior you should refactor these lines below:
declare #Code nvarchar(100);
select #Code=CODE from INSERTED;
So, full trigger code will be:
CREATE OR ALTER TRIGGER dbo.TRIG_AfterInsert ON dbo.B AFTER INSERT AS
BEGIN
UPDATE B
SET B.A_ROWID = A.ROWID
FROM B
JOIN INSERTED I ON I.ROWID = B.ROWID
JOIN A ON A.CODE = I.CODE;
END
Just take a look there is inserted table directly in the join, and with this statement you will filter all the rows you need.
Where clause in your statement is converted in the JOIN.
I have three different tables here:
df_umts_relation table:
|---------------------|------------------|---------------------|------------------|------------------|
| cell_name | n_cell_name | technology | source_ops_num | target_ops_num |
|---------------------|------------------|---------------------|------------------|------------------|
| 121 | 221 | UMTS | 1 | | |
|---------------------|------------------|---------------------|------------------|------------------|
| 122 | 222 | GSM | 2 | | |
|---------------------|------------------|---------------------|------------------|------------------|
| 123 | 223 | UMTS | 3 | | |
|---------------------|------------------|---------------------|------------------|------------------|
| 124 | 224 | GSM | 4 | | |
|---------------------|------------------|---------------------|------------------|------------------|
| 125 | 225 | GSM | 5 | | |
|---------------------|------------------|---------------------|------------------|------------------|
| 126 | 226 | UMTS | 6 | | |
|---------------------|------------------|---------------------|------------------|------------------|
| 127 | 227 | UMTS | 7 | | |
|---------------------|------------------|---------------------|------------------|------------------|
So now I want to update target_ops_num from the two below tables
df_umts_carrier table as this table contains those thow columns I want to work on them and contains some integer values also:
|---------------------|------------------|
| opsnum_umts | cell_name_umts |
|---------------------|------------------|
as I have another table called df_gsm_carrier:
|---------------------|------------------|
| opsnum_gsm | cellname |
|---------------------|------------------|
So All I need I want to update [MyNewDatabase].[dbo].[df_umts_relation].[target_ops_num] CASE WHEN technologyis UMTS then update from table df_umts_carrier ELSE technology is GSM then update from df_gsm_carrier on n_cell_name = cell_name_umts and on n_cell_name = cellname
So I tried to create a query as the below one works with one condition only and it's update the the rows which is UMTS only:
UPDATE [MyNewDatabase].[dbo].[df_umts_relation]
SET [MyNewDatabase].[dbo].[df_umts_relation].[target_ops_num] = [MyNewDatabase].[dbo].[df_umts_carrier].[opsnum_umts]
FROM [MyNewDatabase].[dbo].[df_umts_relation]
INNER JOIN [MyNewDatabase].[dbo].[df_umts_carrier]
ON [n_cell_name] = [cell_name_umts]
and works fine but doesn't update the rows which contains GSM...
On other way I tried to create a query to handle this but it didn't update the GSM part and take a long of time:
UPDATE [MyNewDatabase].[dbo].[df_umts_relation]
SET [MyNewDatabase].[dbo].[df_umts_relation].[target_ops_num] = (CASE WHEN [MyNewDatabase].[dbo].[df_umts_relation].[technology] = 'UMTS'
THEN [MyNewDatabase].[dbo].[df_umts_carrier].[opsnum_umts] ELSE [MyNewDatabase].[dbo].[df_gsm_carrier].[opsnum_gsm] END)
FROM [MyNewDatabase].[dbo].[df_umts_relation]
LEFT JOIN [MyNewDatabase].[dbo].[df_umts_carrier]
ON [n_cell_name] = [cell_name_umts]
LEFT JOIN [MyNewDatabase].[dbo].[df_gsm_carrier]
ON [n_cell_name] = [cell_name]
So any one have any idea how to solve this?
Please check if this will help.
update df_umts_relation
set target_ops_num = ( select case when dur.technology = 'UMTS' then du.cell_name_umts
when dur.technology = 'GSM' then dg.cellname
end
from df_umts_relation dur
left join df_umts_carrier du on dur.n_cell_name = du.opsnum_umts
left join df_gsm_carrier dg on dur.n_cell_name = dg.opsnum_umts
where dur.id= df_umts_relation.id)
Here is a demo
Sorry for the title, perhaps it's not very clear.
I have some SQL queries in a script that depend on each other.
The script uses a temporary table in which the data is inserted (the #temp_data table).
This is the expected output:
___________________________________
| speed1 | speed2 | distance |
| 1 | NULL | 10 |
| 3 | NULL | 40 |
| 5 | NULL | 90 |
| NULL | 1 | 10 |
| NULL | 3 | 40 |
| NULL | 5 | 90 |
Here is the query structure (I didn't include the actual query since it's too big):
-- First group
queryForSpeed1
queryToUpdateDistanceBasedOnSpeed1
-- Second group
queryForSpeed2
queryToUpdateDistanceBasedOnSpeed2
If I run the first group of queries (queryForSpeed1 and queryToUpdateDistanceBasedOnSpeed1) separately from the second group then I get the expected output: only the speed1 and distance columns contain data:
___________________________________
| speed1 | speed2 | distance |
| 1 | NULL | 10 |
| 3 | NULL | 40 |
| 5 | NULL | 90 |
| NULL | NULL | NULL |
| NULL | NULL | NULL |
| NULL | NULL | NULL |
The same happens when I run the second group:
___________________________________
| speed1 | speed2 | distance |
| NULL | NULL | NULL |
| NULL | NULL | NULL |
| NULL | NULL | NULL |
| NULL | 1 | 10 |
| NULL | 2 | 40 |
| NULL | 3 | 90 |
BUT, when I run both groups: all the distances are NULL:
___________________________________
| speed1 | speed2 | distance |
| 1 | NULL | NULL |
| 3 | NULL | NULL |
| 5 | NULL | NULL |
| NULL | 1 | NULL |
| NULL | 2 | NULL |
| NULL | 3 | NULL |
I believe this is somehow related to transaction management and temporary tables, although I wasn't able to find anything relevant to solve the problem on Google.
From what I've read, SQL Server keeps a transaction log where it stores every update, insert and whatever... when it arrives at the end of the script it actually does all those insertions and updates.
So the update I did for the distance column finds all the speeds as being NULL because the data wasn't yet inserted in the temporary table from the previous updates, but at the end of the query the speeds are inserted in the table so that's why they are visible.
I played a bit with the GO statement to execute my script in batches, but no luck so far...
What am I doing wrong? Can someone point me in the right direction, please?
EDIT
Here is the actual query.
The problem is not related to transactions, but rather to the way you conduct updates to #temp_speed_profile. The second pass through #temp_speed_profile retrieves all six records. Speed_new is null in first record of Voyage_Id, consequently #distance becomes null. As you retain the value of #distance in next turn, it remains null.
Problem goes away when using different temporary tables because second pass works on second set of data only.
A note on cursors - when defining one make sure to add local and fast_forward. Local because it is limiting cursors' scope, and fast_forward to optimize fetches.
It is almost certainly caused by the way you have written your queries.
To confirm, just rewrite your queries using #temp_data1 and #temp_data2, rather than a single table #temp_data.
Given a table like below, is there a single-query way to update the table from this:
| id | type_id | created_at | sequence |
|----|---------|------------|----------|
| 1 | 1 | 2010-04-26 | NULL |
| 2 | 1 | 2010-04-27 | NULL |
| 3 | 2 | 2010-04-28 | NULL |
| 4 | 3 | 2010-04-28 | NULL |
To this (note that created_at is used for ordering, and sequence is "grouped" by type_id):
| id | type_id | created_at | sequence |
|----|---------|------------|----------|
| 1 | 1 | 2010-04-26 | 1 |
| 2 | 1 | 2010-04-27 | 2 |
| 3 | 2 | 2010-04-28 | 1 |
| 4 | 3 | 2010-04-28 | 1 |
I've seen some code before that used an # variable like the following, that I thought might work:
SET #seq = 0;
UPDATE `log` SET `sequence` = #seq := #seq + 1
ORDER BY `created_at`;
But that obviously doesn't reset the sequence to 1 for each type_id.
If there's no single-query way to do this, what's the most efficient way?
Data in this table may be deleted, so I'm planning to run a stored procedure after the user is done editing to re-sequence the table.
You can use another variable storing the previous type_id (#type_id). The query is ordered by type_id, so whenever there is a change in type_id, sequence has to be reset to 1 again.
Set #seq = 0;
Set #type_id = -1;
Update `log`
Set `sequence` = If(#type_id=(#type_id:=`type_id`), (#seq:=#seq+1), (#seq:=1))
Order By `type_id`, `created_at`;
I don't know MySQL very well, but you could use a sub query though it may be very slow.
UPDATE 'log' set 'sequence' = (
select count(*) from 'log' as log2
where log2.type_id = log.type_id and
log2.created_at < log.created_at) + 1
You'll get duplicate sequences, though, if two type_ids have the same created_at date.