Updates which include auto increment column - sql

I am using PostgreSQL and I would like to update a table which would include an auto number column id. I know it might something very simple but I have done something like this;
CREATE SEQUENCE SEQ_ID
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 10
update trk2
set (id, track_id, track_point) =
(select nextval('seq_id'), trk1.track_fid, trk1.wkb_geometry
from track_points_1 as trk1
where track_fid = 0)
The thing is that it is giving me the following error (code reformatted):
ERROR: syntax error at or near "select"
LINE 3: set (id, track_id, track_point)=(select nextval('seq_id'), t...
Can anybody help please?

Your query is broken in several places. It could look something like this:
UPDATE trk2
SET (id, track_id, track_point) =
(next_id, x.track_fid, x.wkb_geometry)
FROM (
SELECT id
,nextval('seq_id') AS next_id
,track_fid
,wkb_geometry
FROM track_points_1
WHERE track_fid = 0
) x
WHERE trk2.id = x.id; -- adapt to your case
Or simpler (preferable syntax):
UPDATE trk2
SET (id, track_id, track_point) =
(nextval('seq_id'), x.track_fid, x.wkb_geometry)
FROM track_points_1 x
WHERE x.track_fid = 0
AND trk2.id = x.id; -- adapt to your case
Major points:
An UPDATE without a WHERE clause only makes sense if you really need to change each and every row in the table. Else it is wrong or at least sub-optimal.
When you retrieve values from another table, you get a CROSS JOIN between target and source if you don't add a WHERE clause connecting target with source - meaning that every row of the target table will be updated with every row in the source table. This can take a very long time and lead to arbitrary results. The last UPDATE wins. In short: this is almost always complete nonsense and extremely expensive at that.
In my example I link target and source by the id column. You have to replace that with whatever fits in your case.
You can assign multiple values in one SET clause in an UPDATE, but you can only return a single value from a correlated subselect expression. Therefore, it is strictly not possible to have a subselect in a SET clause with multiple values.
Your initial syntax error comes from a missing pair of parenthesis around your subselect. But adding that only reveals the error mentioned above.
Depending on what you are after, you would include nextval('seq_id') in the subquery or in the SET clause directly. This can lead to very different results, especially when you have rows in the subquery that are not used in the UPDATE.
I placed it in the SET clause because I suspect, that's what you want. The sequence of rows is still arbitrary. If you want more control over which numbers are assigned, you need to define what you want and then take a different route.

Tyler Eaves is correct, this action is not advisable.
However, if you insist, this may help:
Once you have a sequence on a column you don't have to call nextval() in the SET statement. Just leave it out and the column will auto-increment.
UPDATE
trk2
SET
(
track_id = [track_id Value],
track_point = [track_point Value]
)
or, include it and set it to default or null
UPDATE
trk2
SET
(
id = default,
track_id = [track_id Value],
track_point = [track_point Value]
)
Using your example:
UPDATE
trk2
SET
(
track_id,
track_point
) = (
SELECT
trk1.track_fid,
trk1.wkb_geometry
FROM
track_points_1 AS trk1
WHERE
track_fid = 0
)

I think you need a second set of parentheses on the RHS of the SET clause:
UPDATE trk2
SET (id, track_id, track_point) =
((SELECT nextval('seq_id'), trk1.track_fid, trk1.wkb_geometry
FROM track_points_1 as trk1
WHERE track_fid = 0));
The first set of parentheses matches the parentheses on the LHS of the SET clause. The second set wraps up the sub-query for re-use.

Related

Merging deltas with duplicate keys

I'm trying to perform a merge into a target table in our Snowflake instance where the source data contains change data with a field denoting the at source DML operation i.e I=Insert,U=Update,D=Delete.
The problem is dealing with the fact the log (deltas) source might contain multiple updates for the same record. The merge I've constructed bombs out complaining about duplicate keys.
I'm struggling to think of a solution without going the likes of GROUP BY and MAX on the updates. I've done a similar setup with Oracle and the AND clause on the MATCH was enough.
MERGE INTO "DB"."SCHEMA"."TABLE" t
USING (
SELECT * FROM "DB"."SCHEMA"."TABLE_LOG"
ORDER BY RECORD_TIMESTAMP ASC
) s ON t.RECORD_KEY = s.RECORD_KEY
WHEN MATCHED AND s.RECORD_OPERATION = 'D' THEN DELETE
WHEN MATCHED AND s.RECORD_OPERATION = 'U' THEN UPDATE
SET t.ID=COALESCE(s.ID,t.ID),
t.CREATED_AT=COALESCE(s.CREATED_AT,t.CREATED_AT),
t.PRODUCT=COALESCE(s.PRODUCT,t.PRODUCT),
t.SHOP_ID=COALESCE(s.SHOP_ID,t.SHOP_ID),
t.UPDATED_AT=COALESCE(s.UPDATED_AT,t.UPDATED_AT)
WHEN NOT MATCHED AND s.RECORD_OPERATION = 'I' THEN
INSERT (RECORD_KEY, ID, CREATED_AT, PRODUCT,
SHOP_ID, UPDATED_AT)
VALUES (s.RECORD_KEY, s.ID, s.CREATED_AT, s.PRODUCT,
s.SHOP_ID, s.UPDATED_AT);
Is there a way to rewrite the above merge so that it works as is?
The Snowflake docs show the ability for the AND case predicate during the match clause, it sounds like you tried this and it's not working because of the duplicates, right?
https://docs.snowflake.net/manuals/sql-reference/sql/merge.html#matchedclause-for-updates-or-deletes
There is even an example there which is using the AND command:
merge into t1 using t2 on t1.t1key = t2.t2key
when matched and t2.marked = 1 then delete
when matched and t2.isnewstatus = 1 then update set val = t2.newval, status = t2.newstatus
when matched then update set val = t2.newval
when not matched then insert (val, status) values (t2.newval, t2.newstatus);
I think you are going to have to get the "last record" per key and use that as your update, or process these serially which will be pretty slow...
Another thing to look at would be to try to see if you can apply the last_value( ) function to each column, where you order by your timestamp and partition over your key. If you do that in your inline view, that might work.
I hope this helps, I have a feeling it won't help much...Rich
UPDATE:
I found the following: https://docs.snowflake.net/manuals/sql-reference/parameters.html#error-on-nondeterministic-merge
If you run the following command before your merge, I think you'll be OK (testing required of course):
ALTER SESSION SET ERROR_ON_NONDETERMINISTIC_MERGE=false;

sql server if statement not working

Can anyone advise me as to what is wrong with the following SQL server update statement:
IF (SELECT * FROM TBL_SystemParameter WHERE code='SOUND_WRONG_GARMENT') = ''
GO
UPDATE TBL_SystemParameter
SET [Value] = 'Ping.wav'
WHERE ID = (SELECT ID
FROM TBL_SystemParameter
WHERE code = 'SOUND_WRONG_GARMENT')
You don't need an if statement - you can just run the update statement, and if the subquery returns no rows, no rows will be updated. The if won't really save anything - you're performing two queries instead of one.
You either want
UPDATE TBL_SystemParameter
SET [Value] = 'Ping.wav'
WHERE ID In (SELECT ID
FROM TBL_SystemParameter
WHERE code = 'SOUND_WRONG_GARMENT')
if there are multiple ID's with that code OR use
UPDATE TBL_SystemParameter
SET [Value] = 'Ping.wav'
WHERE code = 'SOUND_WRONG_GARMENT'
either way and lose the IF statement as #Mureinik said.
Although Mureinik's answer is the logical solution to this, I will answer why this isn't actually working. Your condition is wrong, and this approach will work instead using IF EXISTS:
IF EXISTS (SELECT * FROM TBL_SystemParameter WHERE code='SOUND_WRONG_GARMENT')
BEGIN
UPDATE TBL_SystemParameter
SET [Value] = 'Ping.wav'
WHERE ID IN (SELECT ID
FROM TBL_SystemParameter
WHERE code = 'SOUND_WRONG_GARMENT')
END
As a side note, you're using an = sign instead of IN, which means you'll be matching to an arbitrary singular ID and only update 1 row based on this. To use a set based operation, use the IN clause.
You could actually 'golf' this by doing away with the derived query altogether, and using a simple WHERE code='SOUND_WRONG_GARMENT' on the table you're updating on.

How to update a PostgreSQL table with a count of duplicate items

I found two bugs in a program that created a lot of duplicate values:
an 'index' was created instead of a 'unique index'
a duplication checks wasn't integrated in one of 4 twisted routines
So I need to go in and clean up my database.
Step one is to decorate the table with a count of all the duplicate values (next I'll look into finding the first value, and then migrating everything over )
The code below works, I just recall doing a similar "update from select count" on the same table years ago, and I did it in half as much code.
Is there a better way to write this?
UPDATE
shared_link
SET
is_duplicate_of_count = subquery.is_duplicate_of_count
FROM
(
SELECT
count(url) AS is_duplicate_of_count
, url
FROM
shared_link
WHERE
shared_link.url = url
GROUP BY
url
) AS subquery
WHERE
shared_link.url = subquery.url
;
You query is fine, generally, except for the pointless (but also harmless) WHERE clause in the subquery:
UPDATE shared_link
SET is_duplicate_of_count = subquery.is_duplicate_of_count
FROM (
SELECT url
, count(url) AS is_duplicate_of_count
FROM shared_link
-- WHERE shared_link.url = url
GROUP BY url
) AS subquery
WHERE shared_link.url = subquery.url;
The commented clause is the same as
WHERE shared_link.url = shared_link.url
and therefore only eliminating NULL values (because NULL = NULL is not TRUE), which is most probably neither intended nor needed in your setup.
Other than that you can only shorten your code further with aliases and shorter names:
UPDATE shared_link s
SET ct = u.ct
FROM (
SELECT url, count(url) AS ct
FROM shared_link
GROUP BY 1
) AS u
WHERE s.url = u.url;
In PostgreSQL 9.1 or later you might be able to do the whole operation (identify dupes, consolidate data, remove dupes) in one SQL statement with aggregate and window functions and data-modifying CTEs - thereby eliminating the need for an additional column to begin with.

Guidance on using the WITH clause in SQL

I understand how to use the WITH clause for recursive queries (!!), but I'm having problems understanding its general use / power.
For example the following query updates one record whose id is determined by using a subquery returning the id of the first record by timestamp:
update global.prospect psp
set status=status||'*'
where psp.psp_id=(
select p2.psp_id
from global.prospect p2
where p2.status='new' or p2.status='reset'
order by p2.request_ts
limit 1 )
returning psp.*;
Would this be a good candidate for using a WITH wrapper instead of the relatively ugly sub-query? If so, why?
If there can be concurrent write access to involved tables, there are race conditions in the following queries. Consider:
Postgres UPDATE … LIMIT 1
Your example can use a CTE (common table expression), but it will give you nothing a subquery couldn't do:
WITH x AS (
SELECT psp_id
FROM global.prospect
WHERE status IN ('new', 'reset')
ORDER BY request_ts
LIMIT 1
)
UPDATE global.prospect psp
SET status = status || '*'
FROM x
WHERE psp.psp_id = x.psp_id
RETURNING psp.*;
The returned row will be the updated version.
If you want to insert the returned row into another table, that's where a WITH clause becomes essential:
WITH x AS (
SELECT psp_id
FROM global.prospect
WHERE status IN ('new', 'reset')
ORDER BY request_ts
LIMIT 1
)
, y AS (
UPDATE global.prospect psp
SET status = status || '*'
FROM x
WHERE psp.psp_id = x.psp_id
RETURNING psp.*
)
INSERT INTO z
SELECT *
FROM y;
Data-modifying queries using CTEs were added with PostgreSQL 9.1.
The manual about WITH queries (CTEs).
WITH lets you define "temporary tables" for use in a SELECT query. For example, I recently wrote a query like this, to calculate changes between two sets:
-- Let o be the set of old things, and n be the set of new things.
WITH o AS (SELECT * FROM things(OLD)),
n AS (SELECT * FROM things(NEW))
-- Select both the set of things whose value changed,
-- and the set of things in the old set but not in the new set.
SELECT o.key, n.value
FROM o
LEFT JOIN n ON o.key = n.key
WHERE o.value IS DISTINCT FROM n.value
UNION ALL
-- Select the set of things in the new set but not in the old set.
SELECT n.key, n.value
FROM o
RIGHT JOIN n ON o.key = n.key
WHERE o.key IS NULL;
By defining the "tables" o and n at the top, I was able to avoid repeating the expressions things(OLD) and things(NEW).
Sure, we could probably eliminate the UNION ALL using a FULL JOIN, but I wasn't able to do that in my particular case.
If I understand your query correctly, it does this:
Find the oldest row in global.prospect whose status is 'new' or 'reset'.
Mark it by adding an asterisk to its status
Return the row (including our tweak to status).
I don't think WITH will simplify anything in your case. It may be slightly more elegant to use a FROM clause, though:
update global.prospect psp
set status = status || '*'
from ( select psp_id
from global.prospect
where status = 'new' or status = 'reset'
order by request_ts
limit 1
) p2
where psp.psp_id = p2.psp_id
returning psp.*;
Untested. Let me know if it works.
It's pretty much exactly what you have already, except:
This can be easily extended to update multiple rows. In your version, which uses a subquery expression, the query would fail if the subquery were changed to yield multiple rows.
I did not alias global.prospect in the subquery, so it's a bit easier to read. Since this uses a FROM clause, you'll get an error if you accidentally reference the table being updated.
In your version, the subquery expression is encountered for every single item. Although PostgreSQL should optimize this and only evaluate the expression once, this optimization will go away if you accidentally reference a column in psp or add a volatile expression.

Updating a table within a select statement

Is there any way to update a table within the select_expr part of a mysql select query. Here is an example of what I am trying to achieve:
SELECT id, name, (UPDATE tbl2 SET currname = tbl.name WHERE tbl2.id = tbl.id) FROM tbl;
This gives me an error in mysql, but I dont see why this shouldn't be possible as long as I am not changing tbl.
Edit:
I will clarify why I cant use an ordinary construct for this.
Here is the more complex example of the problem which I am working on:
SELECT id, (SELECT #var = col1 FROM tbl2), #var := #var+1,
(UPDATE tbl2 SET col1 = #var) FROM tbl WHERE ...
So I am basically in a situation where I am incrementing a variable during the select statement and want to reflect this change as I am selecting the rows as I am using the value of this variable during the execution. The example given here can probably be implemented with other means, but the real example, which I wont post here due to there being too much unnecessary code, needs this functionality.
If your goal is to update tbl2 every time you query tbl1, then the best way to do that is to create a stored procedure to do it and wrap it in a transaction, possibly changing isolation levels if atomicity is needed.
You can't nest updates in selects.
What results do you want? The results of the select, or of the update.
If you want to update based on the results of a query you can do it like this:
update table1 set value1 = x.value1 from (select value1, id from table2 where value1 = something) as x where id = x.id
START TRANSACTION;
-- Let's get the current value
SELECT value FROM counters WHERE id = 1 FOR UPDATE;
-- Increment the counter
UPDATE counters SET value = value + 1 WHERE id = 1;
COMMIT;