How to copy, change, and insert records in Postgres - sql

In a PostgreSQL DB table, I need to copy a block of records from a prior month, change values in some of the columns, and append the updated records to the table. Details include:
The key id is configured with nextval to automatically create
unique key values
The target records have '200814' in group_tag
The new records need '200911' in group_tag
Several other fields need to be updated as shown in the SELECT
My script so far:
INSERT INTO hist.group_control(
id,
group_tag,
process_sequence,
state,
cbsa_code,
window_starts_on,
preceding_group,
preceding_origin,
preceding_window_starts_on
)
SELECT id,
'200911',
1,
state,
cbsa_code,
'2020-09-11',
'200814',
preceding_origin,
'2020-08-14'
FROM hist.group_control WHERE group_tag='200814';
This generates an error:
SQL Error [23505]: ERROR: duplicate key value violates unique constraint "group_control_pkey"
Detail: Key (id)=(12250) already exists.
Records with key values up to 13008 exist. I would have expected nextval to determine this and start the id value at 13009. I attempted to simply not include id in the statement thinking the nextval function would operate automatically, but that errored as well. Variations on the following have not worked due to the respective errors:
alter sequence group_control_id_seq restart with 13009;
SQL Error [42501]: ERROR: must be owner of relation group_control_id_seq
SELECT setval('group_control_id_seq', 13009, true);
SQL Error [42501]: ERROR: permission denied for sequence group_control_id_seq
Anyone know how to code the main statement to not generate the duplicate key or alternatively, how to tell nextval to start at a value of 13009

It appears your serial, bigserial, or generated by default. Any of these only assign the id column when it is not specified in the insert statement. If you specify the id column Postgres will not assign a key PK. Since you selected the id, Postgres attempted to use what you specified. Solution drop id from the insert statement.
INSERT INTO hist.group_control(
group_tag,
process_sequence,
state,
cbsa_code,
window_starts_on,
preceding_group,
preceding_origin,
preceding_window_starts_on
)
SELECT '200911',
1,
state,
cbsa_code,
'2020-09-11',
'200814',
preceding_origin,
'2020-08-14'
FROM hist.group_control WHERE group_tag='200814';

Related

How to upsert when using data from a sub-query (Postgres)

I have two tables:
assignments {recceptacleId, assignedCarrier}
rls_permissions {receptacleId, rlsUserId}
An assignment in this context is any receptacle to airline carrier relationship.
Whenever a new assignment comes into the assignments table, I'd like to upsert (insert if new row or update if it's an existing receptacle being assigned to a new airline carrier) my rls_permissions table.
The issue I'm having with upsert, specifically ON CONFLICT ON CONSTRAINT, is that my insert statement contains a sub-query for the data to be inserted and therefore I'm not sure how to write the DO UPDATE SET part of the statement
I've tried using 'excluded' to try and single out the assignedCarrier that I want to update based on the previous conflict however I keep receiving "ERROR: column excluded.receptacleId does not exist"
My pkey looks like this:
CREATE UNIQUE INDEX rls_permissions_pkey ON rls_permissions("receptacleId" text_ops);
Dummy data could be:
receptacleID assignedCarrier
aaaaaaaaaa00 AA
Where AA is "American Airlines"
INSERT INTO rls_permissions ("receptacleId","rlsUserId")
SELECT DISTINCT assignments."receptacleId", assignments."assignedCarrier"
FROM assignments
ON CONFLICT ON CONSTRAINT rls_permissions_pkey
DO UPDATE SET "rlsUserId" = (SELECT DISTINCT assignments."assignedCarrier"
FROM assignments
WHERE assignments."receptacleId" = excluded."receptacleId");
The excepted result is that if no conflict, the data returned from the sub-query is inserted into a new row on the permissions table.
If there is a conflict, I'd like to update ONLY the newly assigned carrier, and not update or insert a new line since that receptacle already exists.
You don't need a subquery in the UPDATE part. You can access the values for the INSERT part through the excluded keyword.
INSERT INTO rls_permissions ("receptacleId","rlsUserId")
SELECT DISTINCT assignments."receptacleId", assignments."assignedCarrier"
FROM assignments
ON CONFLICT ON CONSTRAINT rls_permissions_pkey
DO UPDATE SET "rlsUserId" = excluded."rlsUserId";
the reference to excluded."rlsUserId" refers to the value that would have been inserted into the column rlsUserId and thus it's the value retrieved through assignments."assignedCarrier" from your SELECT statement.

Insert statement with no joins results in duplicates where no duplicates existed previously?

I am having an issue with some SQL that is resulting in results that I wouldn't expect. I am storing information from a variety of tables in another table which is used as part of a search page on a website. All of the page data for each page, along with data from other elements on other pages (like calendars, etc) is referenced in a table called pageContentCache. This table has normally has an index against created with the following:
alter table pageContentCache add
constraint [IX_pageContentCache] PRIMARY KEY CLUSTERED (
[objectId]
)
For some reason that to me would appear to be a duplicate objectId, an issue has started occurring with one instance of this software, resulting in the following error:
Msg 1505, Level 16, State 1 Procedure sp_rebuildPageContentCache, Line 50
The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.pageContentCache' and the index name 'IX_pageContentCache'. The duplicate key value is (21912).
So, to debug the issue, I had got the procedure to load all of the data it was going to input into the pageContentCache table into a temporary table, #contentcache, first, so I could have a look through it.
This is where I'm starting to get a little confused...
Once the data has been inserted into #contentcache (which has two columns, objectId and content), I can run the following SQL statement and it will return nothing:
select objectId, count(objectId) from #contentcache
group by objectId having count(objectId) > 1
This returns no records. If I then run the following SQL:
insert into pageContentCache (objectId, contentData)
select objectId, content
from #contentcache
This inserts all of the data from #contentcache into pageContentCache as you'd expect. However, if I then run the following SQL, it returns duplicates:
select objectId, count(objectId) from pageContentCache
group by objectId having count(objectId) > 1
This then returns duplicates:
objectId (no column name)
21912 2
There are no triggers or anything like that associated with this table and the insert statement is merely copying the data from one table to another, so... where is this duplicate coming from?
Try the following:
insert into pageContentCache (objectId, contentData)
select distinct objectId, content
from #contentcache
Can't see why you would have duplicates since, as you mentioned, there are no joins in your select statement. Anyways, my guess is that the distinct keyword will ensure that the duplicates are eliminated.
This is a SQL Server database error I have seen before. You may want to patch the latest service pack and retry.
I am not so sure that this statement does what you think it does:
select objectId, count(objectId) from #contentcache
group by objectId having count(objectId) > 1
Can you try this instead:
WITH SUBQUERY AS
( select
COUNT(objectId) OVER (PARTITION BY objectId) AS CNT_OBJECT_IDS,
objectId
FROM #contentcache)
SELECT * FROM SUBQUERY WHERE CNT_OBJECT_IDS > 1
See if this gets you any rows back.
Also, I've never worked with clusters before and I am wondering if they do some additional things that we are not aware of. Can you try just saying
PRIMARY KEY
instead of
PRIMARY KEY CLUSTERED
in your constraint definition and see if that affects your problem at all?

error: column "user_account_id_seq" does not exist after insert in postgresql

I am trying to retrieve the ID value of an inserted row using node.js' pg module. The query is:
INSERT INTO "MySchema"."USER_ACCOUNT"
("LANG","NAME","EMAIL","EMAIL_CONF","STATUS","STATUS_UPDATE","CREATION","PREFERENCES")
VALUES ($1,$2,$3,$4,$5,$6,$7,$8)
RETURNING USER_ACCOUNT_ID_seq;
But I get the following error message:
Cannot insert user account { [error: column "user_account_id_seq" does not exist]...
I don't provide an id, because I am letting the database set the next sequence value.
How can I retrieve this ID value after the insert? Thanks!
Although the ID is generated by the sequence, you could retrieve it from the linked column, probably called "ID" in this case, e.g. ... RETURNING "ID";.
"How do I get the value of a SERIAL insert?"

Datastage Job terminating due to the following error

I'm running a data stage job, Input through DB2 and output to DB2. Input side has a query containing joins and functions.
I'm getting the following warning message;
TRN_HEALTH_INSURANCE_DETAIL,
2: STATEMENT
INSERT
INTO
HEALTH_INSURANCE_DETAIL
(
RISK_DETAIL_ID,
RISK_COVER_ID,
RD_POLICY_SYSTEM_NO,
RD_POLICY_END_NO_IDX,
RD_POLICY_ID,
RD_LEVEL1_ID,
RD_SUM_INSURED_AMT_LC,
RD_PREMIUM_AMT_LC,
PREMIUM_AMOUNT_FC,
SUM_INSURED_AMT_FC,
RD_REC_TYPE,
RD_EFFECT_FROM_DT,
RD_EFFECT_TO_DT,
RD_END_EFFECT_FROM_DT,
SEX_MAS_CD,
MARITAL_STATUS_CD,
EMP_CATG,
NO_OF_DEPENDENTS,
EMP_AL_NO,
DOB,
EFF_DATE,
EFF_DATE2,
NAME,
RELATIONSHIP_CD_S,
RELATIONSHIP_CD,
DESIGNATION,
BRANCH,
BANK_ACCOUNT,
BANK_BRANCH_NAME,
PRE_EXISTING_AILMENT,
AUTHORITY_LETTER_NO,
AGE,
REGION,
CNIC,
CO_CODE,
EMP_LOCATION,
SUB_LOCATION,
CLH_SYSTEM_NO,
CTH_SYS_ID,
CTH_POL_SYS_ID,
CTH_END_NO_IDX,
CTH_END_SR_NO,
CTH_CATEGORY,
CLD_SYS_ID,
CLDH_SYS_ID,
CLD_COVER_CD,
CLD_END_IDX,
CLD_COVER_DESC,
CLD_CLM_TYPE_LIMIT,
CLD_CLM_REL,
CLD_CLM_AGE_FROM,
CLD_CLM_AGE_TO,
CLD_CLM_RB_LIMIT,
CLD_CATEGORY_LIMIT_FC,
CLD_CATEGORY_PREM_FC
)
VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) failed to run.
I cant see such records in my data. The data quality is good. Then what are these ????, I search a bit and found a suggestion to keep the array size and row count to 1, instead of default 2000. But still I'm getting the same warning.
There are a lot of errors followed by this warning; The next error is also interesting.
TRN_HEALTH_INSURANCE_DETAIL,2: SQLExecute reported: SQLSTATE = 23505: Native Error Code = -803: Msg = [IBM][CLI Driver][DB2/NT64] SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "1" constrains table "DB2ADMIN.HEALTH_INSURANCE_DETAIL" from having duplicate values for the index key. SQLSTATE=23505 (CC_DB2DBStatement::executeInsert, file CC_DB2DBStatement.cpp, line 1,095)
I believe the errors are due to the first warning. Kindly help me out.
Regards, Nuh
Make a copy stage before the DB2 connector and put one link to the DB2 and the other to a dataset file to see the data in a data set. But the problem seems to be in the primary key you have a duplicate primary index or a duplicate unique index. It can be either in your data that you want to insert or maybe the table already have a record that you want to insert again

MERGE INTO table containing AUTO_INCREMENT columns

I've declared the following table for use by audit triggers:
CREATE TABLE audit_transaction_ids (id IDENTITY PRIMARY KEY, uuid VARCHAR UNIQUE NOT NULL, `time` TIMESTAMP NOT NULL);
The trigger will get invoked multiple times in the same transaction.
The first time the trigger is invoked, I want it to insert a new
row with the current TRANSACTION_ID() and time.
The subsequent times the trigger is invoked, I want it to return
the existing "id" (I invoke Statement.getGeneratedKeys() to that end)
without altering "uuid" or "time".
The current schema seems to have two problems.
When I invoke MERGE INTO audit_transaction_ids (uuid, time) KEY(id) VALUES(TRANSACTION_ID(), NOW()) I get: org.h2.jdbc.JdbcSQLException: Column "ID" contains null values; SQL
statement: MERGE INTO audit_transaction_ids (uuid, time) KEY(id) VALUES
(TRANSACTION_ID(), NOW()) [90081-155]
I suspect that invoking MERGE on an existing row will alter "time".
How do I fix both these problems?
MERGE is analogous to java.util.Map.put(key, value): it will insert the row if it doesn't exist, and update the row if it does. That being said, you can still merge into a table containing AUTO_INCREMENT columns so long as you use another column as the key.
Given customer[id identity, email varchar(30), count int] you could merge into customer(id, email, count) key(email) values((select max(id) from customer c2 where c2.email='test#acme.com'), 'test#acme.com', 10). Meaning, re-use the id if a record exists, use null otherwise.
See also https://stackoverflow.com/a/18819879/14731 for a portable way to insert-or-update depending on whether a row already exists.
1. MERGE INTO audit_transaction_ids (uuid, time) KEY(id) VALUES(TRANSACTION_ID(), NOW())
If you just want to insert a new row, use:
INSERT INTO audit_transaction_ids (uuid, time) VALUES(TRANSACTION_ID(), NOW())
MERGE without setting the value for the column ID doesn't make sense if ID is used as the key, because that way it could never (even in theory) update an existing rows. What you could do is using another key column (in the case above there is no column that could be used). See the documentation for MERGE for details.
2. Invoking MERGE on an existing row will alter "time"
I'm not sure if you talk about the fact that the value of the column 'time' is altered. This is the expected behavior if you use MERGE ... VALUES(.., NOW()), because the MERGE statement is supposed to update that column.
Or maybe you mean that older versions of H2 returned different values within the same transaction (unlike most other databases, which return the same value within the same transaction). This is true, however with H2 version 1.3.155 (2011-05-27) and later, this incompatibility is fixed. See also the change log: "CURRENT_TIMESTAMP() and so on now return the same value within a transaction." It looks like this is not the problem in your case, because you do seem to use version 1.3.155 (the error message [90081-155] includes the build / version number).
Short Answer:
MERGE INTO AUDIT_TRANSACTION_IDS (uuid, time) KEY (uuid, time)
VALUES (TRANSACTION_ID(), NOW());
little performance tip: make sure uuid is indexed
Long Answer:
MERGE is basically an UPDATE which INSERTs when no record found to be updated.
Wikipedia gives a more concise, standardized syntax of
MERGE but you have to supply your own update and insert.
(Whether this will be supported in H2 or not is not mine to answer)
So how do you update a record using MERGE in H2? You define a key to be looked up for, if it is found you update the row (with column names you supply, and you can define DEFAULT here, to reset your columns to its defaults), otherwise you insert the row.
Now what is Null? Null means unknown, not found, undefined, anything which is not what you're looking for.
That is why Null works as key to be looked up for. Because it means the record is not found.
MERGE INTO table1 (id, col1, col2)
KEY(id) VALUES (Null, 1, 2)
Null has a value. it IS a value.
Now let's see your SQL.
MERGE INTO table1 (id, col1, col2)
KEY(id) VALUES (DEFAULT, 1, 2)
What is that implying? To me, it says
I have this [DEFAULT, 1, 2], find me a DEFAULT in column id,
then update col1 to 1, col2 to 2, if found.
otherwise, insert default to id, 1 to col1, 2 to col2.
See what I emphasized there? What does that even mean? What is DEFAULT? How do you compare DEFAULT to id?
DEFAULT is just a keyword.
You can do stuff like,
MERGE INTO table1 (id, col1,
timeStampCol) KEY(id) VALUES (Null, 1,
DEFAULT)
but don't put DEFAULT in the key column.