I'm creating a table with 3 million rows of data and 9 columns.
I am using the following syntax to insert my data.
INSERT INTO myTable
( column1
column2
column3
...
column11
problemColumn
)
Select
<exampleQuery>
1 column (which I will refer to as problemColumn) inserts 1.2 million null values into this table
When I run exampleQuery on its own (not inserting it into the table), problemColumn returns 0 null values.
problemColumn is correctly defined as an integer when the table is created
problemColumn has 300,000 distinct values. Each value appears in the table at least once, which means that it can't be an issue of a poorly-formatted value
There is no obvious pattern of values being systematically deleted
Edit: Some additional clarifications:
There are no calculations or joins done on problemColumn. I am simply selecting that variable from another table
problemColumn is an integer in the source table, so it is not an issue of a mismatched variable type
Could this be an issue with the size of the table in the database? I cannot comprehend why a query would fundamentally change from an insert statement.
Most likely cause (I've done it myself) is fat fingers - the column you're inserting is in the wrong position. Hard to verify without seeing the actual code, but it might be as simple as:
insert into table
(column1,
column2)
select
(column2,
column1)
from somewhere
Second possibility - there's a trigger on the destination table, which is changing the data. One of the many reasons I hate triggers.
I don't know Teradata, but the point of an RDBMS is to be able to handle exactly this scenario, so it's very unlikely it's anything to do with the size. To verify this, please try to limit the query to 1 result, and see what happens.
If that doesn't work, please convert the results of that query into an insert statement using "values"
INSERT INTO myTable
( column1
column2
column3
...
column11
problemColumn
)
values
(....)
Related
I am trying to insert rows into an Oracle 19c table that we recently added a GENERATED ALWAYS AS IDENTITY column (column name is "ID"). The column should auto-increment and not need to be specified explicitly in an INSERT statement. Typical INSERT statements work - i.e. INSERT INTO table_name (field1,field2) VALUES ('f1', 'f2'). (merely an example). The ID field increments when typical INSERT is executed. But the query below, that was working before the addition of the IDENTITY COLUMN, is now not working and returning the error: ORA-00947: not enough values.
The field counts are identical with the exception of not including the new ID IDENTITY field, which I am expecting to auto-increment. Is this statement not allowed with an IDENTITY column?
Is the INSERT INTO statement, using a SELECT from another table, not allowing this and producing the error?
INSERT INTO T.AUDIT
(SELECT r.IDENTIFIER, r.SERIAL, r.NODE, r.NODEALIAS, r.MANAGER, r.AGENT, r.ALERTGROUP,
r.ALERTKEY, r.SEVERITY, r.SUMMARY, r.LASTMODIFIED, r.FIRSTOCCURRENCE, r.LASTOCCURRENCE,
r.POLL, r.TYPE, r.TALLY, r.CLASS, r.LOCATION, r.OWNERUID, r.OWNERGID, r.ACKNOWLEDGED,
r.EVENTID, r.DELETEDAT, r.ORIGINALSEVERITY, r.CATEGORY, r.SITEID, r.SITENAME, r.DURATION,
r.ACTIVECLEARCHANGE, r.NETWORK, r.EXTENDEDATTR, r.SERVERNAME, r.SERVERSERIAL, r.PROBESUBSECONDID
FROM R.STATUS r
JOIN
(SELECT SERVERSERIAL, MAX(LASTOCCURRENCE) as maxlast
FROM T.AUDIT
GROUP BY SERVERSERIAL) gla
ON r.SERVERSERIAL = gla.SERVERSERIAL
WHERE (r.LASTOCCURRENCE > SYSDATE - (1/1440)*5 AND gla.maxlast < r.LASTOCCURRENCE)
) )
Thanks for any help.
Yes, it does; your example insert
INSERT INTO table_name (field1,field2) VALUES ('f1', 'f2')
would also work as
INSERT INTO table_name (field1,field2) SELECT 'f1', 'f2' FROM DUAL
db<>fiddle demo
Your problematic real insert statement is not specifying the target column list, so when it used to work it was relying on the columns in the table (and their data types) matching the results of the query. (This is similar to relying on select *, and potentially problematic for some of the same reasons.)
Your query selects 34 values, so your table had 34 columns. You have now added a 35th column to the table, your new ID column. You know that you don't want to insert directly into that column, but in general Oracle doesn't, at least at the point it's comparing the query with the table columns. The table has 35 columns, so as you haven't said otherwise as part of the statement, it is expecting 35 values in the select list.
There's no way for Oracle to know which of the 35 columns you're skipping. Arguably it could guess based on the identity column, but that would be more work and inconsistent, and it's not unreasonable for it to insist you do the work to make sure it's right. It's expecting 35 values, it sees 34, so it throws an error saying there are not enough values - which is true.
Your question sort of implies you think Oracle might be doing something special to prevent the insert ... select ... syntax if there is an identity column, but in facts it's the opposite - it isn't doing anything special, and it's reporting the column/value count mismatch as it usually would.
So, you have to list the columns you are populating - you can't automatically skip one. So you statement needs to be:
INSERT INTO T.AUDIT (IDENTIFIER, SERIAL, NODE, ..., PROBESUBSECONDID)
SELECT r.IDENTIFIER, r.SERIAL, r.NODE, ..., r.PROBESUBSECONDID
FROM ...
using the actual column names of course if they differ from the query column names.
If you can't change that insert statement then you could make the ID column invisible; but then you would have to specify it explicitly in queries, as select * won't see it - but then you shouldn't rely on * anyway.
db<>fiddle
At the job we have an update script for some Oracle 11g database that takes around 20 hours, and some of the most demanding queries are updates where we change some values, something like:
UPDATE table1 SET
column1 = DECODE(table1.column1,null,null,'no info','no info','default value'),
column2 = DECODE(table1.column2,null,null,'no info','no info','another default value'),
column3 = 'default value';
And like this, we have many updates. The problem is that the tables have around 10 millions of rows. We also have some updates where some columns are going to have a default value but they are nullable (I know if they have the not null and a default constrains then the add of such columns is almost immediate because the values are in a catalog), and then the update or add of such columns is costing a lot of time.
My approach is to recreate the table (as TOM said in https://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:6407993912330 ). But I have no idea on how to retrive some columns from the original table, that are going to remain the same and also other that are going to change to a default value (and before the update such column had a sensible info), this because we need to keep some info private.
So, my approach is something like this:
CREATE TABLE table1_tmp PARALLEL NOLOGGING
AS (select col1,col2,col3,col4 from table1);
ALTER TABLE table1_tmp ADD ( col5 VARCHAR(10) default('some info') NOT NULL;
ALTER TABLE table1_tmp ADD ( col6 VARCHAR(10) default('some info') NOT NULL;
ALTER TABLE table1_tmp ADD ( col7 VARCHAR(10);
ALTER TABLE table1_tmp ADD ( col8 VARCHAR(10);
MERGE INTO table1_tmp tt
USING table1 t
ON (t.col1 = tt.col1)
WHEN MATCHED THEN
UPDATE SET
tt.col7 = 'some defaul value that may be null',
tt.col7 = 'some value that may be null';
I also have tried to create the nullable values as not null to do it fast, and worked, the problem is when I return the columns to null, then that operation takes too much time. The last code ended up consuming also a great amount of time (more tha one hour in the merge).
Hope have an idea on how to improve performance in stuff like this.
Thanks in advance!
Maybe you can try using NVL while joining in merge:
MERGE INTO table1_tmp tt
USING table1 t
ON (nlv(t.col1,'-3') = nvl(tt.col1,'-3'))
WHEN MATCHED THEN ....
If you don't want update null values you can also do like this:
MERGE INTO table1_tmp tt
USING table1 t
ON (nlv(t.col1,'-3') = nvl(tt.col1,'-2'))
WHEN MATCHED THEN .....
At the end, I finished creating a temp table with data from the original table, and while doing the create, inserting the default values and decodes and any other stuff, like if I wanted to set something to NULL, I did the cast. Something like:
CREATE TABLE table1_tmp AS (
column1 default "default message",
column2, --This column with no change at all
column3, --This will take the value from the decode below
) AS SELECT
"default message" column1,
column2 --This column with no change at all,
decode(column3, "Something", NULL, "A", "B") column3,
FROM table1;
That is how I solved the problem. The time for coping a 23 million row's table was about 3 to 5 minutes, while updating used to take hours. Now just need to set privileges, constraints, indexes, comment, and that's it, but that stuff just takes seconds.
Thanks for the answer #thehazal could not check your approach, but sounds interesting.
I am work on a project which has to add one column to the exist table.
It is like this:
The OLD TBL Layout
OldTbl(
column1 number(1) not null,
column2 number(1) not null
);
SQL TO Create the New TBL
create table NewTbl(
column1 number(1) not null,
column2 number(1) not null,
**column3 number(1)**
);
When I try to insert the data by the SQL below,
on one oracle server,it was successful executed,
but on another oracle server, I got "ORA-00947 error: not enough values"
insert into NewTbl select
column1,
column2
from OldTbl;
Is there any oracle option may cause this kind of difference in oracle?
ORA-00947: not enough values
this is the error you received, which means, your table actually has more number of columns than you specified in the INSERT.
Perhaps, you didn't add the column in either of the servers.
There is also a different syntax for INSERT, which is more readable. Here, you mention the column names as well. So, when such a SQL is issued, unless a NOT NULL column is missed out, the INSERT still work, having null updated in missed columns.
INSERT INTO TABLE1
(COLUMN1,
COLUMN2)
SELECT
COLUMN1,
COLUMN2
FROM
TABLE2
insert into NewTbl select
column1,
column2
from OldTbl;
The above query is wrong, because your new table has three columns, however, your select has only two columns listed. Had the number and the order of the columns been same, then you could have achieved it.
If the number of the columns, and the order of the columns are different, then you must list down the column names in the correct order explicitly.
I would prefer CTAS(create table as select) here, it would be faster than the insert.
CREATE TABLE new_tbl AS
SELECT column1, column2, 1 FROM old_tbl;
You could use NOLOGGING and PARALLEL to increase the performance.
CREATE TABLE new_tbl NOLOGGING PARALLEL 4 AS
SELECT column1, column2, 1 FROM old_tbl;
This will create the new table will 3 columns, the first two columns will have data from the old table, and the third column will have value as 1 for all rows. You could keep any value for the third column as per your choice. I kept it as 1 because you wanted the third column as data type NUMBER(1).
I am running a python script that inserts a large amount of data into a Postgres database, I use a single query to perform multiple row inserts:
INSERT INTO table (col1,col2) VALUES ('v1','v2'),('v3','v4') ... etc
I was wondering what would happen if it hits a duplicate key for the insert. Will it stop the entire query and throw an exception? Or will it merely ignore the insert of that specific row and move on?
The INSERT will just insert all rows and nothing special will happen, unless you have some kind of constraint disallowing duplicate / overlapping values (PRIMARY KEY, UNIQUE, CHECK or EXCLUDE constraint) - which you did not mention in your question. But that's what you are probably worried about.
Assuming a UNIQUE or PK constraint on (col1,col2), you are dealing with a textbook UPSERT situation. Many related questions and answers to find here.
Generally, if any constraint is violated, an exception is raised which (unless trapped in subtransaction like it's possible in a procedural server-side language like plpgsql) will roll back not only the statement, but the whole transaction.
Without concurrent writes
I.e.: No other transactions will try to write to the same table at the same time.
Exclude rows that are already in the table with WHERE NOT EXISTS ... or any other applicable technique:
Select rows which are not present in other table
And don't forget to remove duplicates within the inserted set as well, which would not be excluded by the semi-anti-join WHERE NOT EXISTS ...
One technique to deal with both at once would be EXCEPT:
INSERT INTO tbl (col1, col2)
VALUES
(text 'v1', text 'v2') -- explicit type cast may be needed in 1st row
, ('v3', 'v4')
, ('v3', 'v4') -- beware of dupes in source
EXCEPT SELECT col1, col2 FROM tbl;
EXCEPT without the key word ALL folds duplicate rows in the source. If you know there are no dupes, or you don't want to fold duplicates silently, use EXCEPT ALL (or one of the other techniques). See:
Using EXCEPT clause in PostgreSQL
Generally, if the target table is big, WHERE NOT EXISTS in combination with DISTINCT on the source will probably be faster:
INSERT INTO tbl (col1, col2)
SELECT *
FROM (
SELECT DISTINCT *
FROM (
VALUES
(text 'v1', text'v2')
, ('v3', 'v4')
, ('v3', 'v4') -- dupes in source
) t(c1, c2)
) t
WHERE NOT EXISTS (
SELECT FROM tbl
WHERE col1 = t.c1 AND col2 = t.c2
);
If there can be many dupes, it pays to fold them in the source first. Else use one subquery less.
Related:
Select rows which are not present in other table
With concurrent writes
Use the Postgres UPSERT implementation INSERT ... ON CONFLICT ... in Postgres 9.5 or later:
INSERT INTO tbl (col1,col2)
SELECT DISTINCT * -- still can't insert the same row more than once
FROM (
VALUES
(text 'v1', text 'v2')
, ('v3','v4')
, ('v3','v4') -- you still need to fold dupes in source!
) t(c1, c2)
ON CONFLICT DO NOTHING; -- ignores rows with *any* conflict!
Further reading:
How to use RETURNING with ON CONFLICT in PostgreSQL?
How do I insert a row which contains a foreign key?
Documentation:
The manual
The commit page
The Postgres Wiki page
Craig's reference answer for UPSERT problems:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
Will it stop the entire query and throw an exception? Yes.
To avoid that, you can look on the following SO question here, which describes how to avoid Postgres from throwing an error for multiple inserts when some of the inserted keys already exist on the DB.
You should basically do this:
INSERT INTO DBtable
(id, field1)
SELECT 1, 'value'
WHERE
NOT EXISTS (
SELECT id FROM DBtable WHERE id = 1
);
Please help me.
select * from hoge where id=xxx;
If there is a data, I do it to do it, but though I implement this step, there is if in a program, and insert hoge set data=0 is troublesome if update hoge set data=data+1, a result are 0 lines.
May not you realize this procedure by a blow by SQL?
replace hoge select id, data+1 as data from hoge where id = x;
When it was this SQL, a result was not usable because data did not enter in the case of NULL.
After all will not there be it whether it is a plural number or a comb by SQL in an if sentence?
If there is a simpler method, please teach it.
People thanking you in advance.
If I'm understanding the question properly (I don't think the OP is a native English speaker), you can use ON DUPLICATE KEY to do this in MySQL.
INSERT INTO table
(column1, column2, ...)
VALUES
('initial value for column1', 'initial value for column2', ...)
ON DUPLICATE KEY UPDATE
column1 = column1 + 1, column2 = 'new value for column2';
You seem to be asking how to create a row if it doesn't already exist or, if it does exist, update one of the fields in it.
The normal approach (for DBMS' that don't provide some form of UPSERT statement) is:
insert into TBL(keycol,countcol) values (key,-1)
update TBL set countcol = countcol + 1 where keycol = key
You need to ignore any errors on the first line (which should happen if the row already exists). Then the second statement will update it.
It's also unusual to initially insert a zero entry since you're generally adding one 'thing' the first time you do it. In that case, the insert would set the value to 0, not -1.