RPostgreSQL insert rows without all columns into db - rpostgresql

I'm trying to insert rows representing some but not all columns of a Postgres database. In particular, I am seeking to insert all columns except a timestamp column that has a default set to current timestamp.
I tried the following:
dbWriteTable(con, 'raw_results', df, append = TRUE)
However, this returned an error that indicated that one of the columns (not a timestamp) was not the appropriate format for a timestamp.
I also tried writing out an insert statement:
query_string = "INSERT INTO raw_results (col1, col2) VALUES...."
dbGetQuery(con, query_string)
This returns a Warning message:
Warning message:
In postgresqlQuickSQL(conn, statement, ...) :
Could not create executeINSERT INTO
How can I do a simple insert into a Postgres db via R?

Related

spark sql Insert string column to struct of array type column

I am trying to insert a STRING type column to an ARRAY of STRUCT TYPE column, but facing errors. Could you help to provide the right direction to do the INSERT.
In databricks notebook, I have a raw table (raw_lms.rawTable) where all the columns are string type. This needs to insert into a transform table (tl_lms.transformedTable) where the columns are array of struct type.
CREATE TABLE raw_lms.rawTable
( PrimaryOwners STRING
,Owners STRING
)
USING DELTA LOCATION 'xxxx/rawTable'
CREATE TABLE tl_lms.transformedTable
( PrimaryOwners array<struct<Id:STRING>>
,Owners array<struct<Id:STRING>>
)
USING DELTA LOCATION 'xxxx/transformedTable'
Raw table has the below values populated: Eg.
INSERT INTO TABLE raw_lms.rawTable
VALUES
("[{'Id': '1393fe1b-bba2-4343-dff0-08d9dea59a03'}, {'Id': 'cf2e6549-5d07-458c-9d30-08d9dd5885cf'}]",
"[]"
)
I try to insert to transform table and get the below error:
INSERT INTO tl_lms.transformedTable
SELECT PrimaryOwners,
Owners
FROM raw_lms.rawTable
Error in SQL statement: AnalysisException: cannot resolve
'spark_catalog.raw_lms.rawTable.PrimaryOwners' due to data type
mismatch: cannot cast string to array<struct<Id:string>>;
I do not want to explode the data. I only need to simply insert row for a row between rawTable and transformedTable of different column data types.
Thanks for your time and help.
As the error messages states, you can't insert a string as array. You need to use array and named_struct functions.
Change the type of raw table to correct type and types not strings and try this:
INSERT INTO TABLE raw_lms.rawTable
VALUES
(array(named_struct('id', '1393fe1b-bba2-4343-dff0-08d9dea59a03'), named_struct('id', 'cf2e6549-5d07-458c-9d30-08d9dd5885cf')),
null
);
Or if you want to keep columns as string in raw table, then use from_json to parse the strings into correct type before inserting:
INSERT INTO tl_lms.transformedTable
SELECT from_json(PrimaryOwners, 'array<struct<Id:STRING>>'),
from_json(Owners, 'array<struct<Id:STRING>>')
FROM raw_lms.rawTable

Insert into table contains Repeated Record in BigQuery

I have a table in BigQuery with very complex scheme (up to 2300 column)
in these columns I have RECORD type fields, someof them are in REPEATED mode,
The insert statement is generated by processor in the code,
but when test this insertion statement on BigQuery Web-UI I see an error,
after investigating the issue, I found that inserting array is not done in the appropriate way.
INSERT INTO Table_X (RECORD_FIELD) VALUES (
...
STRUCT([STRUCT(X), STRUCT(Y)]) as property_z
...
it this format is correct for inserting REPEATED fields?
INSERT INTO TABLE_NAME (columns) VALUES (STRUCT([ STRUCT(...), STRUCT(...) ]), ...)
Repeated fields are arrays, so you want to insert them as arrays:
INSERT INTO TABLE_NAME (repeated_column)
VALUES (ARRAY[ STRUCT(...), STRUCT(...) ]);
Note that the array is a single column, You can include values for other columns in the INSERT as well.

Inserting Values Into Table with Identity Column via Databricks

I've created a table in Databricks that is mapped to a table hosted in an Azure SQL DB. I'm trying to do a very simple insert statement on a small table, but an identity column is giving me issues. This table has the aforementioned identity column and three additional columns.
I first tried something similar to below:
%sql
INSERT INTO tableName (col2, col3, col4)
VALUES (1, 'Test Value', '2018-11-16')
That was giving me a syntax error, so I did some searching and learned that Hive SQL doesn't allow you to specify columns for an INSERT statement. So then I tried something like below as a test:
%sql
INSERT INTO tableName
VALUES (100, 1, 'Test Value', '2018-11-16')
That gives me an error message that I can't insert explicit values into an identity column, but that's what I expected to happen.
If I can't specify the columns for my INSERT statement, how do I avoid issues when I have an identity column? I just want to insert values for the non-identity columns, and I want the ID column to continue incrementing like normal. The above example is extremely watered-down. I will need to do much larger insertions based on SELECT statements eventually, so any solution involving toggling on IDENTITY_INSERT probably isn't feasible.
Below is how we can create a table with an identity column -
CREATE TABLE table_name
(column_name1 data_type GENERATED ALWAYS AS IDENTITY,
column_name2......)
Below are the two ways how we can insert the data into the table with the Identity column -
First way -
INSERT INTO T2 (CHARCOL2)
SELECT CHARCOL1 FROM T1;
Second way -
INSERT INTO T2 (CHARCOL2,IDENTCOL2) OVERRIDING USER VALUE
SELECT * FROM T1;
Links for reference-
Create table - https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html
Insert into table - https://www.ibm.com/docs/en/db2-for-zos/11?topic=statement-rules-inserting-data-into-identity-column

Oracle set based insert vs set based merge performance

We're using Oracle 11g at the moment without Enterprise (not an option unfortunately).
Let's say I have a table with a constant(Let's say 2000) rows of data. Let's call it data_source.
I want to insert some columns of this table into another table, data_dest. I'm using all the records from the source table.
In other words, I would like to insert this set
select data_source.col1, data_source.col2, ... data_source.colN
from data_source
Which would be faster in this case:
insert into data_dest
select data_source.col1, data_source.col2, ... data_source.colN
from data_source
OR
merge into data_dest dd
using data_source ds
on (dd.col1 = ds.col1) --Let's assume that this is a matching column names
when not matched
insert (col1,col2...)
values(ds.col1,ds.col2...)
EDIT 1:
We can assume there are no primary keys violations from the insert.
In other words we can assume that insert will successfully insert all of the rows and so will merge.
The insert is very likely faster because it does not require a join on the two tables.
That said, the two queries are not equivalent. Assuming that col1 is defined as the primary key, the insert will throw an error if data_source contains a value in col1 that is already in data_dest. Because the merge is comparing the data in the two tables, then only inserting only the rows that don't already exist, it won't ever throw a primary key violation.
An insert that would be equivalent to the merge would be:
INSERT INTO data_dest
SELECT data_source.col1, data_source.col2, ... data_source.colN
FROM data_source
WHERE NOT EXISTS
(SELECT *
FROM data_dest
WHERE data_source.col1 = data_dest.col1)
It's likely that the plan for this insert will be very similar (if not identical) to the plan for the merge and the performance would be indistinguishable.

Getting INSERT errors when I do UPDATE?

At work we have a SQL Server database. I don't know the db that well. I have created a new column in the table for some new functionality....straight away I have started seeing errors
My statement was this:
ALTER TABLE users
ADD locked varchar(50) NULL
GO
The error is:
Insert Error: Column name or number of supplied values does not match table definition
I have read that the error message appears when during an INSERT operation either the number of supplied column names or the number of supplied values does not match the table definition.
But I have checked so many times and i have changed the PHP code to include this columns data yet I still receive the error.
I have run the SQL query directly on the db and still get the error.
Funny enough the query which gets the error is an Update.
UPDATE "users"
SET "users"."date_last_login" = GETDATE()
WHERE id = 1
Have you considered it could be a trigger causing it? 
This is the error message you would get.
If its an Update action causing it check trigger actions that Updates on that table run.
Do it with:
#sp_helptrigger Users, 'UPDATE';
This will show triggers occuring with ‘update’ actions.
If there is a trigger, grab the triggers name and run the below (but replace TriggerNameHere with real trigger):
#sp_helptext TriggerNameHere;
This will give you any SQL that the trigger runs and could be the INSERT the error message is referring to.
Hope this helps
Aside from TRIGGERS,
the reason for that is because you are using implicit type of INSERT statement. Let's say your previous number of columns on the table is 3. You have this syntax of INSERT statement,
INSERT INTO tableName VALUES ('val1','val2','val3')
which executes normally fine. But then you have altered the table to add another column. All of your INSERT queries are inserting only three values on the table which doesn't matches to the total number of columns.
In order to fix the problem, you have to update all INSERT statements to insert 4 values on the table,
INSERT INTO tableName VALUES ('val1','val2','val3', 'val4')
and it will normally work fine.
I'll advise you to use the EXPLICIT type of INSERT wherein you have to specify the columns you want to insert values with. Eg,
INSERT INTO tableName (col1, col2, col3) VALUES ('val1','val2','val3')
in this ways, even if you have altered your tables by adding additional columns, your INSERT statement won't be affected unless the column doesn't have a default value and which is non-nullable.