How can I do INSERT IF NOT EXISTS using Apache Impala?

How can I do INSERT IF NOT EXISTS using Apache Impala? - sql

Does anyone know if there is a way to do an
INSERT IF NOT EXISTS
in Apache Impala?
I know about INSERT OVERWRITE but it does not suit the use cases I am working on.
Thank you.

Impala doesn't support that, at least when using HDFS, since a primary key would be needed. If you are able to use Impala+Kudu, which has primary key support, INSERT IF NOT EXISTS could be implemented by inserting and ignoring the errors.

Related

How to achieve "INSERT IGNORE" using PRESTO

So in my MYSQL you can use the INSERT IGNORE syntax when doing an insert to rather than throw an error on insert of a duplicate row rather just ignore that row
I would like to achieve the same in Presto working on a Hive database if possible?
I know hive is not a true relational database in the sense the documentation for the INSERT statement on Presto is very basic
I would just like to know if there is a simple work around as all I can think of is first doing a select with a cursor to loop through results and insert

Until Hive 3, there is no concept of unique constraints and even in Hive 3 the constraints are not enforced to the best of my knowledge.
Therefore Presto Hive connector does not enforce any unique constraints, so your INSERT query will never fail when you insert duplicated rows. They will just be stored as independent copies of data.
If you want to maintain uniqueness, this needs to be handled externally, on the application level.

how to have postgres ignore inserts with a duplicate key but keep going

I am inserting record data in a collection in memory into postgres and want the database to ignore any record that already exists in the database (by virtue of having the same primary key) but keep going with the rest of my inserts.
I'm using clojure and hugsql, btw, but I'm guessing the answer might be language agnostic.
As I'm essentially treating the database as a set in this way I may be engaging in an antipattern.

If you're using Postgres 9.5 or newer (which I assume you are, since it was released back in January 2016), there's a very useful ON CONFLICT cluase you can use:
INSERT INTO mytable (id, col1, col2)
VALUES (123, 'some_value', 'some_other_value')
ON CONFLICT (id) DO NOTHING

I had to solve this for an early version of Postgres so instead of having a single INSERT statement with muliple rows, I used multiple INSERT statements and just ran all of them in a script and made sure that an error would not stop the script (I used Adminer with "stop on error" unchecked) so that the ones that don't throw an error are executed and then all of the new entries got inserted.

Jetbrains Datagrip 2017.1.3, force columns exported when dumping data to sql inserts file

I have an SQL server database with a lot of tables and data. I need to reproduce it locally in a docker container.
I have successfully exported the schema and reproduced it. When I dump data to an SQL file, it does not export automatically generated fields (Like ids or uuids for example)
Here is the schema for the user table:
create table user (
id_user bigint identity constraint PK_user primary key,
uuid uniqueidentifier default newsequentialid() not null,
id_salarie bigint constraint FK_user_salarie references salarie,
date_creation datetime,
login nvarchar(100)
)
When it exports and element from this table, I get this kind of insert:
INSERT INTO user(id_salarie, date_creation, login) VALUES (1, null, "example")
As a consequence, most of my inserts give me foreign key errors, because the ids generated by my new database are not the same as the ones in the old database. I can't change everything manually as there is way too much data.
Instead, I would like to have this kind of insert:
INSERT INTO user(id_user, uuid, id_salarie, date_creation, login) VALUES (1, 1, "manuallyentereduuid" null, "example")
Is there any way to do this with Datagrid directly? Or maybe a specific SQL server way of generating insert statements this way?
Don't hesitate to ask for more details in comments.

You need the option 'Skip generated columns' while configuring INSERT extractor.

It seems like Datagrip does not give you that possibility so I used something else : DBeaver. It is free and based on the Eclipse Environment.
The method is simple :
Select all the tables you want to export
Right click -> Export table data
From there you just have to follow the instructions. It outputs one file per table, which is a good thing if you have a consequent volumetry. I had trouble executing the whole script and had to split it when using Datagrip.
Hope this helps anyone encountering the same problem. If you find the solution directly in datagrip, I would like to know too.
EDIT : See the answer above

SQL Column after insert read only without trigger

I have to create two columns (creator, creation date), which is after insert readonly. These two fields can't be changed after the insert. I know that is posible with sql-triggers to solve this requirement. but think this is not a comfortable solutuion.
Is there a solution to solve my "problem" with in the Create-Table-Statemant

Maybe the Column permission will work for you.
DENY UPDATE ON dbo.MyTable (Creator, CreationDate) TO SampleRole;

No and maybe yes :)
This can't be done with a single table, but it is possible to create another table and create a FK to it. With a proper security in place (inserts into the main table only through a stored procedure signed with a certificate and denying modifications to referenced table to everyone except a user created from the same cert, eg), this will make it rather difficult for someone to modify it. Of course, an admin/owner user can't be stopped.
LMK if you wish me to expand the answer with the code.

SQLite, SQL: Using UPDATE or INSERT accordingly

Basically, I want to insert if a given entry (by its primary key id) doesn't exist, and otherwise update if it does. What might be the best way to do this?

Does sqllite not have REPLACE command ? Or ON CONFLICT REPLACE ?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How can I do INSERT IF NOT EXISTS using Apache Impala? - sql

Does anyone know if there is a way to do an INSERT IF NOT EXISTS in Apache Impala? I know about INSERT OVERWRITE but it does not suit the use cases I am working on. Thank you.

Impala doesn't support that, at least when using HDFS, since a primary key would be needed. If you are able to use Impala+Kudu, which has primary key support, INSERT IF NOT EXISTS could be implemented by inserting and ignoring the errors.

Related

How to achieve "INSERT IGNORE" using PRESTO

how to have postgres ignore inserts with a duplicate key but keep going

Jetbrains Datagrip 2017.1.3, force columns exported when dumping data to sql inserts file

SQL Column after insert read only without trigger

SQLite, SQL: Using UPDATE or INSERT accordingly

Categories

Resources