Equivalent of ON CONFLICT DO NOTHING - dbt

i’m having a hard time understanding possibility of achieving “on conflict do nothing” behavior with DBT.
I have a predefined PostgreSQL table with an unique two-column index - CREATE UNIQUE INDEX my_index ON my_table USING btree (col1, col2);. Now i want to make an incremental model on the top of this table.
The problem is that I want to ignore all insert conflict while building the model.
With PostgreSQL it looks smth like insert into table (....) values(....) on conflict(col1,col2) do nothing;
I’ve seen ‘unique_key’ and ‘incremental_strategy’ options, but they are not much of a help.
Is there any way?

As PostgreSQL only supports the default incremental_strategy = 'merge' you do not have to specify this in your setup.
I am not completely sure if I understand your question correctly. But first, you need to specify your incremental model in the config settings.
Secondly, as far as I understand your issue, you can specify your concern in the pre_hook configurations as shown below:
{{
config(
materialized = 'incremental',
unique_key = 'id',
pre_hook="""
{% if is_incremental() %}
pre_hook="<sql-statement>" | ["<sql-statement>"]
{% endif %}
"""
}}

Related

How are UNIQUE constraints implemented when an INSERT command is called in Postgresql?

If I, for example, had a table with a single (integer) attribute a with a UNIQUE constraint placed on the attribute, and I tried inserting a single value 2 into the table, what would query planner do in this case? For example, would the the UNIQUE constraint be checked by doing a sequential scan through the table, or something else?
I tried understanding what happens by running EXPLAIN ANALYZE in postgres, but unfortunately it did not reveal anything of use. Any help would be much appreciated.

DataStax/Tinkerpop - Ability to remove a property

I am looking a way to remove a propertyKey in the schema. The documentation here explains how to add properties but no information about the removal. Does that mean that it is not possible?
Since DataStax relies on Cassandra that supports table altering I guess there is some way to achieve that, otherwise how to deal with dynamic schemas where properties can be added or removed?
Edit: For more clarity I want to remove the property both in the schema and in the data. Exactly like the ALTER DROP in SQL:
ALTER TABLE table_name
DROP COLUMN column_name
The DSE Graph reference for dropping data, schema or graphs is: http://docs.datastax.com/en/latest-dse/datastax_enterprise/graph/using/dropSchemaDataStudio.html
As DSE Graph is built on the standards of TinkerPop, you can also leverage the TinkerPop 3 API references here - http://tinkerpop.apache.org/docs/current/reference/#_tinkerpop3.
For this item, i believe you are looking for .drop(). http://tinkerpop.apache.org/docs/current/reference/#drop-step
From the above link, if you want to remove a property, do this: .properties("X").drop()

How to avoid duplicate inserts for an OrientDB database?

In SQL there's a query INSERT IGNORE which keeps duplicate entries out of the database based on the primary key. But is there a way to achieve this functionality in OrientDB since the primary key concept here is kind of achieved using the #rid concept?
I think you can use a unique index on that class, so you can avoid duplicate entries.
Have you tried the UPSERT?
UPDATE Profile SET nick = 'Luca' UPSERT WHERE nick = 'Luca'
Please create an index against "nick" property.

How to avoid the same record inserted twice in MongoDB (using Mongoid) or ActiveRecord (Rails using MySQL)?

For example, if we are doing Analytics recording the page_type, item_id, date, pageviews, timeOnPage.
It seems that they are several ways to avoid it. Is there an automatic way?
create index on the fields that uniquely identify the record, for example [page_type, item_id, date] and make the index unique, so that when adding the same record, it will reject it.
or, make the above the primary index, which is unique, if the DB or framework supports it. In Rails, usually the ID 1, 2, 3, 4 is the primary index, though.
or, query the record using the [page_type, item_id, date], and then update that record if it already exists (or don't do anything if the pageviews and timeOnPage already has the same values). If record doesn't exist, then insert a new record with this data. But if need to query the record this way, looks like we need an index on these 3 fields anyways.
Insert new records all the time, but when query for the values, use something like
select * from analytics where ... order by created_at desc limit 1
that is, get the newest created record and ignore the rest. But this seems like a solution for 1 record but not so feasible when it is summing up values (doing aggregates), such as select sum(pageviews) or select count(*).
Is there also some automatic solution besides using the methods above?
Jian,
Your first option seems viable to me. And simplest way. Mongo supports this feature by default.
On insert it will check for the unique combination, if exists it will ignore the insert and write the "E11000 duplicate key error index" message in server log. Otherwise it will proceed with the normal insertion.
But it seems this will not work in the case of bulk insert. If any duplicate is there, entire batch will be failed. Quick googling shows up the existing mongo bug reporting jira ticket. Its still open.
I can't speak for Mongoid/MongoDB, but if you wish to enforce a uniqueness constraint in a relational database, you should create a uniqueness constraint. That's what they're there for! In MySQL, that is equivalent to a unique index; you could specify it as CONSTRAINT ... UNIQUE (col1, col2), but this will just create a unique index anyway.

using NHibernate on a table without a primary key

I am (hopefully) just about to start on my first NHibernate project and have come across a stumbling block. Would appreciate any advice.
I am working on a project where my new code will run side-by-side with legacy code and I cannot change the data model at all. I have come across a situation where one of the main tables has no primary key. It is something like this:
Say there is an order table and a product table, and a line_item table which lists the products in each order (i.e. consits of order_id, product_id, and quantity). In this data model it is quite possible to have 2 line items for the same product in the same order. What happens in the existing code is that whenever the user updates a line item, all line items for that order are deleted and re-inserted. Since even a compound key of all the fields in the line_item table would not necessarily be unique, that is the only possible way to update a line item in this data model.
I am prepared to guarantee that I will never attempt to update or delete an indivdual line item. Can I make my NHibernate code work in the same way as the existing code? If not, does this mean I (a) I cannot use NHibernate at all; (b) I cannot use NHibernate to map the line_item table; or (c) I can still map this table but not its relationships
Thanks in advance for any advice
I think if you map it as a bag collection on the Order (with inverse="false") it would work.
Collection Mapping
Note: Large NHibernate bags mapped
with inverse="false" are inefficient
and should be avoided; NHibernate
can't create, delete or update rows
individually, because there is no key
that may be used to identify an
individual row.
They warn against it but it sounds like what you want.