SQL CTE Syntax to DELETE / INSERT rows - sql

What's the CTE syntax to delete from a table, then insert to the same table and return the values of the insert?
Operating on 2 hours of sleep and something doesn't look right (besides the fact that this won't execute):
WITH delete_rows AS (
DELETE FROM <some_table> WHERE id = <id_value>
RETURNING *
)
SELECT * FROM delete_rows
UNION
(
INSERT INTO <some_table> ( id, text_field )
VALUES ( <id_value>, '<text_field_value>' )
RETURNING *
)
The expected behavior is to first clear all the records for an ID, then insert records for the same ID (intentionally not an upsert) and return those inserted records (not the deletions).

Your question update made clear that you cannot do this in a single statement.
Packed into CTEs of the same statement, both operations (INSERT and DELETE) would see the same snapshot of the table and execute virtually at the same time. I.e., the INSERT would still see all rows that you thought to be deleted already. The manual:
All the statements are executed with the same snapshot (see Chapter 13), so they cannot "see" one another's effects on the target tables.
You can wrap them as two independent statements into the same transaction - which doesn't seem strictly necessary either, but it would allow the whole operation to succeed / fail atomically:
BEGIN;
DELETE FROM <some_table> WHERE id = <id_value>;
INSERT INTO <some_table> (id, text_field)
VALUES ( <id_value>, '<text_field_value>')
RETURNING *;
COMMIT;
Now, the INSERT can see the results of the DELETE.

CREATE TABLE test_table (value TEXT UNIQUE);
INSERT INTO test_table SELECT 'value 1';
INSERT INTO test_table SELECT 'value 2';
WITH delete_row AS (DELETE FROM test_table WHERE value='value 2' RETURNING 0)
INSERT INTO test_table
SELECT DISTINCT 'value 2'
FROM (SELECT 'dummy') dummy
LEFT OUTER JOIN delete_row ON TRUE
RETURNING *;
The query above handles the situations when DELETE deletes 0/1/some rows.

Elaborating on skif1979's "DelSert" CTE method, the "Logged DelSert:"
-- setups
DROP TABLE IF EXISTS _zx_t1 ;
CREATE TEMP TABLE
IF NOT EXISTS
_zx_t1
( id bigint
, fld2 bigint
, UNIQUE (id)
);
-- unique records
INSERT INTO _zx_t1 SELECT 1, 99;
INSERT INTO _zx_t1 SELECT 2, 98;
WITH
_cte_del_row AS
( DELETE
FROM _zx_t1
WHERE id = 2
RETURNING id as _b4_id, fld2 as _b4_fld2 -- returns complete deleted row
)
, _cte_delsert AS
( INSERT
INTO _zx_t1
SELECT DISTINCT
_cte_del_row._b4_id
, _cte_del_row._b4_fld2 + 1
from (SELECT null::integer AS _zunk) _zunk -- skif1979's trick here
LEFT OUTER JOIN _cte_del_row -- clever LOJ magic
ON TRUE -- LOJ cartesian product
RETURNING id as _aft_id , fld2 as _aft_fld2 -- return newly "delserted" rows
)
SELECT * -- returns before & after snapshots from CTE's
FROM
_cte_del_row
, _cte_delsert ;
RESULT:
_b4_id | _b4_fld2 | _aft_id | _aft_fld2
--------+----------+---------+-----------
2 | 209 | 2 | 210
AFAICT these all occur linearly w/in a unit of work, akin to a journaled or logged update.
Workable for
Child records
OR Schema w/ no FK
OR FK w/ cascading deletes
Not workable for
Parent records w/ FK & no cascading deletes

A related (& IMO better) answer, akin to the "Logged DelSert" is this, a logged "SelUp" :
-- setups
DROP TABLE IF EXISTS _zx_t1 ;
CREATE TEMP TABLE
IF NOT EXISTS
_zx_t1
( id bigint
, fld2 bigint
, UNIQUE (id)
);
-- unique records
INSERT INTO _zx_t1 SELECT 1, 99;
INSERT INTO _zx_t1 SELECT 2, 98;
WITH
_cte_sel_row AS
( SELECT -- start unit of work with read
id as _b4_id -- fields need to be aliased
,fld2 as _b4_fld2 -- to prevent ambiguous column errors
FROM _zx_t1
WHERE id = 2
FOR UPDATE
)
, _cte_sel_up_ret AS -- we're in the same UOW
( UPDATE _zx_t1 -- actual table
SET fld2 = _b4_fld2 + 1 -- some actual work
FROM _cte_sel_row
WHERE id = _b4_id
AND fld2 < _b4_fld2 + 1 -- gratuitous but illustrates the point
RETURNING id as _aft_id, fld2 as _aft_fld2
)
SELECT
_cte_sel_row._b4_id
,_cte_sel_row._b4_fld2 -- before
,_cte_sel_up_ret._aft_id
,_cte_sel_up_ret._aft_fld2 -- after
FROM _cte_sel_up_ret
INNER JOIN _cte_sel_row
ON TRUE AND _cte_sel_row._b4_id = _cte_sel_up_ret._aft_id
;
RESULT:
_b4_id | _b4_fld2 | _aft_id | _aft_fld2
--------+----------+---------+-----------
2 | 209 | 2 | 210
See also:
https://rob.conery.io/2018/08/13/transactional-data-operations-in-postgresql-using-common-table-expressions/

Related

How to bring both old and new vale in single row in oracle sql?

Create table query
CREATE TABLE ID_TAB (
ID VARCHAR2(20),
ID_VALUE VARCHAR2(20), FLAG VARCHAR2(20)
);
CREATE TABLE FACT_TABLE (
ID VARCHAR2(20),
VALUE VARCHAR2(20),
NAME VARCHAR2(100)
);
Insert Query
INSERT INTO ID_TAB VALUES('100','ABC','N');
INSERT INTO ID_TAB VALUES('120','ABC','Y');
INSERT INTO FACT_TABLE VALUES('100','MAX','ORANGE');
My objective is to update the fact table 'ID' column to 120 because it has the FLAG value as 'Y'
My original table has 50 million records.
How can we write the query in merge or using Update?
You could try the following (I didn't try it though):
UPDATE fact_table
SET fact_table.id = (SELECT yes_tab.id
FROM id_tab no_tab
JOIN id_tab yes_tab
ON no_tab.id_value = yes_tab.id_value
AND yes_tab.flag = 'Y'
WHERE no_tab.id = fact_table.id)
WHERE EXISTS (SELECT *
FROM id_tab
WHERE id_tab.id = fact_table.id
AND id_tab.flag = 'N');
The WHERE EXISTS makes sure you update only the elements with the flag in the corresponding ID_TAB set to 'N'.
The query for the SET searches the ID of the corresponding element having the flag set to 'Y'.
To get a performant solution (i.e. avoid the join completely) you may switch to a dynamic SQL.
I extended you example a bit to have more updated keys as follows.
Note that there is maximal one new value per ID_VALUE, but more old value with N flag are allowed, which all must be updated.
INSERT INTO ID_TAB VALUES('100','ABC','N'); -- old key
INSERT INTO ID_TAB VALUES('110','ABC','N'); -- old key
INSERT INTO ID_TAB VALUES('120','ABC','Y'); -- new key
INSERT INTO ID_TAB VALUES('200','EFG','N'); -- old key
INSERT INTO ID_TAB VALUES('210','EFG','N'); -- old key
INSERT INTO ID_TAB VALUES('220','EFG','Y'); -- new key
INSERT INTO FACT_TABLE VALUES('100','MAX','ORANGE');
INSERT INTO FACT_TABLE VALUES('110','MIN','ORANGE');
INSERT INTO FACT_TABLE VALUES('200','MAX','APPLE');
INSERT INTO FACT_TABLE VALUES('210','MIN','APPLE');
INSERT INTO FACT_TABLE VALUES('220','NEW','APPLE');
commit;
The UPDATE statements reflects all th eupdate options, e.g. the keys 100 and 110 should be chnaged to 200.
This leads to the following update
update FACT_TABLE
set ID = case
when ID in ('100','110') then '200'
when ID in ('200','210') then '220'
end
where ID in ('100','110','200','210');
Note that the where condition consist of all the keys with flag N and the case statement is produced per ID_VALUE mapping all keys with the flag N to the (only one) key with flag Y.
This makes the generation of the UPDATE statement an easy little task of a query on the table ID_TAB. See the query below. For the list creation the function LISTAGG is used. The main parts are commented in the query.
with old_keys as (
select /* get list of all old values */
listagg(''''||ID||'''',',') within group (order by ID) old_keys
from ID_TAB where flag = 'N'),
case_stmt as (
select ID_VALUE,
listagg(case when flag = 'N' then ''''||ID||'''' end,',') within group (order by ID) old_keys,
max(''''||case when flag = 'Y' then ID end||'''') new_key
from ID_TAB
group by ID_VALUE),
case_stmt2 as (
select
'when ID in ('||OLD_KEYS ||') then ' ||NEW_KEY ||' /* update for '|| ID_VALUE || ' */' case_when, ID_VALUE
from case_stmt),
case_stmt3 as ( /* concatenate CASE WHEN */
select listagg(CASE_WHEN,chr(13)) within group (order by ID_VALUE) case_when from case_stmt2)
select
'update FACT_TABLE
set ID = case
'||
case_when ||
'
end
where ID in ('||
(select old_keys from old_keys)||')' as update_stmt
from case_stmt3
On the sample data this SQL string is returned
update FACT_TABLE
set ID = case
when ID in ('100','110') then '120' /* update for ABC */
when ID in ('200','210') then '220' /* update for EFG */
end
where ID in ('100','110','200','210')

Optimizing deletes from table using UDT (tsql)

SQL Server.
I have a proc that takes a user defined table (readonly) and is about 7500 records large. Using that UDT, I run about 15 different delete statements:
delete from table1
where id in (select id from #table)
delete from table2
where id in (select id from #table)
delete from table3
where id in (select id from #table)
delete from table4
where id in (select id from #table)
....
This operation, as expected, does take a while (about 7-10 minutes). These columns are indexed. However, I suspect there is a more efficient way to do this. I know deletes are traditionally slower, but I wasn't expecting this slow.
Is there a better way to do this?
You can test/try "exists" instead of "IN". I really don't like IN clauses for anything besides casual lookup-queries. (Some people will argue about IN until they are blue in the face)
Delete deleteAlias
from table1 deleteAlias
where exists ( select null from #table vart where vart.Id = deleteAlias.Id )
You can populate a #temp table instead of a #variableTable. Again, over the years, this has been trial and test it out. #variable vs #temp , most of the time, doesn't make that big of a different. But in about 4 situations I had, going to a #temp table made a big impact.
You can also experiment with putting an index on the #temp table (the "joining" column, 'Id' in this example )
IF OBJECT_ID('tempdb..#Holder') IS NOT NULL
begin
drop table #Holder
end
CREATE TABLE #Holder
(ID INT )
/* simulate your insert */
INSERT INTO #HOLDER (ID)
select 1 union all select 2 union all select 3 union all select 4
/* CREATE CLUSTERED INDEX IDX_TempHolder_ID ON #Holder (ID) */
/* optional, create an index on the "join" column of the #temp table */
CREATE INDEX IDX_TempHolder_ID ON #Holder (ID)
Delete deleteAlias
from table1 deleteAlias
where exists ( select null from #Holder holder where holder.Id = deleteAlias.Id )
IF OBJECT_ID('tempdb..#Holder') IS NOT NULL
begin
drop table #Holder
end
IMHO, there is not clear cut answer, sometimes you gotta experiment a little.
And "how your tempdb is setup' is a huge fork in the road that can affect #temp table performance. But try the suggestions above first.
And one last experiment
Delete deleteAlias
from table1 deleteAlias
where exists ( select 1 from #table vart where vart.Id = deleteAlias.Id )
change the null to "1".... once I saw this affect something. Weird, right?

inserting into A errors because of a foreign key contraint issue

Can someone help explain this to me and resolve it?
http://sqlfiddle.com/#!6/2adc7/9
The INSERT statement conflicted with the FOREIGN KEY constraint "FK_tblMobileForms_tblForms". The conflict occurred in database "db_6_2adc7", table "dbo.tblForms", column 'fm_id'.: insert into tblMobileForms(fm_name) values ('lol')
My schema has the ID from tblMobileForms be a foreign key to tblForms.fm_id
To do what you are trying to do you cannot set up the FK on tblMobileForms as an identity. See my fiddle below for more information.
http://sqlfiddle.com/#!6/be6f7/2
Alternatively what you could do is to have tblMobileForms have it's own separate surrogate key and have a different FK column to the tblForms table.
The PK on the tblMobileForms table has the same name as the FK on the same table. Seeing the PK is an IDENTITY column, you can end up with non-matching values.
In my fiddle, the tblForms table contained IDs in the upper 60s. Running the INSERT in the child table would add a record with id 1, which does not exist in the parent table.
I'd create a new row in the tblMobileForms table, and reference that to the parent table.
You could use an INSTEAD OF trigger to apply a random ID to each mobile form as it is inserted:
CREATE TRIGGER dbo.tblMobileForms_Insert
ON dbo.tblMobileForms
INSTEAD OF INSERT
AS
BEGIN
DECLARE #Inserted TABLE (fm_ID INT, fm_html_file VARBINARY(MAX), fm_name NVARCHAR(50));
INSERT #Inserted (fm_ID, fm_html_File, fm_Name)
SELECT fm_ID, fm_html_File, fm_Name
FROM inserted;
IF EXISTS (SELECT 1 FROM #Inserted WHERE fm_ID IS NULL)
BEGIN
WITH NewRows AS
( SELECT fm_ID, fm_html_File, fm_Name, RowNumber = ROW_NUMBER() OVER (ORDER BY fm_name)
FROM #Inserted
WHERE fm_ID IS NULL
), AvailableIDs AS
( SELECT fm_ID, RowNumber = ROW_NUMBER() OVER (ORDER BY fm_ID)
FROM tblForms f
WHERE NOT EXISTS
( SELECT 1
FROM tblMobileForms m
WHERE f.Fm_ID = m.fm_ID
)
AND NOT EXISTS
( SELECT 1
FROM inserted i
WHERE f.fm_ID = i.fm_ID
)
)
UPDATE NewRows
SET fm_ID = a.fm_ID
FROM NewRows n
INNER JOIN AvailableIDs a
ON a.RowNumber = n.RowNumber
IF EXISTS (SELECT 1 FROM #Inserted WHERE fm_ID IS NULL)
BEGIN
RAISERROR ('Not enough free Form IDs to allocate an ID to the inserted rows', 16, 1);
RETURN;
END
END
INSERT dbo.tblMobileForms (fm_ID, fm_html_File, fm_Name)
SELECT fm_ID, fm_html_file, fm_name
FROM #Inserted
END
When each row is inserted the trigger will check for the next available ID in tblForms and apply it sequentially to the inserted rows where fm_id is not specified. If there are no free ID's in tblForms then the trigger will throw an error so a 1 to 1 relationship is maintained (The error would be thrown anyway since tblMobileForms.fm_id is also a PK).
N.b. this requires tblForms.fm_ID to just be an int column, and not identity.

In a persisted field, how do you return the number of occurrences of a column within a different table's column

The following is required due to records being entered by 3rd parties in a web application.
Certain columns (such as Category) require validation including the one below. I have a table OtherTable with the allowed values.
I need to identify how many occurrences (ie: IF) there are of the current table's column's value in a different table's specified column. If there are no occurrences this results in a flagged error '1', if there are occurrences, then it results in no flagged error '0'.
If `Category` can be found in `OtherTable.ColumnA` then return 0 else 1
How can I do this please?
If Category can be found in OtherTable.ColumnA then return 0 else 1
You could use CASE with EXISTS
SELECT CASE WHEN EXISTS(
SELECT NULL
FROM AllowedValues av
WHERE av.ColumnA = Category
) THEN 0 ELSE 1 END AS ErrorCode
, Category
FROM [Table]
Edit: Here's a sql-fiddle: http://sqlfiddle.com/#!3/55a2e/1
Edit: I've only just noticed that you want to use a computed column. As i've read you can only use it with scalar values and not with sub-queries. But you can create a scalar valued function.
For example:
create table AllowedValues(ColumnA varchar(1));
insert into AllowedValues Values('A');
insert into AllowedValues Values('B');
insert into AllowedValues Values('C');
create table [Table](Category varchar(1));
insert into [Table] Values('A');
insert into [Table] Values('B');
insert into [Table] Values('C');
insert into [Table] Values('D');
insert into [Table] Values('E');
-- create a scalar valued function to return your error-code
CREATE FUNCTION udf_Category_ErrorCode
(
#category VARCHAR(1)
)
RETURNS INT
AS BEGIN
DECLARE #retValue INT
SELECT #retValue =
CASE WHEN EXISTS(
SELECT NULL
FROM AllowedValues av
WHERE av.ColumnA = #category
) THEN 0 ELSE 1 END
RETURN #retValue
END
GO
Now you can add the column as computed column which uses the function to calculate the value:
ALTER TABLE [Table] ADD ErrorCode AS ( dbo.udf_Category_ErrorCode(Category) )
GO
Here's the running SQL: http://sqlfiddle.com/#!3/fc49e/2
Note: as #Damien_The_Unbelieve has commented at the other answer, even if you persist the result with a UDF, the value won't be updated if the rows in OtherTable change. Just keep that in mind, so you need to update the table manually if desired with the help of the UDF.
select mt.*,IFNULL(cat_count.ct,0) as Occurrences from MainTable mt
left outer join (select ColumnA,count(*) as ct from OtherTable) cat_count
on mt.Category=cat_count.ColumnA
Result:
mt.col1 | mt.col2 | Category | Occurrences
### | ### | XXX | 3
### | ### | YYY | 0
### | ### | ZZZ | 1

Delete multiple duplicate rows in table

I have multiple groups of duplicates in one table (3 records for one, 2 for another, etc) - multiple rows where more than 1 exists.
Below is what I came up with to delete them, but I have to run the script for however many duplicates there are:
set rowcount 1
delete from Table
where code in (
select code from Table
group by code
having (count(code) > 1)
)
set rowcount 0
This works well to a degree. I need to run this for every group of duplicates, and then it only deletes 1 (which is all I need right now).
If you have a key column on the table, then you can use this to uniquely identify the "distinct" rows in your table.
Just use a sub query to identify a list of ID's for unique rows and then delete everything outside of this set. Something along the lines of.....
create table #TempTable
(
ID int identity(1,1) not null primary key,
SomeData varchar(100) not null
)
insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData3')
insert into #TempTable(SomeData) values('someData4')
select * from #TempTable
--Records to be deleted
SELECT ID
FROM #TempTable
WHERE ID NOT IN
(
select MAX(ID)
from #TempTable
group by SomeData
)
--Delete them
DELETE
FROM #TempTable
WHERE ID NOT IN
(
select MAX(ID)
from #TempTable
group by SomeData
)
--Final Result Set
select * from #TempTable
drop table #TempTable;
Alternatively you could use a CTE for example:
WITH UniqueRecords AS
(
select MAX(ID) AS ID
from #TempTable
group by SomeData
)
DELETE A
FROM #TempTable A
LEFT outer join UniqueRecords B on
A.ID = B.ID
WHERE B.ID IS NULL
It is frequently more efficient to copy unique rows into temporary table,
drop source table, rename back temporary table.
I reused the definition and data of #TempTable, called here as SrcTable instead, since it is impossible to rename temporary table into a regular one)
create table SrcTable
(
ID int identity(1,1) not null primary key,
SomeData varchar(100) not null
)
insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData3')
insert into SrcTable(SomeData) values('someData4')
by John Sansom in previous answer
-- cloning "unique" part
SELECT * INTO TempTable
FROM SrcTable --original table
WHERE id IN
(SELECT MAX(id) AS ID
FROM SrcTable
GROUP BY SomeData);
GO;
DROP TABLE SrcTable
GO;
sys.sp_rename 'TempTable', 'SrcTable'
You can alternatively use ROW_NUMBER() function to filter out duplicates
;WITH [CTE_DUPLICATES] AS
(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY SomeData ORDER BY SomeData)
FROM #TempTable
)
DELETE FROM [CTE_DUPLICATES] WHERE RN > 1
SET ROWCOUNT 1
DELETE Table
FROM Table a
WHERE (SELECT COUNT(*) FROM Table b WHERE b.Code = a.Code ) > 1
WHILE ##rowcount > 0
DELETE Table
FROM Table a
WHERE (SELECT COUNT(*) FROM Table b WHERE b.Code = a.Code ) > 1
SET ROWCOUNT 0
this will delete all duplicate rows, But you can add attributes if you want to compare according to them .