SQL - Split container row conditionally - sql

In the process of migrating containers, if we have two tables;
TABLE_MAPPING (old_value, new_value)
TABLE_USING (value, data...)
TABLE_USING is referencing (FK) a container in a irrelevant table.
TABLE_MAPPING is temporarily used for a migration, the goal is to move contents from deprecated to new containers.
The problem here is that sometimes the container is not only replaced, but split into multiple new containers, for example TABLE_MAPPING could contain:
OLD_VALUE
NEW VALUE
1
10
1
11
2
20
And the query would result in an "update" of one row with value '1' to two rows with values '10' and '11'.
Is there a plain SQL way to do that? Or should I use PL/SQL?
EDIT: as requested, here is an example of before/after using the TABLE_MAPPING above
Before:
VALUE
IRRELEVANT_COLUMNS ...
1
...
2
...
After:
VALUE
IRRELEVANT_COLUMNS ...
10
...
12
...
20
...

You need two steps. Below I first insert all new rows, then I delete all old rows.
-- insert rows with new values
insert into table_using (value, data ...)
select m.new_value, u.data ...
from table_using u
join table_mapping m on m.old_value = u.value;
-- delete rows with old values
delete from table_using where value in (select old_value from table_mapping);
-- commit the transaction
commit;

Related

Oracle Sequence wastes/reserves values (in INSERT SELECT)

I've been struggling with sequences for a few days. I have this Origin data table called "datos" with the next columns:
CENTRO
CODV
TEXT
INCIDENCY
And a Destiny data table called "anda" with the following:
TIPO = 31 (for all rows)
DESCRI = 'Site' (for all rows)
SECU = sequence number generated with Myseq.NEXTVAL
CENTRO
CODV
TEXT
The last three columns must be filled in with data from "datos" table.
When I execute my query, it all works fine, my table is filled and the sequence generates its values. But, in the INSERT INTO SELECT, I have the following conditions:
Every row in origin "datos" must not already be in the destiny "anda", so it won't be duplicated, and every row in "datos" must have the INCIDENCY flag value to 'N' or NULL.
If each row matches the conditions, it should be filled.
The thing is, that the query works fine and I have been trying with many different values. Here comes the problem:
When a row has its INCIDENCY value set to 'Y' (so it must not be copied into destiny table), it doesn't appear, but the sequence DOES consumes one value, and when I check Myseq.NEXTVAL its value is higher.
How can I prevent the sequence to add any value when it doesn't match the conditions? I've read that Oracle first reserves all the possible values returning from the SELECT query, but I can't find how to prevent it.
Here's the SQL:
INSERT INTO anda (TIPO, DESCRI, SECU, CENTRO, CODV, TEXT)
SELECT( 31 TIPO,
'Site' DESCRI,
Myseq.NEXTVAL,
datos.CENTRO,
datos.CODV,
datos.TEXT
FROM datos
WHERE (CENTRO, CODV) NOT IN
(SELECT CENTRO, CODV
FROM anda)
AND (datos.INCIDENCY = 'N' OR datos.INCIDENCY IS NULL)
)
Thanks in advance!!
Definition of MySeq
CREATE SEQUENCE CREATE SEQUENCE "BBDD"."MySeq" MINVALUE 800000000000
MAXVALUE 899999999999 INCREMENT BY 1 START WITH 800000000000 CACHE 20 ORDER NOCYCLE ;
You might be able to trick Oracle into doing this with a CTE:
INSERT INTO anda (TIPO, DESCRI, SECU, CENTRO, CODV, TEXT)
WITH toinsert as (
SELECT d.*
FROM datos d
WHERE (CENTRO, CODV) NOT IN (SELECT CENTRO, CODV FROM anda) AND
(d.INCIDENCY = 'N' OR d.INCIDENCY IS NULL)
)
SELECT 31 as TIPO, 'Site' as DESCRI, Myseq.NEXTVAL,
d.CENTRO, d.CODV, d.TEXT
FROM toinsert d;
I'm not quite sure if that will work. A more guaranteed approach is to use a before insert trigger (or identity column if you are using 12c+). You would increment the value in the trigger.
However, I do agree with Hugh Jones. You should be confident using the sequence to add a unique value to each row and this value will be increasing. Gaps can appear for other reasons, such as deletes. Also, I know that SQL Server can create gaps when doing parallel inserts; I'm not sure if that also happens with Oracle.
I don't believe you have a real problem(the gaps are not really an issue) but you can put a before insert (at row level) trigger on anda table and set sequ there with your sequence generated value.
But keep in mind that this will keep consecutive only the sequ number in a statement. You'll get gaps anyway for other reasons.
UPDATE: as Alex Poole has commented, the insert itself does not generate gaps.
See a test below:
> drop sequence tst_fgg_seq;
sequence TST_FGG_SEQ dropped.
> drop table tst_fgg;
table TST_FGG dropped.
> drop table tst_insert_fgg;
table TST_INSERT_FGG dropped.
> create sequence tst_fgg_seq start with 1 nocycle;
sequence TST_FGG_SEQ created.
> create table tst_fgg as select level l from dual connect by level < 11;
table TST_FGG created.
> create table tst_insert_fgg as
select tst_fgg_seq.nextval
from tst_fgg
where l between 3 and 5;
table TST_INSERT_FGG created.
> select * from tst_insert_fgg;
NEXTVAL
----------
1
2
3
> insert into tst_insert_fgg
select tst_fgg_seq.nextval
from tst_fgg
where l between 3 and 5;
3 rows inserted.
> select * from tst_insert_fgg;
NEXTVAL
----------
1
2
3
4
5
6
6 rows selected

SQL Cartesian product joining table to itself and inserting into existing table

I am working in phpMyadmin using SQL.
I want to take the primary key (EntryID) from TableA and create a cartesian product (if I am using the term correctly) in TableB (empty table already created) for all entries which share the same value for FieldB in TableA, except where TableA.EntryID equals TableA.EntryID
So, for example, if the values in TableA were:
TableA.EntryID TableA.FieldB
1 23
2 23
3 23
4 25
5 25
6 25
The result in TableB would be:
Primary key EntryID1 EntryID2 FieldD (Default or manually entered)
1 1 2 Default value
2 1 3 Default value
3 2 1 Default value
4 2 3 Default value
5 3 1 Default value
6 3 2 Default value
7 4 5 Default value
8 4 6 Default value
9 5 4 Default value
10 5 6 Default value
11 6 4 Default value
12 6 5 Default value
I am used to working in Access and this is the first query I have attempted in SQL.
I started trying to work out the query and got this far. I know it's not right yet, as I’m still trying to get used to the syntax and pieced this together from various articles I found online. In particular, I wasn’t sure where the INSERT INTO text went (to create what would be an Append Query in Access).
SELECT EntryID
FROM TableA.EntryID
TableA.EntryID
WHERE TableA.FieldB=TableA.FieldB
TableA.EntryID<>TableA.EntryID
INSERT INTO TableB.EntryID1
TableB.EntryID2
After I've got that query right, I need to do a TRIGGER query (I think), so if an entry changes it's value in TableA.FieldB (changing it’s membership of that grouping to another grouping), the cartesian product will be re-run on THAT entry, unless TableB.FieldD = valueA or valueB (manually entered values).
I have been using the Designer Tab. Does there have to be a relationship link between TableA and TableB. If so, would it be two links from the EntryID Primary Key in TableA, one to each EntryID in TableB? I assume this would not work because they are numbered EntryID1 and EntryID2 and the name needs to be the same to set up a relationship?
If you can offer any suggestions, I would be very grateful.
Research:
http://www.fluffycat.com/SQL/Cartesian-Joins/
Cartesian Join example two
Q: You said you can have a Cartesian join by joining a table to itself. Show that!
Select *
From Film_Table T1,
Film_Table T2;
No, you don't want a cartesian product where you join tables without any condition. What you are looking for is a simple self join:
insert into TableB(EntryID1, EntryID2)
select x.EntryID, y.EntryID
from TableA x
join TableA y on x.FieldB = y.FieldB and x.EntryID <> y.EntryID;
EDIT: As you want this table to be up-to-date all the time, consider a view instead of a table (only, then you could not have a manually maintained FieldD).
This should give you the 'cartesian' (its not actually a cartesian, as #ThorstenKettner mentions below - its a self join) you're after, and insert it into TableB:
INSERT INTO TableB (EntryId1, EntryId2, fieldD)
SELECT a.EntryID, b.EntryID, 'Default Value'
FROM TableA a, TableA b
WHERE a.FieldB=b.FieldB
AND a.EntryID<>b.EntryID
You don't need any relationship between the two tables for the Trigger, although I would suggest you have a foreign key relationship setup anyway, so that you never get an entry in TableB.EntryID1 or TableB.EntryID2 that doesn't have a corresponding entry in TableA.EntryID..
For the Triggers, you'd do something like this for the insert (you don't need to check TableB in this case because you know that your new TableA.EntryId doesn't exist there yet:
CREATE TRIGGER ins_tableA AFTER INSERT ON TableA
FOR EACH ROW
BEGIN
INSERT INTO TableB (EntryId1, EntryId2, fieldD)
SELECT a.EntryID, b.EntryID, 'Default Value'
FROM TableA a, TableA b
WHERE a.FieldB=b.FieldB
AND a.EntryID=NEW.EntryID
AND a.EntryID<>b.EntryID;
END;
And for the update, you could delete all the corresponding rows from TableB first, and then re-run the insert. Something like this:
CREATE TRIGGER upd_tableA AFTER UPDATE ON TableA
FOR EACH ROW
BEGIN
DELETE FROM TableB b
WHERE b.EntryId1 = NEW.EntryId
OR b.EntryId2 = NEW.EntryId;
INSERT INTO TableB (EntryId1, EntryId2, fieldD)
SELECT a.EntryID, b.EntryID, 'Default Value'
FROM TableA a, TableA b
WHERE a.FieldB=b.FieldB
AND a.EntryID=NEW.EntryID
AND a.EntryID<>b.EntryID;
END;
None of this is tested I'm afraid, but hopefully it'll put you on the right track..

TSQL Inserting records and track ID

I would like to insert records in a table below (structure of table with example data). I have to use TSQL to achieve this:
MasterCategoryID MasterCategoryDesc SubCategoryDesc SubCategoryID
1 Housing Elderly 4
1 Housing Adult 5
1 Housing Child 6
2 Car Engine 7
2 Car Engine 7
2 Car Window 8
3 Shop owner 9
So for example if I enter in a new record with MasterCategoryDesc = 'Town' it will insert '4' in MasterCategoryID with the respective SubCategoryDesc + ID.
CAN I SIMPLIFY THIS QUESTION BY REMOVING THE SubCategoryDesc and SubCategoryID columns. How can I achieve this now just with the 2 columns MasterCategoryID and MasterCategoryDesc
INSERT into Table1
([MasterCategoryID], [MasterCategoryDesc], [SubCategoryDesc], [SubCategoryID])
select TOP 1
case when 'Town' not in (select [MasterCategoryDesc] from Table1)
then (select max([MasterCategoryID])+1 from Table1)
else (select [MasterCategoryID] from Table1 where [MasterCategoryDesc]='Town')
end as [MasterCategoryID]
,'Town' as [MasterCategoryDesc]
,'owner' as [SubCategoryDesc]
,case when 'owner' not in (select [SubCategoryDesc] from Table1)
then (select max([SubCategoryID])+1 from Table1)
else (select [SubCategoryID] from Table1 where [SubCategoryDesc]='owner')
end as [SubCategoryID]
from Table1
SQL FIDDLE
If you want i can create a SP too. But you said you want an T-SQL
This will take three steps, preferably in a single Stored Procedure. Make sure it's within a transaction.
a) Check if the MasterCategoryDesc you are trying to insert already exists. If so, take its ID. If not, find the highest MasterCategoryID, increase by one, and save it to a variable.
b) The same with SubCategoryDesc and SubCategoryID.
c) Insert the new record with the two variables you created in steps a and b.
Create a table for the MasterCategory and a table for the SubCategory. Make an ___ID column for each one that is identity (1,1). When loading, insert new rows for nonexistent values and then look up existing values for the INSERT.
Messing around with finding the Max and looking up data in the existing table is, in my opinion, a recipe for failure.

Delete duplicate id and Value ROW using SQL Server 2008 R2

In SQL Server 2008 R2 I added two duplicate ID and record in my table. When I try to delete one of the last two records I receive the following error.
The row values updated or deleted either do not make the row unique or they alter multiple rows.
The data is:
7 ABC 6
7 ABC 6
7 ABC 6
8 XYZ 1
8 XYZ 1
8 XYZ 4
7 ABC 6
7 ABC 6
I need to delete last two records:
7 ABC 6
7 ABC 6
I have been trying to delete last 2 record using the feature "Edit the Top 200 rows" to delete this duplicate id but get the error above.
Any help is appreciated. Thanks in advance:)
Since you have given no clue whatsoever that there are other columns in the table, assuming your data is in 3 columns A,B,C, you can delete 2 rows using:
;with t as (
select top(2) *
from tbl
where A = 7 and B = 'ABC' and C = 6
)
DELETE t;
This will arbitrarily match two rows based on the conditions, and delete them.
This is an outline of code I use to delete dups in tables that may have many dups.
/* I always put the rollback and commit up here in comments until I am sure I have
done what I wanted. */
BEGIN tran Jim1 -- rollback tran Jim1 -- Commit tran Jim1; DROP table PnLTest.dbo.What_Jim_Deleted
/* This creates a table to put the deleted rows in just in case I'm really screwed up */
SELECT top 1 *, NULL dupflag
INTO jt1.dbo.What_Jim_Deleted --DROP TABLE jt1.dbo.What_Jim_Deleted
FROM jt1.dbo.tab1;
/* This removes the row without removing the table */
TRUNCATE TABLE jt1.dbo.What_Jim_Deleted;
/* the cte assigns a row number to each unique security for each day, dups will have a
rownumber > 1. The fields in the partition by are from the composite key for the
table (if one exists. These are the queries that I ran to show them as dups
SELECT compkey1, compkey2, compkey3, compkey4, COUNT(*)
FROM jt1.dbo.tab1
GROUP BY compkey1, compkey2, compkey3, compkey4
HAVING COUNT(*) > 1
ORDER BY 1 DESC
*/
with getthedups as
(SELECT *,
ROW_NUMBER() OVER
(partition by compkey1,compkey2, compkey3, compkey4
ORDER BY Timestamp desc) dupflag /*This can be anything that gives some order to the rows (even if order doesn't matter) */
FROM jt1.dbo.tab1)
/* This delete is deleting from the cte which cascades to the underlying table
The Where is part of the Delete (even though it comes after the OUTPUT. The
OUTPUT takes all of the DELETED row and inserts them into the "oh shit" table,
just in case.*/
DELETE
FROM getthedups
OUTPUT DELETED.* INTO jti.dbo.What_Jim_Deleted
WHERE dupflag > 1
--Check the resulting tables here to ensure that you did what you think you did
/* If all has gone well then commit the tran and drop the "oh shit" table, or let it
hang around for a while. */

Need a SQL statement focus on combination of tables but entries always with unique ID

I need SQL code to solve the tables combination problem, described on below:
Table old data: table old
name version status lastupdate ID
A 0.1 on 6/8/2010 1
B 0.1 on 6/8/2010 2
C 0.1 on 6/8/2010 3
D 0.1 on 6/8/2010 4
E 0.1 on 6/8/2010 5
F 0.1 on 6/8/2010 6
G 0.1 on 6/8/2010 7
Table new data: table new
name version status lastupdate ID
A 0.1 on 6/18/2010
#B entry deleted
C 0.3 on 6/18/2010 #version_updated
C1 0.1 on 6/18/2010 #new_added
D 0.1 on 6/18/2010
E 0.1 off 6/18/2010 #status_updated
F 0.1 on 6/18/2010
G 0.1 on 6/18/2010
H 0.1 on 6/18/2010 #new_added
H1 0.1 on 6/18/2010 #new_added
the difference of new data and old date:
B entry deleted
C entry version updated
E entry status updated
C1/H/H1 entry new added
What I want is always keeping the ID - name mapping relationship in old data table no matter how data changed later, a.k.a the name always has an unique ID number bind with it.
If entry has update, then update the data, if entry is new added, insert to the table then give a new assigned unique ID. If the entry was deleted, delete the entry and do not reuse that ID later.
However, I can only use SQL with simple select or update statement then it may too hard for me to write such code, then I hope someone with expertise can give direction, no details needed on the different of SQL variant, a standard sql code as sample is enough.
Thanks in advance!
Rgs
KC
========
I listed my draft sql here, but not sure if it works, some one with expertise pls comment, thanks!
1.duplicate old table as tmp for store updates
create table tmp as
select * from old
2.update into tmp where the "name" is same in old and new table
update tmp
where name in (select name from new)
3.insert different "name" (old vs new) into tmp and assign new ID
insert into tmp (name version status lastupdate ID)
set idvar = max(select max(id) from tmp) + 1
select * from
(select new.name new.version new.status new.lastupdate new.ID
from old, new
where old.name <> new.name)
4. delete the deleted entries from tmp table (such as B)
delete from tmp
where
(select ???)
You never mentioned what DBMS you are using but if you are using SQL Server, one really good one is the SQL MERGE statement. See: http://www.mssqltips.com/tip.asp?tip=1704
The MERGE statement basically works as
separate insert, update, and delete
statements all within the same
statement. You specify a "Source"
record set and a "Target" table, and
the join between the two. You then
specify the type of data modification
that is to occur when the records
between the two data are matched or
are not matched. MERGE is very useful,
especially when it comes to loading
data warehouse tables, which can be
very large and require specific
actions to be taken when rows are or
are not present.
Example:
MERGE Products AS TARGET
USING UpdatedProducts AS SOURCE
ON (TARGET.ProductID = SOURCE.ProductID)
--When records are matched, update
--the records if there is any change
WHEN MATCHED AND TARGET.ProductName <> SOURCE.ProductName
OR TARGET.Rate <> SOURCE.Rate THEN
UPDATE SET TARGET.ProductName = SOURCE.ProductName,
TARGET.Rate = SOURCE.Rate
--When no records are matched, insert
--the incoming records from source
--table to target table
WHEN NOT MATCHED BY TARGET THEN
INSERT (ProductID, ProductName, Rate)
VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)
--When there is a row that exists in target table and
--same record does not exist in source table
--then delete this record from target table
WHEN NOT MATCHED BY SOURCE THEN
DELETE
--$action specifies a column of type nvarchar(10)
--in the OUTPUT clause that returns one of three
--values for each row: 'INSERT', 'UPDATE', or 'DELETE',
--according to the action that was performed on that row
OUTPUT $action,
DELETED.ProductID AS TargetProductID,
DELETED.ProductName AS TargetProductName,
DELETED.Rate AS TargetRate,
INSERTED.ProductID AS SourceProductID,
INSERTED.ProductName AS SourceProductName,
INSERTED.Rate AS SourceRate;
SELECT ##ROWCOUNT;
GO
Let me start from the end:
In #4 you would delete all rows in tmp; what you wanted to say there is WHERE tmp.name NOT IN (SELECT name FROM new); similarly #3 is not correct syntax, but if it was it would try to insert all rows.
Regarding #2, why not use auto increment on the ID?
Regarding #1, if your tmp table is the same as new the queries #2-#4 make no sense, unless you change (update, insert, delete) new table in some way.
But (!), if you do update the table new and it has an auto increment field on ID and if you are properly updating the table (using ID) from the application then your whole procedure is unnecessary (!).
So, the important thing is that you should not design the system to work like above.
To get the concept of updating data in the database from the application side take a look at examples here (php/mysql).
Also, to get the syntax correct on your queries go through the basic version of SET, INSERT, DELETE and SELECT commands (no way around this).
Note - if you are concerned about performance you can skip this whole answer :-)
If you can redesign have 2 tables - one with the data and other with the name - ID linkage. Something like
table_original
name version status lastupdate
A 0.1 on 6/8/2010
B 0.1 on 6/8/2010
C 0.1 on 6/8/2010
D 0.1 on 6/8/2010
E 0.1 on 6/8/2010
F 0.1 on 6/8/2010
G 0.1 on 6/8/2010
and name_id
name ID
A 1
B 2
C 3
D 4
E 5
F 6
G 7
When you get the table_new with the new set of data
TRUNCATE table_original
INSERT INTO name_id (names from table_new not in name_id)
copy table_new to table_original
Note : I think there's a bit of ambiguity about the deletion here
If the entry was deleted, delete the
entry and do not reuse that ID later.
If name A gets deleted, and it turns up again in a later set of updates do you want to a. reuse the original ID tagged to A, or b. generate a new ID?
If it's b. you need a column Deleted? in name_id and a last step
4 . set Deleted? = Y where name not in table_original
and 2. would exclude Deleted? = Y records.
You could also do the same thing without the name_id table based on the logic that the only thing you need from table_old is the name - ID links. Everything else you need is in table_new,
This works in Informix and gives exactly the display you require. Same or similar should work in MySQL, one would think. The trick here is to get the union of all names into a temp table and left join on that so that the values from the other two can be compared.
SELECT DISTINCT name FROM old
UNION
SELECT DISTINCT name FROM new
INTO TEMP _tmp;
SELECT
CASE WHEN b.name IS NULL THEN ''
ELSE aa.name
END AS name,
CASE WHEN b.version IS NULL THEN ''
WHEN a.version = b.version THEN a.version
ELSE b.version
END AS version,
CASE WHEN a.status = b.status THEN a.status
WHEN b.status IS NULL THEN ''
ELSE b.status
END AS status,
CASE WHEN a.lastupdate = b.lastupdate THEN a.lastupdate
WHEN b.lastupdate IS NULL THEN null
ELSE b.lastupdate
END AS lastupdate,
CASE WHEN a.name IS NULL THEN '#new_added'
WHEN b.name IS NULL THEN '#' || aa.name || ' entry deleted'
WHEN a.version b.version THEN '#version_updated'
WHEN a.status b.status THEN '#status_updated'
ELSE ''
END AS change
FROM _tmp aa
LEFT JOIN old a
ON a.name = aa.name
LEFT JOIN new b
ON b.name = aa.name;
a drafted approach, I have no idea if it works fine......
CREATE TRIGGER auto_next_id
AFTER INSERT ON table FOR EACH ROW
BEGIN
UPDATE table SET uid = max(uid) + 1 ;
END;
If I understood well what you need based on the comments in the two tables, I think you can simplify a lot your problem if you don't merge or update the old table because what you need is table new with the IDs in table old when they exist and new IDs when they do not exist, right?
New records: table new has the new records already - OK (but they need a new ID)
Deleted Records: they are not in table new - OK
Updated Records: already updated in table new - OK (need to copy ID from table old)
Unmodified records: already in table new - OK (need to copy ID from table old)
So the only thing you need to do is to:
(a) copy the IDs from table old to table new when they exist
(b) create new IDs in table new when they do not exist in table old
(c) copy table new to table old.
(a) UPDATE new SET ID = IFNULL((SELECT ID FROM old WHERE new.name = old.name),0);
(b) UPDATE new SET ID = FUNCTION_TO GENERATE_ID(new.name) WHERE ID = 0;
(c) Drop table old;
CREATE TABLE old (select * from new);
As I don't know which SQL database you are using, in (b) you can use an sql function to generate the unique id depending on the database. With SQL Server, newid(), With postgresql (not too old versions), now() seems a good choice as its precision looks sufficient (but not in other databases as MySQL for example as I think the precision is limited to seconds)
Edit: Sorry, I hadn't seen you're using sqlite and python. In this case you can use str(uuid.uuid4()) function (uuid module) in python to generate the uuid and fill the ID in new table where ID = 0 in step (b). This way you'll be able to join 2 independent databases if needed without conflicts on the IDs.
Why don't you use a UUID for this? Generate it once for a plug-in, and incorporate/keep it into the plug-in, not into the DB. Now that you mention python, here's how to generate it:
import uuid
UID = str(uuid.uuid4()) # this will yield new UUID string
Sure it does not guarantee global uniqueness, but chances you get the same string in your project is pretty low.