How to copy rows into a new a one to many relationship

How to copy rows into a new a one to many relationship - sql

I'm trying to copy a set of data in a one to many relationship to create a new set of the same data in a new, but unrelated one to many relationship. Lets call them groups and items. Groups have a 1-* relation with items - one group has many items.
I've tried to create a CTE to do this, however I can't get the items inserted (in y) as the newly inserted groups don't have any items associated with them yet. I think I need to be able to access old. and new. like you would in a trigger, but I can't work out how to do this.
I think I could solve this by introducing a previous parent id into the templateitem table, or maybe a temp table with the data required to enable me to join on that, but I was wondering if it is possible to solve it this way?
SQL Fiddle Keeps Breaking on me, so I've put the code here as well:
DROP TABLE IF EXISTS meta.templateitem;
DROP TABLE IF EXISTS meta.templategroup;
CREATE TABLE meta.templategroup (
templategroup_id serial PRIMARY KEY,
groupname text,
roworder int
);
CREATE TABLE meta.templateitem (
templateitem_id serial PRIMARY KEY,
itemname text,
templategroup_id INTEGER NOT NULL REFERENCES meta.templategroup(templategroup_id)
);
INSERT INTO meta.templategroup (groupname, roworder) values ('Group1', 1), ('Group2', 2);
INSERT INTO meta.templateitem (itemname, templategroup_id) values ('Item1A',1), ('Item1B',1), ('Item2A',2);
WITH
x AS (
INSERT INTO meta.templategroup (groupname, roworder)
SELECT distinct groupname || '_v1' FROM meta.templategroup where templategroup_id in (1,2)
RETURNING groupname, templategroup_id, roworder
),
y AS (
Insert INTO meta.templateitem (itemname, templategroup_id)
Select itemname, x.templategroup_id
From meta.templateitem i
INNER JOIN x on x.templategroup_id = i.templategroup_id
RETURNING *
)
SELECT * FROM y;

Use an auxiliary column templategroup.old_id:
ALTER TABLE meta.templategroup ADD old_id int;
WITH x AS (
INSERT INTO meta.templategroup (groupname, roworder, old_id)
SELECT DISTINCT groupname || '_v1', roworder, templategroup_id
FROM meta.templategroup
WHERE templategroup_id IN (1,2)
RETURNING templategroup_id, old_id
),
y AS (
INSERT INTO meta.templateitem (itemname, templategroup_id)
SELECT itemname, x.templategroup_id
FROM meta.templateitem i
INNER JOIN x ON x.old_id = i.templategroup_id
RETURNING *
)
SELECT * FROM y;
templateitem_id | itemname | templategroup_id
-----------------+----------+------------------
4 | Item1A | 3
5 | Item1B | 3
6 | Item2A | 4
(3 rows)
It's impossible to do that in a single plain sql query without an additional column. You have to store the old ids somewhere. As an alternative you can use plpgsql and anonymous code block:
Before:
select *
from meta.templategroup
join meta.templateitem using (templategroup_id);
templategroup_id | groupname | roworder | templateitem_id | itemname
------------------+-----------+----------+-----------------+----------
1 | Group1 | 1 | 1 | Item1A
1 | Group1 | 1 | 2 | Item1B
2 | Group2 | 2 | 3 | Item2A
(3 rows)
Insert:
do $$
declare
grp record;
begin
for grp in
select distinct groupname || '_v1' groupname, roworder, templategroup_id
from meta.templategroup
where templategroup_id in (1,2)
loop
with insert_group as (
insert into meta.templategroup (groupname, roworder)
values (grp.groupname, grp.roworder)
returning templategroup_id
)
insert into meta.templateitem (itemname, templategroup_id)
select itemname || '_v1', g.templategroup_id
from meta.templateitem i
join insert_group g on grp.templategroup_id = i.templategroup_id;
end loop;
end $$;
After:
select *
from meta.templategroup
join meta.templateitem using (templategroup_id);
templategroup_id | groupname | roworder | templateitem_id | itemname
------------------+-----------+----------+-----------------+-----------
1 | Group1 | 1 | 1 | Item1A
1 | Group1 | 1 | 2 | Item1B
2 | Group2 | 2 | 3 | Item2A
3 | Group1_v1 | 1 | 4 | Item1A_v1
3 | Group1_v1 | 1 | 5 | Item1B_v1
4 | Group2_v1 | 2 | 6 | Item2A_v1
(6 rows)

Related

Get records having the same value in 2 columns but a different value in a 3rd column

I am having trouble writing a query that will return all records where 2 columns have the same value but a different value in a 3rd column. I am looking for the records where the Item_Type and Location_ID are the same, but the Sub_Location_ID is different.
The table looks like this:
+---------+-----------+-------------+-----------------+
| Item_ID | Item_Type | Location_ID | Sub_Location_ID |
+---------+-----------+-------------+-----------------+
| 1 | 00001 | 20 | 78 |
| 2 | 00001 | 110 | 124 |
| 3 | 00001 | 110 | 124 |
| 4 | 00002 | 3 | 18 |
| 5 | 00002 | 3 | 25 |
+---------+-----------+-------------+-----------------+
The result I am trying to get would look like this:
+---------+-----------+-------------+-----------------+
| Item_ID | Item_Type | Location_ID | Sub_Location_ID |
+---------+-----------+-------------+-----------------+
| 4 | 00002 | 3 | 18 |
| 5 | 00002 | 3 | 25 |
+---------+-----------+-------------+-----------------+
I have been trying to use the following query:
SELECT *
FROM Table1
WHERE Item_Type IN (
SELECT Item_Type
FROM Table1
GROUP BY Item_Type
HAVING COUNT (DISTINCT Sub_Location_ID) > 1
)
But it returns all records with the same Item_Type and a different Sub_Location_ID, not all records with the same Item_Type AND Location_ID but a different Sub_Location_ID.

This should do the trick...
-- some test data...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
BEGIN DROP TABLE #TestData; END;
CREATE TABLE #TestData (
Item_ID INT NOT NULL PRIMARY KEY,
Item_Type CHAR(5) NOT NULL,
Location_ID INT NOT NULL,
Sub_Location_ID INT NOT NULL
);
INSERT #TestData (Item_ID, Item_Type, Location_ID, Sub_Location_ID) VALUES
(1, '00001', 20, 78),
(2, '00001', 110, 124),
(3, '00001', 110, 124),
(4, '00002', 3, 18),
(5, '00002', 3, 25);
-- adding a covering index will eliminate the sort operation...
CREATE NONCLUSTERED INDEX ix_indexname ON #TestData (Item_Type, Location_ID, Sub_Location_ID, Item_ID);
-- the actual solution...
WITH
cte_count_group AS (
SELECT
td.Item_ID,
td.Item_Type,
td.Location_ID,
td.Sub_Location_ID,
cnt_grp_2 = COUNT(1) OVER (PARTITION BY td.Item_Type, td.Location_ID),
cnt_grp_3 = COUNT(1) OVER (PARTITION BY td.Item_Type, td.Location_ID, td.Sub_Location_ID)
FROM
#TestData td
)
SELECT
cg.Item_ID,
cg.Item_Type,
cg.Location_ID,
cg.Sub_Location_ID
FROM
cte_count_group cg
WHERE
cg.cnt_grp_2 > 1
AND cg.cnt_grp_3 < cg.cnt_grp_2;

You can use exists :
select t.*
from table t
where exists (select 1
from table t1
where t.Item_Type = t1.Item_Type and
t.Location_ID = t1.Location_ID and
t.Sub_Location_ID <> t1.Sub_Location_ID
);

Sql server has no vector IN so you can emulate it with a little trick. Assuming '#' is illegal char for Item_Type
SELECT *
FROM Table1
WHERE Item_Type+'#'+Cast(Location_ID as varchar(20)) IN (
SELECT Item_Type+'#'+Cast(Location_ID as varchar(20))
FROM Table1
GROUP BY Item_Type, Location_ID
HAVING COUNT (DISTINCT Sub_Location_ID) > 1
);
The downsize is the expression in WHERE is non-sargable

I think you can use exists:
select t1.*
from table1 t1
where exists (select 1
from table1 tt1
where tt1.Item_Type = t1.Item_Type and
tt1.Location_ID = t1.Location_ID and
tt1.Sub_Location_ID <> t1.Sub_Location_ID
);

Generating random data in two tables with 1:M relationship with more cells of unique data in the second table in PostgreSQL

I have two tables: first_table and other_table. I'm generating 3 rows of random data in the pr_key column in the first_table. The first_table and other_table have a 1:M relationship.
For each pr_key in the other_table I need to generate again random numbers in the second_code column so that there would be in total 9 rows.
The problem with my approach is, that the random numbers in the second_code column repeat for each pr_key, but they need to be different.
Additionally, in the other_table there is a constraint, that checks, that the pairs of pr_key and second_code are unique.
with oper as (
INSERT INTO first_table(
pr_key,
)
SELECT
pr_key,
FROM (
SELECT(
SELECT (random()*10)::int+(gen*0) as pr_key
),
gen as id
FROM generate_series(1,3) as gen
) main
RETURNING pr_key)
INSERT INTO other_table(pr_key, second_code)
SELECT pr_key, second_code
FROM oper, (
SELECT
(
SELECT 1+(random()*10)::int+(gen*0) as second_code
),
gen as id
FROM generate_series(1,3) as gen
) AS gener

Try with this syntax:
CREATE TABLE t1 (pr_key int);
CREATE TABLE t2 (pr_key int, second_code int);
with c1 as
(
insert into t1
select pr_key
from (
select (select (random()*10)::int+(gen*0) as pr_key),
gen as id
from generate_series(1,3) as gen
) t
returning pr_key
)
insert into t2 (pr_key, second_code)
select pr_key, (select (random()*10)::int+(id*0))
from c1, (select gen as id from generate_series(1,3) as gen) t2
select * from t1;
| pr_key |
| -----: |
| 2 |
| 7 |
| 9 |
select * from t2;
pr_key | second_code
-----: | ----------:
2 | 5
2 | 7
2 | 1
7 | 4
7 | 0
7 | 3
9 | 4
9 | 10
9 | 2

Update table in Postgresql by grouping rows

I want to update a table by grouping (or combining) some rows together based on a certain criteria. I basically have this table currently (I want to group by 'id_number' and 'date' and sum 'count'):
Table: foo
---------------------------------------
| id_number | date | count |
---------------------------------------
| 1 | 2001 | 1 |
| 1 | 2001 | 2 |
| 1 | 2002 | 1 |
| 2 | 2001 | 6 |
| 2 | 2003 | 12 |
| 2 | 2003 | 2 |
---------------------------------------
And I want to get this:
Table: foo
---------------------------------------
| id_number | date | count |
---------------------------------------
| 1 | 2001 | 3 |
| 1 | 2002 | 1 |
| 2 | 2001 | 6 |
| 2 | 2003 | 14 |
---------------------------------------
I know that I can easily create a new table with the pertinent info. But how can I modify an existing table like this without making a "temp" table? (Note: I have nothing against using a temporary table, I'm just interested in seeing if I can do it this way)

If you want to delete rows you can add a primary key (for distinguish rows) and use two sentences, an UPDATE for the sum and a DELETE for obtain less rows.
You can do something like this:
create table foo (
id integer primary key,
id_number integer,
date integer,
count integer
);
insert into foo values
(1, 1 , 2001 , 1 ),
(2, 1 , 2001 , 2 ),
(3, 1 , 2002 , 1 ),
(4, 2 , 2001 , 6 ),
(5, 2 , 2003 , 12 ),
(6, 2 , 2003 , 2 );
select * from foo;
update foo
set count = count_sum
from (
select id, id_number, date,
sum(count) over (partition by id_number, date) as count_sum
from foo
) foo_added
where foo.id_number = foo_added.id_number
and foo.date = foo_added.date;
delete from foo
using (
select id, id_number, date,
row_number() over (partition by id_number, date order by id) as inner_order
from foo
) foo_ranked
where foo.id = foo_ranked.id
and foo_ranked.inner_order <> 1;
select * from foo;
You can try it here: http://rextester.com/PIL12447
With only one UPDATE
(but with a trigger) you can set a NULL value in count and trigger a DELETE in that case.
create table foo (
id integer primary key,
id_number integer,
date integer,
count integer
);
create function delete_if_count_is_null() returns trigger
language plpgsql as
$BODY$
begin
if new.count is null then
delete from foo
where id = new.id;
end if;
return new;
end;
$BODY$;
create trigger delete_if_count_is_null
after update on foo
for each row
execute procedure delete_if_count_is_null();
insert into foo values
(1, 1 , 2001 , 1 ),
(2, 1 , 2001 , 2 ),
(3, 1 , 2002 , 1 ),
(4, 2 , 2001 , 6 ),
(5, 2 , 2003 , 12 ),
(6, 2 , 2003 , 2 );
select * from foo;
update foo
set count = case when inner_order = 1 then count_sum else null end
from (
select id, id_number, date,
sum(count) over (partition by id_number, date) as count_sum,
row_number() over (partition by id_number, date order by id) as inner_order
from foo
) foo_added
where foo.id_number = foo_added.id_number
and foo.date = foo_added.date
and foo.id = foo_added.id;
select * from foo;
You can try it in: http://rextester.com/MWPRG10961

Fast query to do normalization on SQL data

I have some data that I want to normalize. Specifically I'm normalizing it so I can process the portions getting normalized without having to worry about duplicates. What I'm doing is:
INSERT INTO new_table (a, b, c)
SELECT DISTINCT a,b,c
FROM old_table;
UPDATE old_table
SET abc_id = new_table.id
FROM new_table
WHERE new_table.a = old_table.a
AND new_table.b = old_table.b
AND new_table.c = old_table.c;
First off, it seems as if there should be a better way of doing this. It seems that the inherent process of finding the distinct data could produce a list of the members that belong to it. Second, and more important, the INSERT takes a couple and the UPDATE takes FOREVER (I don't actually have a value for how long it takes yet because it's still running). I'm using postgresql. Is there a better way of doing this (perhaps all in one query).

This is my other answer, extended to three columns:
-- Some test data
CREATE TABLE the_table
( id SERIAL NOT NULL PRIMARY KEY
, name varchar
, a INTEGER
, b varchar
, c varchar
);
INSERT INTO the_table(name, a,b,c) VALUES
( 'Chimpanzee' , 1, 'mammals', 'apes' )
,( 'Urang Utang' , 1, 'mammals', 'apes' )
,( 'Homo Sapiens' , 1, 'mammals', 'apes' )
,( 'Mouse' , 2, 'mammals', 'rodents' )
,( 'Rat' , 2, 'mammals', 'rodents' )
,( 'Cat' , 3, 'mammals', 'felix' )
,( 'Dog' , 3, 'mammals', 'canae' )
;
-- [empty] table to contain the "squeezed out" domain {a,b,c}
CREATE TABLE abc_table
( id SERIAL NOT NULL PRIMARY KEY
, a INTEGER
, b varchar
, c varchar
, UNIQUE (a,b,c)
);
-- The original table needs a "link" to the new table
ALTER TABLE the_table
ADD column abc_id INTEGER -- NOT NULL
REFERENCES abc_table(id)
;
-- FK constraints are helped a lot by a supportive index.
CREATE INDEX abc_table_fk ON the_table (abc_id);
-- Chained query to:
-- * populate the domain table
-- * initialize the FK column in the original table
WITH ins AS (
INSERT INTO abc_table(a,b,c)
SELECT DISTINCT a,b,c
FROM the_table a
RETURNING *
)
UPDATE the_table ani
SET abc_id = ins.id
FROM ins
WHERE ins.a = ani.a
AND ins.b = ani.b
AND ins.c = ani.c
;
-- Now that we have the FK pointing to the new table,
-- we can drop the redundant columns.
ALTER TABLE the_table DROP COLUMN a, DROP COLUMN b, DROP COLUMN c;
SELECT * FROM the_table;
SELECT * FROM abc_table;
-- show it to the world
SELECT a.*
, c.a, c.b, c.c
FROM the_table a
JOIN abc_table c ON c.id = a.abc_id
;
Results:
CREATE TABLE
INSERT 0 7
CREATE TABLE
ALTER TABLE
CREATE INDEX
UPDATE 7
ALTER TABLE
id | name | abc_id
----+--------------+--------
1 | Chimpanzee | 4
2 | Urang Utang | 4
3 | Homo Sapiens | 4
4 | Mouse | 3
5 | Rat | 3
6 | Cat | 1
7 | Dog | 2
(7 rows)
id | a | b | c
----+---+---------+---------
1 | 3 | mammals | felix
2 | 3 | mammals | canae
3 | 2 | mammals | rodents
4 | 1 | mammals | apes
(4 rows)
id | name | abc_id | a | b | c
----+--------------+--------+---+---------+---------
1 | Chimpanzee | 4 | 1 | mammals | apes
2 | Urang Utang | 4 | 1 | mammals | apes
3 | Homo Sapiens | 4 | 1 | mammals | apes
4 | Mouse | 3 | 2 | mammals | rodents
5 | Rat | 3 | 2 | mammals | rodents
6 | Cat | 1 | 3 | mammals | felix
7 | Dog | 2 | 3 | mammals | canae
(7 rows)

Came up with a way to do this on my own:
BEGIN;
CREATE TEMPORARY TABLE new_table_temp (
LIKE new_table,
old_ids integer[]
)
ON COMMIT DROP;
INSERT INTO new_table_temp (a, b, c, old_ids)
SELECT a, b, c, array_ag(id) AS old_ids
FROM old_table
GROUP BY a, b, c;
INSERT INTO new_table (id, a, b, c)
SELECT id, a, b, c
FROM new_table_temp;
UPDATE old_table
SET abc_id = new_table_temp.id
FROM new_table_temp
WHERE old_table.id = ANY(new_table_temp.old_ids);
COMMIT;
This at least is what I was looking for. I'll update this as to whether it worked quickly. The EXPLAIN seems to form a sensible plan, so I'm hopeful.

Getting the Next Available Row

How can I get a List all the JobPositionNames having the lowest jobPositionId when ContactId = 1
Tablel :
| JobPositionId | JobPositionName | JobDescriptionId | JobCategoryId | ContactId
---------------------------------------------------------------------------------
1 | Audio Cables | 1 | 1 | 1
2 |Audio Connections| 2 | 1 | 1
3 |Audio Connections| 2 | 1 | 0
4 |Audio Connections| 2 | 1 | 0
5 | Sound Board | 3 | 1 | 0
6 | Tent Pen | 4 | 3 | 0
eg the result of this table should be lines 1,3,5,6

I can't figure out the solution.
Only lack of something, but I can give some code for you view.
Maybe it can help you.
--create table
create table t
(
JobPositionId int identity(1,1) primary key,
JobPositionName nvarchar(100) not null,
JobDescriptionId int,
JobCategoryId int,
ContactId int
)
go
--insert values
BEGIN TRAN
INSERT INTO t VALUES ('AudioCables', 1,1,1)
INSERT INTO t VALUES ('AudioConnections',2,1,1)
INSERT INTO t VALUES ('AudioConnections',2,1,0)
INSERT INTO t VALUES ('AudioConnections',2,1,0)
INSERT INTO t VALUES ('SoundBoard',3,1,0)
INSERT INTO t VALUES ('TentPen',4,3,0)
COMMIT TRAN
GO
SELECT
Min(JobPositionId) AS JobPositionId, JobPositionName, ContactId
INTO
#tempTable
FROM
t
GROUP BY JobPositionName, ContactId
SELECT * FROM #tempTable
WHERE JobPositionId IN (
SELECT JobPositionId
FROM #tempTable
GROUP BY JobPositionName
--... lack of sth, I can't figure out ,sorry.
)
drop table t
GO

For per-group maximum/minimum queries you can use a null-self-join as well as strategies like subselects. This is generally faster in MySQL.
SELECT j0.JobPositionId, j0.JobPositionName, j0.ContactId
FROM Jobs AS j0
LEFT JOIN Jobs AS j1 ON j1.JobPositionName=j0.JobPositionName
AND (
(j1.ContactId<>0)<(j0.ContactId<>0)
OR ((j1.ContactId<>0)=(j0.ContactId<>0) AND j1.JobPositionId<j0.JobPositionId))
)
WHERE j1.JobPositionName IS NULL
This says, for each JobPositionName, find a row for which there exists no other row with a lower ordering value. The ordering value here is a composite [ContactId-non-zeroness, JobPositionId].
(Aside: shouldn't JobPositionName and JobCategoryId be normalised out into a table keyed on JobDescriptionId? And shouldn't unassigned ContactIds be NULL?)

SELECT jp.*
FROM (
SELECT JobPositionName, JobPositionId, COUNT(*) AS cnt
FROM JobPosisions
) jpd
JOIN JobPosisions jp
ON jp.JobPositionId =
IF(
cnt = 1,
jpd.JobPositionId,
(
SELECT MIN(JobPositionId)
FROM JobPositions jpi
WHERE jpi.JobPositionName = jpd.JobPositionName
AND jpi.ContactID = 0
)
)
Create an index on (JobPositionName, ContactId, JobPositionId) for this to work fast.
Note that if will not return the jobs having more than one position, neither of which has ContactID = 0

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to copy rows into a new a one to many relationship - sql

Related

Get records having the same value in 2 columns but a different value in a 3rd column

Generating random data in two tables with 1:M relationship with more cells of unique data in the second table in PostgreSQL

Update table in Postgresql by grouping rows

Fast query to do normalization on SQL data

Getting the Next Available Row

Categories

Resources