I have two tables with the same columns, the first column is the name and the second is a count. I would like to merge these tables, so that each name appears with the added count of the two tables:
Table1: Table2: Result Table:
NAME COUNT NAME COUNT NAME COUNT
name1 1 name3 3 name1 1
name2 2 name4 4 name2 2
name3 3 name5 5 name3 6
name4 4 name6 6 name4 8
name5 5
name6 6
As of the moment I have created a pretty ugly structure to execute this, and would like to know if it is possible to get the results in a more elegant way.
What I have so far (Table1 is test1 and Table2 is test2):
create table test1 ( name varchar(40), count integer);
create table test2 ( name varchar(40), count integer);
create table test3 ( name varchar(40), count integer);
create table test4 ( name varchar(40), count integer);
create table test5 ( name varchar(40), count integer);
insert into test4 (name, count) select * from test1;
insert into test4 (name, count) select * from test2;
insert into test3 (name , count) select t1.name, t1.count + t2.count
from test1 t1 inner join test2 t2 on t1.name = t2.name;
select merge_db(name, count) from test3;
insert into test5 (name, count) (select name, max(count) from test4 group by name);
CREATE FUNCTION merge_db(key varchar(40), data integer) RETURNS VOID AS
$$ -- souce: http://stackoverflow.com/questions/1109061/insert-on-duplicate-update-postgresql
BEGIN
LOOP
-- first try to update the key
UPDATE test4 SET count = data WHERE name = key;
IF found THEN
RETURN;
END IF;-- not there, so try to insert the key -- if someone else inserts the same key concurrently, -- we could get a unique-key failure
BEGIN
INSERT INTO test4(name,count) VALUES (key, data);
RETURN;
EXCEPTION WHEN unique_violation THEN-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
=> create table t1 (name text,cnt int);
=> create table t2 (name text,cnt int);
=> insert into t1 values ('name1',1), ('name2',2), ('name3',3), ('name4',4);
=> insert into t2 values ('name3',3), ('name4',4), ('name5',5), ('name6',6);
=>
select name,sum(cnt) from
(select * from t1
union all
select * from t2 ) X
group by name
order by 1;
name | sum
-------+-----
name1 | 1
name2 | 2
name3 | 6
name4 | 8
name5 | 5
name6 | 6
(6 rows)
How about this, in pure SQL:
SELECT
COALESCE(t1.name, t2.name),
COALESCE(t1.count, 0) + COALESCE(t2.count, 0) AS count
FROM t1 FULL OUTER JOIN t2 ON t1.name=t2.name;
Basically we're doing a full outer join on the name field to merge the two tables. The tricky part is that with the full outer join, rows that exist in one table but not the other will appear, but will have NULL in the other table; so if t1 has "name1" but t2 doesn't, the join will give us NULLs for t2.name and t2.name.
The COALESCE function returns the first non-NULL argument, so we use it to "convert" the NULL counts to 0 and to pick the name from the correct table. Thanks for the tip on this Wayne!
Good luck!
An alternative method is to use the NATURAL FULL OUTER JOIN combined with SUM(count) and GROUP BY name statements. The following SQL code exactly yields the desired result:
SELECT name, SUM(count) AS count FROM
( SELECT 1 AS tableid, * FROM t1 ) AS table1
NATURAL FULL OUTER JOIN
( SELECT 2 AS tableid, * FROM t2 ) AS table2
GROUP BY name ORDER BY name
The artificial tableid column ensures that the NATURAL FULL OUTER JOIN creates a separate row for each row in t1 and for each row in t2. In other words, the rows "name3, 3" and "name4, 4" appear twice in the intermediate result. In order to merge these duplicate rows and to sum the counts we can group the rows by the name column and sum the count column.
Related
I have selected the following data that I want to insert into the database.
Letter
Value
A
1
A
2
B
3
B
4
Since there is a repetition of "A" and "B" in this format, I want to split data into two separate tables: table1 and table2.
table1:
ID
Letter
1
A
2
B
ID here is automatically inserted by database (using a sequence).
table2:
table1_id
Value
1
1
1
2
2
3
2
4
In this particular example, I don't gain anything on storage but it illustrates in the best way what problem I have encountered.
How can I use SQL or PL/SQL to insert data into table1 and table2?
First populate table1 from the source
insert table1(Letter)
select distinct Letter
from srcTable;
Then load data from the source decoding letter to id
insert table2(table1_id, Value)
select t1.id, src.value
from srcTable src
join table1 t1 on t1.Letter = src.Letter;
You may use multitable insert with workaround to get stable nextval on sequence. Since nextval is evaluated on each row regardless of when condition, it is not sufficient to use it inside values.
insert all
when rn = 1 then into l(id, val) values(seq, letter)
when rn > 0 then into t(l_id, val) values(seq, val)
with a(letter, val) as (
select 'A', 1 from dual union all
select 'A', 2 from dual union all
select 'B', 3 from dual union all
select 'B', 4 from dual union all
select 'C', 5 from dual
)
, b as (
select
a.*,
l.id as l_id,
row_number() over(partition by a.letter order by a.val asc) as rn
from a
left join l
on a.letter = l.val
)
select
b.*,
max(decode(rn, 1, coalesce(
l_id,
extractvalue(
/*Hide the reference to the sequence due to restrictions
of multitalbe insert*/
dbms_xmlgen.getxmltype('select l_sq.nextval as seq from dual')
, '/ROWSET/ROW/SEQ/text()'
) + 0
))
) over(partition by b.letter) as seq
from b
select *
from l
ID | VAL
-: | :--
1 | A
2 | B
3 | C
select *
from t
L_ID | VAL
---: | --:
1 | 1
1 | 2
2 | 3
2 | 4
3 | 5
db<>fiddle here
Principally you need to produce and ID value for the table1 to be inserted into table2. For this, You can use INSERT ... RETURNING ID INTO v_id statement after creating the tables containing some constraints especially unique ones such as PRIMARY KEY and UNIQUE
CREATE TABLE table1( ID INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, letter VARCHAR2(1) NOT NULL );
ALTER TABLE table1 ADD CONSTRAINT uk_tab1_letter UNIQUE(letter);
CREATE TABLE table2( ID INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, table1_id INT, value INT );
ALTER TABLE table2 ADD CONSTRAINT fk_tab2_tab1_id FOREIGN KEY(table1_id) REFERENCES table1 (ID);
and adding exception handling in order not to insert repeating letters to the first table. Then use the following code block ;
DECLARE
v_id table1.id%TYPE;
v_letter table1.letter%TYPE := 'A';
v_value table2.value%TYPE := 1;
BEGIN
BEGIN
INSERT INTO table1(letter) VALUES(v_letter) RETURNING ID INTO v_id;
EXCEPTION WHEN OTHERS THEN NULL;
END;
INSERT INTO table2(table1_id,value) SELECT id,v_value FROM table1 WHERE letter = v_letter;
COMMIT;
END;
/
and run by changing the initialized values for v_letter&v_value as 'A'&2, 'B'&1,'B'&2 ..etc respectively.
Alternatively you can convert the code block to a stored procedure or function such as
CREATE OR REPLACE PROCEDURE Pr_Ins_Tabs(
v_letter table1.letter%TYPE,
v_value table2.value%TYPE
) AS
v_id table1.id%TYPE;
BEGIN
BEGIN
INSERT INTO table1(letter) VALUES(v_letter) RETURNING ID INTO v_id;
EXCEPTION WHEN OTHERS THEN NULL;
END;
INSERT INTO table2(table1_id,value) SELECT id,v_value FROM table1 WHERE letter = v_letter;
COMMIT;
END;
/
in order to revoke resiliently such as
BEGIN
Pr_Ins_Tabs('A',2);
END;
/
Demo
PS. If your DB verion is prior to 12c, then create sequences(seq1&seq2) and use seq1.nextval&seq2.nextval within the Insert statements as not possible to use GENERATED ALWAYS AS IDENTITY clause within the table creation DDL statements.
Trying to compare between two columns and check if there are no records that exist with the reversal between those two columns. Other Words looking for instances where 1-> 3 exists but 3->1 does not exist. If 1->2 and 2->1 exists we will still consider 1 to be part of the results.
Table = Betweens
start_id | end_id
1 | 2
2 | 1
1 | 3
1 would be added since it is a start to an end with no opposite present of 3,1. Though it did not get added until the 3rd entry since 1 and 2 had an opposite.
So, eventually it will just return names where the reversal does not exist.
I then want to join another table where the number from the previous problem has its name installed on it.
Table = Names
id | name
1 | Mars
2 | Earth
3 | Jupiter
So results will just be the names of those that don't have an opposite.
You can use a not exists condition:
select t1.start_id, t1.end_id
from the_table t1
where not exists (select *
from the_table t2
where t2.end_id = t1.start_id
and t2.start_id = t1.end_id);
I'm not sure about your data volume, so with your ask, below query will supply desired result for you in Sql Server.
create table TableBetweens
(start_id INT,
end_id INT
)
INSERT INTO TableBetweens VALUES(1,2)
INSERT INTO TableBetweens VALUES(2,1)
INSERT INTO TableBetweens VALUES(1,3)
create table TableNames
(id INT,
NAME VARCHAR(50)
)
INSERT INTO TableNames VALUES(1,'Mars')
INSERT INTO TableNames VALUES(2,'Earth')
INSERT INTO TableNames VALUES(3,'Jupiter')
SELECT *
FROM TableNames c
WHERE c.id IN (
SELECT nameid1.nameid
FROM (SELECT a.start_id, a.end_id
FROM TableBetweens a
LEFT JOIN TableBetweens b
ON CONCAT(a.start_id,a.end_id) = CONCAT(b.end_id,b.start_id)
WHERE b.end_id IS NULL
AND b.start_id IS NULL) filterData
UNPIVOT
(
nameid
FOR id IN (filterData.start_id,filterData.end_id)
) AS nameid1
)
I think this requirement is rarely encountered so I couldn't search for similar questions.
I have a table that needs to update the ID. For example ID 123 in table1 is actually supposed to be 456. I have a separate reference table built that stores the mapping (e.g. old 123 maps to new id 456).
I used the below query but apparently it returned error 38104, columns referenced in the ON clause cannot be updated.
MERGE INTO table1
USING ref_table ON (table1.ID = ref_table.ID_Old)
WHEN MATCHED THEN UPDATE SET table.ID = ref_table.ID_New;
Is there other way to achieve my purpose?
Thanks and much appreciated for your answer!
Use the ROWID pseudocolumn:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TABLE1( ID ) AS
SELECT 1 FROM DUAL UNION ALL
SELECT 2 FROM DUAL UNION ALL
SELECT 3 FROM DUAL;
CREATE TABLE REF_TABLE( ID_OLD, ID_NEW ) AS
SELECT 1, 4 FROM DUAL UNION ALL
SELECT 2, 5 FROM DUAL;
MERGE INTO TABLE1 dst
USING ( SELECT t.ROWID AS rid,
r.id_new
FROM TABLE1 t
INNER JOIN REF_TABLE r
ON ( t.id = r.id_old ) ) src
ON ( dst.ROWID = src.RID )
WHEN MATCHED THEN
UPDATE SET id = src.id_new;
Query 1:
SELECT * FROM table1
Results:
| ID |
|----|
| 4 |
| 5 |
| 3 |
You can't update a column used in the ON clause in a MERGE. But if you don't need to make other changes that MERGE allows like WHEN NOT MATCHED or deleting, etc. you can just use a UPDATE to achieve this.
You mentioned this is an ID that needs an update. Here's an example using a scalar subquery. As it is an ID, this presumes UNIQUE ID_OLD values in REF_TABLE. I wasn't sure if Every row needs an update or only a sub-set, so set the update here to only update rows that have a value in REF_TABLE.
CREATE TABLE TABLE1(
ID NUMBER
);
CREATE TABLE REF_TABLE(
ID_OLD NUMBER,
ID_NEW NUMBER
);
INSERT INTO TABLE1 VALUES (1);
INSERT INTO TABLE1 VALUES (2);
INSERT INTO TABLE1 VALUES (100);
INSERT INTO REF_TABLE VALUES (1,10);
INSERT INTO REF_TABLE VALUES (2,20);
Initial State:
SELECT * FROM TABLE1;
ID
1
2
100
Then make the UPDATE
UPDATE TABLE1
SET TABLE1.ID = (SELECT REF_TABLE.ID_NEW
FROM REF_TABLE
WHERE REF_TABLE.ID_OLD = ID)
WHERE TABLE1.ID IN (SELECT REF_TABLE.ID_OLD
FROM REF_TABLE);
2 rows updated.
And check the change:
SELECT * FROM TABLE1;
ID
10
20
100
I have a unique requirement to select records from one table based on another table only if the second table has at least one record with a certain flag. The query should not return two records for the same ID: example:
Table 1
id name location
4 myname MyLocation
6 hisname HisLocation
7 hername herlocation
The id in this table are unique:
Table two
id details1 details2 closureflg
4 somdetails somedetails Y
4 somdetails somedetails Y
6 somdetails somedetails N
7 somdetails somedetails N
7 somdetails somedetail N
I need to select from the first table one record only as long as the corresponding id has records in Table 2 whose closure flag is N:
I have tried:
select * from table1 where id in(select id from tbale2 where closureflg = 'N');
this returns two records for id 7;
My expected output:
id name location
6 hisname HisLocation
7 hername herlocation.
Please help.
please, try this one
SELECT *
FROM table1 t1
WHERE EXISTS(SELECT * FROM table2 t2 WHERE closureflag = 'N' AND t1.ID = t2.ID);
I would try it with a join
select id, name, location from table1
join table2 on table1.id=table2.id
where table2.closureflg = 'N'
group by id
Setup:
create table main(id integer unsigned);
create table test1(id integer unsigned);
create table test2(id integer unsigned);
insert into main(id) value(1);
insert into test1(id) value(1);
insert into test1(id) value(1);
insert into test2(id) value(1);
insert into test2(id) value(1);
insert into test2(id) value(1);
Using:
select main.id,
count(test1.id),
count(test2.id)
from main
left join test1 on main.id=test1.id
left join test2 on main.id=test2.id
group by main.id;
...returns:
+------+-----------------+-----------------+
| id | count(test1.id) | count(test2.id) |
+------+-----------------+-----------------+
| 1 | 6 | 6 |
+------+-----------------+-----------------+
How to get the desired result of 1 2 3?
EDIT
The solution should be extensible,I'm going to query multiple count() information about main.id in the future.
Not optimal, but works:
select
count(*),
(select count(*) from test1 where test1.id = main.id) as test1_count,
(select count(*) from test2 where test2.id = main.id) as test2_count
from main
You created tables that contain the following:
Table main
id
----
1
Table test1
id
----
1
1
Table test2
id
----
1
1
1
When you join this like you do you will get the following
id id id
-----------
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
So how should SQL answer differently?
You can call:
SELECT id,COUNT(id) FROM main GROUP BY id
for every table, then join them by id.
Not sure if this works in MySQL exactly as written (I'm using Oracle):
1 select main.id, t1.rowcount, t2.rowcount
2 from main
3 left join (select id,count(*) rowcount from test1 group by id) t1
4 on t1.id = main.id
5 left join (select id,count(*) rowcount from test2 group by id) t2
6* on t2.id = main.id
SQL> /
ID ROWCOUNT ROWCOUNT
1 2 3
You're inadvertently creating a Cartesian product between test1 and test2, so every matching row in test1 is combined with every matching row in test2. The result of both counts, therefore, is the count of matching rows in test1 multiplied by the count of matching rows in test2.
This is a common SQL antipattern. A lot of people have this problem, because they think they have to get both counts in a single query.
Some other folks on this thread have suggested ways of compensating for the Cartesian product through creative use of subqueries, but the solution is simply to run two separate queries:
select main.id, count(test1.id)
from main
left join test1 on main.id=test1.id
group by main.id;
select main.id, count(test2.id)
from main
left join test2 on main.id=test2.id
group by main.id;
You don't have to do every task in a single SQL query! Frequently it's easier to code -- and easier for the RDBMS to execute -- multiple simpler queries.
You can get the desired result by using:
SELECT COUNT(*) as main_count,
(SELECT COUNT(*) FROM table1) as table1Count,
(SELECT COUNT(*) from table2) as table2Count FROM main