Insert/join table on multiple conditions - sql

I’ve a table that looks like this:
Table A
Version,id
5060586,22285
5074515,22701
5074515,22285
7242751,22701
7242751,22285
I want to generate a new key called groupId that is inserted as my example below:
Table A
Version,id,groupId
5060586,22285,1
5074515,22701,2
5074515,22285,2
7242751,22701,2
7242751,22285,2
I want the groupId to be the same as long as the id's are the same in the different versions. So for example version 5074515 and 7242751 has the same id's so therefor the groupId will be the same. If all the id's aren't the same a new groupId should be added as it has in version 5060586.
How can i solve this specific problem in SQL oracle?

One approach is to create a unique value representing the set of ids in each version, then assign a groupid to the unique values of that, then join back to the original data.
INSERT ALL
INTO t (version,id) VALUES (5060586,22285)
INTO t (version,id) VALUES (5074515,22701)
INTO t (version,id) VALUES (5074515,22285)
INTO t (version,id) VALUES (7242751,22701)
INTO t (version,id) VALUES (7242751,22285)
SELECT 1 FROM dual;
WITH groups
AS
(
SELECT version
, LISTAGG(id,',') WITHIN GROUP (ORDER BY id) AS group_text
FROM t
GROUP BY version
),
groupids
AS
(
SELECT group_text, ROW_NUMBER() OVER (ORDER BY group_text) AS groupid
FROM groups
GROUP BY group_text
)
SELECT t.*, groupids.groupid
FROM t
INNER JOIN groups ON t.version = groups.version
INNER JOIN groupids ON groups.group_text = groupids.group_text;
dbfiddle.uk

You can use:
UPDATE tableA t
SET group_id = ( SELECT COUNT(DISTINCT id)
FROM TableA x
WHERE x.Version <= t.version );
Which, for the sample data:
CREATE TABLE TableA (
Version NUMBER,
id NUMBER,
group_id NUMBER
);
INSERT INTO TableA (Version, id)
SELECT 5060586,22285 FROM DUAL UNION ALL
SELECT 5074515,22701 FROM DUAL UNION ALL
SELECT 5074515,22285 FROM DUAL UNION ALL
SELECT 7242751,22701 FROM DUAL UNION ALL
SELECT 7242751,22285 FROM DUAL;
Then, after the update:
SELECT * FROM tablea;
Outputs:
VERSION
ID
GROUP_ID
5060586
22285
1
5074515
22701
2
5074515
22285
2
7242751
22701
2
7242751
22285
2
db<>fiddle here

Related

How to ROWCOUNT_BIG() value with union all

I have the following query in SQL Server. How do I get the number of rows of previous select query as following format?
Sample Query
select ID, Name FROM Branch
UNION ALL
SELECT ROWCOUNT_BIG(), ''
Sample Output
If you use a CTE you can count the rows and union all together:
with cte as (
select ID, [Name]
from dbo.Branch
)
select ID, [Name]
from cte
union all
select count(*) + 1, ''
from cte;
I think you want to see total count of the select statement. you can do this way.
CREATE TABLE #test (id int)
insert into #test(id)
SELECT 1
SELECT id from #test
union all
SELECT rowcount_big()
Note: Here, the ID will be implicitly converted to BIGINT datatype, based on the datatype precedence. Read more
Presumably, you are running this in some sort of application. So why not use ##ROWCOUNT?
select id, name
from . . .;
select ##rowcount_big; -- big if you want a bigint
I don't see value to including the value in the same query. However, if the underlying query is an aggregation query, there might be a way to do this using GROUPING SETS.
Here are two ways. It's better to use a CTE to define the row set so further table inserts don't interfere with the count. Since you're using ROWCOUNT_BIG() these queries use COUNT_BIG() (which also returns bigint) to count the inserted rows. In order to make sure the total always appears as the last row an 'order_num' column was added to the SELECT list and ORDER BY clause.
drop table if exists #tTest;
go
create table #tTest(
ID int not null,
[Name] varchar(10) not null);
insert into #tTest values
(115, 'Joe'),
(116, 'Jon'),
(117, 'Ron');
/* better to use a CTE to define the row set */
with t_cte as (
select *
from #tTest)
select 1 as order_num, ID, [Name]
from t_cte
union all
select 2 as order_num, count_big(*), ''
from t_cte
order by order_num, ID;
/* 2 separate queries could give inconsistent result if table is inserted into */
select 1 as order_num, ID, [Name]
from #tTest
union all
select 2 as order_num, count_big(*), ''
from #tTest
order by order_num, ID;
Both return
order_num ID Name
1 115 Joe
1 116 Jon
1 117 Ron
2 3

Inserting unique value from another table

Tables: I have 3 tables
They are cust, new_cust, old_cust
all of them have 3 columns, they are id, username, name
each of them have possibilities to have same data as the others.
I would like to make "whole" table that consisting all of them but only the uniques.
I've Tried
Creating a dummy table
I've tried to create the dummy table called "Temp" table by
select *
into Temp
from cust
insert all table to dummy
Then I insert all of them into they Temp table
insert into temp
select * from new_cust
insert into temp
select * from old_cust
taking uniques using distinct
After they all merged I'm using distinct to only take the unique id value
select distinct(id), username, fullname
into Whole
from temp
it did decreasing some rows
Result
But after I move it to whole table I would like to put primary key on id but I got the message that there are some duplicate values. Is there any other way?
I am guessing that you want unique ids. And you want these prioritized by the tables in some order. If so, you can do this with union all and row_number():
select id, username, name
from (select c.*,
row_number() over (partition by id order by priority) as seqnum
from ((select id, username, name, 1 as priority
from new_cust
) union all
(select id, username, name, 2 as priority
from cust
) union all
(select id, username, name, 3 as priority
from old_cust
)
) c
) c
where seqnum = 1;
Try this:
insert into temp
select * from new_cust
UNION
select * from old_cust
Union will avoid the duplicate entries and you can then create a primary key on ID column
Try this below query...
WITH cte as (
SELECT id, username, NAME,
ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t1.username, t1.name ) AS rn
FROM cust t1
LEFT JOIN new_cust t2 ON t1.Id = t2.Id
LEFT JOIN old_cust t3 ON t2.Id = t3.Id
)
SELECT id, username, NAME
FROM cte
WHERE rn = 1
Note:-
Put all the query inside a CTE(Common table expression)
with a new column(rn) that you will use to filter the results.
This new Column will produce ROW_NUMBER()....PARTITION BY username,name.....
But after I move it to whole table I would like to put primary key on
id but I got the message that there are some duplicate values.?
That's because You are trying to insert ID value from each of the tables to Whole table.
Just insert username and name and skip ID. ID is IDENTITY and it MUST be unique.
Run this on Your current Whole table to see if You have duplicated Id's:
select COUNT(ID), username
from whole
GROUP BY username
HAVING COUNT(ID) > 1
To get unique customers recreate table Whole and make ID col IDENTITY:
IF OBJECT_ID ('dbo.Whole') IS NOT NULL DROP TABLE dbo.Whole;
CREATE TABLE Whole (ID INT NOT NULL IDENTITY(1,1), Name varchar(max), Username varchar(max))
Insert values into Whole table:
INSERT INTO Whole
SELECT Name, Username FROM cust
UNION
SELECT Name, Username FROM new_cust
UNION
SELECT Name, Username FROM old_cust
Make ID col PK.
What does Unique mean for your row ?
If it is only the username, and you don't care about keeping the old ID values,
this will favor the new_cust data over the old_cust data.
SELECT
ID = ROW_NUMBER() OVER (ORDER BY all_temp.username)
, all_temp.*
INTO dbo.Temp
FROM
(
SELECT nc.username, nc.[name] FROM new_cust AS nc
UNION
SELECT oc.username, oc.[name]
FROM old_cust AS oc
WHERE oc.username NOT IN (SELECT nc1.username FROM new_cust AS nc1) --remove the where part if needed
) AS all_temp
ALTER TABLE dbo.Temp ALTER COLUMN ID INTEGER NOT NULL
ALTER TABLE dbo.Temp ADD PRIMARY KEY (ID)
If by Unique you mean both the username and name then just remove the where part in the union

Make value from every second row appear in new 3rd column

Lets assume my data looks like this :
Every second row represents old (previous value) in a table that holds historical data.
table 1 :
id value
------------
1 a
1 b
2 c
2 d
3 a
3 b
and i want to get value of every second row to appear in new 3rd column like this :
table 2:
id new_value old_value
------------------------
1 a b
2 c d
3 a b
EDIT:
For clarity ill post the skeleton of query thats producing data i want to transform (so its clear i am already using WITH so cant use additional one due to oracle not yet allowing nesting of WITH elements) :
skeleton code that produces data in table 1 :
with candidates as
(
--select list of candidates
)
SELECT * FROM
(
(
--select new values
MINUS
--select old values
)
UNION
(
--select old values
MINUS
--select new values
)
)
ORDER BY id;
The goal is to finally get only a list of ids that changed with their old and new values.
Thanks in advance.
Use CTE
;WITH CTE AS(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RN
FROM TableName
)
SELECT ID,
MIN(CASE WHEN RN=1 THEN [value] END) NewValue,
MIN(CASE WHEN RN=2 THEN [value] END) OldValue
FROM CTE
GROUP BY ID
It is quite possible that overall query can be written in a much simpler way. Just join intermediary results with old and new values together on id to put them in two different columns instead of unioning them into the same column.
WITH
candidates
AS
(
--select list of candidates
)
,CTE_NewValues
AS
(
--select new values
select id, value AS new_value
FROM candidates
WHERE ...
-- assumes id is unique, one row per id
)
,CTE_OldValues
AS
(
--select old values
select id, value AS old_value
FROM candidates
WHERE ...
-- assumes id is unique, one row per id
)
SELECT
CTE_NewValues.id
,CTE_NewValues.new_value
,CTE_OldValues.old_value
FROM
CTE_NewValues
INNER JOIN CTE_OldValues ON CTE_NewValues.id = CTE_OldValues.id
WHERE
CTE_NewValues.new_value <> CTE_OldValues.old_value
ORDER BY
CTE_NewValues.id;
If we stick to the skeleton of the query in the question, there are also many ways to do it. Self-join is likely to be less efficient than using analytic functions, like ROW_NUMBER and LEAD.
Sorting just by id is not enough to unambiguously define which value is new or old. You need to have some extra column to resolve it.
You don't "nest" WITH (common-table expressions), you "chain" them. Something like the following. As you do that, make sure to add the sort_order column to be able to distinguish old and new values, if you don't have a similar column already.
WITH
candidates
AS
(
--select list of candidates
)
,CTE_YourQuery
AS
(
SELECT * FROM
(
(
--select new values
select 1 AS sort_order, id, value
MINUS
--select old values
select 1 AS sort_order, id, value
)
UNION ALL
(
--select old values
select 2 AS sort_order, id, value
MINUS
--select new values
select 2 AS sort_order, id, value
)
)
)
,CTE_RowNumber
AS
(
SELECT
id
,value AS new_value
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY sort_order) AS rn
,LEAD(value) OVER (PARTITION BY id ORDER BY sort_order) AS old_value
FROM CTE_YourQuery
)
SELECT
id
,new_value
,old_value
FROM CTE_RowNumber
WHERE rn = 1
ORDER BY id;
Assuming there is some other column which defines the "order" in which the new and old value appears, you can do this:
select t1.id, t1.value as old_value, t2.value as new_value
from the_table t1
join the_table t2 on t1.id = t2.id and t1.sort_order < t2.sort_order
But you have to have some column that distinguishes the row that is considered "old" from the one that is considered "new".

Creating tables on-the-fly

It is often convenient in PosgreSQL to create "tables" on the fly so to refer to them, e.g.
with
selected_ids as (
select 1 as id
)
select *
from someTable
where id = (select id from selected_ids)
Is it impossible to provide multiple values as id this way? I found this answer that suggests using values for similar problem, but I have problem with translating it to the example below.
I would like to write subqueries such as
select 1 as id
union
select 2 as id
union
select 7 as id
or
select 1 as id, 'dog' as animal
union
select 7 as id, 'cat' as animal
in more condensed way, without repeating myself.
You can use arguments in the query alias:
with selected_ids(id) as (
values (1), (3), (5)
)
select *
from someTable
where id = any (select id from selected_ids)
You can also use join instead of a subquery, example:
create table some_table (id int, str text);
insert into some_table values
(1, 'alfa'),
(2, 'beta'),
(3, 'gamma');
with selected_ids(id) as (
values (1), (2)
)
select *
from some_table
join selected_ids
using(id);
id | str
----+------
1 | alfa
2 | beta
(2 rows)
You can pass id and animal field in WITH like this
with selected_ids(id,animal) as (
values (1,'dog'), (2,'cat'), (3,'elephant'),(4,'rat')--,..,.. etc
)
select *
from someTable
where id = any (select id from selected_ids)
You should use union and IN statement like this:
with
selected_ids as (
select 1 as id
union
select 2 as id
union
select 3 as id
....
)
select *
from someTable
where id in (select id from selected_ids)
after reviewing wingedpanther's idea and looking for it, you can use his idea IF those id's are continuously like this:
with
selected_ids as (
SELECT * FROM generate_series(Start,End) --(1,10) for example
)
select *
from someTable
where id in (select id from selected_ids)
If they are not continuously , the only way you can do that is by storing those ID's in a different table(maybe you have it already and if not insert it)
And then:
select *
from someTable
where id in (select id from OtherTable)

How to add 2 temporary tables together

If I am creating temporary tables, that have 2 columns. id and score. I want to to add them together.
The way I want to add them is if they each contain the same id then I do not want to duplicate the id but instead add the scores together.
if I have 2 temp tables called t1 and t2
and t1 had:
id 3 score 4
id 6 score 7
and t2 had:
id 3 score 5
id 5 score 2
I would end up with a new temp table containing:
id 3 score 9
id 5 score 2
id 6 score 7
The reason I want to do this is, I am trying to build a product search. I have a few algorithms I want to use, 1 using fulltext another not. And I want to use both algorithms so I want to create a temporary table based on algorithm1 and a temp table based on algorithm2. Then combine them.
How about:
SELECT id, SUM(score) AS score FROM (
SELECT id, score FROM t1
UNION ALL
SELECT id, score FROM t2
) t3
GROUP BY id
This is untested but you should be able to perform a union on the two tables and then perform a select on the results, grouping the fields and adding the scores
SELECT id,SUM(score) FROM
(
SELECT id,score FROM t1
UNION ALL
SELECT id,score FROM t2
) joined
GROUP BY id
Perform a full outer join on the ID. Select on the ID and the sum of the two "score" columns after coalescing the values to 0.
SELECT id, SUM(score) FROM
(
SELECT id, score FROM #t1
UNION ALL
SELECT id, score FROM #t2
) AS Temp
GROUP BY id
select id, sum(score)
from (
select * from table 1
union all
select * from table2
) tables
group by id
You need to create an union of those two tables then You can easily group the results.
SELECT id, sum(score) FROM
(
SELECT id, score FROM t1
UNION
SELECT id, score FROM t2
) as tmp
GROUP BY id;