Merge rows which has same pattern

Merge rows which has same pattern - sql

I have to combine rows into 1 which has same pattern. I am not sure how to acheive it.
I tried to merge with MAX but since its a string, not sure if there is way to combine.
SELECT ID,MAX(GRP1),MAX(GRP2),MAX(GRP3),MAX(GRP)
FROM GROUP_MAPPING
GROUP BY ID
From table:
To Table

Using the same principle as this answer:
SELECT DISTINCT
id,
grp1,
grp2,
grp3,
grp
FROM table_name
WHERE CONNECT_BY_ISLEAF = 1
CONNECT BY NOCYCLE
PRIOR id = id
AND grp LIKE '%' || PRIOR grp1 || '%' || PRIOR grp2 || '%' || PRIOR grp3 || '%';
or
SELECT DISTINCT
id,
grp1,
grp2,
grp3,
grp
FROM table_name
WHERE CONNECT_BY_ISLEAF = 1
CONNECT BY NOCYCLE
PRIOR id = id
AND (PRIOR grp1 = grp1 OR PRIOR grp1 IS NULL)
AND (PRIOR grp2 = grp2 OR PRIOR grp2 IS NULL)
AND (PRIOR grp3 = grp3 OR PRIOR grp3 IS NULL)
Which, for the sample data:
CREATE TABLE table_name (id, grp1, grp2, grp3, grp) AS
SELECT 'DEPT1', 10, 'AA', 'S1', '10AAAS1' FROM DUAL UNION ALL
SELECT 'DEPT1', NULL, 'AA', 'S1', 'AAS1' FROM DUAL UNION ALL
SELECT 'DEPT1', 10, NULL, NULL, '10' FROM DUAL UNION ALL
SELECT 'DEPT1', 10, 'BB', NULL, '10BB' FROM DUAL UNION ALL
SELECT 'DEPT1', NULL, 'BB', NULL, 'BB' FROM DUAL;
Both output:
ID
GRP1
GRP2
GRP3
GRP
DEPT1
10
AA
S1
10AAAS1
DEPT1
10
BB
null
10BB
db<>fiddle here

Related

Oracle SQL Uniquely Update Duplicate Records

I have a STUDENT table and need to update the STUDENT_ID values by prefixing with the letter SS followed by STUDENT_ID value. For any duplicate STUDENT_ID records, I should prefix the duplicate records as SS1 SS2. Below is an example
Before Update:
NUM
STUDENT_ID
1
9234
2
9234
3
9234
4
3456
5
3456
6
789
7
956
After Update:
NUM
STUDENT_ID
1
SS9234
2
SS19234
3
SS29234
4
SS3456
5
SS13456
6
SS789
7
SS956
Below is the query for updating the STUDENT_ID for unique records.
update student set student_id = 'SS'||student_id ;
commit;
Need suggestion for updating the STUDENT_ID for duplicate records. There are around 1 million duplicate records in the table and total volume is around 40 million. Appreciate for any inputs for performance enhancement.

You can use a MERGE statement correlated on the ROWID pseudo-column and using the ROW_NUMBER() analytic function:
MERGE INTO table_name dst
USING (
SELECT ROWID as rid,
ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY num) AS rn
FROM table_name
) src
ON (src.rid = dst.ROWID)
WHEN MATCHED THEN
UPDATE
SET student_id = 'SS' || CASE WHEN rn > 1 THEN rn - 1 END || dst.student_id;
Which, for the sample data:
CREATE TABLE table_name (NUM, STUDENT_ID) AS
SELECT 1, CAST('9234' AS VARCHAR2(20)) FROM DUAL UNION ALL
SELECT 2, '9234' FROM DUAL UNION ALL
SELECT 3, '9234' FROM DUAL UNION ALL
SELECT 4, '3456' FROM DUAL UNION ALL
SELECT 5, '3456' FROM DUAL UNION ALL
SELECT 6, '789' FROM DUAL UNION ALL
SELECT 7, '956' FROM DUAL;
Then after the MERGE the table contains:
NUM
STUDENT_ID
1
SS9234
2
SS19234
3
SS29234
4
SS3456
5
SS13456
6
SS789
7
SS956
fiddle

I'm sure there must be a better way, but this query can get the job done:
update t
set student_id = (
select new_student_id
from (
select x.*, 'SS' || case when rn = 1 then '' else '' || rn end
|| student_id as new_student_id
from (
select t.*, row_number() over(partition by student_id order by num) as rn
from t
) x
) y
where t.num = y.num
)
Result:
NUM STUDENT_ID
---- ----------
1 SS9234
2 SS29234
3 SS39234
4 SS3456
5 SS23456
6 SS789
7 SS956
See running example at db<>fiddle.

Maybe you could do it without updating!?
I would probably try to :
CREATE NEW_TABLE AS
SELECT [do the "update" here] FROM OLD_TABLE;
- add indexes on new table
- add constraints on new table
- add anything else you need on new table (foreign keys, grants...)
and then
DROP TABLE OLD_TABLE;
-- and
RENAME NEW_TABLE To OLD_TABLE;
SELECT with your sample data:
WITH
tbl as
(
Select 1 "NUM", 9234 "STUDENT_ID" From Dual Union All
Select 2 "NUM", 9234 "STUDENT_ID" From Dual Union All
Select 3 "NUM", 9234 "STUDENT_ID" From Dual Union All
Select 4 "NUM", 3456 "STUDENT_ID" From Dual Union All
Select 5 "NUM", 3456 "STUDENT_ID" From Dual Union All
Select 6 "NUM", 789 "STUDENT_ID" From Dual Union All
Select 7 "NUM", 956 "STUDENT_ID" From Dual
)
Select
NUM,
CASE WHEN Count(NUM) Over(Partition By STUDENT_ID) = 1 THEN 'SS' || STUDENT_ID
ELSE 'SS' || Replace(Sum(1) Over(Partition By STUDENT_ID Order By NUM) - 1, 0, '') || STUDENT_ID
END "STUDENT_ID"
From
tbl
Order By NUM
Result:
NUM
STUDENT_ID
1
SS9234
2
SS19234
3
SS29234
4
SS3456
5
SS13456
6
SS789
7
SS956

spaces in the result of a listagg

hello I have 2 database;
table1:IDSPUBPIPE
ID; ID1
1;1
1;2
1;3
2;1
3;3
4;1
5;2
6;1
table2:IDSPUBCIRCUIT
ID;NOM
1; test1
2; test2
3; test3
4; test4
5; test5
6; test6
result hope
ID;ID1;nom
1;1,2,4,6;test1,test2,test4,test6
2;1,5;test1,test5
obtained result
ID;ID1;nom
1;1,2,4,6;t e s t 1 , t e s t 2 , t e s t 4 , t e s t 6
2;1,5;t e s t 1 , t e s t 5
select cast(pipeci.ID as numeric) as ID,
cast(pipeci.ID1 as numeric) as ID1,
cast(RSF_cir.ID as numeric) as ID_circuit,
rsf_cir.NOM,
LISTAGG(RSF_cir.NOM, '; ') WITHIN GROUP (ORDER BY pipeci.ID1, RSF_cir.NOM)
OVER (PARTITION BY pipeci.ID1) as Emp_list,
count(RSF_cir.ID) over(partition by pipeci.ID1) as NB_circuit
FROM IDSPUBPIPE pipeci
LEFT JOIN IDSPUBCIRCUIT RSF_cir ON pipeci.ID=RSF_cir.ID
I don't understand the cause of the spaces between each letter, and I can't seem to find a solution
thank you in advance for your leads
[copieecran][1]
[1]: https://i.stack.imgur.com/PipH1.jpg

You can use:
SELECT p.ID1,
LISTAGG( r.ID, ',' ) WITHIN GROUP ( ORDER BY r.NOM ) as ID_circuit,
LISTAGG( r.NOM, ',' ) WITHIN GROUP ( ORDER BY r.NOM ) AS Emp_list
FROM IDSPUBPIPE p
LEFT OUTER JOIN IDSPUBCIRCUIT r
ON ( p.ID = r.ID )
GROUP BY p.ID1
So, for your test data:
CREATE TABLE IDSPUBPIPE ( ID, ID1 ) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 1, 2 FROM DUAL UNION ALL
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 3 FROM DUAL UNION ALL
SELECT 4, 1 FROM DUAL UNION ALL
SELECT 5, 2 FROM DUAL UNION ALL
SELECT 6, 1 FROM DUAL;
CREATE TABLE IDSPUBCIRCUIT ( ID, NOM ) AS
SELECT 1, 'test1' FROM DUAL UNION ALL
SELECT 2, 'test2' FROM DUAL UNION ALL
SELECT 3, 'test3' FROM DUAL UNION ALL
SELECT 4, 'test4' FROM DUAL UNION ALL
SELECT 5, 'test5' FROM DUAL UNION ALL
SELECT 6, 'test6' FROM DUAL;
This outputs:
ID1 | ID_CIRCUIT | EMP_LIST
--: | :--------- | :----------------------
1 | 1,2,4,6 | test1,test2,test4,test6
2 | 1,5 | test1,test5
3 | 1,3 | test1,test3
db<>fiddle here

Find rows with consecutive ones

I've two integer columns and need to display the rows with consecutive one's in the NUM column.
Sample data:
CREATE TABLE table_name ( ID, NUM ) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 1 FROM DUAL UNION ALL
SELECT 4, 2 FROM DUAL UNION ALL
SELECT 5, 1 FROM DUAL UNION ALL
SELECT 6, 2 FROM DUAL UNION ALL
SELECT 7, 2 FROM DUAL;
Expected Output:
ID NUM
-- ---
1 1
2 1
3 1

I have tried using self-joins and achieved the result:
WITH TAB (ID, NUM) AS
(
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 1 FROM DUAL UNION ALL
SELECT 4, 2 FROM DUAL UNION ALL
SELECT 5, 1 FROM DUAL UNION ALL
SELECT 6, 2 FROM DUAL UNION ALL
SELECT 7, 2 FROM DUAL
)
SELECT DISTINCT
T.ID,
T.NUM
FROM
TAB T
JOIN (
SELECT
T1.ID ID1,
T2.ID ID2,
T1.NUM,
COUNT(1) OVER(
PARTITION BY T1.NUM
) RN
FROM
TAB T1
JOIN TAB T2 ON ( T1.NUM = T2.NUM
AND T1.ID = T2.ID + 1 )
) T_IN ON ( ( T.ID = T_IN.ID1
OR T.ID = T_IN.ID2 )
AND T.NUM = T_IN.NUM
AND RN >= 2 ) -- THIS CONDITION IS TO RESTRICT CONSECUTIVES LESS THAN 3
ORDER BY
1
output:
db<>fiddle demo

Use analytic functions LAG or LEAD:
Oracle Setup:
CREATE TABLE table_name ( ID, NUM ) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 1 FROM DUAL UNION ALL
SELECT 4, 2 FROM DUAL UNION ALL
SELECT 5, 1 FROM DUAL UNION ALL
SELECT 6, 2 FROM DUAL UNION ALL
SELECT 7, 2 FROM DUAL;
Query:
SELECT id,num
FROM (
SELECT id,
num,
LAG( num ) OVER ( ORDER BY id ) AS prev_num,
LEAD( num ) OVER ( ORDER BY id ) AS next_num
FROM table_name
)
WHERE num = 1
AND ( num = prev_num
OR num = next_num )
Output:
ID | NUM
-: | --:
1 | 1
2 | 1
3 | 1
db<>fiddle here

Interaction of where clause with connect by And Creating query to fetch next level in a hierarchy

Table:
create table temp_hierarchy_define (dept varchar2(25), parent_dept varchar2(25))
create table temp_employee (empid number(1), empname varchar2(50), dept varchar2(25), salary number(10))
Data
Select 'COMPANY' dept , 'COMPANY' parent_dept From Dual Union All
Select 'IT' , 'COMPANY' From Dual Union All
Select 'MARKET' , 'COMPANY' From Dual Union All
Select 'ITSEC' , 'IT' From Dual Union All
Select 'ITDBA' , 'IT' From Dual Union All
Select 'ITDBAORC' , 'ITDBA' From Dual Union All
Select 'ITDBASQL' , 'ITDBA' From Dual
select 1 empid, 'Rohan-ITDBASQL' empname ,'ITDBASQL' dept ,10 salary from dual union all
select 2, 'Raj-ITDBAORC' ,'ITDBAORC' ,20 from dual union all
select 3, 'Roy-ITDBA' ,'ITDBA' ,30 from dual union all
select 4, 'Ray-MARKET' ,'MARKET' ,40 from dual union all
select 5, 'Roopal-IT' ,'IT' ,50 from dual union all
select 6, 'Ramesh-ITSEC' ,'ITSEC' ,60 from dual
Requirement
Summarize salary of all IT dept:
CATEGORY SALARY
5,50
ITSEC,60
ITDBA,60
Summarize salary of all COMPANY dept:
CATEGORY SALARY
IT,170
MARKET,40
Summarize salary of all ITDBA dept:
CATEGORY SALARY
3,30
ITDBASQL,10
ITDBAORC,20
You will notice that we are trying to summarize based on the next level in the hierarchy. If any emp is already part of that level then we need to show the employee.
Trial Query:
Select Category,sum(salary) from (
Select
NVL((Select dept.dept from temp_hierarchy_define dept
Where dept.parent_dept = 'IT'
And dept.dept != 'IT'
Start With dept.dept = emp.dept
Connect by NOCYCLE dept.dept = Prior dept.parent_dept
and prior dept.dept is not null),emp.empid) category,
emp.*
From temp_employee emp
Where emp.DEPT in
(Select dept.dept from temp_hierarchy_define dept
Start With dept.dept = 'IT'
connect by nocycle prior dept.dept = dept.parent_dept) ) Group by Category
Concerns & queries:
Whether this query will work well in all scenarios. Or there any better way of doing it ??
How does where condition interact with connect by. For eg in the sub query we are filtering with parent_dept = 'IT', however while starting connect by some emp might have parent_dept = 'ITDBASQL' which is also part of IT. I am having a hard time in understanding the workflow.
Thank you for your time and assistance.

Or there any better way of doing it ?
This is an equivalent query that only requires one table scan for each table. You will need to determine whether your query or this one is more performant for your data/indexes/etc.
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table temp_hierarchy_define (
dept varchar2(25), parent_dept varchar2(25));
create table temp_employee (
empid number(1), empname varchar2(50), dept varchar2(25), salary number(10));
INSERT INTO temp_hierarchy_define( dept, parent_dept )
Select 'COMPANY' , 'COMPANY' From Dual Union All
Select 'IT' , 'COMPANY' From Dual Union All
Select 'MARKET' , 'COMPANY' From Dual Union All
Select 'ITSEC' , 'IT' From Dual Union All
Select 'ITDBA' , 'IT' From Dual Union All
Select 'ITDBAORC' , 'ITDBA' From Dual Union All
Select 'ITDBASQL' , 'ITDBA' From Dual;
INSERT INTO temp_employee( empid, empname, dept, salary )
select 1, 'Rohan-ITDBASQL' ,'ITDBASQL' ,10 from dual union all
select 2, 'Raj-ITDBAORC' ,'ITDBAORC' ,20 from dual union all
select 3, 'Roy-ITDBA' ,'ITDBA' ,30 from dual union all
select 4, 'Ray-MARKET' ,'MARKET' ,40 from dual union all
select 5, 'Roopal-IT' ,'IT' ,50 from dual union all
select 6, 'Ramesh-ITSEC' ,'ITSEC' ,60 from dual;
Query 1:
SELECT dept,
SUM( salary )
FROM (
SELECT CASE
WHEN lvl = 1 AND h.parent_dept = e.dept
THEN CAST( e.empid AS VARCHAR2(25) )
ELSE root_dept
END AS dept,
e.empid,
e.salary
FROM ( SELECT CONNECT_BY_ROOT( dept ) AS root_dept,
h.*,
LEVEL AS lvl,
ROW_NUMBER() OVER ( PARTITION BY parent_dept ORDER BY ROWNUM ) AS rn
FROM temp_hierarchy_define h
WHERE parent_dept != dept
START WITH h.parent_dept = 'IT'
CONNECT BY NOCYCLE PRIOR h.dept = h.parent_dept
) h
LEFT OUTER JOIN
temp_employee e
ON ( h.dept = e.dept
OR ( h.parent_dept = e.dept AND h.lvl = 1 AND h.rn = 1)
)
)
GROUP BY dept
Results:
| DEPT | SUM(SALARY) |
|-------|-------------|
| ITDBA | 60 |
| 5 | 50 |
| ITSEC | 60 |
You can run the queries individually to find out what they are doing:
SELECT CONNECT_BY_ROOT( dept ) AS root_dept,
h.*,
LEVEL AS lvl,
ROW_NUMBER() OVER ( PARTITION BY parent_dept ORDER BY ROWNUM ) AS rn
FROM temp_hierarchy_define h
WHERE parent_dept != dept
START WITH h.parent_dept = 'IT'
CONNECT BY NOCYCLE PRIOR h.dept = h.parent_dept
Just lists all the rows in the hierarchy and uses CONNECT_BY_ROOT to get the department at the root of the branch of the hierarchy. LEVEL and ROW_NUMBER() are used to find the first row at the top of the hierarchy.

Oracle/SQL - Need query that will select max value from string in each row

I need a graceful way to select the max value from a field holding a comma delimited list.
Expected Values:
List_1 | Last
------ | ------
A,B,C | C
B,D,C | D
I'm using the following query and I'm not getting what's expected.
select
list_1,
(
select max(values) WITHIN GROUP (order by 1)
from (
select
regexp_substr(list_1,'[^,]+', 1, level) as values
from dual
connect by regexp_substr(list_1, '[^,]+', 1, level) is not null)
) as last
from my_table
Anyone have any ideas to fix my query?

with
test_data ( id, list_1 ) as (
select 101, 'A,B,C' from dual union all
select 102, 'B,D,C' from dual union all
select 105, null from dual union all
select 122, 'A' from dual union all
select 140, 'A,B,B' from dual
)
-- end of simulated table (for testing purposes only, not part of the solution)
select id, list_1, max(token) as max_value
from ( select id, list_1,
regexp_substr(list_1, '([^,])(,|$)', 1, level, null, 1) as token
from test_data
connect by level <= 1 + regexp_count(list_1, ',')
and prior id = id
and prior sys_guid() is not null
)
group by id, list_1
order by id
;
ID LIST_1_ MAX_VAL
---- ------- -------
101 A,B,C C
102 B,D,C D
105
122 A A
140 A,B,B B
In Oracle 12.1 or higher, this can be re-written using the LATERAL clause:
select d.id, d.list_1, x.max_value
from test_data d,
lateral ( select max(regexp_substr(list_1, '([^,]*)(,|$)',
1, level, null, 1)) as max_value
from test_data x
where x.id = d.id
connect by level <= 1 + regexp_count(list_1, ',')
) x
order by d.id
;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Merge rows which has same pattern - sql

I have to combine rows into 1 which has same pattern. I am not sure how to acheive it. I tried to merge with MAX but since its a string, not sure if there is way to combine. SELECT ID,MAX(GRP1),MAX(GRP2),MAX(GRP3),MAX(GRP) FROM GROUP_MAPPING GROUP BY ID From table: To Table

Related

Oracle SQL Uniquely Update Duplicate Records

spaces in the result of a listagg

Find rows with consecutive ones

Interaction of where clause with connect by And Creating query to fetch next level in a hierarchy

Oracle/SQL - Need query that will select max value from string in each row

Categories

Resources