Combine strings in a certain order

Combine strings in a certain order - google-bigquery

Sample table
NUMBER
DEAL_NUMBER
NAME1
NAME2
1
T01
TOM
JERRY
2
T02
LEBRON
STEVEN
Would like output as below:
NUMBER
DEAL_NUMBER
NAME1
NAME2
Name_COM
1
T01
TOM
JERRY
LEBRON TOM STEVEN JERRY
2
T02
LEBRON
STEVEN
LEBRON TOM STEVEN JERRY
By using string_agg function can solve with it, but it is not convenient enough.
CREATE TEMP FUNCTION EXCHANGE_PLACE(STR STRING)
RETURNS STRING
AS
((
SELECT ARRAY_TO_STRING(array_reverse(ARRAY_LIST),' ') FROM (SELECT SPLIT(STR,' ')ARRAY_LIST)
));
WITH TBL_D_CUSTOMER AS
(
SELECT "1" AS NUMBER,"T01" AS DEAL_NUMBER,"TOM" AS NAME1, "JERRY" AS NAME2 UNION ALL
SELECT "2","T01","LEBRON","STEVEN"
)
SELECT
*,
EXCHANGE_PLACE(STRING_AGG(NAME1,' ')OVER(PARTITION BY DEAL_NUMBER)) || ' ' || EXCHANGE_PLACE(STRING_AGG(NAME2,' ')OVER(PARTITION BY DEAL_NUMBER)) AS NAME_COM
FROM TBL_D_CUSTOMER
Is there any approach can do it better?

Consider below approach
select *, array_to_string(
array_reverse(array_agg(NAME1) over win) || array_reverse(array_agg(NAME2) over win)
, ' ') as Name_COM
from TBL_D_CUSTOMER
window win as (partition by DEAL_NUMBER)
if applied to sample data in your question - output is

Related

get occurence of multiple rows in postgresql

I have a table that contains students publications like this
id
student
1
john
2
anthony
3
steven
4
lucille
5
anthony
6
steven
7
john
8
lucille
9
john
10
anthony
11
steven
12
lucille
13
john
so the idea is about to have a query that fetchs all ordered occurences of a determinated student names
context :
answer to the question : how many times John is publishing just after Anthony (who is publishing just after Steven ...) and get id of each occurence
example :
If I look for all occurences of [john, anthony] I'll get (note that the ids must be successive for each occurence)
id
student
1
john
2
anthony
9
john
10
anthony
Or :
id
-- comment
1
(id of first occurence of john, anthony)
9
(id of second occurence of john, anthony)
If I look for [anthony, steven, lucille] i'll get
id
student
2
anthony
3
steven
4
lucille
10
anthony
11
steven
12
lucille
Or :
id
-- comment
2
(id of first occurence of anthony, steven, lucille)
10
(id of second occurence of anthony, steven, lucille)
Any ideas or leads to help me move forward?

That should do the trick, performance wise.
The main idea is to split the data by the first student that is in our search list, but not in all places -
Since the same student can appear multiple times in our search list, we need to make sure that we're not breaking the pattern in the middle.
We're doing that by verifying that each occurrence of the first student is far enough from its previous occurrence, that is, the distance between the two occurrences is bigger than the search list length (the number of non-unique students' names within the search list)
with
prm(students) as (select 'anthony,steven,lucille,anthony')
,prm_ext(search_pattern, first_student, tokens_num) as
(
select regexp_replace(students, '^|(,)','\1\d+;', 'g') as search_pattern
,split_part(students, ',', 1) as first_student
,array_length(string_to_array(students, ','), 1) as tokens_num
from prm
)
,prev_student as
(
select id
,student
,lag(id) over (partition by student order by id) as student_prev_id
from t
)
,seq as
(
select id
,student
,sum(case when student = p.first_student and coalesce(id - student_prev_id >= p.tokens_num, true) then 1 end) over (order by id) as seq_id
,id - max(case when student = p.first_student then id end) over (order by id) as distance_from_first_student
from prev_student cross join prm_ext as p
order by id
)
select split_part(unnest(regexp_matches(string_agg(id || ';' || student, ',' order by id), (select search_pattern from prm_ext), 'g')), ';', 1)::int as id
from seq cross join prm_ext p
where seq_id is not null
and distance_from_first_student < p.tokens_num
group by seq_id
This is the result for an extended data sample:
id
2
16
22
Fiddle

Start with this and if it explodes we'll do some performance improvements, with the price of making the code a little bit more complicated.
with
prm(students) as (select 'anthony,steven,lucille')
,prm_ext(students_regex) as (select regexp_replace(students, '^|(,)','\1\d+;', 'g') from prm)
select split_part(unnest(regexp_matches(string_agg(id || ';' || student, ',' order by id), (select students_regex from prm_ext), 'g')), ';', 1)::int as id
from t
id
2
10
with
prm(students) as (select 'anthony,steven,lucille')
,prm_ext(students_regex) as (select regexp_replace(students, '^|(,)','\1\d+;', 'g') from prm)
select cols[1]::int as id
,cols[2]::text as student
from (select string_to_array(string_to_table(unnest(regexp_matches(string_agg(id || ';' || student, ',' order by id), (select students_regex from prm_ext), 'g')), ','), ';') as cols
from t
) t
id
student
2
anthony
3
steven
4
lucille
10
anthony
11
steven
12
lucille
Fiddle

Comparing string values within a table

Is there any way to compare two columns with strings to each other, and getting the matches?
I have two columns containing Names, once with the Full Name the other with (mostly) just the Surname.
I just tried it with soundex, but it will just return if the values are almost similar in both columns.
SELECT * FROM TABLE
WHERE soundex(FullName) = soundex(Surname)
1 John Doe Doe
2 Peter Parker Parker
3 Brian Griffin Brian Griffin
with soundex it will only match the 3rd line.

A simple option is to use instr, which shows whether surname exists in fullname:
SQL> with test (id, fullname, surname) as
2 (select 1, 'John Doe' , 'Doe' from dual union all
3 select 2, 'Peter Parker' , 'Parker' from dual union all
4 select 3, 'Brian Griffin', 'Brian Griffin' from dual
5 )
6 select *
7 from test
8 where instr(fullname, surname) > 0;
ID FULLNAME SURNAME
---------- ------------- -------------
1 John Doe Doe
2 Peter Parker Parker
3 Brian Griffin Brian Griffin
Another option is to use one of UTL_MATCH functions, e.g. Jaro-Winkler similarity which shows how well those strings match:
SQL> with test (id, fullname, surname) as
2 (select 1, 'John Doe' , 'Doe' from dual union all
3 select 2, 'Peter Parker' , 'Parker' from dual union all
4 select 3, 'Brian Griffin', 'Brian Griffin' from dual
5 )
6 select id, fullname, surname,
7 utl_match.jaro_winkler_similarity(fullname, surname) jws
8 from test
9 order by id;
ID FULLNAME SURNAME JWS
---------- ------------- ------------- ----------
1 John Doe Doe 48
2 Peter Parker Parker 62
3 Brian Griffin Brian Griffin 100
SQL>
Feel free to explore other function that package offers.
Also, note that I didn't pay attention to possible letter case differences (e.g. "DOE" vs. "Doe"). If you need that as well, compare e.g. upper(surname) to upper(fullname).

Please use instring function,
SELECT * FROM TABLE
WHERE instr(Surname, FullName) > 0;
SELECT * FROM TABLE
WHERE instr(upper(Surname), upper(FullName)) > 0;
SELECT * FROM TABLE
WHERE upper(FullName) > upper(Surname);

As far as I know there is nothing out of the box when matching becomes complicated. For the cases shown, however, the following expression would suffice:
where fullname like '%' || surname
Update
The main problem may be false positives:
The last name 'Park' appears in 'Peter Parker'. Above query solves this by looking at the full name's end.
Another problem may be upper / lower case as mentioned in the other answers (not shown in your sample data).
You want the last name 'PARKER' match 'Peter Parker'.
But when looking at the strings case insensitively, another problem arises:
The last name 'Strong' will suddenly match 'Louis Armstrong'.
A solution for this is to add a blank to make the difference:
where ' ' || upper(fullname) like '% ' || upper(surname)
' LOUIS ARMSTRONG' like '% STRONG' -> false
' LOUIS ARMSTRONG' like '% ARMSTRONG' -> true
' LOUIS ARMSTRONG' like '% LOUIS ARMSTRONG' -> true
Demo: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0ac5c80061b4aeac1153a8c5976e6e54

Filter invalid ids from the oracle table

My table
NAME
Peter
Lance
Oscar
Steve
Reddy
Input to my query is array of string, let's say Peter, Bond, Steve, Smith
My query should return me the invalid values of my input (i.e) Bond & Smith
I am using Oracle 12.1.0 and odcivarchar2list is not supported.
Any suggestions would be highly appreciated

You can use cte :
with list_string as (
select 'Peter' as name union all
select 'Bond' as name union all
select 'Steve' as name union all
select 'Smith' as name
)
select ls.name, 'Invalid Values'
from list_string ls
where not exists (select 1 from table t1 where t1.name = t.name);

Some more options.
Data you have:
SQL> select * from test;
NAME
-----
Peter
Lance
Oscar
Steve
Reddy
If you don't mind enclosing names into single quotes, then this might be an option:
SQL> select column_value result
2 from table(sys.odcivarchar2list('Peter', 'Bond', 'Steve', 'Smith'))
3 minus
4 select t.name
5 from test t;
RESULT
-----------------------------------------------------------------------------
Bond
Smith
SQL>
If you'd just want to enter those names "normally", comma-separated, then:
SQL> with
2 sample (val) as
3 (select 'Peter, Bond, Steve, Smith' from dual)
4 select trim(regexp_substr(s.val, '[^,]+', 1, level)) result
5 from sample s
6 connect by level <= regexp_count(s.val, ',') + 1
7 minus
8 select t.name
9 from test t;
RESULT
---------------------------------------------------------------------
Bond
Smith
SQL>

SQL: replace NULL with SPACE

I want to create a fixed length flat file (separated by ','), but when a field has a null value, the record moves. Please see illustration below (incorrect Jenny and Roland records):
Source Table:
Name Color Balance Zip Code
------- ------ ------- --------
Melissa Orange $200.00 40240
Karl Blue $150.00 40884
Jenny -null- -null- 45667
Roland -null- $110.00 53366
Vincent Green $285.00 45677
Output I want to get:
Correct_Ouput
----------------------------
Melissa,Orange,$200.00,40240
Karl ,Blue ,$150.00,40884
Jenny , , ,45667
Roland , ,$110.00,53366
Vincent,Green ,$285.00,45677
Output I want to get:
Wrong_Output
----------------------------
Melissa,Orange,$200.00,40240
Karl ,Blue ,$150.00,40884
Jenny ,,,45667
Roland ,,$110.00,53366
Vincent,Green ,$285.00,45677
I tried searching but I get the null to empty string result.
Please help. Thank you.

Use the COALESCE(column,' ') function.
For some databases, you can use IFNULL or NVL. See this web page for more details.

You could use NVL to replace the NULL value with required number of spaces to retain the format.
For example,
SQL> SELECT 'Karl' NAME,
2 NVL('Blue', ' ') color,
3 NVL('$150.00',' ') balance,
4 40884 zip_code
5 FROM dual
6 UNION ALL
7 SELECT 'Jenny' name,
8 NVL(NULL, ' ') color,
9 NVL(NULL,' ') balance,
10 45667 zip_code
11 FROM dual
12 /
NAME COLOR BALANCE ZIP_CODE
----- ----- --------- ----------
Karl Blue $150.00 40884
Jenny 45667
SQL>
You could also use DECODE.
For example,
DECODE(column, NULL, ' ')

SQL multiple SELECT query with xmlagg function- Data not pulled in the required fashion

My data in Oracle is something like this
NAME | DEP_VALUE | ID_DEP
Amy 1 AA1234
Bob 2 BB4321
Clara 1 CC5678
Clara 2 CC7890
John 1 JJ6543
John 2 JJ7865
John 3 JJ7654
Tom 1 TT0987
Tom 2 TT6541
Tom 3 TT4087
Tom 4 TT3409
I need the data to be pulled in this fashion
NAME | DEP_VALUE | ID_DEP
Amy 1 AA1234
Bob 2 BB4321
Clara 1;2 CC5678;CC7890
John 1;2;3 JJ6543;JJ7865;JJ7654
Tom 1;2;3;4 TT0987;TT6541;TT4087;TT3409
My query is as follows
SELECT name,
Rtrim(Xmlagg (Xmlelement (e, dep_value
|| ';')).extract ( '//text()' ), ','),
Rtrim(Xmlagg (Xmlelement (e, id_dep
|| ';')).extract ( '//text()' ), ',')
FROM (SELECT emp_name,
dep.dep_value,
dep.id_dep
FROM emp
inner join dep
ON emp.name = dep.name
WHERE id_name IN (SELECT name
FROM altname
WHERE id_emp IN (SELECT id_emp
FROM cnames
WHERE emp_lvl LIKE '%GGG%')))
GROUP BY name,
dep_value
The result that is displayed is
NAME | DEP_VALUE | ID_DEP
Amy 1; AA1234;
Bob 2; BB4321;
Clara 1; CC5678;
Clara 2; CC7890;
John 1; JJ6543;
John 2; JJ7865;
John 3; JJ7654;
Tom 1; TT0987;
Tom 2; TT6541;
Tom 3; TT4087;
Tom 4; TT3409;
How can I pull the data as in the 2nd table? What is the error in my sql query?

It sounds like you want to GROUP BY name rather than GROUP BY name, dep_value
SELECT name,
Rtrim(Xmlagg (Xmlelement (e, dep_value
|| ';')).extract ( '//text()' ), ';'),
Rtrim(Xmlagg (Xmlelement (e, id_dep
|| ';')).extract ( '//text()' ), ';')
FROM (SELECT emp_name,
dep.dep_value,
dep.id_dep
FROM emp
inner join dep
ON emp.name = dep.name
WHERE id_name IN (SELECT name
FROM altname
WHERE id_emp IN (SELECT id_emp
FROM cnames
WHERE emp_lvl LIKE '%GGG%')))
GROUP BY name

Just to provide further explanation on xmlagg, and add another option with Oracle 11g.
http://www.dba-oracle.com/t_display_multiple_column_values_same_rows.htm
select
deptno,
listagg (ename, ',')
WITHIN GROUP
(ORDER BY ename) enames
FROM
emp
GROUP BY
deptno
/
Output:
DEPTNO ENAMES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

Try like this more simple
select NAME,replace(wm_concat(DEP_VALUE),',',';') as DEP_VALUE, replace(wm_concat(ID_DEP),',',';') as ID_DEP from yourtable
where dep_value<2000 group by NAME
note: you need to limit your dep_value length and you can make your assumption, because you cannot get your string result to long, hope this is help your job

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Combine strings in a certain order - google-bigquery

Consider below approach select *, array_to_string( array_reverse(array_agg(NAME1) over win) || array_reverse(array_agg(NAME2) over win) , ' ') as Name_COM from TBL_D_CUSTOMER window win as (partition by DEAL_NUMBER) if applied to sample data in your question - output is

Related

get occurence of multiple rows in postgresql

Comparing string values within a table

Filter invalid ids from the oracle table

SQL: replace NULL with SPACE

SQL multiple SELECT query with xmlagg function- Data not pulled in the required fashion

Categories

Resources