How to modify Column Data in BigQuery Table - google-bigquery

I've been trying to search online however, I am only able to see how to add, remove, change a column in a table. Basically, I need to go through an entire column of email addresses in BigQuery and add a 2nd email address in each of the rows.
ID|Name |email
1 |Name1|email1#address.com
2 |Name2|email2#address.com
3 |Name3|email3#address.com
4 |Name4|email4#address.com
5 |Name5|email5#address.com
6 |Name6|email6#address.com
What I am looking for is some script that will go through a column and add a 2nd emailadd with a "," in the middle so that it'll look like this:
ID|Name |email
1 |Name1|email1#address.com,secondemail#address.com
2 |Name2|email2#address.com,secondemail#address.com
3 |Name3|email3#address.com,secondemail#address.com
4 |Name4|email4#address.com,secondemail#address.com
5 |Name5|email5#address.com,secondemail#address.com
6 |Name6|email6#address.com,secondemail#address.com
While all of the beginning data remains intact. Please let me know if this is possible. Also the "secondemail#address.com" is just one email address it doesn't change per user. I just need this format for a business reason.

Below is for BigQuery Standard SQL
#standardSQL
SELECT * REPLACE(email || ',secondemail#address.com' AS email)
FROM `project.dataset.table`
You can test, play with above using sample data from your question, as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 'Name1' name, 'email1#address.com' email UNION ALL
SELECT 2, 'Name2', 'email2#address.com' UNION ALL
SELECT 3, 'Name3', 'email3#address.com' UNION ALL
SELECT 4, 'Name4', 'email4#address.com' UNION ALL
SELECT 5, 'Name5', 'email5#address.com' UNION ALL
SELECT 6, 'Name6', 'email6#address.com'
)
SELECT * REPLACE(email || ',secondemail#address.com' AS email)
FROM `project.dataset.table`
with output
Row id name email
1 1 Name1 email1#address.com,secondemail#address.com
2 2 Name2 email2#address.com,secondemail#address.com
3 3 Name3 email3#address.com,secondemail#address.com
4 4 Name4 email4#address.com,secondemail#address.com
5 5 Name5 email5#address.com,secondemail#address.com
6 6 Name6 email6#address.com,secondemail#address.com

Related

oracle sql parent child with for multiple columns

I have a users table having two columns for Approval Hierarchy , Table structure is like below
User_ID Submit_to Approve_to
1 2 3
2 4 5
3 6 2
4 2 3
5 1 0
Data is just For example 0 Mention no Approver :
Submit to and Approve to both will be Approvers
I need a Query which can give the Approves details in sequence that who will be next approver for entry user have created .
Using CONNECT BY, you can connect the parent/child relationship within a table. The query below will show the approval path for each user.
Query
WITH
d (user_id, submit_to, approve_to)
AS
(SELECT 1, 2, 3 FROM DUAL
UNION ALL
SELECT 2, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 6, 2 FROM DUAL
UNION ALL
SELECT 4, 2, 3 FROM DUAL
UNION ALL
SELECT 5, 1, 0 FROM DUAL)
SELECT d.user_id, LTRIM (SYS_CONNECT_BY_PATH (user_id, '<'), '<') AS approval_path
FROM d
START WITH approve_to = 0
CONNECT BY NOCYCLE PRIOR user_id = approve_to
ORDER BY user_id;
Result
In the APPROVAL_PATH column, the left most number is the USER_ID giving final approval and the right most number is the USER_ID initially submitting whatever needs to be approved.
USER_ID | APPROVAL_PATH
-----------------------
1 | 5<2<3<1
2 | 5<2
3 | 5<2<3
4 | 5<2<3<4
5 | 5

How to process a column that holds a comma-separated or range string values in Oracle

Using Oracle 12c DB, I have the following table data example that I need assistance with using SQL and PL/SQL.
Table data is as follows:
Table Name: my_data
ID ITEM ITEM_LOC
------- ----------- ----------------
1 Item-1 0,1
2 Item-2 0,1,2,3,4,7
3 Item-3 0-48
4 Item-4 0,1,2,3,4,5,6,7,8
5 Item-5 1-33
6 Item-6 0,1
7 Item-7 0,1,5,8
Using the data above within the my_data table, what is the best way to process this ITEM_LOC as I need to use the values in this column as an individual value, i.e:
0,1 means the SQL needs to return either 0 or 1 or
range values, i.e:
0-48 means the SQL needs to return a value between 0 and 48.
The returned values for both scenarios should commence from lowest to highest and can't be re-used once processed.
Based on the above, it would be great to have a function that takes the ID and returns an individual value from ITEM_LOC that hasn't been used, based on my description above. This could be a comma-separated string value or a range string value.
Desired result for ID = 2 could be 7. For this ID = 2, ITEM_LOC = 7 could not be used again.
Desired result for ID = 5 could be 31. For this ID = 5, ITEM_LOC = 31 could not be used again.
For the ITEM_LOC data that could not be used again, against that ID, I am looking at holding another table to hold this or perhaps separate all data into separate rows with a new column called VALUE_USED.
This query shows how to extract list of ITEM_LOC values based on whether they are comma-separated (which means "take exactly those values") or dash-separated (which means "find all values between starting and end point"). I modified your sample data a little bit (didn't feel like displaying ~50 values if 5 of them do the job).
lines #1 - 6 represent sample data.
the first select (lines #7 - 15) splits comma-separated values into rows
the second select (lines #17 - 26) uses a hierarchical query which adds 1 to the starting value, up to item's end value.
SQL> with my_data (id, item, item_loc) as
2 (select 2, 'Item-2', '0,2,4,7' from dual union all
3 select 7, 'Item-7', '0,1,5' from dual union all
4 select 3, 'Item-3', '0-4' from dual union all
5 select 8, 'Item-8', '5-8' from dual
6 )
7 select id,
8 item,
9 regexp_substr(item_loc, '[^,]+', 1, column_value) loc
10 from my_data
11 cross join table(cast(multiset
12 (select level from dual
13 connect by level <= regexp_count(item_loc, ',') + 1
14 ) as sys.odcinumberlist))
15 where instr(item_loc, '-') = 0
16 union all
17 select id,
18 item,
19 to_char(to_number(regexp_substr(item_loc, '^\d+')) + column_value - 1) loc
20 from my_data
21 cross join table(cast(multiset
22 (select level from dual
23 connect by level <= to_number(regexp_substr(item_loc, '\d+$')) -
24 to_number(regexp_substr(item_loc, '^\d+')) + 1
25 ) as sys.odcinumberlist))
26 where instr(item_loc, '-') > 0
27 order by id, item, loc;
ID ITEM LOC
---------- ------ ----------------------------------------
2 Item-2 0
2 Item-2 2
2 Item-2 4
2 Item-2 7
3 Item-3 0
3 Item-3 1
3 Item-3 2
3 Item-3 3
3 Item-3 4
7 Item-7 0
7 Item-7 1
7 Item-7 5
8 Item-8 5
8 Item-8 6
8 Item-8 7
8 Item-8 8
16 rows selected.
SQL>
I don't know what you meant by saying that "item_loc could not be used again". Used where? If you use the above query in, for example, cursor FOR loop, then yes - those values would be used only once as every loop iteration fetches next item_loc value.
As others have said, it's a bad idea to store data in this way. You very likely could have input like this, and you likely could need to display the data like this, but you don't have to store the data the way it is input or displayed.
I'm going to store the data as individual LOC elements based on the input. I assume the data contains only integers separated by commas, or pairs of integers separated by a hyphen. Whitespace is ignored. The comma-separated list does not have to be in any order. In pairs, if the left integer is greater than the right integer I return no LOC element.
create table t as
with input(id, item, item_loc) as (
select 1, 'Item-1', ' 0,1' from dual union all
select 2, 'Item-2', '0,1,2,3,4,7' from dual union all
select 3, 'Item-3', '0-48' from dual union all
select 4, 'Item-4', '0,1,2,3,4,5,6,7,8' from dual union all
select 5, 'Item-5', '1-33' from dual union all
select 6, 'Item-6', '0,1' from dual union all
select 7, 'Item-7', '0,1,5,8,7 - 11' from dual
)
select distinct id, item, loc from input, xmltable(
'let $item := if (contains($X,",")) then ora:tokenize($X,"\,") else $X
for $i in $item
let $j := if (contains($i,"-")) then ora:tokenize($i,"\-") else $i
for $k in xs:int($j[1]) to xs:int($j[count($j)])
return $k'
passing item_loc as X
columns loc number path '.'
);
Now to "use" an element I just delete it from the table:
delete from t where rowid = (
select min(rowid) keep (dense_rank first order by loc)
from t
where id = 7
);
To return the data in the same format it was input, use MATCH_RECOGNIZE:
select id, item, listagg(item_loc, ',') within group(order by first_loc) item_loc
from t
match_recognize(
partition by id, item order by loc
measures a.loc first_loc,
a.loc || case count(*) when 1 then null else '-'||b.loc end item_loc
pattern (a b*)
define b as loc = prev(loc) + 1
)
group by id, item;
ID ITEM ITEM_LOC
1 Item-1 0-1
2 Item-2 0-4,7
3 Item-3 0-48
4 Item-4 0-8
5 Item-5 1-33
6 Item-6 0-1
7 Item-7 1,5,7-11
Note that the output here will not be exactly like the input, because any consecutive integers will be compressed into a pair.

Oracle - Loop through hierarchised records

So, since I struggled to find an accurate title, I think a detailled shema will be much more understandable.
I have this table PROGRAM that I will reduce to 3 fields for simplicity:
ID |NAME |ID_ORIGINAL_PROGRAM
1 |name_1 |
2 |name_2 |1
3 |name_3 |1
4 |name_4 |2
5 |name_5 |3
6 |name_6 |
7 |name_7 |6
I'm trying to find a query that will allow me, with any ID as parameter to gather all the related programs to this id.
And I need to be able to send a parameter than does not necessarily has to be the "father" id of the hierarchy.
For example, if parameter ID is 1, then results will be:
ID
2
3
4
5
If parameter ID is 4, then the results will be:
ID
1
2
3
5
It seems like I'm missing some kind "loop" logic that I can't clearly identify.
I looked up at "CONNECT BY PRIOR" but was not able to grasp the concept enough to understand how to deploy it.
Edit:
So it seems I found a way through:
SELECT ID
FROM PROGRAM
START WITH ID = 67256
CONNECT BY NOCYCLE ID_ORIGINAL_PROGRAM = PRIOR ID
OR ID = PRIOR ID_ORIGINAL_PROGRAM
order by ID
I'm a bit concerned by the performances though (it takes 1 second to perform)
I suppose you need
with program( id, id_original_program ) as
(
select 1, null from dual union all
select 2, 1 from dual union all
select 3, 1 from dual union all
select 4, 2 from dual union all
select 5, 3 from dual union all
select 6, null from dual union all
select 7, 6 from dual
)
select id, sys_connect_by_path(id, '->') as path
from program
where id_original_program is not null
connect by prior id = id_original_program
start with id = 1 -- 4
order by id;
ID PATH
2 ->1->2
3 ->1->3
4 ->1->2->4
5 ->1->3->5
if value 4 is substituted, then you get
ID PATH
4 ->4
only.
Whether you substitute 1 or 4, you'll get the same result for your query.

How to get all substring occurences between some characters?

What i'm trying to get is the part of a column text that is between some characters ($$ to be exact) but the trick is those characters can occur more than twice (but always even like if there are more than 2 than it must be like $$xxx$$ ... $$yyy$$) and I need to get them separately.
When I try this, if the pattern only occur once then it's no problem :
regexp_substr(txt,'\$\$(.*)\$\$',1,1,null,1)
But lets say the column text is : $$xxx$$ ... $$yyy$$
then it gives me : xxx$$ ... $$yyy
but what I need is two get them in separate lines like :
xxx
yyy
which I couldn't get it done so how?
You could use a recursive query that matches the first occurrence and then removes that from the string for the next iteration of the recursive query.
Assuming your table and column are called tbl and txt:
with cte(match, txt) as (
select regexp_substr(txt,'\$\$(.*?)\$\$', 1, 1, null, 1),
regexp_replace(txt,'\$\$(.*?)\$\$', '', 1, 1)
from tbl
where regexp_like(txt,'\$\$(.*?)\$\$')
union all
select regexp_substr(txt,'\$\$(.*?)\$\$', 1, 1, null, 1),
regexp_replace(txt,'\$\$(.*?)\$\$', '', 1, 1)
from cte
where regexp_like(txt,'\$\$(.*?)\$\$')
)
select match from cte
One could also use CONNECT BY to "loop" through the elements surrounded by the double dollar signs, returning the data inside (the 2nd grouping). This method handles NULL elements (ID 7, element 2) and since the dollar signs are consumed as the regex moves from left to right, characters in between the groups are not falsely matched.
SQL> with tbl(id, txt) as (
select 1, '$$xxx$$' from dual union all
select 2, '$$xxx$$ ... $$yyy$$' from dual union all
select 3, '' from dual union all
select 4, '$$xxx$$abc$$yyy$$' from dual union all
select 5, '$$xxx$$ ... $$yyy$$ ... $$www$$ ... $$zzz$$' from dual union all
select 6, '$$aaa$$$$bbb$$$$ccc$$$$ddd$$' from dual union all
select 7, '$$aaa$$$$$$$$ccc$$$$ddd$$' from dual
)
select id, level, regexp_substr(txt,'(\$\$(.*?)\$\$)',1,level,null,2) element
from tbl
connect by regexp_substr(txt,'(\$\$(.*?)\$\$)',1,level) is not null
and prior txt = txt
and prior sys_guid() is not null
order by id, level;
ID LEVEL ELEMENT
---------- ---------- -------------------------------------------
1 1 xxx
2 1 xxx
2 2 yyy
3 1
4 1 xxx
4 2 yyy
5 1 xxx
5 2 yyy
5 3 www
5 4 zzz
6 1 aaa
6 2 bbb
6 3 ccc
6 4 ddd
7 1 aaa
7 2
7 3 ccc
7 4 ddd
18 rows selected.
SQL>

Oracle SQL - getting values out of a table where all values must be equal

So lets say I have this table with names and scores,lets call it grades, and the score values can only be 1, 2 or 3
names | scores
Bob | 3
Bob | 3
Bob | 3
John | 3
John | 1
Peter | 3
And I want the names of the people who got perfect score (3 in all of their scores). My problem is that each person can have different number of grades.
The expected output would be something like this like this:
names
Bob
Peter
How do I do this in Oracle sql
try this...
select names
from grades
group by names
having count(*) * 3 = sum(scores)
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE grades ( name, grade ) AS
SELECT 'Bob', 3 FROM DUAL
UNION ALL SELECT 'Bob', 3 FROM DUAL
UNION ALL SELECT 'Bob', 3 FROM DUAL
UNION ALL SELECT 'John', 3 FROM DUAL
UNION ALL SELECT 'John', 1 FROM DUAL
UNION ALL SELECT 'Peter', 3 FROM DUAL
Query 1:
SELECT name
FROM grades
GROUP BY name
HAVING MIN(grade) = 3
Results:
| NAME |
|-------|
| Peter |
| Bob |
This will do it:
select g1.name
from grades g1
where g1.score = 3
group by name
having count(*) = (select count(*)
from grades g2
where g2.name = g1.name);
It retrieves all names that where the count of score = 3 is equal to the total count for that name.
SQLFiddle: http://sqlfiddle.com/#!4/29637/1