Format a number to have commas (1000000 -> 1,000,000) - google-bigquery

In Bigquery: How do we format a number that will be part of the resultset to have it formatted with commas: like 1000000 to 1,000,000 ?

below is for Standard SQL
SELECT
input,
FORMAT("%'d", input) as formatted
FROM (
SELECT 123 AS input UNION ALL
SELECT 1234 AS input UNION ALL
SELECT 12345 AS input UNION ALL
SELECT 123456 AS input UNION ALL
SELECT 1234567 AS input UNION ALL
SELECT 12345678 AS input UNION ALL
SELECT 123456789 AS input
)
Works great for integers, but if you will need floats too, you can use :
SELECT
input,
CONCAT(FORMAT("%'d", CAST(input AS int64)),
SUBSTR(FORMAT("%.2f", CAST(input AS float64)), -3)) as formatted
FROM (
SELECT 123 AS input UNION ALL
SELECT 1234 AS input UNION ALL
SELECT 12345 AS input UNION ALL
SELECT 123456.1 AS input UNION ALL
SELECT 1234567.12 AS input UNION ALL
SELECT 12345678.123 AS input UNION ALL
SELECT 123456789.1234 AS input
)
added for Legacy SQL
Btw, if for whatever reason you are bound to Legacy SQL - below is quick example for it
SELECT input, formatted
FROM JS((
SELECT input
FROM
(SELECT 123 AS input ),
(SELECT 1234 AS input ),
(SELECT 12345 AS input ),
(SELECT 123456 AS input ),
(SELECT 1234567 AS input ),
(SELECT 12345678 AS input ),
(SELECT 123456789 AS input)
),
// input
input,
// output
"[
{name: 'input', type:'integer'},
{name: 'formatted', type:'string'}
]",
// function
"function (r, emit) {
emit({
input: r.input,
formatted: r.input.toString().replace(/(\d)(?=(\d{3})+(?!\d))/g, '$1,')
});
}"
)
Above example uses in-line versin of Legacy SQL User-Defined Functions which is usually used for quick demo/example - but not recommended in production - if you will find it useful for you - you will need to "very slightly" transform it - see https://cloud.google.com/bigquery/user-defined-functions#webui for example

With Standard SQL:
SELECT FORMAT("%'d", 1000123)
1,000,123
Instruction to enable Standard SQL: https://cloud.google.com/bigquery/sql-reference/enabling-standard-sql

Improve on Mikhail's answer for the float session
CAST(input AS int64) will make numbers like 12345.5 become 12346.50 in output.
I will use split by "." to get the integer part of the number, then cast to int64.
CREATE TEMP FUNCTION
format_n(x float64) AS (CONCAT(FORMAT("%'d", CAST(SPLIT(CAST(x AS string), '.')[
OFFSET
(0)] AS int64)),SUBSTR(FORMAT("%.2f", x), -3)));
SELECT
input,
format_n(input)
FROM (
SELECT
123 AS input
UNION ALL
SELECT
1234 AS input
UNION ALL
SELECT
12345 AS input
UNION ALL
SELECT
123456.8 AS input
UNION ALL
SELECT
1234567.12 AS input
UNION ALL
SELECT
12345678.127 AS input
UNION ALL
SELECT
123456789.1234 AS input )

Related

How to convert quarter in Oracle SQL

I am currently working on Oracle SQL and have the following string values:
'1/2019', '2/2019', '3/2019', '4/2019'.
This values should be transformed like the following:
'1/2019' should be convert to '01.01.2019'
'2/2019' should be convert to '01.04.2019'
'3/2019' should be convert to '01.07.2019'
'4/2019' should be convert to '01.10.2019'
Is there a way in Oracle SQL to implement this transformation or is it necessary to write my own implementation (for example with SQL Case)?
I may take a bit of string manipulation, but this should solve your question:
WITH test_data (date_string) AS
(
select '1/2019' from dual union all
select '2/2019' from dual union all
select '3/2019' from dual union all
select '4/2019' from dual
)
SELECT ADD_MONTHS(TO_DATE('01/01/'||SUBSTR(td.date_string, -4), 'MM/DD/YYYY'), (SUBSTR(td.date_string, 1, 1)-1)*3)
FROM test_data td;
with
test_inputs (str) as (
select '1/2019' from dual union all
select '2/2019' from dual union all
select '3/2019' from dual union all
select '4/2019' from dual
)
select str, add_months(to_date(str, 'mm/yyyy'), 2 * substr(str, 1, 1) - 2) as qtr
from test_inputs
;
STR QTR
------ -----------
1/2019 01-JAN-2019
2/2019 01-APR-2019
3/2019 01-JUL-2019
4/2019 01-OCT-2019
The with clause is for testing only; what you are looking for is the formula in the main select.
I left the result in date data type (displayed here using my current session's settings). If you need to convert back to string in the specific format you requested, you can wrap within to_char(..., 'dd.mm.yyyy').

Oracle : replace string of options based on data set - is this possible?

I have column in table looking like this:
PATTERN
{([option1]+[option2])*([option3]+[option4])}
{([option1]+[option2])*([option3]+[option4])*([option6]+[option7])}
{[option1]+[option6]}
{([option1]+[option2])*([option8]+[option9])}
{([option1]+[option2])*[option4]}
{[option10]}
Every option has a number of value.
There is a table - let's call it option_set and records look like
OPTION VALUE
option1 3653265
option2 26452
option3 73552
option3 100
option4 1235
option5 42565
option6 2330
option7 544
option9 2150
I want to replace option name to number in 1st table, if exists of course, if not exists then =0.
I have done this in PLSQL (get the pattern, go through every option, and if exists - regexp_replace),
but I am wondering if this could be done in SQL??
My goal is to replace values for all patterns for current OPTION_SET and get only records, where all equations would be greater than 0. Of course - I couldn't run this equation in SQL, so I think of something like
for rec in
(
SELECT...
)
loop
execute immediate '...';
if above_equation > 0 then ..
end loop;
Any ideas would be appreciated
You can do a loop-like query in SQL with the recursive CTE, replacing new token on each iteration, so this will let you to replace all the tokens.
The only way I know to execute a dynamic query inside SQL statement in Oracle is DBMS_XMLGEN package, so you can evaluate the expression and filter by the result value without PL/SQL. But all this is viable for low cardinality tables with patterns and options.
Here's the code:
with a as (
select 1 as id, '{([option1]+[option2])*([option3]+[option4])}' as pattern from dual union all
select 2 as id, '{([option1]+[option2])*([option3]+[option4])*([option6]+[option7])}' as pattern from dual union all
select 3 as id, '{[option1]+[option6]}' as pattern from dual union all
select 4 as id, '{([option1]+[option2])*([option8]+[option9])}' as pattern from dual union all
select 5 as id, '{([option1]+[option2])*[option4]}' as pattern from dual union all
select 6 as id, '{[option10]}]' as pattern from dual
)
, opt as (
select 'option1' as opt, 3653265 as val from dual union all
select 'option2' as opt, 26452 as val from dual union all
select 'option3' as opt, 73552 as val from dual union all
select 'option3' as opt, 100 as val from dual union all
select 'option4' as opt, 1235 as val from dual union all
select 'option5' as opt, 42565 as val from dual union all
select 'option6' as opt, 2330 as val from dual union all
select 'option7' as opt, 544 as val from dual union all
select 'option9' as opt, 2150 as val from dual
)
, opt_ordered as (
/*Order options to iterate over*/
select opt.*, row_number() over(order by 1) as rn
from opt
)
, rec (id, pattern, repl_pattern, lvl) as (
select
id,
pattern,
pattern as repl_pattern,
0 as lvl
from a
union all
select
r.id,
r.pattern,
/*Replace each part at new step*/
replace(r.repl_pattern, '[' || o.opt || ']', o.val),
r.lvl + 1
from rec r
join opt_ordered o
on r.lvl + 1 = o.rn
)
, out_prepared as (
select
rec.*,
case
when instr(repl_pattern, '[') = 0
/*When there's no more not parsed expressions, then we can try to evaluate them*/
then dbms_xmlgen.getxmltype(
'select ' || replace(replace(repl_pattern, '{', ''), '}', '')
|| ' as v from dual'
)
/*Otherwise SQL statement will fail*/
end as parsed_expr
from rec
/*Retrieve the last step*/
where lvl = (select max(rn) from opt_ordered)
)
select
id,
pattern,
repl_pattern,
extractvalue(parsed_expr, '/ROWSET/ROW/V') as calculated_value
from out_prepared o
where extractvalue(parsed_expr, '/ROWSET/ROW/V') > 0
ID | PATTERN | REPL_PATTERN | CALCULATED_VALUE
-: | :------------------------------------------------------------------ | :---------------------------------------- | :---------------
1 | {([option1]+[option2])*([option3]+[option4])} | {(3653265+26452)*(73552+1235)} | 275194995279
2 | {([option1]+[option2])*([option3]+[option4])*([option6]+[option7])} | {(3653265+26452)*(73552+1235)*(2330+544)} | 790910416431846
3 | {[option1]+[option6]} | {3653265+2330} | 3655595
5 | {([option1]+[option2])*[option4]} | {(3653265+26452)*1235} | 4544450495
db<>fiddle here
Here is one way to do this. There's a lot to unpack, so hang on tight.
I include the test data in the with clause. Of course, you won't need that; simply remove the two "tables" and use your actual table and column names in the query.
From Oracle 12.1 on, we can define PL/SQL functions directly in the with clause, right at the top; if we do so, the query must be terminated with a slash (/) instead of the usual semicolon (;). If your version is earlier than 12.1, you can define the function separately. The function I use takes an "arithmetic expression" (a string representing a compound arithmetic operation) and returns its value as a number. It uses native dynamic SQL (the "execute immediate" statement), which will cause the query to be relatively slow, as a different cursor is parsed for each row. If speed becomes an issue, this can be changed, to use a bind variable (so that the cursor is parsed only once).
The recursive query in the with clause replaces each placeholder with the corresponding value for the "options" table. I use 0 either if a "placeholder" doesn't have a corresponding option in the table, or if it does but the corresponding value is null. (Note that your sample data shows option3 twice; that makes no sense, and I removed one occurrence from my sample data.)
Instead of replacing one placeholder at a time, I took the opposite approach; assuming the patterns may be long, but the number of "options" is small, this should be more efficient. Namely: at each step, I replace ALL occurrences of '[optionN]' (for a given N) in a single pass. Outside the recursive query, I replace all the placeholders for "non-existent" options with 0.
Note that recursive with clause requires Oracle 11.2. If your version is even earlier than that (although it shouldn't be), there are other ways; you would likely need to do that in PL/SQL also.
So, here it is - a single SELECT query for the whole thing:
with
function expr_eval(pattern varchar2) return number as
x number;
begin
execute immediate 'select ' || pattern || ' from dual' into x;
return x;
end;
p (id, pattern) as (
select 1, '{([option1]+[option2])*([option3]+[option4])}' from dual union all
select 2, '{([option1]+[option2])*([option3]+[option4])*([option6]+[option7])}' from dual union all
select 3, '{[option1]+[option6]}' from dual union all
select 4, '{([option1]+[option2])*([option8]+[option9])}' from dual union all
select 5, '{([option1]+[option2])*[option4]}' from dual union all
select 6, '{[option10]}' from dual union all
select 7, '{[option2]/([option3]+[option8])-(300-[option2])/(0.1 *[option3])}' from dual
)
, o (opt, val) as (
select 'option1', 3653265 from dual union all
select 'option2', 26452 from dual union all
select 'option3', 100 from dual union all
select 'option4', 1235 from dual union all
select 'option5', 42565 from dual union all
select 'option6', 2330 from dual union all
select 'option7', 544 from dual union all
select 'option9', 2150 from dual
)
, n (opt, val, rn, ct) as (
select opt, val, rownum, count(*) over ()
from o
)
, r (id, pattern, rn, ct) as (
select id, substr(pattern, 2, length(pattern) - 2), 1, null
from p
union all
select r.id, replace(r.pattern, '[' || n.opt || ']', nvl(to_char(n.val), 0)),
r.rn + 1, n.ct
from r join n on r.rn = n.rn
)
, ae (id, pattern) as (
select id, regexp_replace(pattern, '\[[^]]*]', '0')
from r
where rn = ct + 1
)
select id, expr_eval(pattern) as result
from ae
order by id
/
Output:
ID RESULT
---- ---------------
1 4912422195
2 14118301388430
3 3655595
4 7911391550
5 4544450495
6 0
7 2879.72

Extracting substring in Oracle

Let's say I have three rows with value as
1 121/2808B|:6081
2 OD308B|:6081_1:
3 008312100001200|:6081_1
I want to display value only until B but want to exclude everything after B. So as you can see in above data:
from 121/2808B|:6081 I want only 121/2808B
from OD308B|:6081_1: only OD308B
from 008312100001200|:6081_1 only 008312100001200.
Thanks for the Help.
Try this: regexp_substr('<Your_string>','[^B]+')
SELECT
REGEXP_SUBSTR('121/2808B|:6081', '[^B]+')
FROM
DUAL;
REGEXP_S
--------
121/2808
SELECT
REGEXP_SUBSTR('OD308B|:6081_1:', '[^B]+')
FROM
DUAL;
REGEX
-----
OD308
SELECT
REGEXP_SUBSTR('008312100001200.', '[^B]+')
FROM
DUAL;
REGEXP_SUBSTR('0
----------------
008312100001200.
db<>fiddle demo
Cheers!!
You could try using SUBSTR() and INSTR()
select SUBSTR('121/2808B|:6081',1,INSTR('121/2808B|:6081','B', 1, 1) -1)
from DUAL
I think you forgot to mention that you wanted to use | as a field separator, but I deduced this from the expected result from the third string. As such the following should give you what you want:
WITH cteData AS (SELECT 1 AS ID, '121/2808B|:6081' AS STRING FROM DUAL UNION ALL
SELECT 2, 'OD308B|:6081_1:' FROM DUAL UNION ALL
SELECT 3, '008312100001200|:6081_1' FROM DUAL)
SELECT ID, STRING, SUBSTR(STRING, 1, CASE
WHEN INSTR(STRING, 'B') = 0 THEN INSTR(STRING, '|')-1
ELSE INSTR(STRING, 'B')-1
END) AS UP_TO_B
FROM cteData;
dbfiddle here
Assuming Bob Jarvis is correct in the assumption that "|" is also a delimiter (as seems likely) try:
-- define test data
with test as
( select '121/2808B|:6081' stg from dual union all
select 'OD308B|:6081_1:' from dual union all
select '008312100001200|:6081_1' from dual
)
-- execute extract
select regexp_substr(stg , '[^B|]+') val
from test ;

Extract number or string after string in BigQuery

I have several 1.000 URLs and want to extract some values from the URL parameters.
Here some examples from the DB:
["www.xxx.com?uci=6666&rci=fefw"]
["www.xxx.com?uci=61
["www.xxx.com?rci=62&uci=5536"]
["www.xxx.com?uci=6666&utm_source=XXX"]
["www.xxx.com?pccst=TEST%20sTESTg"]
["www.xxx.com?pccst=TEST2%20s&uci=1"]
["www.xxx.com?uci=1pccst=TEST42rt24&rci=2"]
How can I extract the value of the parameter UCI. It is always a digit number (don’t know the exact length).
I tried it with REGEXP_EXTRACT. But I didn't succeed:
REGEXP_EXTRACT(URL, '(uci)\=[0-9]+') AS UCI_extract
And I also want to extract the value of the parameter pccst. It can be every character and I don`t know the exact length. But it always ends with “ or ? or &
I tried it also with REGEXP_EXTRACT but didn't succeed:
REGEXP_EXTRACT(URL, r'pccst\=(.*)(\"|\&|\?)') AS pccst_extract
I am really not the REGEX expert.
So would be great if someone could help me.
Thanks a lot in advance,
Peter
You can adapt this solution
#standardSQL
# Extract query parameters from a URL as ARRAY in BigQuery; standard-sql; 2018-04-08
# #see http://www.pascallandau.com/bigquery-snippets/extract-url-parameters-array/
WITH examples AS (
SELECT 1 AS id, 'www.xxx.com?uci=6666&rci=fefw' AS query
UNION ALL SELECT 2, 'www.xxx.com?uci=1pccst%20TEST42rt24&rci=2'
UNION ALL SELECT 3, 'www.xxx.com?pccst=TEST2%20s&uci=1'
)
SELECT
id,
query,
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params,
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)(?:([^=]+)=(?:[^&]*))') as keys,
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)(?:(?:[^=]+)=([^&]*))') as values
FROM examples
Below example for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT "www.xxx.com?uci=6666&rci=fefw" url UNION ALL
SELECT "www.xxx.com?uci=61" UNION ALL
SELECT "www.xxx.com?rci=62&uci=5536" UNION ALL
SELECT "www.xxx.com?uci=6666&utm_source=XXX" UNION ALL
SELECT "www.xxx.com?pccst=TEST%20sTESTg" UNION ALL
SELECT "www.xxx.com?pccst=TEST2%20s&uci=1" UNION ALL
SELECT "www.xxx.com?uci=1&pccst=TEST42rt24&rci=2"
)
SELECT
url,
REGEXP_EXTRACT(url, r'[?&]uci=(.*?)(?:$|&)') uci,
REGEXP_EXTRACT(url, r'[?&]pccst=(.*?)(?:$|&)') pccst
FROM `project.dataset.table`
result is
Row url uci pccst
1 www.xxx.com?pccst=TEST%20sTESTg null TEST%20sTESTg
2 www.xxx.com?pccst=TEST2%20s&uci=1 1 TEST2%20s
3 www.xxx.com?uci=1&pccst=TEST42rt24&rci=2 1 TEST42rt24
4 www.xxx.com?uci=61 61 null
5 www.xxx.com?rci=62&uci=5536 5536 null
6 www.xxx.com?uci=6666&rci=fefw 6666 null
7 www.xxx.com?uci=6666&utm_source=XXX 6666 null
Also, below option to parse out all key-value pairs so, then you can dynamically select needed
#standardSQL
WITH `project.dataset.table` AS (
SELECT "www.xxx.com?uci=6666&rci=fefw" url UNION ALL
SELECT "www.xxx.com?uci=61" UNION ALL
SELECT "www.xxx.com?rci=62&uci=5536" UNION ALL
SELECT "www.xxx.com?uci=6666&utm_source=XXX" UNION ALL
SELECT "www.xxx.com?pccst=TEST%20sTESTg" UNION ALL
SELECT "www.xxx.com?pccst=TEST2%20s&uci=1" UNION ALL
SELECT "www.xxx.com?uci=1pccst=TEST42rt24&rci=2"
)
SELECT url,
ARRAY(
SELECT AS STRUCT
SPLIT(kv, '=')[SAFE_OFFSET(0)] key,
SPLIT(kv, '=')[SAFE_OFFSET(1)] value
FROM UNNEST(SPLIT(SUBSTR(url, LENGTH(NET.HOST(url)) + 2), '&')) kv
) key_value_pair
FROM `project.dataset.table`

SQL Query to show string before a dash

I would like to execute a query that will only show all the string before dash in the particular field.
For example:
Original data: AB-123
After query: AB
You can use substr:
SQL> WITH DATA AS (SELECT 'AB-123' txt FROM dual)
2 SELECT substr(txt, 1, instr(txt, '-') - 1)
3 FROM DATA;
SUBSTR(TXT,1,INSTR(TXT,'-')-1)
------------------------------
AB
or regexp_substr (10g+):
SQL> WITH DATA AS (SELECT 'AB-123' txt FROM dual)
2 SELECT regexp_substr(txt, '^[^-]*')
3 FROM DATA;
REGEXP_SUBSTR(TXT,'^[^-]*')
---------------------------
AB
You can use regexp_replace.
For example
WITH DATA AS (
SELECT 'AB-123' as text FROM dual
UNION ALL
SELECT 'ABC123' as text FROM dual
)
SELECT
regexp_replace(d.text, '-.*$', '') as result
FROM DATA d;
will lead to
WITH DATA AS (
2 SELECT 'AB-123' as text FROM dual
3 UNION ALL
4 SELECT 'ABC123' as text FROM dual
5 )
6 SELECT
7 regexp_replace(d.text, '-.*$', '') as result
8 FROM DATA d;
RESULT
------------------------------------------------------
AB
ABC123
I found this simple
SELECT distinct
regexp_replace(d.pyid, '-.*$', '') as result
FROM schema.table d;
pyID column contains ABC-123, DEF-3454
SQL Result:
ABC
DEF