BigQuery Standard SQL: Setting a value to positive or negative infinity - sql

I note from the docs for BigQuery Standard SQL Mathematical functions that we can test if a number "is infinity" using the IS_INF() function something like this:
WITH demo_tbl AS (
SELECT 1 AS val UNION ALL
SELECT 100 AS val
)
SELECT
val,
IS_INF(val) as is_infinity
FROM demo_tbl
which outputs:
+---+-----+-------------+
| | val | is_infinity |
+---+-----+-------------+
| 0 | 1 | False |
| 1 | 100 | False |
+---+-----+-------------+
but is it possible to explicitly set a value to be positive or negative infinity using some constant value or key word?
Perhaps something like this:
WITH demo_tbl AS (
SELECT 1 AS val UNION ALL
SELECT +inf AS val -- <-- THIS doesnt work
)
SELECT
val,
IS_INF(val) as is_infinity
FROM demo_tbl
in order to give desired output similar to this:
+---+-----+-------------+
| | val | is_infinity |
+---+-----+-------------+
| 0 | 1 | False |
| 1 | inf | True |
+---+-----+-------------+
I searched the documentation as best as I could and had a Google around this but couldn't turn up an answer.

You can create the value using cast():
select is_inf(cast('Infinity' as float64))
or:
select is_inf(cast('inf' as float64))
Buried in the documentation is:
There is no literal representation of NaN or infinity, but the
following case-insensitive strings can be explicitly cast to float:
"NaN"
"inf" or "+inf"
"-inf"

Related

How to UNNEST an Array Postgresql double nested?

I struggle with Unnesting an array in this format -> btw newbie alert! Use Case: I want to count all v=1234 in a table custom_fields = {f=[{v=1234}, {v=[]}]}
I tried to use:
select custom_fields[safe_offset(1)]
from database
limit 10
it gives me the column, but still everything is nested.
Then i tried this:
SELECT tickets.id, cf
FROM db.tickets
CROSS JOIN UNNEST(tickets.custom_fields) AS cf
limit 10
same behaviour as first code.
I tried this [][]:
SELECT
custom_fields[1][1]
FROM db.tickets
limit 10
*Array element access with array[position] is not supported. Use
array[OFFSET(zero_based_offset)] or array[ORDINAL(one_based_ordinal)]
but jeah thats the query at the beginning of this message.
I am pretty lost.. Anyone an idea?
Not sure I fully understood your question, but I replicated your example adding an id column and the json_col containing the json. The following statement extracts each v value in a different row and is still related to the id
with my_tbl as (
select 1 id, '{"f":[{"v":1234}, {"v":2345}, {"v":7777}]}'::jsonb as json_col UNION ALL
select 2 id, '{"f":[{"v":6789}, {"v":3333}]}'::jsonb as json_col
)
select * from my_tbl, jsonb_to_recordset(jsonb_extract_path(json_col, 'f')) as x(v int);
The sql, uses the JSONB_EXTRACT_PATH to extract the f part, and the JSONB_TO_RECORDSET to create a row for each v value. More info on JSON functions in the documentation
id | json_col | v
----+------------------------------------------------+------
1 | {"f": [{"v": 1234}, {"v": 2345}, {"v": 7777}]} | 1234
1 | {"f": [{"v": 1234}, {"v": 2345}, {"v": 7777}]} | 2345
1 | {"f": [{"v": 1234}, {"v": 2345}, {"v": 7777}]} | 7777
2 | {"f": [{"v": 6789}, {"v": 3333}]} | 6789
2 | {"f": [{"v": 6789}, {"v": 3333}]} | 3333
(5 rows)

Convert a percentage(string, with a %) to a decimal in postgresql

I would like to average on the Scores(string) of each person from the following table in postgresql,
No. | Name | Term | Score
1 | A | 1 | 95.00%
2 | A | 2 | 99.00%
3 | C | 1 | 90.00%
4 | D | 1 | 100.00%
.
.
It does not like % on the score. How can I convert it into a decimal/float from a string containing a % as shown above?
Tried,
score::decimal
but it complains as,
ERROR: invalid input syntax for type numeric: "95.00%"
SQL state: 22P02
cast also does not seem to work.
How do I convert this?
One method uses replace():
select replace(score, '%', '')::numeric
If you actually want to convert it to a number between 0 and 1 rather than 0 and 100, try a case:
select (case when right(score, 1) = '%'
then (replace(score, '%', '')::numeric) / 100
else score::numeric
end)

Filter json values regardless of keys in PostgreSQL

I have a table called diary which includes columns listed below:
| id | user_id | custom_foods |
|----|---------|--------------------|
| 1 | 1 | {"56": 2, "42": 0} |
| 2 | 1 | {"19861": 1} |
| 3 | 2 | {} |
| 4 | 3 | {"331": 0} |
I would like to count how many diaries having custom_foods value(s) larger than 0 each user have. I don't care about the keys, since the keys can be any number in string.
The desired output is:
| user_id | count |
|---------|---------|
| 1 | 2 |
| 2 | 0 |
| 3 | 0 |
I started with:
select *
from diary as d
join json_each_text(d.custom_foods) as e
on d.custom_foods != '{}'
where e.value > 0
I don't even know whether the syntax is correct. Now I am getting the error:
ERROR: function json_each_text(text) does not exist
LINE 3: join json_each_text(d.custom_foods) as e
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
My using version is: psql (10.5 (Ubuntu 10.5-1.pgdg14.04+1), server 9.4.19). According to PostgreSQL 9.4.19 Documentation, that function should exist. I am so confused that I don't know how to proceed now.
Threads that I referred to:
Postgres and jsonb - search value at any key
Query postgres jsonb by value regardless of keys
Your custom_foods column is defined as text, so you should cast it to json before applying json_each_text. As json_each_text by default does not consider empty jsons, you may get the count as 0 for empty jsons from a separate CTE and do a UNION ALL
WITH empty AS
( SELECT DISTINCT user_id,
0 AS COUNT
FROM diary
WHERE custom_foods = '{}' )
SELECT user_id,
count(CASE
WHEN VALUE::int > 0 THEN 1
END)
FROM diary d,
json_each_text(d.custom_foods::JSON)
GROUP BY user_id
UNION ALL
SELECT *
FROM empty
ORDER BY user_id;
Demo

Postgresql: Dynamic Regex Pattern

I have event data that looks like this:
id | instance_id | value
1 | 1 | a
2 | 1 | ap
3 | 1 | app
4 | 1 | appl
5 | 2 | b
6 | 2 | bo
7 | 1 | apple
8 | 2 | boa
9 | 2 | boat
10 | 2 | boa
11 | 1 | appl
12 | 1 | apply
Basically, each row is a user typing a new letter. They can also delete letters.
I'd like to create a dataset that looks like this, let's call it data
id | instance_id | value
7 | 1 | apple
9 | 2 | boat
12 | 1 | apply
My goal is to extract all the complete words in each instance, accounting for deletion as well - so it's not sufficient to just get the longest word or the most recently typed.
To do so, I was planning to do a regex operation like so:
select * from data
where not exists (select * from data d2 where d2.value ~ (d.value || '.'))
Effectively I'm trying to build a dynamic regex that adds matches one character more than is present, and is specific to the row it's matching against.
The code above doesn't seem to work. In Python, I can "compile" a regex pattern before I use it. What is the equivalent in PostgreSQL to dynamically build a pattern?
Try simple LIKE operator instead of regex patterns:
SELECT * FROM data d1
WHERE NOT EXISTS (
SELECT * FROM data d2
WHERE d2.value LIKE d1.value ||'_%'
)
Demo: https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=cd064c92565639576ff456dbe0cd5f39
Create an index on value column, this should speed up the query a bit.
To find peaks in the sequential data window functions is a good choice. You just need to compare each value with previous and next ones using lag() and lead() functions:
with cte as (
select
*,
length(value) > coalesce(length(lead(value) over (partition by instance_id order by id)),0) and
length(value) > coalesce(length(lag(value) over (partition by instance_id order by id)),length(value)) as is_peak
from data)
select * from cte where is_peak order by id;
Demo

SQL for comparison of strings comprised of number and text

I need to compare 2 strings that contains number and possibly text. for example I have this table:
id | label 1 | label 2 |
1 | 12/H | 1 |
2 | 4/A | 41/D |
3 | 13/A | 3/F |
4 | 8/A | 8/B |
..
I need to determine the direction so that if Label 1 < Label2 then Direction is W (with) else it is A (against). So I have to build a view that presents data this way:
id | Direction
1 | A |
2 | W |
3 | A |
4 | W |
..
I'm using postgres 9.2.
WITH x AS (
SELECT id
,split_part(label1, '/', 1)::int AS l1_nr
,split_part(label1, '/', 2) AS l1_txt
,split_part(label2, '/', 1)::int AS l2_nr
,split_part(label2, '/', 2) AS l2_txt
FROM t
)
SELECT id
,CASE WHEN (l1_nr, l1_txt) < (l2_nr, l2_txt)
THEN 'W' ELSE 'A' END AS direction
FROM x;
I split the two parts with split_part() and check with an ad-hoc row type to check which label is bigger.
The cases where both labels are equal or where either one is NULL have not been defined.
The CTE is not necessary, it's just to make it easier to read.
-> sqlfiddle
You can try something like:
SELECT id, CASE WHEN regexp_replace(label_1,'[^0-9]','','g')::numeric <
regexp_replace(label_2,'[^0-9]','','g')::numeric
THEN 'W'
ELSE 'A'
END
FROM table1
regexp_replace deletes all non numeric characters from the string ::numeric converts the string to numeric.
Details here: regexp_replace, pattern matching, CASE WHEN