Presto SQL query - sql

Let's assume that i have an array of strings with the following values:
string = {'123','12ab','38','abc','01a8','1123b'}
how should i do a query in Presto SQL to extract only the values containing only and only numerical digits, so that my output would be {'123','38'}?
doing something like the query below, does not returns any output
SELECT string
FROM table1
WHERE string LIKE '[0-9]*'
GROUP BY string
example of my data sample
enter image description here

There are at least two options:
leverage try_cast operator provided by Presto
-- sample data
WITH dataset(string) AS (
values ('123'),
('12ab'),
('38'),
('abc'),
('01a8'),
('1123b')
)
-- query
select *
from dataset
where try_cast(string as integer) is not null;
Or use regular expressions via regexp_like:
-- query
select *
from dataset
where regexp_like(string, '^\d+$');
Output:
string
123
38

Related

Selecting substrings from different points in strings depending on another column entry SQL

I have 2 columns that look a little like this:
Column A
Column B
Column C
ABC
{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}
1.0
DEF
{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}
24.0
I need a select statement to create column C - the numerical digits in column B that correspond to the letters in Column A. I have got as far as finding the starting point of the numbers I want to take out. But as they have different character lengths I can't count a length, I want to extract the characters from the calculated starting point( below) up to the next comma.
STRPOS(Column B, Column A) +5 Gives me the correct character for the starting point of a SUBSTRING query, from here I am lost. Any help much appreciated.
NB, I am using google Big Query, it doesn't recognise CHARINDEX.
You can use a regular expression as well.
WITH sample_table AS (
SELECT 'ABC' ColumnA, '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}' ColumnB UNION ALL
SELECT 'DEF', '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}' UNION ALL
SELECT 'XYZ', '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}'
)
SELECT *,
REGEXP_EXTRACT(ColumnB, FORMAT('"%s":([0-9.]+)', ColumnA)) ColumnC
FROM sample_table;
Query results
[Updated]
Regarding #Bihag Kashikar's suggestion: sinceColumnB is an invalid json, it will not be properly parsed within js udf like below. If it's a valid json, js udf with json key can be an alternative of a regular expression. I think.
CREATE TEMP FUNCTION custom_json_extract(json STRING, key STRING)
RETURNS STRING
LANGUAGE js AS """
try {
obj = JSON.parse(json);
}
catch {
return null;
}
return obj[key];
""";
SELECT custom_json_extract('{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}', 'ABC') invalid_json,
custom_json_extract('{"ABC":1.0,"DEF":24.0,"XYZ":10.50}', 'ABC') valid_json;
Query results
take a look at this post too, this shows using js udf and with split options
Error when trying to have a variable pathsname: JSONPath must be a string literal or query parameter

delimit the output of xml extract function in oracle [duplicate]

I have a CLOB column that contains XML type data. For example XML data is:
<A><B>123</b><C>456</C><B>789</b></A>
I have tried the concat function:
concat(xmltype (a.xml).EXTRACT ('//B/text()').getStringVal (),';'))
or
xmltype (a.xml).EXTRACT (concat('//B/text()',';').getStringVal ()))
But they are giving ";" at end only not after each <B> tag.
I am currently using
xmltype (a.xml).EXTRACT ('//B/text()').getStringVal ()
I want to concatenate all <B> with ; and expected result should be 123;789
Please suggest me how can I concatenate my data.
The concat() SQL function concatenates two values, so it's just appending the semicolon to each extracted value independently. But you're really trying to do string aggregation of the results (which could, presumably, really be more than two extracted values).
You can use XMLQuery instead of extract, and use an XPath string-join() function to do the concatentation:
XMLQuery('string-join(/A/B, ";")' passing xmltype(a.xml) returning content)
Demo with fixed XMl end-node tags:
-- CTE for sample data
with a (xml) as (
select '<A><B>123</B><C>456</C><B>789</B></A>' from dual
)
-- actual query
select XMLQuery('string-join(/A/B, ";")' passing xmltype(a.xml) returning content) as result
from a;
RESULT
------------------------------
123;789
You could also extract all of the individual <B> values using XMLTable, and then use SQL-level aggregation:
-- CTE for sample data
with a (xml) as (
select '<A><B>123</B><C>456</C><B>789</B></A>' from dual
)
-- actual query
select listagg(x.b, ';') within group (order by null) as result
from a
cross join XMLTable('/A/B' passing xmltype(a.xml) columns b number path '.') x;
RESULT
------------------------------
123;789
which gives you more flexibility and would allow grouping by other node values more easily, but that doesn't seem to be needed here based on your example value.

how to find the count of substring in string using BigQuery?

I want to find how many times "fizz" appears in "fizzbuzzfizz" string in bigquery or sql.
here output should be 2.
You can use REGEXP_EXTRACT_ALL and ARRAY_LENGTH, See this sql:
WITH data AS(
SELECT 'fizzbuzzfizz' as string
)
SELECT
ARRAY_LENGTH(REGEXP_EXTRACT_ALL(string, "fiz")) AS size FROM data;
Which produces this:

Transform set of data into a single column

Lets say I have this set of integers enclosed in the parenthesis (1,2,3,4,5).
Data I have:
(1,2,3,4,5)
And I would want them to be in a single column.
Expected Output:
column
--------
1
2
3
4
5
(5 rows)
How can I do this? I've tried using array then unnest but with no luck. I know I'm doing something wrong.
I need this to optimize a query that is using a large IN statement, I want to put it in a temp table then join it on the main table.
You can convert the string to an array, then do the unnest:
select *
from unnest(translate('(1,2,3,4,5)', '()', '{}')::int[]);
The translate() call converts '(1,2,3,4,5)' to '{1,2,3,4,5}' which is the string representation of an array. That string is then cast to an array using ::int[].
You don't need a temp table, you can directly join to the result of the unnest.
select *
from some_table t
join unnest(translate('(1,2,3,4,5)', '()', '{}')::int[]) as l(id)
on t.id = l.id;
Another option is to simply use that array in a where condition:
select *
from some_table t
where t.id = any (translate('(1,2,3,4,5)', '()', '{}')::int[]);

determine DB2 text string length

I am trying to find out how to write an SQL statement that will grab fields where the string is not 12 characters long. I only want to grab the string if they are 10 characters.
What function can do this in DB2?
I figured it would be something like this, but I can't find anything on it.
select * from table where not length(fieldName, 12)
From similar question DB2 - find and compare the lentgh of the value in a table field - add RTRIM since LENGTH will return length of column definition. This should be correct:
select * from table where length(RTRIM(fieldName))=10
UPDATE 27.5.2019: maybe on older db2 versions the LENGTH function returned the length of column definition. On db2 10.5 I have tried the function and it returns data length, not column definition length:
select fieldname
, length(fieldName) len_only
, length(RTRIM(fieldName)) len_rtrim
from (values (cast('1234567890 ' as varchar(30)) ))
as tab(fieldName)
FIELDNAME LEN_ONLY LEN_RTRIM
------------------------------ ----------- -----------
1234567890 12 10
One can test this by using this term:
where length(fieldName)!=length(rtrim(fieldName))
This will grab records with strings (in the fieldName column) that are 10 characters long:
select * from table where length(fieldName)=10
Mostly we write below statement
select * from table where length(ltrim(rtrim(field)))=10;