PostgreSQL How to query String array - sql

I am trying to write a query to check if an element is within an array of Strings.
Here is my simple select query along with the output
select languages from person limit 3;
{CSS,HTML,Java,JavaScript,Python}
{JavaScript,Python,TensorFlow}
{C++,Python}
How do I write a query to find all people who have "Java" as a listed language they know?
I tried following the syntax but it isn't working.
select languages from person where languages #> ARRAY['Java']::varchar[];

You need to use a string constant on the left side, and the ANY operator on the array column:
select languages
from person
where 'Java' = any(languages);
This assumes languages is defined as text[] or varchar[] as your sample output indicates

try this
select languages from person where 'Java' = ANY (string_to_array(languages , ','))

You can search for more than one pattern replacing '=' operator by the regular expression match operator '~' preceding by a POSIX regular expression, such as:
select languages from person where '[Java,Php]' ~ ANY (string_to_array(languages , ','))

Related

How to get the nth match from regexp_matches() as plain text

I have this code:
with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web)
select regexp_matches(replace(lower(web),'www.',''),'([^\.]*)') from demo
And the table I get is:
regexp_matches
{hello}
{hi}
What I would like to do is:
with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web)
select regexp_matches(replace(lower(web),'www.',''),'([^\.]*)')[1] from demo
Or even the big query version:
with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web)
select regexp_matches(replace(lower(web),'www.',''),'([^\.]*)')[offset(1)] from demo
But neither works. Is this possible? If it isn't clear, the result I would like is:
match
hello
hi
Use split_part() instead. Simpler, faster. To get the first word, before the first separator .:
WITH demo(web) AS (
VALUES
('WWW.HELLO.COM')
, ('hi.co.uk')
)
SELECT split_part(replace(lower(web), 'www.', ''), '.', 1)
FROM demo;
db<>fiddle here
See:
Split comma separated column data into additional columns
regexp_matches() returns setof text[], i.e. 0-n rows of text arrays. (Because each regular expression can result in a set of multiple matching strings.)
In Postgres 10 or later, there is also the simpler variant regexp_match() that only returns the first match, i.e. text[]. Either way, the surrounding curly braces in your result are the text representation of the array literal.
You can take the first row and unnest the first element of the array, but since you neither want the set nor the array to begin with, use split_part() instead. Simpler, faster, and less versatile. But good enough for the purpose. And it returns exactly what you want to begin with: text.
I'm a little confused. Doesn't this do what you want?
with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web
)
select (regexp_matches(replace(lower(web), 'www.',''), '([^\.]*)'))[1]
from demo
This is basically your query with extra parentheses so it does not generate a syntax error.
Here is a db<>fiddle illustrating that it returns what you want.

How run Select Query with LIKE on thousands of rows

Newbie here. Been searching for hours now but I can seem to find the correct answer or properly phrase my search.
I have thousands of rows (orderids) that I want to put on an IN function, I have to run a LIKE at the same time on these values since the columns contains json and there's no dedicated table that only has the order_id value. I am running the query in BigQuery.
Sample Input:
ORD12345
ORD54376
Table I'm trying to Query: transactions_table
Query:
SELECT order_id, transaction_uuid,client_name
FROM transactions_table
WHERE JSON_VALUE(transactions_table,'$.ordernum') LIKE IN ('%ORD12345%','%ORD54376%')
Just doesn't work especially if I have thousands of rows.
Also, how do I add the order id that I am querying so that it appears under an order_id column in the query result?
Desired Output:
Option one
WITH transf as (Select order_id, transaction_uuid,client_name , JSON_VALUE(transactions_table,'$.ordernum') as o_num from transactions_table)
Select * from transf where o_num like '%ORD12345%' or o_num like '%ORD54376%'
Option two
split o_num by "-" as separator , create table of orders like (select 'ORD12345' as num
Union
Select 'ORD54376' aa num) and inner join it with transf.o_num
One method uses OR:
WHERE JSON_VALUE(transactions_table, '$.ordernum') LIKE IN '%ORD12345%' OR
JSON_VALUE(transactions_table, '$.ordernum') LIKE '%ORD54376%'
An alternative method uses regular expressions:
WHERE REGEXP_CONTAINS(JSON_VALUE(transactions_table, '$.ordernum'), 'ORD12345|ORD54376')
According to the documentation, here, the LIKE operator works as described:
Checks if the STRING in the first operand X matches a pattern
specified by the second operand Y. Expressions can contain these
characters:
A percent sign "%" matches any number of characters or
bytes.
An underscore "_" matches a single character or byte.
You can escape "\", "_", or "%" using two backslashes. For example, "\%". If
you are using raw strings, only a single backslash is required. For
example, r"\%".
Thus , the syntax would be like the following:
SELECT
order_id,
transaction_uuid,
client_name
FROM
transactions_table
WHERE
JSON_VALUE(transactions_table,
'$.ordernum') LIKE '%ORD12345%'
OR JSON_VALUE(transactions_table,
'$.ordernum') LIKE '%ORD54376%
Notice that we specify two conditions connected with the OR logical operator.
As a bonus information, when querying large datasets it is a good pratice to select only the columns you desire in your out output ( either in a Temp Table or final view) instead of using *, because BigQuery is columnar, one of the reasons it is faster.
As an alternative for using LIKE, you can use REGEXP_CONTAINS, according to the documentation:
Returns TRUE if value is a partial match for the regular expression, regex.
Using the following syntax:
REGEXP_CONTAINS(value, regex)
However, it will also work if instead of a regex expression you use a STRING between single/double quotes. In addition, you can use the pipe operator (|) to allow the searched components to be logically ordered, when you have more than expression to search, as follows:
where regexp_contains(email,"gary|test")
I hope if helps.

SQL Substring \g

I would just like to know where do I put the \g in this query?
SELECT project,
SUBSTRING(address FROM 'A-Za-z') AS letters,
SUBSTRING(address FROM '\d') AS numbers
FROM repositories
I tried this but this brings back nothing (it doesn't throw an error though)
SELECT project,
SUBSTRING(CONCAT(address, '#') FROM 'A-Za-z' FOR '#') AS letters,
SUBSTRING(CONCAT(address, '#') FROM '\d' FOR '#') AS numbers
FROM repositories
Here is an example: I would like the string 1DDsg6bXmh3W63FTVN4BLwuQ4HwiUk5hX to return DDsgbXmhWFTVNBLwuQHwiUkhX. So basically return all the letters...and then my second one is to return all the numbers.
The g (“global”) modifier in regular expressions indicates that all matches rather than only the first one should be used.
That doesn't make much sense in the substring function, which returns only a single value, namely the first match. So there is no way to use g with substring.
In those functions where it makes sense in PostgreSQL (regexp_replace and regexp_matches), the g can be specified in the optional last flags parameter.
If you want to find all substrings that match a pattern, use regexp_matches.
For your example, which really has nothing to do with substring at all, I'd use
SELECT translate('1DDsg6bXmh3W63FTVN4BLwuQ4HwiUk5hX', '0123456789', '');
translate
---------------------------
DDsgbXmhWFTVNBLwuQHwiUkhX
(1 row)
So this is not pure SQL but Postgresql, but this also does the job:
SELECT project,
regexp_replace(address, '[^A-Za-z]', '', 'g') AS letters,
regexp_replace(address, '[^0-9]', '', 'g') AS numbers
FROM repositories;

Too big number for repeat range when using regexp_like in where clause

I tried to run the following query:
select * from table where regexp_like('^{{', text_field)
And got the following error:
too big number for repeat range
Thinking perhaps regexp_like is confusing { for the repeat count operator, I also tried the following variations:
select * from table where regexp_like('^\{\{', text_field)
select * from table where regexp_like('^[{][{]', text_field)
select * from table where regexp_like('^[[:punct:]]{2}', text_field)
None of which worked. For now, text_field like '{{' suffices, but I may want to include a more flexible version of this that would require regular expressions. What's wrong with my approach here? And what does this error message mean?
You are using the prestodb regex_like function in the wrong way:
regexp_like(string, pattern)
Evaluates the regular expression pattern and determines if it is
contained within string. This function is similar to the LIKE
operator, expect that the pattern only needs to be contained within
string, rather than needing to match all of string. In other words,
this performs a contains operation rather than a match operation. You
can match the entire string by anchoring the pattern using ^ and $:
SELECT regexp_like('1a 2b 14m', '\d+b'); -- true

Django queryset to find entries in all capital letters?

In raw SQL, it is possible to look through a database to find rows with a column for which the contents are written in all capital letters; that question was answered here.
Is there a way to accomplish the same thing using the Django ORM and without resorting to .raw() ?
You can use regular expression match in Django ORM. Link to documentation - https://docs.djangoproject.com/en/dev/ref/models/querysets/#iregex
Example:
Entry.objects.get(title__regex=r'^(An?|The) +')
SQL equivalents:
SELECT ... WHERE title REGEXP BINARY '^(An?|The) +'; -- MySQL
SELECT ... WHERE REGEXP_LIKE(title, '^(an?|the) +', 'c'); -- Oracle
SELECT ... WHERE title ~ '^(An?|The) +'; -- PostgreSQL
SELECT ... WHERE title REGEXP '^(An?|The) +'; -- SQLite
Using raw strings (e.g., r'foo' instead of 'foo') for passing in the regular expression syntax is recommended.
EDIT:
You can add the regular expression like:
Entry.objects.get(title__regex=r'^[[:upper:]]+$') #not tested
Figured it out. Looks like the best way is to use extra. For example,
MyModel.objects.extra(where=['title = UPPER(title)'])