Update from regexp matches in same table without using subquery - sql

I want to fill two columns from the results of a regular expression matching on a column of the same table.
Extracting the matches in an array is easy enough:
select regexp_matches(description, '(?i)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$') matches from room;
(note that only some of the rows match, not all of them)
But in order to do the update I didn't find anything simpler than
1) repeating the regex which would be ridiculous:
update room r set
link=(regexp_matches(description, '(?i)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$'))[1],
description=(regexp_matches(description, '(?i)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$'))[2]
where description ~ '(?i)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$';
2) a query with a subquery and an id join, which looks over complicated and probably not the most efficient:
update room r set link=matches[1], description=matches[2] from (
select id, regexp_matches(description, '(?i)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$') matches from room
) s where matches is not null and r.id=s.id;
What's the proper solution here ? I suspect one of the magical array functions of postgresql would do the trick, or another regexp related function, or maybe something even simpler.

From 9.5, you can use the following syntax:
with p(pattern) as (
select '(?in)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$'
)
update room
set (link, description) = (select m[1], m[2]
from regexp_matches(description, pattern) m)
from p
where description ~ pattern;
This way regexp_matches() executed only once, but this will execute your regex twice. If you want to avoid that you'll need to use a join anyway. Or, you could do:
update room
set (link, description) = (
select coalesce(m[1], l), coalesce(m[2], d)
from (select link l, description d) s,
regexp_matches(d, '(?in)^(https?://\S{4,220}\.(?:jpe?g|png))\s(.*)$') m
);
But this will "touch" every row no matter what. It will just don't modify the values of link and description when there is no match.

Related

How to select a column whose name is a value in another column in POSTGRESQL?

I know this isn't valid SQL, but I'd like to do something like:
SELECT items.{SELECT items.preferred_column}
To elaborate, to achieve what I'm trying to achieve, I could write a long case when statement:
SELECT
CASE WHEN items.preferred_column = "column_a" THEN items.column_a
CASE WHEN items.preferred_column = "column_b" THEN items.column_b
CASE WHEN items.preferred_column = "column_c" THEN items.column_c
... and so on...
But that seems wrong. I would prefer to write a query that looks at the value of items.preferred_column and loads that column.
Is this possible?
My use case involves an Active Record (the ORM for Rails) query, which limits me. I'm not able to use "INTO" for example.
Doing this without creating a SQL function would preferred, though if it's not possible without creating a SQL function that would be good to know.
Thanks in advance for lending your expertise!
You can try transforming the table rows with row_to_json() and then using json_each(), you can join the resultant "key" field on the preferred_column:
WITH CTE AS (
SELECT
row_to_json(Z.*)::jsonb as rcr,
row_number() over(partition by null order by <whatever comparator clause>) as rn,
Z.*
FROM items Z)
SELECT b.value, a.*
FROM CTE a, jsonb_each(rcr) b, CTE c
WHERE c.rn=a.rn AND b.key = ( c.preferred_column )
Note that this essentially operates as a quasi-pivot, so you'll need to maintain an index (the row_number invocation) to self-join the table when extracting the appropriate key-value pairs from jsonb_each's set-return. Casting to jsonb will be helpful in that the binary form will alphabetize the key-value pairs by key order within the object itself.
If you need to get the resultant value as a text string instead of a json primitive, you can do
b.value #>>'{}'
instead of using jsonb_each_text(), which will preserve any json columns.

how to use listagg operator so that the query should fetch comma separated values

SELECT (SELECT STRING_VALUE
FROM EMP_NODE_PROPERTIES
WHERE NODE_ID=AN.ID ) containedWithin
FROM EMP_NODE AN
WHERE AN.STORE_ID = ALS.ID
AND an.TYPE_QNAME_ID=(SELECT ID
FROM EMP_QNAME
where LOCAL_NAME = 'document')
AND
AND AN.UUID='13456677';
from the above query I am getting below error.
ORA-01427: single-row subquery returns more than one row
so how to change the above query so that it should fetch comma separated values
This query won't return error you mentioned because
there are two ANDs and
there's no ALS table (or its alias).
I suggest you post something that is correctly written, then we can discuss other errors.
Basically, it is either select string_value ... or select id ... (or even both of them) that return more than a single value.
The most obvious "solution" is to use select DISTINCT
another one is to include where rownum = 1
or, use aggregate functions, e.g. select max(string_value) ...
while the most appropriate option would be to join all tables involved and decide which row (value) is correct and adjust query (i.e. its WHERE clause) to make sure that desired value will be returned.
You would seem to want something like this:
SELECT LISTAGG(NP.STRING_VALUE, ',') WITHIN GROUP(ORDER BY NP.STRING_VALUE)
as containedWithin
FROM EMP_NODE N
JOIN EMP_QNAME Q
ON N.TYPE_QNAME_ID = Q.ID
LEFT JOIN EMP_NODE_PROPERTIES NP
ON NP.NODE_ID = N.ID
WHERE Q.LOCAL_NAME = 'document'
AND AN.UUID = '13456677';
This is a bit speculative because your original query would not run for the reason explained by Littlefoot.

Using CONTAINS to find items IN a table

I'm trying to write a SP that will allow users to search on multiple name strings, but supports LIKE functionality. For example, the user's input might be a string 'Scorsese, Kaurismaki, Tarkovsky'. I use a split function to turn that string into a table var, with one column, as follows:
part
------
Scorsese
Kaurismaki
Tarkovsky
Then, normally I would return any values from my table matching any of these values in my table var, with an IN statement:
select * from myTable where lastName IN (select * from #myTableVar)
However, this only returns exact matches, and I need to return partial matches. I'm looking for something like this, but that would actually compile:
select * from myTable where CONTAINS(lastName, select * from #myTableVar)
I've found other questions where it's made clear that you can't combine LIKE and IN, and it's recommended to use CONTAINS. My specific question is, is it possible to combine CONTAINS with a table list of values, as above? If so, what would that syntax look like? If not, any other workarounds to achieve my goal?
I'm using SQL Server 2016, if it makes any difference.
You can use EXISTS
SELECT * FROM myTable M
WHERE
EXISTS( SELECT * FROM #myTableVar V WHERE M.lastName like '%'+ V.part +'%' )
Can your parser built the entire statement? Will that get you what you want?
select *
from myTable
where CONTAINS
(lastName,
'"Scorsese" OR "Kaurismaki" OR "Tarkovsky"'
)
This can be done using CHARINDEX function combined with EXISTS:
select *
from myTable mt
where exists(select 1 from #myTableVar
where charindex(mt.lastName, part) > 0
or charindex(part, mt.lastName) > 0)
You might want to omit one of the conditions in the inner query, but I think this is what you want.

Determine the values of a long list which are not existing in a postgres table

Provided there is a long list of values, which happen to be values of attributes of records in a postgres-database.
I would like to create a query which finds out which of these values can not be found in the database.
I have no right to execute DDL-Statements and I would like to avoid procedural code.
Example:
the table might be
CREATE TABLE Test (
ID Integer,
attr varchar(30)
)
The list might be something like (but longer, about 240000 values)
ATTR
TestValue0
TestValue1
TestValue2
TestValue3
Using sed I can create and execute a statement
select count(*) from Test where attr in ('TestValue0',
'TestValue1','TestValue2','TestValue3')
This statement shows me, that not all of these values can be found in Test.
How can I formulate a query which tells me which of these uniq-values can not be found in the postgres-database?
For what you want to do, you can use left join, not in or not exists. But the key is that you need a derived table with the values you care about:
select v.attr
from (values ('TestValue0'), ('TestValue1'), ('TestValue2'), ('TestValue3')
) v attr
where not exists (select 1 from test t where t.attr = v.attr);

How to use a multi-element string for a IN sql query?

Is it possible to use the input from one field of the database for another query in combination with the IN statement. The point is that in the sting, I use for IN, contains several by comma separated values:
SELECT id, name
FROM refPlant
WHERE id IN (SELECT cover
FROM meta_characteristic
WHERE id = 2);
the string of the subquery is: 1735,1736,1737,1738,1739,1740,1741,1742,1743,1744
The query above give me only the first element of the string. But when I put the string directly in the query, I get all the ten elements:
SELECT id, name
FROM refPlant
WHERE id IN (735,1736,1737,1738,1739,1740,1741,1742,1743,1744);
Is it possible to have all ten elements and not only one with query like the first one.
My sql version is 10.1.16-MariaDB
You can use FIND_IN_SET in the join condition.
SELECT r.id, r.name
FROM refPlant r
JOIN (SELECT * FROM meta_characteristic m WHERE id=2) m
ON FIND_IN_SET(r.id,m.cover) > 0
If you use a sub-query as in the first code snippet you will get a filter for each row returned from it. It will not work when it returns as a single string field.
SELECT id, name
FROM refPlant
WHERE FIND_IN_SET(id, (SELECT cover
FROM meta_charateristic
WHERE id = 2));