Searching and returning all occurences of keyword in a Postgres text column - sql

About database
Database table for content of an Confluence page is named bodycontent and the HTML content is stored in column named body, which is a text field. Im using Postgres database. Primary key is named bodycontentid
Result I need
For each row in the table I need to find all occurence of <image> tag where src attribute starts with "http://images.mydomain.com/allImages/%" in the body column
Example
Let say that body with bodycontentid = 12345 contains following text:
<h1>Chapter 1</h1>
<image src="http://www.google.com/image/111.jpg"/>
<h1>Chapter 2</h1>
<image src="http://images.mydomain.com/allImages/222.jpg"/>
<h1>Chapter 3</h1>
<image src="http://images.mydomain.com/allImages/333.jpg"/>
Result after running this query should return:
bodycontentid: 12345
body: http://images.mydomain.com/allImages/222.jpg
bodycontentid: 12345
body: http://images.mydomain.com/allImages/333.jpg
What I have tried
Im able to find all rows that has at least one occurence of the keyword Im searching for (see below), but what I need is to get list of all keywords per row that is matching my query.
SELECT *
FROM bodycontent
WHERE body LIKE '%http://images.mydomain.com/allImages/%'

One method is to use regexp_split_to_table() and then some string manipulation:
select bc.bodycontentid,
left(rst.s, position('"' in rst.s) - 1) as domain
from bodycontent bc, lateral
regexp_split_to_table(bc.body, E'srce="') rst(s)
where rst.s like 'http://images.mydomain.com/allImages/%';

Related

How to add "x in y" clause in ExpressJS

In Postgres I have created a simple table called tags with these columns:
tag_id
tag
owner_id
In ExpressJS, this query works fine:
return pool.query(`SELECT tag_id, tag FROM tags WHERE owner_id = $1`, [ownerId]);
Now what I want to do is restrict which tags are returned via an array of tag values I'm passing in:
const tagsCsv = convertArrayToCSV(tags); // Example: "'abc','def'"
return pool.query(`SELECT tag_id, tag FROM tags WHERE owner_id = $1 AND tag IN ($2)`, [ownerId, tagsCsv]);
The code doesn't crash but it returns an empty array when I know for a fact that both abc & def are sample tags in my table.
I thus suspect something is wrong with my syntax but am not sure what. Might anyone have any ideas?
Robert
I did more searching and found this: node-postgres: how to execute "WHERE col IN (<dynamic value list>)" query?
Following the examples in there, I stopped converting the string array to a CSV string and instead did this:
const tags: Array<string> = values.tags;
return pool.query(`SELECT tag_id, tag FROM tags WHERE owner_id = $1 AND tag = ANY($2::text[])`, [ownerId, tags]);
This worked perfectly, returning the records I was expecting!

request where there are at least 2 records in other table

I have this doc xml(short version) :
<Artiste>
<artiste a_id="A62" a_p_id="UK" a_date_nais="07/06/1952" a_sexe="M">
<a_prenom>Liam</a_prenom>
<a_nom>Neeson</a_nom>
</artiste>
<artiste a_id="A66" a_p_id="UK" a_date_nais="08/09/1971" a_sexe="M">
<a_prenom>Martin</a_prenom>
<a_nom>Freeman</a_nom>
</artiste>
<Film>
<film f_id="F1" f_p_id="FR" f_r_id="A61">
<f_genre>P</f_genre>
<f_titre>Banlieue 13</f_titre>
<f_date_sortie>10/11/2004</f_date_sortie>
<f_resume>fiction française</f_resume>
<f_role ro_nom="Leïto" ro_a_id="A63"/>
<f_role ro_nom="Lola" ro_a_id="A64"/>
</film>
<film f_id="F2" f_p_id="NZ" f_r_id="A59">
<f_genre>A</f_genre>
<f_titre>Les seigneurs des anneaux</f_titre>
<f_date_sortie>19/12/2005</f_date_sortie>
<f_resume>fiction américaine</f_resume>
<f_role ro_nom="Pêcheur" ro_a_id="A25"/>
<f_role ro_nom="Sirène" ro_a_id="A2"/>
</film>
</Film>
An artist play in a film(movie), an artist has a 'a_id" field in Artiste which is the same then in ro_a_id in Film
I want to select the name and first name (a_prenom, a_nom) of every artists that have played in at least 2 movies (film)
This is what I've done :
for $artiste in doc('S:/path/file.xml')//Artiste/artiste
(: retrieve film $artiste is working in :)
let $film := ('S:/path/file.xml')//Film/film[#ro_a_id=$artiste/#a_id]
where count(#ro_a_id)>=2
order by $artiste/#a_id
return $x/a_nom, $x/a_prenom
So I don't know how to join and make the request, and I also don't know how to return 2 fields (I know that $x/a_nom, $x/a_prenom line generates an error)
You're very close, but your query has a couple of problems:
The return clause references an undefined variable $x. This should be changed to $artiste
Once that is fixed, you can return each actor's two name elements by constructing a sequence—by wrapping the items in parentheses: ($artiste/a_nom, $artiste/a_prenom). Alternatively you could return a single item, e.g., a string created by concatenating the two name parts, with concat($artiste/a_nom, " ", $artiste/a_prenom).
Your where clause should reference the $film variable—specifically, $film/f_role/#ro_a_id.
Your sample data here doesn't contain any artists who appear in more than two of the films listed. So the where clause, even if fixed, will result in 0 hits.
I've posted a revised query to http://xqueryfiddle.liberty-development.net/nbUY4kp/1 showing these suggested changes. You'll see that I commented out the where clause so that we get some results.
The easiest way to make your query work is as follows:
for $artiste in doc('S:/path/file.xml')//Artiste/artiste
(: retrieve film $artiste is working in :)
let $film := doc('S:/path/file.xml')//Film/film[f_role/#ro_a_id=$artiste/#a_id]
where count($film)>=2
order by $artiste/#a_id
return ($artiste/a_nom, $artiste/a_prenom)
Here are the things I changed:
The expression ('S:/path/file.xml') in line 3 should probably be doc('S:/path/file.xml').
#ro_a_id is an attribute of f_role, not film.
You have to count the film elements, #ro_a_id is not in scope on line 4.
$x is never declared, you probably mean $artiste.
The final problem in the last row is that the FLWOR expression ends after the comma, so $artiste/a_prenom is not part of it. You can solve that by surrounding both parts with parentheses.

SQL query WHERE column IN (#list#) not returning any results

I have the next simple query :
SELECT code, description
FROM table
WHERE code in ( #list# )
The list is created from an XML feed with listAppend():
<cfset list= listAppend(list, data.data1[i].xmltext )>
<cfset qualifiedList1 = ListQualify(list, "'")>
With listQualify I wrap every element from the list in quotation marks for the query. The problem is that when I run the query, I don't get any results back.
If I dump the list the query look like this :
SELECT code, description
FROM table
WHERE code in ('''BG/NN1'',''BG/NL2'',''BG/NN3'',''BG/NN4'',''BG/NN5''')
Any ideas on how can fix this problem?
Update 1:
I've fixed the problem.The problem was with ListQualify(list, "'")> Because list Qualify wraps every element in quotes the list attribute from cfqueryparam didn't recognized any of the values.Thank you!

Get all entries for a specific json tag only in postgresql

I have a database with a json field which has multiple parts including one called tags, there are other entries as below but I want to return only the fields with "{"tags":{"+good":true}}".
"{"tags":{"+good":true}}"
"{"has_temps":false,"tags":{"+good":true}}"
"{"tags":{"+good":true}}"
"{"has_temps":false,"too_long":true,"too_long_as_of":"2016-02-12T12:28:28.238+00:00","tags":{"+good":true}}"
I can get part of the way there with this statement in my where clause trips.metadata->'tags'->>'+good' = 'true' but that returns all instances where tags are good and true including all entries above. I want to return entries with the specific statement "{"tags":{"+good":true}}" only. So taking out the two entries that begin has_temps.
Any thoughts on how to do this?
With jsonb column the solution is obvious:
with trips(metadata) as (
values
('{"tags":{"+good":true}}'::jsonb),
('{"has_temps":false,"tags":{"+good":true}}'),
('{"tags":{"+good":true}}'),
('{"has_temps":false,"too_long":true,"too_long_as_of":"2016-02-12T12:28:28.238+00:00","tags":{"+good":true}}')
)
select *
from trips
where metadata = '{"tags":{"+good":true}}';
metadata
-------------------------
{"tags":{"+good":true}}
{"tags":{"+good":true}}
(2 rows)
If the column's type is json then you should cast it to jsonb:
...
where metadata::jsonb = '{"tags":{"+good":true}}';
If I get you right, you can check text value of the "tags" key, like here:
select true
where '{"has_temps":false,"too_long":true,"too_long_as_of":"2016-02-12T12:28:28.238+00:00","tags":{"+good":true}}'::json->>'tags'
= '{"+good":true}'

How to write Select query for selecting particular xml nodes in DB2 which occur multiple times?

I have a XML structure as below:
<root>
<firstChild>
<a>
<a1>someText</a1>
<a2>someNumber</a2>
<a>
<a>
<a1>someText1</a1>
<a2>someNumber1</a2>
<a>
<a>
<a1>someText2</a1>
<a2>someNumber2</a2>
<a>
<a>
<a1>someText3</a1>
<a2>someNumber3</a2>
<a>
</firstChild>
</root>
I want to write a DB2 SQL which will return all application id which have a1 as someText1 and a2 as someNumber1.
For more information I have a table say APPLICATION which has application_xml as column. This column has all the xml documents as shown above and are stored against each application id.
Can someone please suggest.
I have tried below query but it did not succeed.
select XMLQUERY('copy $new := $application_xml
for $i in $new/root/firstChild/a[a1 = "someText1"], $new/root/firstChild/a[a2 = "someNumber1"]
return $new') from application
Based on your description I assume that the table has two columns application id (aid) and application_xml. As you want to return the application id the base structure of the query is
select aid from application
Now we need the condition of which rows qualify. You state that in the related XML document the elements a1 and a2 need to have a certain value. The function xmlexists is the one to use in the WHERE clause of SQL:
select aid from application
where xmlexists('$d/root/firstChild/a[a1 = "someText1" and a2 = "someNumber1"]' passing application_xml as "d")
The XMLEXISTS is used as filtering predicate. The "passing" clause tells DB2 to expect "application_xml" under the name "d" inside the XPath/XQuery expression. The XPath expression itself is looking for the path /root/firstChild/a and under a specific "a" both the condition for "a1" and "a2" need to be true. If you want a broader condition, there would be also ways to express that.