PostgreSQL query involving integer[] - sql

I have 2 tables:
CREATE TABLE article (
id serial NOT NULL,
title text,
tags integer[] -- array of tag id's from TAG table
)
CREATE TABLE tag (
id serial NOT NULL,
description character varying(250) NOT NULL
)
... and need to select tags from TAG table held in ARTICLE's 'tags integer[]' based on article's title.
So tried something like
SELECT *
FROM tag
WHERE tag.id IN ( (select article.tags::int4
from article
where article.title = 'some title' ) );
... which gives me
ERROR: cannot cast type integer[] to integer
LINE 1: ...FROM tag WHERE tag.id IN ( (select article.tags::int4 from ...
I am Stuck with PostgreSql 8.3 in both dev and production environment.

Use the array overlaps operator &&:
SELECT *
FROM tag
WHERE ARRAY[id] && ANY (SELECT tags FROM article WHERE title = '...');
Using contrib/intarray you can even index this sort of thing quite well.

Take a look at section "8.14.5. Searching in Arrays", but consider the tip at the end of that section:
Tip: Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.

You did not mention your Postgres version, so I assume you are using an up-to-date version (8.4, 9.0)
This should work then:
SELECT *
FROM tag
WHERE tag.id IN ( select unnest(tags)
from article
where title = 'some title' );
But you should really consider changing your table design.
Edit
For 8.3 the unnest() function can easily be added, see this wiki page:
http://wiki.postgresql.org/wiki/Array_Unnest

Related

Select records that match several tags

I implemented a standard tagging system on SQLite with two tables.
Table annotation:
CREATE TABLE IF NOT EXISTS annotation (
id INTEGER PRIMARY KEY,
comment TEXT
)
Table label:
CREATE TABLE IF NOT EXISTS label (
id INTEGER PRIMARY KEY,
annot_id INTEGER NOT NULL REFERENCES annotation(id),
tag TEXT NOT NULL
)
I can easily find the annotations that match tags 'tag1' OR 'tag2' :
SELECT * FROM annotation
JOIN label ON label.annot_id = annotation.id
WHERE label.tag IN ('tag1', 'tag2') GROUP BY annotation.id
But how do I select the annotations that match tags 'tag1' AND
'tag2'?
How do I select the annotations that match tags 'tag1'
AND 'tag2' but NOT 'tag3'?
Should I use INTERSECT? Is it efficient or is there a better way to express these?
I would definitely go with INTERSECT for question 1 and EXCEPT for question 2. After many years of experience with SQL I find it best to go with whatever the platform offers in cases where it directly addresses what you want to do.
The only exception would be if you had a really good reason not to. In this case, intersect and except are not ansi standard, so you are stuck with sqlite for as long as you use them.
If you want to go old school and use ONLY straight up SQL it is possible using subqueries, one for tag A, one for tag B, and one for tag C. Using an outer join with an "is null" condition is a common idiom to perform the exclusion.
Here is an sqlite example:
create table annotation (id integer, comment varchar);
create table label (id integer, annot_id integer, tag varchar);
insert into annotation values (1,'annot 1'),(2,'annot 2');
insert into label values (1,1,'tag1'),(2,1,'tag2'),(3,1,'tag2');
insert into label values (1,2,'tag1'),(2,2,'tag2'),(3,2,'tag3');
select distinct x.id,x.comment from annotation x
join label a on a.annot_id=x.id and a.tag='tag1'
join label b on b.annot_id=x.id and b.tag='tag2'
left join label c on c.annot_id=x.id and c.tag='tag3'
where
c.id is null;
This is set up so that both annotation 1 and 2 have tag1 and tag2 but label 2 has tag3 so should be excluded the output is only annotation 1:
id
comment
1
annot 1

SQLite, Many to many relations, How to aggregate?

I have the classic arrangement for a many to many relation in a small flashcard like application built using SQLite. Every card can have multiple tags, and every tag can have multiple cards. This two entities having each a table with a third table to link records.
This is the table for Cards:
CREATE TABLE Cards (CardId INTEGER PRIMARY KEY AUTOINCREMENT,
Text TEXT NOT NULL,
Answer INTEGER NOT NULL,
Success INTEGER NOT NULL,
Fail INTEGER NOT NULL);
This is the table for Tags:
CREATE TABLE Tags (TagId INTEGER PRIMARY KEY AUTOINCREMENT,
Name TEXT UNIQUE NOT NULL);
This is the cross reference table:
CREATE TABLE CardsRelatedToTags (CardId INTEGER,
TagId INTEGER,
PRIMARY KEY (CardId, TagId));
I need to get a table of cards with their associated tags in a column separated by commas.
I can already get what I need for a single row knowing its Id with the following query:
SELECT Cards.CardId, Cards.Text,
(SELECT group_concat(Tags.Name, ', ') FROM Tags
JOIN CardsRelatedToTags ON CardsRelatedToTags.TagId = Tags.TagId
WHERE CardsRelatedToTags.CardId = 1) AS TagsList
FROM Cards
WHERE Cards.CardId = 1
This will result in something like this:
CardId | Text | TagsList
1 | Some specially formatted text | Tag1, Tag2, TagN...
How to get this type of result (TagsList from group_concat) for every row in Cards using a SQL query? It is advisable to do so from the performance point of view? Or I need to do this sort of "presentation" work in application code using a simpler request to the DB?
Answering your code question:
SELECT
c.CardId,
c.Text,
GROUP_CONCAT(t.Name,', ') AS TagsList
FROM
Cards c
JOIN CardsRelatedToTags crt ON
c.CardId = crt.CardId
JOIN Tags t ON
crt.TagId = t.TagId
WHERE
c.CardId = 1
GROUP BY c.CardId, c.Text
Now, to the matter of performance. Databases are a powerful tool and do not end on simple SELECT statements. You can definitely do what you need inside a DB (even SQLite). It is a bad practice to use a SELECT statement as a feed for one column inside another SELECT. It would require scanning a table to get result for each row in your input.

What is the equivalent PostgreSQL syntax to Oracle's CONNECT BY ... START WITH?

In Oracle, if I have a table defined as …
CREATE TABLE taxonomy
(
key NUMBER(11) NOT NULL CONSTRAINT taxPkey PRIMARY KEY,
value VARCHAR2(255),
taxHier NUMBER(11)
);
ALTER TABLE
taxonomy
ADD CONSTRAINT
taxTaxFkey
FOREIGN KEY
(taxHier)
REFERENCES
tax(key);
With these values …
key value taxHier
0 zero null
1 one 0
2 two 0
3 three 0
4 four 1
5 five 2
6 six 2
This query syntax …
SELECT
value
FROM
taxonomy
CONNECT BY
PRIOR key = taxHier
START WITH
key = 0;
Will yield …
zero
one
four
two
five
six
three
How is this done in PostgreSQL?
Use a RECURSIVE CTE in Postgres:
WITH RECURSIVE cte AS (
SELECT key, value, 1 AS level
FROM taxonomy
WHERE key = 0
UNION ALL
SELECT t.key, t.value, c.level + 1
FROM cte c
JOIN taxonomy t ON t.taxHier = c.key
)
SELECT value
FROM cte
ORDER BY level;
Details and links to documentation in my previous answer:
Does PostgreSQL have a pseudo-column like "LEVEL" in Oracle?
Or you can install the additional module tablefunc which provides the function connectby() doing almost the same. See Stradas' answer for details.
Postgres does have an equivalent to the connect by. You will need to enable the module. Its turned off by default.
It is called tablefunc. It supports some cool crosstab functionality as well as the familiar "connect by" and "Start With". I have found it works much more eloquently and logically than the recursive CTE. If you can't get this turned on by your DBA, you should go for the way Erwin is doing it.
It is robust enough to do the "bill of materials" type query as well.
Tablefunc can be turned on by running this command:
CREATE EXTENSION tablefunc;
Here is the list of connection fields freshly lifted from the official documentation.
Parameter: Description
relname: Name of the source relation (table)
keyid_fld: Name of the key field
parent_keyid_fld: Name of the parent-key field
orderby_fld: Name of the field to order siblings by (optional)
start_with: Key value of the row to start at
max_depth: Maximum depth to descend to, or zero for unlimited depth
branch_delim: String to separate keys with in branch output (optional)
You really should take a look at the docs page. It is well written and it will give you the options you are used to. (On the doc page scroll down, its near the bottom.)
Postgreql "Connect by" extension
Below is the description of what putting that structure together should be like. There is a ton of potential so I won't do it justice, but here is a snip of it to give you an idea.
connectby(text relname, text keyid_fld, text parent_keyid_fld
[, text orderby_fld ], text start_with, int max_depth
[, text branch_delim ])
A real query will look like this. Connectby_tree is the name of the table. The line that starting with "AS" is how you name the columns. It does look a little upside down.
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')
AS t(keyid text, parent_keyid text, level int, branch text, pos int);
As indicated by Stradas I report the query:
SELECT value
FROM connectby('taxonomy', 'key', 'taxHier', '0', 0, '~')
AS t(keyid numeric, parent_keyid numeric, level int, branch text)
inner join taxonomy t on t.key = keyid;
For example, we have a table in PostgreSQL, its name is product_types. Our table columns are (id, parent_id, name, sort_order).
Our first selection should give (parent) a root line.
id = 76 will be our sql's top 1 parent record.
with recursive product_types as (
select
pt0.id,
pt0.parant_id,
pt0.name,
pt0.sort_order,
0 AS level
from product_types pt0
where pt0.id = 76
UNION ALL
select
pt1.id,
pt1.parant_id,
pt1.name,
pt1.sort_order, (product_types.level + 1) as level
from product_types pt1
inner join product_types on (pt1.parant_id = product_types.id )
)
select
*
from product_types
order by level, sort_order

Populate Temp Table Postgres

I have the following three tables in the postgres db of my django app:
publication {
id
title
}
tag {
id
title
}
publication_tags{
id
publication_id
tag_id
}
Where tag and publication have a many to many relationship.
I'd like to make a temp table with three columns: 1)publication title, 2)publication id, and 3)tags, where tags is a list (in the form of a string if possible) of all the tags on a given publication.
Thus far I have made the temp table and populated it with the publication id and publication title, but I don't know how to get the tags into it. This is what I have so far:
CREATE TEMP TABLE pubtags (pub_id INTEGER, pub_title VARCHAR(50), pub_tags VARCHAR(50))
INSERT INTO pubtags(pub_id, pub_title) SELECT id, title FROM apricot_app_publication
Can anyone advise me on how I would go about the last step?
Sounds like a job for string_agg:
string_agg(expression, delimiter)
input values concatenated into a string, separated by delimiter
So something like this should do the trick:
insert into pubtags (pub_id, pub_title, pub_tags)
select p.id, p.title, string_agg(t.title, ' ,')
from publication p
join publication_tags pt on (p.id = pt.publication_id)
join tag on (pt.tag_id = t.id)
group by p.id, p.title
You may want to adjust the delimiter, I guessed that a comma would make sense.
I'd recommend using TEXT instead of VARCHAR for your pub_tags so that you don't have to worry about the string aggregation overflowing the pub_tags length. Actually, I'd recommend using TEXT instead of VARCHAR period: PostgreSQL will treat them both the same except for wasting time on length checks with VARCHAR so VARCHAR is pointless unless you have a specific need for a limited length.
Also, if you don't specifically need pub_tags to be a string, you could use an array instead:
CREATE TEMP TABLE pubtags (
pub_id INTEGER,
pub_title TEXT,
pub_tags TEXT[]
)
and array_agg instead of string_agg:
insert into pubtags (pub_id, pub_title, pub_tags)
select p.id, p.title, array_agg(t.title)
-- as above...
Using an array will make it a lot easier to unpack the tags if you need to.

How can I match a comma separated list against a value?

Hello I have a table with articles and the articles have a column category. The categories are filled like 1,4,6,8
How can i check if there is a product which has in it for example category 5
I tried something like
select * from article where category in(5);
But that doesn't work. If I use like than i will have a problem with for example 1 and 10. So how can I do this in one mysql query?
Storing CSV in a column you need to query is a bad idea - you should use a separate table.
IN is not for CSVs - it's for listing values for a single column
Those arguments aside, you can use FIND_IN_SET()
For example:
SELECT * FROM article WHERE FIND_IN_SET('5', category) != 0;
You could do it with select * from article where category='5' or category like '5,%' or category like '%,5' or category like '%,5,%'
But you really don't want to do that.
Instead, what you're after is
create table article (
id INTEGER AUTO_INCREMENT NOT NULL PRIMARY KEY,
headline VARCHAR(50) NOT NULL,
body VARCHAR(50) NOT NULL
);
create table articlecategory (
article_id INTEGER NOT NULL,
category_id INTEGER NOT NULL,
PRIMARY KEY (article_id, category_id)
);
And then
SELECT
article.*
FROM
article,articlecategory
WHERE
articlecategory.article_id=article.id AND
articlecategory.category_id=5;
try this:
select *
from article
where category like '%,5,%' or category like '5,%' or category like '%,5'
well, that is not the best solution database wise, to have a column with comma separated values. Instead you could have a separate table with two columns, one for the category and one with a article id.
Then you could do:
select * from article, category_articles where category_article.category_id = category.id and category_article.category = 5
Take a look at this for more on database normalization, which this is kinda related to
But if that is not an option you could try using a delimiter and store the data like this: C1C10C, then you could use:
select * from article where category like '%C1C%'
which would not mach 10, but will match the 1.