How to make this sql with query array_agg? - sql

I want to make a query
select * from projects where user_id = 3;
and depending on it's result r, I need to make n queries, where n is the length l of r. eg:
| id | project_name | description | user_id |
| 1 | Project A | lorem ipsu | 3 |
| 4 | Project B | lorem ipsu | 3 |
l => 2 then:
select * from images where project_id = 1;
select * from images where project_id = 4;
Ok, you can see where this is going if l is too big. Too many selects, too many access to the database. Is there a better way to achieve an end result like so:
| id | project_name | description | user_id | images |
| 1 | Project A | lorem ipsu | 3 | {imgX,imgY,imgZ} |
| 4 | Project B | lorem ipsu | 3 | {imgA,imgB} |
I heard about array_agg function on postgres. Maybe that's the answer? Anyways, these are my table descriptions:
Table "public.projects"
Column | Type | Modifiers
-------------+--------------------------+-------------------------------------------------------
id | integer | not null default nextval('projects_id_seq'::regclass)
name | character varying(255) |
description | character varying(255) |
user_id | integer |
created_at | timestamp with time zone |
updated_at | timestamp with time zone |
Table "public.images"
Column | Type | Modifiers
------------+--------------------------+-----------------------------------------------------
id | integer | not null default nextval('images_id_seq'::regclass)
name | character varying(255) |
url | character varying(255) |
project_id | integer |
created_at | timestamp with time zone |
updated_at | timestamp with time zone |
And thank you in advance :D

array_agg is like any other aggregate function (count, sum), but returns an array instead of a scalar value. What you need can be achieved simply by joining and grouping the 2 tables.
SELECT p.id, p.name, p.description, p.user_id, array_agg(i.name) images
FROM projects p
LEFT JOIN images i ON p.id = i.project_id
GROUP BY p.id, p.name, p.description, p.user_id

You want records from projects plus records in image names as array matching by project_id :
SELECT *
FROM projects
LEFT JOIN LATERAL (SELECT array_agg(name) AS images FROM images WHERE project_id = projects.project_id) x ON true
WHERE user_id = '3'
sqlfiddle

Probably the easiest solution for you is a sub-select. This comes closest to the individual SELECT statements that you mention earlier:
SELECT * FROM images
WHERE project_id IN (
SELECT project_id FROM projects
WHERE user_id = 3);

Related

Response Slow when Order BY is added to my SQL Query

I have the following job_requests table schema as shown here
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| available_to. | integer[] | NO | | | |
| available_type | varchar(255) | NO | | NULL | |
| start_at | varchar(255) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
I have the following query to return a list of records and order them by the type_of_pool value
WITH matching_jobs AS (
SELECT
job_requests_with_distance.*,
CASE WHEN (users.id = ANY (available_to) AND available_type = 0) THEN 'favourite'
ELSE 'normal'
END AS type_of_pool
FROM (
SELECT
job_requests.*,
users.id AS user_id,
FROM
job_requests,
users) AS job_requests_with_distance
LEFT JOIN users ON users.id = user_id
WHERE start_at > NOW() at time zone 'Asia/Kuala_Lumpur'
AND user_id = 491
AND (user_id != ALL(coalesce(unavailable_to, array[]::int[])))
)
SELECT
*
FROM
matching_jobs
WHERE (type_of_pool != 'normal')::BOOLEAN
ORDER BY
array_position (ARRAY['favourite','exclusive','normal']::text[], type_of_pool),
LIMIT 30
If i remove the ORDER BY function, it takes about 3ms but when I add the ORDER BY function, it takes about 1.3seconds to run.
Not sure how do i optimize this query to make it faster? I have read using Indexes and all but not sure how an index will help in this scenario.
Any help is appreciated.

Oracle SQL query comparing multiple rows with same identifier

I'm honestly not sure how to title this - so apologies if it is unclear.
I have two tables I need to compare. One table contains tree names and nodes that belong to that tree. Each Tree_name/Tree_node combo will have its own line. For example:
Table: treenode
| TREE_NAME | TREE_NODE |
|-----------|-----------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 1 | E |
| 2 | A |
| 2 | B |
| 2 | D |
| 3 | C |
| 3 | D |
| 3 | E |
| 3 | F |
I have another table that contains names of queries and what tree_nodes they use. Example:
Table: queryrecord
| QUERY | TREE_NODE |
|---------|-----------|
| Alpha | A |
| Alpha | B |
| Alpha | D |
| BRAVO | A |
| BRAVO | B |
| BRAVO | D |
| CHARLIE | A |
| CHARLIE | B |
| CHARLIE | F |
I need to create an SQL where I input the QUERY name, and it returns any ‘TREE_NAME’ that includes all the nodes associated with the query. So if I input ‘ALPHA’, it would return TREE_NAME 1 & 2. If I ask it for CHARLIE, it would return nothing.
I only have read access, and don’t believe I can create temp tables, so I’m not sure if this is possible. Any advice would be amazing. Thank you!
You can use group by and having as follows:
Select t.tree_name
From tree_node t
join query_record q
on t.tree_node = q.tree_node
WHERE q.query = 'ALPHA'
Group by t.tree_name
Having count(distinct t.tree_node)
= (Select count(distinct q.tree_node) query_record q WHERE q.query = 'ALPHA');
Using an IN condition (a semi-join, which saves time over a join):
with prep (tree_node) as (select tree_node from queryrecord where query = :q)
select tree_name
from treenode
where tree_node in (select tree_node from prep)
group by tree_name
having count(*) = (select count(*) from prep)
;
:q in the prep subquery (in the with clause) is the bind variable to which you will assign the various QUERY values at runtime.
EDIT
I don't generally set up the test case on online engines; but in a comment below this answer, the OP said the query didn't work for him. So, I set up the example on SQLFiddle, here:
http://sqlfiddle.com/#!4/b575e/2
A couple of notes: for some reason, SQLFiddle thinks table names should be at most eight characters, so I had to change the second table name to queryrec (instead of queryrecord). I changed the name in the query, too, of course. And, second, I don't know how I can give bind values on SQLFiddle; I hard-coded the name 'Alpha'. (Note also that in the OP's sample data, this query value is not capitalized, while the other two are; of course, text values in SQL are case sensitive, so one should pay attention when testing.)
You can do this with a join and aggregation. The trick is to count the number of nodes in query_record before joining:
select qr.query, t.tree_name
from (select qr.*,
count(*) over (partition by query) as num_tree_node
from query_record qr
) qr join
tree_node t
on t.tree_node = qr.tree_node
where qr.query = 'ALPHA'
group by qr.query, t.tree_name, qr.num_tree_node
having count(*) = qr.num_tree_node;
Here is a db<>fiddle.

Postgresql - How to remove last one from array_agg() in one select query?

I have a special need with below table
Table "public.skill_name"
Column | Type | Collation | Nullable | Default
----------+---------+-----------+----------+---------
position | integer | | not null |
value | text | | not null |
id | text | | not null |
skill | text | | |
Indexes:
"skill_name.id" UNIQUE, btree (id)
Foreign-key constraints:
"skill_name_skill_fkey" FOREIGN KEY (skill) REFERENCES skill(id) ON DELETE SET NULL
and some sample data like below:
position | value | id | skill
----------+---------------------------------------------------------------------------------------+-----------------------------+-----------------------------
1000 | Python | ck5bxmk67101790acuf05cikujw | ck5bxmk62101789acuf7pj1qmj6
2000 | Python Language | ck5bxmk69101791acufih7mc6u6 | ck5bxmk62101789acuf7pj1qmj6
3000 | Stdlib | ck5bxmk6c101792acuflzcu2avg | ck5bxmk62101789acuf7pj1qmj6
4000 | functools | ck5bxmk6e101793acuf42ih0evn | ck5bxmk62101789acuf7pj1qmj6
5000 | lru_cache | ck5bxmk6g101794acuf690rjgzp | ck5bxmk62101789acuf7pj1qmj6
1000 | Python | ck5bxysvp102005acuf6unt4cb7 | ck2wk5gba044342xbyaulv17i
2000 | Python Language | ck5bxysvs102012acuf5862l0gx | ck2wk5gba044342xbyaulv17i
3000 | Python Syntax | ck5bxysvu102021acufjcmxi1ij | ck2wk5gba044342xbyaulv17i
4000 | Classes | ck5bxysvx102030acufbaz3kml3 | ck2wk5gba044342xbyaulv17i
5000 | metaclasses | ck5bxysvz102037acufa5lmbuhj | ck2wk5gba044342xbyaulv17i
The requirement is to generate a result like below(NOTE: The last one group by skill been excluded in column path)
skill | path
-----------------------------+---------------------------------------------------------------------------------------
ck5bxmk62101789acuf7pj1qmj6 | Python,Python Language,Stdlib,functools
ck2wk5gba044342xbyaulv17i | Python,Python Language,Python Syntax,Classes
I have below sql but it does not work, it complains more than one row returned by a subquery used as an expression
SELECT
skill,
ARRAY_REMOVE(
ARRAY_AGG(value),
(
SELECT
skill_name.value
FROM (
SELECT
*,
skill AS skill_id,
LAST_VALUE(position) OVER (
PARTITION BY skill
ORDER BY position
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_pos
FROM skill_name
) skill_name
WHERE last_pos=position
GROUP BY skill_name.value
)
) as path
FROM skill_name
GROUP BY skill;
I do not know how to fix that, could any one help?
You could use row_number() to locate and eliminate the last value from the resultset before aggregating:
select skill, array_agg(value order by position) path
from (
select t.*, row_number() over(partition by skill order by position desc) rn
from mytable t
) t
where rn > 1
group by skill
Postgres has pretty sophisticated array functions. You don't need a subquery to do this:
select skill,
(array_agg(value order by position))[1:count(*) - 1] as path
from t
group by skill

SQL - get from table and join with same table

I have two tables, ChatRoom and ChatRoomMap, I want to get a list of chatrooms a user belongs to, along with all the other users in each chatroom.
// this contains a map of user to chatroom, listing which user is in what room
CREATE TABLE ChatRoomMap
(
user_id bigint NOT NULL,
chatroom_id text NOT NULL,
CONSTRAINT uniq UNIQUE (userid, roomid)
)
// sample values
==========================
| user_id | chatroom_id |
| 1 | 7 |
| 1 | blue |
| 7 | red |
==========================
And
CREATE TABLE ChatRoom
(
id text NOT NULL,
admin bigint,
name text,
created timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT uniqid UNIQUE (id)
)
// sample values
======================================================
| id | admin | name | timestamp |
| blue | 7 | blue room | now() |
| red | 2 | red | now() |
| 7 | 11 | mine | now() |
======================================================
To get a list of rooms a user is in, I can do:
SELECT DISTINCT ON (id) id, userid, name, admin
FROM ChatRoomMap, ChatRoom WHERE ChatRoomMap.user_id = $1 AND ChatRoomMap.chatroom_id = ChatRoom.id
This will get me a distinct list of chat rooms a user is in.
I would like to get the distinct list of rooms along with all the users in each room (concatenation of all as a separate column), how can this be done?
Example result:
=======================================================
| user_id | chatroom_id | name | admin | other_users |
| 10 | 7 | One | 1 | 1, 2, 3, 8 |
| 10 | 4 | AAA | 10 | 7, 11, 15 |
=======================================================
First up, use proper joins - the explicit join syntax was introduced to the SQL92 standard and the major vendors implemented it in the early 2000's (and it's the only way to achieve an outer join).
Try this:
SELECT DISTINCT id, crm2.user_id, name, admin,
FROM ChatRoomMap crm1
JOIN ChatRoom ON crm1.chatroom_id = ChatRoom.id
LEFT JOIN ChatRoomMap crm2 ON crm2.chatroom_id = crm1.chatroom_id
AND crm2.user_id != crm1.user_id -- only other users
WHERE crm1.user_id = $1
The LEFT JOIN is needed in case there are no other users in the room it will still list the room (with a null for other user id).

obtaining unique/distinct values from multiple unassociated columns

I have a table in a postgresql-9.1.x database which is defined as follows:
# \d cms
Table "public.cms"
Column | Type | Modifiers
-------------+-----------------------------+--------------------------------------------------
id | integer | not null default nextval('cms_id_seq'::regclass)
last_update | timestamp without time zone | not null default now()
system | text | not null
owner | text | not null
action | text | not null
notes | text
Here's a sample of the data in the table:
id | last_update | system | owner | action |
notes
----+----------------------------+----------------------+-----------+------------------------------------- +-
----------------
584 | 2012-05-04 14:20:53.487282 | linux32-test5 | rfell | replaced MoBo/CPU |
34 | 2011-03-21 17:37:44.301984 | linux-gputest13 | apeyrovan | System deployed with production GPU |
636 | 2012-05-23 12:51:39.313209 | mac64-cvs11 | kbhatt | replaced HD |
211 | 2011-09-12 16:58:16.166901 | linux64-test12 | rfell | HD swap |
drive too small
What I'd like to do is craft a SQL query that returns only the unique/distinct values from the system and owner columns (and filling in NULLs if the number of values in one column's results is less than the other column's results), while ignoring the association between them. So something like this:
system | owner
-----------------+------------------
linux32-test5 | apeyrovan
linux-gputest13 | kbhatt
linux64-test12 | rfell
mac64-cvs11 |
The only way that I can figure out to get this data is with two separate SQL queries:
SELECT system FROM cms GROUP BY system;
SELECT owner FROM cms GROUP BY owner;
Far be it from me to inquire why you would want to do such a thing. The following query does this by doing a join, on a calculated column using the row_number() function:
select ts.system, town.owner
from (select system, row_number() over (order by system) as seqnum
from (select distinct system
from t
) ts
) ts full outer join
(select owner, row_number() over (order by owner) as seqnum
from (select distinct owner
from t
) town
) town
on ts.seqnum = town.seqnum
The full outer join makes sure that the longer of the two lists is returned in full.