Is it possible to reuse the result from a subquery in a subsequent subquery when creating a view in postgresql?
For example I have the following two tables:
CREATE TABLE application
(
id INT PRIMARY KEY,
name CHARACTER VARYING(255)
);
CREATE TABLE application_user
(
id INT PRIMARY KEY,
application_id INT REFERENCES application (id) ON DELETE CASCADE,
active BOOLEAN
);
-- some sample data
INSERT INTO application (id, name) VALUES
(10, 'application1'),
(20, 'application2'),
(30, 'application3');
INSERT INTO application_user (id, application_id, active) VALUES
(1, 10, true),
(2, 10, false),
(3, 20, false),
(4, 20, false),
(5, 20, false);
The view that I need looks (right now) as follows:
CREATE VIEW application_stats AS
SELECT a.name,
(SELECT COUNT(1) FROM application_user u
WHERE a.id = u.application_id) AS users,
(SELECT COUNT(1) FROM application_user u
WHERE a.id = u.application_id AND u.active = true) AS active_users
FROM application a;
This does give me the correct result:
name users active_users
application1 2 1
application2 3 0
application3 0 0
However it is also pretty inefficient since I'm using two times almost the same query and ideally I would like to reuse the result from the first query. Is there an efficient way to do this?
This would normally be expressed as a join/group by:
SELECT a.name, COUNT(au.application_id) as users,
SUM( (au.active = true)::int) as active_users
FROM application a LEFT JOIN
application_user au
ON a.name = au.application_id
GROUP BY a.name;
I'm rather surprised that application doesn't have a serial primary key. But because you are using name, perhaps the join is not needed at all:
SELECT au.application_id, COUNT(*) as users,
SUM( (au.active = true)::int) as active_users
FROM application_user au
GROUP BY au.application_id;
This will return applications that have at least one server.
You should join the two tables, group by application_id and use count with a FILTER (WHERE ...) clause to count only the rows you want:
CREATE VIEW application_stats AS
SELECT a.name
count(*) AS users,
count(*) FILTER (WHERE u.active) AS active_users
FROM application a
LEFT JOIN application_user u ON a.id = u.application_id
GROUP BY a.id;
Related
Let's say I have three sample tables for groups of people as shown below.
Table users:
id
name
available
1
John
true
2
Nick
true
3
Sam
false
Table groups:
id
name
1
study
2
games
Table group_users:
group_id
user_id
role
1
1
teach
1
2
stdnt
1
3
stdnt
2
1
tank
2
2
heal
And I need to show to a user all groups that he participates in and also available right now, which means all users in that group have users.available = true.
I tried something like:
SELECT `groups`.*, `users`.* , `group_users`.*
FROM `groups`
LEFT JOIN `group_users` ON `groups`.`id` = `group_users`.`group_id`
LEFT JOIN `users` ON `users`.`id` = `group_users`.`user_id`
WHERE `users`.`available` = true AND `users`.`id` = 1
But it just shows groups and part of their users, that are available. And I need to have ONLY the groups that have all their users available.
If I were to find all available groups as User 1 - I should get only group 2 and it's users. How to do this the right way?
Tables DDL:
CREATE TABLE users (
id int PRIMARY KEY,
name varchar(256) NOT NULL,
available bool
);
CREATE TABLE teams (
id int PRIMARY KEY,
name varchar(256) NOT NULL
);
CREATE TABLE team_users (
team_id int NOT NULL,
user_id int NOT NULL,
role varchar(64)
);
INSERT INTO users VALUES
(1, 'John', true ),
(2, 'Nick', true ),
(3, 'Sam' , false);
INSERT INTO teams VALUES
(1, 'study'),
(2, 'games');
INSERT INTO team_users VALUES
(1, 1, 'teach'),
(1, 2, 'stdnt'),
(1, 3, 'stdnt'),
(2, 1, 'tank' ),
(2, 2, 'heal' );
mySQL select version() output:
10.8.3-MariaDB-1:10.8.3+maria~jammy
Check do you need in this:
WITH cte AS (
SELECT users.name username,
teams.id teamid,
teams.name teamname,
SUM(NOT users.available) OVER (PARTITION BY teams.id) non_availabe_present,
SUM(users.name = #user_name) OVER (PARTITION BY teams.id) needed_user_present
FROM team_users
JOIN users ON team_users.user_id = users.id
JOIN teams ON team_users.team_id = teams.id
)
SELECT username, teamid, teamname
FROM cte
WHERE needed_user_present
AND NOT non_availabe_present;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=605cf10d147fd904fb2d4a6cd5968302
PS. I use user name as a criteria, you may edit and use user's identifier, of course.
Join the tables and aggregate with the conditions in the HAVING clause:
SELECT t.id, t.name
FROM teams t
INNER JOIN team_users tu ON t.id = tu.team_id
INNER JOIN users u ON u.id = tu.user_id
GROUP BY t.id
HAVING MIN(u.available) AND SUM(u.id = 1);
The HAVING clause is a simplification of:
HAVING MIN(u.available) = true AND SUM(u.id = 1) > 0
See the demo.
first you need to find those group which users is available. then find the all the group details of those group which is not related to those group which user is available.
SELECT * FROM team_users a
JOIN teams b ON a.team_id=b.id
JOIN users c ON a.user_id=c.id
WHERE NOT EXISTS
(
SELECT 1 FROM team_users tu
JOIN users u ON tu.user_id=u.id AND u.available =1
WHERE tu.team_id=a.Team_id
)
I have these tables which I would like to query:
create table employees
(
id bigint generated by default as identity (maxvalue 2147483647),
username varchar(100) not null,
password varchar(60) not null,
account_id bigint,
role_id bigint,
first_name varchar(150),
last_name varchar(150),
primary key (id)
);
create table accounts
(
id bigint generated by default as identity,
account_name varchar(150) not null,
account_group_id bigint not null,
primary key (id)
);
Test data:
insert into employees (id, username, password, account_id) values
(1, "test user", "pass", 3),
(2, "test user2 ", "pass", 4);
insert into accounts (id, account_name, account_group_id) values
(1, "main", 3),
(2, "second", 4);
(3, "third", 4);
I need to create a query which searches into table employees by account_name. I tried this:
Example when I send search param second I need to get a row result: test user2
SELECT * FROM common.employees e
WHERE e.??????? iLIKE CONCAT('%', :params, '%')
Do you know how I can join the tables?
You cannot directly parameterize Sql identifier(columns, tables), You can only parameterize values.
Prepared statements can take parameters: values that are substituted into the statement when it is executed.
https://www.postgresql.org/docs/current/sql-prepare.html
In your code. WHERE e.??????? cannot be easily parameterized. You need to use plpgsql functions.
prepare test(text,int) as SELECT e.* FROM employees e
join accounts a on e.account_id = a.id
WHERE a.account_name iLIKE CONCAT('%', $1, '%')
and a.account_group_id = $2;
If your already have test prepare statement in the active session then DEALLOCATE test;
suppose the account_group_id = 1 then:
execute test('third', 1);
Join the 2 tables like that (result here)
SELECT e.* FROM
employees e, accounts a
WHERE
e.account_id = a.id
and a.account_name = 'second'
To include columns account_group_id and account_id into the result you can get as below :
Though e.* will contain all the info that is present in employee table which include account_id as well. So if you want to customized your result set you can do that according to your need:
SELECT e.*,a.account_group_id
FROM employees e
INNER JOIN accounts a ON a.id = e.account_id
WHERE a.account_name = param
If you just use an inner join and join on the account table using the account_id and add a WHERE clause where you only select from employee where the account_name equals your param....which I'm guessing will be a varchar
SELECT e.*, a.account_group_id
FROM employees e
INNER JOIN accounts a ON a.id = e.account_id
WHERE a.account_name = param
or
WHERE a.account_name LIKE '%param%'
but the second may bring back other users as the param could exist in other names.
Also I don't believe the data in your example is correct as surely the account_id would link to the id in the accounts table...so passing second would in fact get you an employee who's account_id is 2.
Suppose I have the following table
DROP TABLE IF EXISTS #toy_example
CREATE TABLE #toy_example
(
Id int,
Pet varchar(10)
);
INSERT INTO #toy
VALUES (1, 'dog'),
(1, 'cat'),
(1, 'emu'),
(2, 'cat'),
(2, 'turtle'),
(2, 'lizard'),
(3, 'dog'),
(4, 'elephant'),
(5, 'cat'),
(5, 'emu')
and I want to fetch all Ids that have certain pets (for example either cat or emu, so Ids 1, 2 and 5).
DROP TABLE IF EXISTS #Pets
CREATE TABLE #Pets
(
Animal varchar(10)
);
INSERT INTO #Pets
VALUES ('cat'),
('emu')
SELECT Id
FROM #toy_example
GROUP BY Id
HAVING COUNT(
CASE
WHEN Pet IN (SELECT Animal FROM #Pets)
THEN 1
END
) > 0
The above gives me the error Cannot perform an aggregate function on an expression containing an aggregate or a subquery. I have two questions:
Why is this an error? If I instead hard code the subquery in the HAVING clause, i.e. WHEN Pet IN ('cat','emu') then this works. Is there a reason why SQL server (I've checked with SQL server 2017 and 2008) does not allow this?
What would be a nice way to do this? Note that the above is just a toy example. The real problem has many possible "Pets", which I do not want to hard code. It would be nice if the suggested method could check for multiple other similar conditions too in a single query.
If I followed you correctly, you can just join and aggregate:
select t.id, count(*) nb_of_matches
from #toy_example t
inner join #pets p on p.animal = t.pet
group by t.id
The inner join eliminates records from #toy_example that have no match in #pets. Then, we aggregate by id and count how many recors remain in each group.
If you want to retain records that have no match in #pets and display them with a count of 0, then you can left join instead:
select t.id, count(*) nb_of_records, count(p.animal) nb_of_matches
from #toy_example t
left join #pets p on p.animal = t.pet
group by t.id
How about this approach?
SELECT e.Id
FROM #toy_example e JOIN
#pets p
ON e.pet = p.animal
GROUP BY e.Id
HAVING COUNT(DISTINCT e.pet) = (SELECT COUNT(*) FROM #pets);
I have the following query, which retrieves 4 adverts from certain categories in a random order.
At the moment, if a user has more than 1 advert, then potentially all of those ads might be retrieved - I need to limit it so that only 1 ad per user is displayed.
Is this possible to achieve in the same query?
SELECT a.advert_id, a.title, a.url, a.user_id,
FLOOR(1 + RAND() * x.m_id) 'rand_ind'
FROM adverts AS a
INNER JOIN advert_categories AS ac
ON a.advert_id = ac.advert_id,
(
SELECT MAX(t.advert_id) - 1 'm_id'
FROM adverts t
) x
WHERE ac.category_id IN
(
SELECT category_id
FROM website_categories
WHERE website_id = '8'
)
AND a.advert_type = 'text'
GROUP BY a.advert_id
ORDER BY rand_ind
LIMIT 4
Note: The solution is the last query at the bottom of this answer.
Test Schema and Data
create table adverts (
advert_id int primary key, title varchar(20), url varchar(20), user_id int, advert_type varchar(10))
;
create table advert_categories (
advert_id int, category_id int, primary key(category_id, advert_id))
;
create table website_categories (
website_id int, category_id int, primary key(website_id, category_id))
;
insert website_categories values
(8,1),(8,3),(8,5),
(1,1),(2,3),(4,5)
;
insert adverts (advert_id, title, user_id) values
(1, 'StackExchange', 1),
(2, 'StackOverflow', 1),
(3, 'SuperUser', 1),
(4, 'ServerFault', 1),
(5, 'Programming', 1),
(6, 'C#', 2),
(7, 'Java', 2),
(8, 'Python', 2),
(9, 'Perl', 2),
(10, 'Google', 3)
;
update adverts set advert_type = 'text'
;
insert advert_categories values
(1,1),(1,3),
(2,3),(2,4),
(3,1),(3,2),(3,3),(3,4),
(4,1),
(5,4),
(6,1),(6,4),
(7,2),
(8,1),
(9,3),
(10,3),(10,5)
;
Data properties
each website can belong to multiple categories
for simplicity, all adverts are of type 'text'
each advert can belong to multiple categories. If a website has multiple categories that are matched multiple times in advert_categories for the same user_id, this causes the advert_id's to show twice when using a straight join between 3 tables in the next query.
This query joins the 3 tables together (notice that ids 1, 3 and 10 each appear twice)
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
inner join adverts a on a.advert_id = ac.advert_id and a.advert_type = 'text'
where wc.website_id='8'
order by a.advert_id
To make each website show only once, this is the core query to show all eligible ads, each only once
select *
from adverts a
where a.advert_type = 'text'
and exists (
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
where wc.website_id='8'
and a.advert_id = ac.advert_id)
The next query retrieves all the advert_id's to be shown
select advert_id, user_id
from (
select
advert_id, user_id,
#r := #r + 1 r
from (select #r:=0) r
cross join
(
# core query -- vvv
select a.advert_id, a.user_id
from adverts a
where a.advert_type = 'text'
and exists (
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
where wc.website_id='8'
and a.advert_id = ac.advert_id)
# core query -- ^^^
order by rand()
) EligibleAdsAndUserIDs
) RowNumbered
group by user_id
order by r
limit 2
There are 3 levels to this query
aliased EligibleAdsAndUserIDs: core query, sorted randomly using order by rand()
aliased RowNumbered: row number added to core query, using MySQL side-effecting #variables
the outermost query forces mysql to collect rows as numbered randomly in the inner queries, and group by user_id causes it to retain only the first row for each user_id. limit 2 causes the query to stop as soon as two distinct user_id's have been encountered.
This is the final query which takes the advert_id's from the previous query and joins it back to table adverts to retrieve the required columns.
only once per user_id
feature user's with more ads proportionally (statistically) to the number of eligible ads they have
Note: Point (2) works because the more ads you have, the more likely you will hit the top placings in the row numbering subquery
select a.advert_id, a.title, a.url, a.user_id
from
(
select advert_id
from (
select
advert_id, user_id,
#r := #r + 1 r
from (select #r:=0) r
cross join
(
# core query -- vvv
select a.advert_id, a.user_id
from adverts a
where a.advert_type = 'text'
and exists (
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
where wc.website_id='8'
and a.advert_id = ac.advert_id)
# core query -- ^^^
order by rand()
) EligibleAdsAndUserIDs
) RowNumbered
group by user_id
order by r
limit 2
) Top2
inner join adverts a on a.advert_id = Top2.advert_id;
I'm thinking through something but don't have MySQL available.. can you try this query to see if it works or crashes...
SELECT
PreQuery.user_id,
(select max( tmp.someRandom ) from PreQuery tmp where tmp.User_ID = PreQuery.User_ID ) MaxRandom
from
( select adverts.user_id,
rand() someRandom
from adverts, advert_categories
where adverts.advert_id = advert_categories.advert_id ) PreQuery
If the "tmp" alias is recognized as a temp buffer of the preliminary query as defined by the OUTER FROM clause, I might have something that will work... I think the field as a select statement from a queried from WONT work, but if it does, I know I'll have something solid for you.
Ok, this one might make the head hurt a bit, but lets get the logical thing going... The inner most "Core Query" is a basis that gets all unique and randomly assigned QUALIFIED Users that have a qualifying ad base on the category chosen, and type = 'text'. Since the order is random, I don't care what the assigned sequence is, and order by that. The limit 4 will return the first 4 entries that qualify. This is regardless of one user having 1 ad vs another having 1000 ads.
Next, join to the advertisements, reversing the table / join qualifications... but by having a WHERE - IN SUB-SELECT, the sub-select will be on each unique USER ID that was qualified by the "CoreQuery" and will ONLY be done 4 times based on ITs inner limit. So even if 100 users with different advertisements, we get 4 users.
Now, the Join to the CoreQuery is the Advert Table based on the same qualifying user. Typically this would join ALL records against the core query given they are for the same user in question... This is correct... HOWEVER, the NEXT WHERE clause is what filters it down to only ONE ad for the given person.
The Sub-Select is making sure its "Advert_ID" matches the one selected in the sub-select. The sub-select is based ONLY on the current "CoreQuery.user_ID" and gets ALL the qualifying category / ads for the user (wrong... we don't want ALL ads)... So, by adding an ORDER BY RAND() will randomize only this one person's ads in the result set... then Limiting THAT by 1 will only give ONE of their qualified ads...
So, the CoreQuery restricts down to 4 users. Then for each qualified user ID, gets only 1 of the qualified ads (by its inner order by RAND() and LIMIT 1 )...
Although I don't have MySQL to try, the queries are COMPLETELY legit and hope it works for you.... man, I love brain teasers like this...
SELECT
ad1.*
from
( SELECT ad.user_id,
count(*) as UserAdCount,
RAND() as ANYRand
from
website_categories wc
inner join advert_categories ac
ON wc.category_id = ac.category_id
inner join adverts ad
ON ac.advert_id = ad.advert_id
AND ad.advert_type = 'text'
where
wc.website_id = 8
GROUP BY
1
order by
3
limit
4 ) CoreQuery,
adverts ad1
WHERE
ad1.advert_type = 'text'
AND CoreQuery.User_ID = ad1.User_ID
AND ad1.advert_id in
( select
ad2.advert_id
FROM
adverts ad2,
advert_categories ac2,
website_categories wc2
WHERE
ad2.user_id = CoreQuery.user_id
AND ad2.advert_id = ac2.advert_id
AND ac2.category_id = wc2.category_id
AND wc2.website_id = 8
ORDER BY
RAND()
LIMIT
1 )
I like to suggest that you do the random with php. This is way faster than doing it in mySQL.
"However, when the table is large (over about 10,000 rows) this method of selecting a random row becomes increasingly slow with the size of the table and can create a great load on the server. I tested this on a table I was working that contained 2,394,968 rows. It took 717 seconds (12 minutes!) to return a random row."
http://www.greggdev.com/web/articles.php?id=6
set #userid = -1;
select
a.id,
a.title,
case when #userid = a.userid then
0
else
1
end as isfirst,
(#userid := a.userid)
from
adverts a
inner join advertcategories ac on ac.advertid = a.advertid
inner join categories c on c.categoryid = ac.categoryid
where
c.website = 8
order by
a.userid,
rand()
having
isfirst = 1
limit 4
Add COUNT(a.user_id) as owned in the main select directive and add HAVING owned < 2 after Group By
http://dev.mysql.com/doc/refman/5.5/en/select.html
I think this is the way to do it, if the one user has more than one advert then we will not select it.
I have a db schema which looks something like this:
create table user (id int, name varchar(32));
create table group (id int, name varchar(32));
create table group_member (group_id int, user_id int, flag int);
I want to write a query that allows me to so the following:
Given a valid user id (UID), fetch the ids of all users that are in the same group as the specified user id (UID) AND have group_member.flag=3.
Rather than just have the SQL. I want to learn how to think like a Db programmer. As a coder, SQL is my weakest link (since I am far more comfortable with imperative languages than declarative ones) - but I want to change that.
Anyway here are the steps I have identified as necessary to break down the task. I would be grateful if some SQL guru can demonstrate the simple SQL statements - i.e. atomic SQL statements, one for each of the identified subtasks below, and then finally, how I can combine those statements to make the ONE statement that implements the required functionality.
Here goes (assume specified user_id [UID] = 1):
//Subtask #1.
Fetch list of all groups of which I am a member
Select group.id from user inner join group_member where user.id=group_member.user_id and user.id=1
//Subtask #2
Fetch a list of all members who are members of the groups I am a member of (i.e. groups in subtask #1)
Not sure about this ...
select user.id from user, group_member gm1, group_member gm2, ... [Stuck]
//Subtask #3
Get list of users that satisfy criteria group_member.flag=3
Select user.id from user inner join group_member where user.id=group_member.user_id and user.id=1 and group_member.flag=3
Once I have the SQL for subtask2, I'd then like to see how the complete SQL statement is built from these subtasks (you dont have to use the SQL in the subtask, it just a way of explaining the steps involved - also, my SQL may be incorrect/inefficient, if so, please feel free to correct it, and point out what was wrong with it).
Thanks
Query 1 - Select all groups I am a member of.
You don't need a join here unless you also want the groups' names. Just check the group_member table.
SELECT group_id
FROM group_member
WHERE user_id = 1
Result:
1
3
Query 2: Select all users in one of the same groups as me.
You can self-join the group_member table to find all the users that are in the same group as each other and then add a where clause to only find all those that are in the same group as yourself. Add DISTINCT to make sure you don't get people twice.
SELECT DISTINCT T2.user_id
FROM group_member AS T1
JOIN group_member AS T2
ON T1.group_id = T2.group_id
WHERE T1.user_id = 1
AND T2.user_id <> 1 -- Remove myself
Result:
2
3
5
Query 3: Users who have flag 3 in any group.
You just need to check the group_member table. Again, add DISTINCT if you only want to see each user once.
SELECT DISTINCT user_id
FROM group_member
WHERE group_member.flag=3
Result:
2
3
4
Final query: Users in the same group as me who have flag 3.
This is almost the same as query two, just add an extra WHERE condition.
SELECT DISTINCT T2.user_id
FROM group_member AS T1
JOIN group_member AS T2
ON T1.group_id = T2.group_id
WHERE T1.user_id = 1
AND T2.user_id <> 1 -- Remove myself
AND T2.flag = 3
Result:
2
3
Test data:
create table user (id int, name varchar(32));
create table `group` (id int, name varchar(32));
create table group_member (group_id int, user_id int, flag int);
insert into user (id, name) VALUES (1, 'user1'), (2, 'user2'), (3, 'user3'), (4, 'user4'), (5, 'user5');
insert into `group` (id, name) VALUES (1, 'group1'),(2, 'group2'), (3, 'group3');
insert into group_member (group_id, user_id, flag) VALUES (1, 1, 0), (1, 2, 3), (1, 3, 3), (2, 3, 3), (2, 4, 3), (2, 5, 0), (3, 1, 0), (3, 5, 0);
This should find all people in the same group as a certain user, with the specified flag.
SELECT DISTINCT g2.user_id
FROM group_member AS g INNER JOIN group_member AS g2
ON g.group_id = g2.group_id
WHERE g.user_id = <the userid you want to find>
AND g2.flag = 3
To approach the problem, I did the following:
We need to compare two group_members, as we want to know which ones are in the same group, so we'll need to join group_member with itself. (On group_id, as we want the ones in the same group.)
I want to make sure that one of them is the user I want to compare it to, and then the other one must have the correct flag.
Once this is done, I simply need to pull out the user_id from the one I compare to my original user. Since I figure a user might be in two groups another user is in, it might also be wise to add a DISTINCT to ensure that we only get each user_id out of it once.
The beauty of SQL is that you can join these three selects together with one efficient join. In this case I used the WHERE version of a join because I find it easier to understand. But you might also look at the syntax for LEFT JOIN and INNER JOIN because they give you a lot of expressivity.
SELECT * FROM user, group_member, group
WHERE user.id = group_member.user_id
&& group_member.group_id = group.id
&& group_member.flag = 3