SQLite: How would you rephrase the query? - sql

I created a basic movie database and for that I'm working with SQLite.
I have a table, which looks like this:
CREATE TABLE movie_collection (
user_id INTEGER NOT NULL,
movie_id INTEGER NOT NULL,
PRIMARY KEY (user_id, movie_id),
FOREIGN KEY (user_id) REFERENCES user (id),
FOREIGN KEY (movie_id) REFERENCES movie (id)
)
As one simple task, I want to show one user (let's say user_id = 1) the whole movie collections, in which the actual user(user_id = 1) might or might not have some movie collection. I also have to prevent the multiple result sets, where more than one user have the same movie record in their collection, especially if this involves the actual user (user_id = 1) then he has the priority, that is if there are let's say 3 records as following:
user_id movie_id
-------- ---------
1 17
5 17
8 17
Then the result set must have the record (1, 17) and not other two.
For this task I wrote a sql query like this:
SELECT movie_collect.user_id, movie_collect.movie_id
FROM (
SELECT user_id, movie_id FROM movie_collection WHERE user_id = 1
UNION
SELECT user_id, movie_id FROM movie_collection WHERE user_id != 1 AND movie_id NOT IN (SELECT movie_id FROM movie_collection WHERE user_id = 1)
) AS movie_collect
Altough this query delivers pretty much that what I need, but just out of curiosity I wanted to ask, if someone else has an another idea to solve this problem.
Thank you.

The outer query is superfluous:
SELECT user_id, movie_id
FROM movie_collection
WHERE user_id = 1
UNION
SELECT user_id, movie_id
FROM movie_collection
WHERE user_id != 1
AND movie_id NOT IN (SELECT movie_id
FROM movie_collection
WHERE user_id = 1)
And UNION removes duplicates, so you do not need to check for uid in the second subquery:
SELECT user_id, movie_id
FROM movie_collection
WHERE user_id = 1
UNION
SELECT user_id, movie_id
FROM movie_collection
WHERE movie_id NOT IN (SELECT movie_id
FROM movie_collection
WHERE user_id = 1)
And the only difference between the two subqueries is the WHERE clause, so you can combine them:
SELECT user_id, movie_id
FROM movie_collection
WHERE user_id = 1
OR movie_id NOT IN (SELECT movie_id
FROM movie_collection
WHERE user_id = 1);

Related

PostgreSQL: Select the group with specific members

Given the tables below:
CREATE TABLE users (
id bigserial PRIMARY KEY,
name text NOT NULL
);
CREATE TABLE groups (
id bigserial PRIMARY KEY
);
CREATE TABLE group_members (
group_id bigint REFERENCES groups ON DELETE CASCADE,
user_id bigint REFERENCES users ON DELETE CASCADE,
PRIMARY KEY (group_id, user_id)
);
How do we select a group with a specific set of users?
We want an SQL function that takes an array of user IDs and returns the group ID (from the group_members table) with the exact same set of user IDs.
Also, please add indexes if they will make your solution faster.
First, we need to get "candidate" rows from group_members relation, and then with additional run ensure that group size is the same as user_ids array size (here I use CTE https://www.postgresql.org/docs/current/static/queries-with.html):
with target(id) as (
select * from unnest(array[2, 3]) -- here is your input
), candidates as (
select group_id
from group_members
where user_id in (select id from target) -- find all groups which include input
)
select group_id
from group_members
where group_id in (select group_id from candidates)
group by group_id
having array_length(array_agg(user_id), 1)
= array_length(array(select id from target), 1) -- filter out all "bigger" groups
;
Demonstration with some sample data: http://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=a98c09f20e837dc430ac66e01c7f0dd0
This query will utilize indexes you already have, but probably it's worth to add a separate index on group_members (user_id) to avoid intermediate hashing in the first stage of the CTE query.
SQL function is straightforward:
create or replace function find_groups(int8[]) returns int8 as $$
with candidates as (
select group_id
from group_members
where user_id in (select * from unnest($1))
)
select group_id
from group_members
where group_id in (select group_id from candidates)
group by group_id
having array_length(array_agg(user_id), 1) = array_length($1, 1)
;
$$ language sql;
See the same DBfiddle for demonstration.

How to get last edited post of every user in PostgreSQL?

I have user data in two tables like
1. USERID | USERPOSTID
2. USERPOSTID | USERPOST | LAST_EDIT_TIME
How do I get the last edited post and its time for every user? Assume that every user has 5 posts, and each one is edited at least once.
Will I have to write a loop iterating over every user, find the USERPOST with MAX(LAST_EDIT_TIME) and then collect the values? I tried GROUP BY, but I can't put USERPOSTID or USERPOST in an aggregate function. TIA.
Seems like something like this should work:
create table users(
id serial primary key,
username varchar(50)
);
create table posts(
id serial primary key,
userid integer references users(id),
post_text text,
update_date timestamp default current_timestamp
);
insert into users(username)values('Kalpit');
insert into posts(userid,post_text)values(1,'first test');
insert into posts(userid,post_text)values(1,'second test');
select *
from users u
join posts p on p.userid = u.id
where p.update_date =
( select max( update_date )
from posts
where userid = u.id )
fiddle: http://sqlfiddle.com/#!15/4b240/4/0
You can use a windowing function here:
select
USERID
, USERPOSTID
from
USERS
left join (
select
USERID
, row_number() over (
partition by USERID
order by LAST_EDIT_TIME desc) row_num
from
USERPOST
) most_recent
on most_recent.USERID = USERS.USERID
and row_num = 1

Array_agg alterantive - PostgreSQL

I am using postgreSQL version 8.3.4, which doesn't support the function "array_agg"
my definitions for the tables are:
create table photos (id integer, user_id integer, primary key (id, user_id));
create table tags (photo_id integer, user_id integer, info text, primary key (user_id, photo_id, info));
I came across this query, which gives me what I need:
SELECT photo_id
FROM tags t
GROUP BY 1
HAVING (SELECT count(*) >= 1
FROM (
SELECT photo_id
FROM tags
WHERE info = ANY(array_agg(t.info))
AND photo_id <> t.photo_id
GROUP BY photo_id
HAVING count(*) >= 1
) t1
)
but I can't use it because of my version.
Is there any alternative query to this one that I can use?
select
t2.photo_id, count(*)
from tags t1
join tags t2 on t1.info = t2.info and t1.photo_id <> t2.photo_id
group by t2.photo_id
;
and HAVING count = k if you want the exact k

Trying to remove an inner SQL select statement

I am making a music player where we have stations. I have a table called histories. It has data on the songs a user likes, dislikes or skipped. We store all the times that a person has liked a song or disliked it. We want to get a current snapshot of all the songs the user has either liked (event_type=1) or disliked (event_type=2) in a given station.
The table has the following rows:
id (PK int autoincrement)
station_id (FK int)
song_id (FK int)
event_type (int, either 1, 2, or 3)
Here is my query:
SELECT song_id, event_type, id
FROM histories
WHERE id IN (SELECT MAX(id) AS id
FROM histories
WHERE station_id = 187
AND (event_type=1 OR event_type=2)
GROUP BY station_id, song_id)
ORDER BY id;
Is there a way to make this query run without the inner select? I am pretty sure this will run a lot faster without it
You can use JOIN instead. Something like this:
SELECT h1.song_id, h1.event_type, h1.id
FROM histories AS h1
INNER JOIN
(
SELECT station_id, song_id, MAX(id) AS MaxId
FROM histories
WHERE station_id = 187
AND event_type IN (1, 2)
GROUP BY station_id, song_id
) AS h2 ON h1.station_id = h2.station_id
AND h1.song_id = h2.song_id
AND h1.id = h2.maxid
ORDER BY h1.id;
#Mahmoud Gamal answer is correct, you probably can get rid of the some conditions that is not needed.
SELECT h1.song_id, h1.event_type, h1.id
FROM histories AS h1
INNER JOIN
(
SELECT MAX(id) AS MaxId
FROM histories
WHERE station_id = 187
AND event_type IN (1, 2)
GROUP BY song_id
) AS h2 ON h1.id = h2.maxid
ORDER BY h1.id;
Based on your description, this is the answer:
SELECT DISTINCT song_id, event_type, id
FROM histories
WHERE station_id = 187
AND (event_type=1 OR event_type=2)
ORDER BY id
But you must be doing the MAX for some reason - why?

Find Specific Rows

I'm trying to build a rather specific query to find a set of user_ids based on topics they have registered to.
Unfortunately it's not possible to refactor the tables so I have to go with what I've got.
Single table with user_id and registration_id
I need to find all user_ids that have a registration_id of (4 OR 5) AND NOT 1
Each row is a single user_id/registration_id combination.
My SQL skills aren't the best, so I'm really scratching my brain. Any help would be greatly appreciated.
SELECT *
FROM (
SELECT DISTINCT user_id
FROM registrations
) ro
WHERE user_id IN
(
SELECT user_id
FROM registrations ri
WHERE ri.registration_id IN (4, 5)
)
AND user_id NOT IN
(
SELECT user_id
FROM registrations ri
WHERE ri.registration_id = 1
)
Most probably, user_id, registration_id is a PRIMARY KEY in your table. If it's not, then create a composite index on (user_id, registration_id) for this to work fast.
Possibly not the best way to do it (my SQL skills aren't the best either), but should do the job:
SELECT user_id
FROM table AS t
WHERE registration_id IN (4, 5)
AND NOT EXISTS (SELECT user_id
FROM table
WHERE user_id = t.user_id
AND registration_id = 1);
Another way with eliminating duplicates of user_id:
SELECT user_id
FROM registrations
WHERE registration_id IN (4, 5)
except
SELECT user_id
FROM registrations
WHERE registration_id =1
One use of a table:
select user_id from registrations
where registration_id <=5
group by user_id
having MIN(registration_id)>1 and MAX(registration_id)>= 4