SQL: How to select rows from a table while ignoring the duplicate field values? - sql

How to select rows from a table while ignoring the duplicate field values?
Here is an example:
id user_id message
1 Adam "Adam is here."
2 Peter "Hi there this is Peter."
3 Peter "I am getting sick."
4 Josh "Oh, snap. I'm on a boat!"
5 Tom "This show is great."
6 Laura "Textmate rocks."
What i want to achive is to select the recently active users from my db. Let's say i want to select the 5 recently active users. The problem is, that the following script selects Peter twice.
mysql_query("SELECT * FROM messages ORDER BY id DESC LIMIT 5 ");
What i want is to skip the row when it gets again to Peter, and select the next result, in our case Adam. So i don't want to show my visitors that the recently active users were Laura, Tom, Josh, Peter, and Peter again. That does not make any sense, instead i want to show them this way: Laura, Tom, Josh, Peter, (skipping Peter) and Adam.
Is there an SQL command i can use for this problem?

Yes. "DISTINCT".
SELECT DISTINCT(user_id) FROM messages ORDER BY id DESC LIMIT 5

Maybe you could exclude duplicate user using GROUP BY.
SELECT * FROM messages GROUP BY user_id ORDER BY id DESC LIMIT 5;

Related

Display all individual rows of a GROUP BY, sorted by the cardinality of the groups

Given a table like this:
User A
---------------
Erik 1278
Bob 16287
Alice 9723
Daniel 7
Erik 8
Bob 162
Erik 126
how to select all lines, grouped/ordered by user with the highest number of rows?
The result would be:
Erik 1278 # Erik is first because 3 rows with him
Erik 8
Erik 126
Bob 16287 # Bob is 2nd because 2 rows
Bob 162
Alice 9723
Daniel 7
Neither
SELECT * FROM t ORDER BY user
nor
SELECT *, COUNT(1) as frequency FROM t GROUP BY user ORDER BY frequency DESC
works, indeed the latter displays only one row for Erik, one row for Bob, one row for ...
It seems like I need a GROUP BY, but still be able to see "each row" of the group. How to do this?
You can use window functions in the order by:
order by count(*) over (partition by user) desc,
user
The first key counts the number of rows per user. The second keeps all users together (which is important if there are ties). You can add a third key if you want for ordering the rows for each user.
EDIT:
In older versions, you can use a subquery:
order by (select count(*) from user u2 where u2.user= u.user) desc,
user

Grouping values and changing values which do not allow the rest of the row to group

Not sure how to describe this, but I want to group a row of values, where one field has two or more different values and set the value of that (but concatenating or changing the values) to give just one single row.
For example:
I have a simple table (all fields are Strings) of people next to their departments. But some people belong to more than one department.
select department_ind, name
from jobs
;
department_ind name
1 Michael
2 Michael
2 Sarah
3 Dave
2 Sally
4 Sally
I want to group by name, and concatenate the department_ind. So the results show look like:
department_ind name
1,2 Michael
2 Sarah
3 Dave
2,4 Sally
Thanks
Use string_agg()
select string_agg(department_ind::text, ',') as departments,
name
from jobs
group by name;

SQL Tinder to prioritize order for users who already like you

profile
---
id name
1 John
2 Jane
3 Jill
...
swipe
---
id profile_1_id profile_2_id liked
1 2 1 true
2 3 1 false
...
If you've used Tinder before, you might recognize that it seems to fetch an initial card deck that consists of:
users who already like you that you can instantly match with, pushed to the top
other users
(out of scope for this question but it also sprinkles in some more attractive users)
If we extend the example to 100+ users, id=1 John was looking at the app, and we fetched with a limit of 20, it would guarantee Jane comes back (since Jane already likes John and John could match right away) + 19 others to fill the rest of John's deck to keep John swiping for more.
What is the SQL for "get people who like John first then fill the rest with random users"? Would this be a WHERE(case if else) or some other statement?
Here is a query that should meet your need.
It works by using a conditional sorting with CASE. Users that liked John will are given higher priority, and will appear sorted by id. Other users are given a lower, random, priority ; this also means, for a given user, this part of list will not always be the same (which, I believe, fits your purpose). The number of output records is then controlled by a LIMIT clause.
I tested the query in this db fiddle. You need to replace the question mark (?) in the CASE clause with the id of user for which you are generating a card (1 for John in your sample data).
SELECT
p.id,
p.name
FROM
profile p
LEFT JOIN swipe s on s.profile_1_id = p.id
ORDER BY
CASE s.profile_2_id
WHEN ? THEN 0
ELSE FLOOR(random() * 10) + 1
END,
p.id
LIMIT 20
You could try something like this but I think you're oversimplifying. Do you want to exclude not liked people from the others?
select * from profile p
left outer join swipe s on (p.id=profile_1_id and s.profile_2_id = 1 and liked = true)
where
p.id<>1
order by coalesce(profile_2_id , random()*-1000000) desc
limit 20

Is there a way to select results after a certain id in an order list?

I'm trying to implement a cursor-based paginating list based off of data from a Postgres database.
As an example, say I have a table with the following columns:
id | firstname | lastname
I want to paginate this data, which would be pretty simple if I only ever wanted to sort it by the id, but in my case, I want the option to sort by last name, and there's guaranteed to be multiple people with the same last name.
If I have a select statement like follows:
SELECT * FROM people
ORDER BY lastname ASC;
In the case, I could make my encoded cursor contain information about the lastname so I could pick up where I left off, but since there will be multiple users with the same last name, this will be buggy. Is there a way in SQL to only get the results after a certain id in an ordered list where it is not the column by which the results are sorted?
Example results from the select statement:
1 | John | Doe
4 | John | Price
2 | Joe | White
6 | Jim | White
3 | Sam | White
5 | Sally | Young
If I wanted a page size of 3, I couldn't add WHERE lastname <= :lastname as I'd have duplicate data on the list since it would return ids 2, 6, and 3 during that call. In my case, it'd be helpful if I could add to my query something similar to AFTER id = 6 where it could skip everything until it finds that id in the ordered list.
Yes. If I understand correctly:
select t.*
from t
where (lastname, id) > (select t2.lastname, t2.id
from t t2
where t2.id = ?
)
order by t.lastname;
I think I would add firstname into the mix, but it is the same idea.
Limit and offset are used for pagination e.g.:
SELECT id, lastname, firstname FROM people
Order by lastname, firstname, id
Offset 0
Limit 10
This will bring you the first to the 10th row, to retrieve the next page you need to specify the offset to 10
Here the documentation:
https://www.postgresql.org/docs/9.6/static/queries-limit.html

selecting point in time when change was made

Say i have the following table:
id Name Status Date
1 John Working 11/11/2003
2 John Working 03/03/2004
3 John Quit 04/04/2004
4 John Quit 04/05/2004
5 John Quit 04/06/2004
6 Joey Working 03/05/2009
7 Joey Working 02/06/2009
8 Joey Quit 02/07/2009
9 Joey Quit 02/08/2009
10 Joey Quit 02/09/2009
I want to get the date when the change between working and quit occured, so that i get:
3 John Quit 04/04/2004
8 Joey Quit 02/07/2009
How would i do that?
select ID, Name, Status, Date
from tableName
where ID in (
select min(ID)
from tableName
where Status = 'Quit'
)
This will get the first record that has the status of 'Quit' for each person. It is not necessarily the chronologically first record though. If that is what you're are looking for let me know. Also, if so, can you be sure there won't be any duplicated names; this could be problematic if there are two different John's or Joey's.
select id, name, date
from tableName WHERE date =
(select MIN(date)
FROM tableName
WHERE NAME='<name>' AND date >= (select MAX(date)
FROM tableName WHERE NAME='<name>' AND status = 'Working')
)
Hopefully I didn't screw things up badly in the nested query there. This would only give you 1 user, you could wrap it in a function and call it passing the username or do something like make a temp table with distinct name inserted into it and loop through those names.
A lot of options exist for bringing it form just one user to all users.