Is there a way to select results after a certain id in an order list? - sql

I'm trying to implement a cursor-based paginating list based off of data from a Postgres database.
As an example, say I have a table with the following columns:
id | firstname | lastname
I want to paginate this data, which would be pretty simple if I only ever wanted to sort it by the id, but in my case, I want the option to sort by last name, and there's guaranteed to be multiple people with the same last name.
If I have a select statement like follows:
SELECT * FROM people
ORDER BY lastname ASC;
In the case, I could make my encoded cursor contain information about the lastname so I could pick up where I left off, but since there will be multiple users with the same last name, this will be buggy. Is there a way in SQL to only get the results after a certain id in an ordered list where it is not the column by which the results are sorted?
Example results from the select statement:
1 | John | Doe
4 | John | Price
2 | Joe | White
6 | Jim | White
3 | Sam | White
5 | Sally | Young
If I wanted a page size of 3, I couldn't add WHERE lastname <= :lastname as I'd have duplicate data on the list since it would return ids 2, 6, and 3 during that call. In my case, it'd be helpful if I could add to my query something similar to AFTER id = 6 where it could skip everything until it finds that id in the ordered list.

Yes. If I understand correctly:
select t.*
from t
where (lastname, id) > (select t2.lastname, t2.id
from t t2
where t2.id = ?
)
order by t.lastname;
I think I would add firstname into the mix, but it is the same idea.

Limit and offset are used for pagination e.g.:
SELECT id, lastname, firstname FROM people
Order by lastname, firstname, id
Offset 0
Limit 10
This will bring you the first to the 10th row, to retrieve the next page you need to specify the offset to 10
Here the documentation:
https://www.postgresql.org/docs/9.6/static/queries-limit.html

Related

Get total count and first 3 columns

I have the following SQL query:
SELECT TOP 3 accounts.username
,COUNT(accounts.username) AS count
FROM relationships
JOIN accounts ON relationships.account = accounts.id
WHERE relationships.following = 4
AND relationships.account IN (
SELECT relationships.following
FROM relationships
WHERE relationships.account = 8
);
I want to return the total count of accounts.username and the first 3 accounts.username (in no particular order). Unfortunately accounts.username and COUNT(accounts.username) cannot coexist. The query works fine removing one of the them. I don't want to send the request twice with different select bodies. The count column could span to 1000+ so I would prefer to calculate it in SQL rather in code.
The current query returns the error Column 'accounts.username' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. which has not led me anywhere and this is different to other questions as I do not want to use the 'group by' clause. Is there a way to do this with FOR JSON AUTO?
The desired output could be:
+-------+----------+
| count | username |
+-------+----------+
| 1551 | simon1 |
| 1551 | simon2 |
| 1551 | simon3 |
+-------+----------+
or
+----------------------------------------------------------------+
| JSON_F52E2B61-18A1-11d1-B105-00805F49916B |
+----------------------------------------------------------------+
| [{"count": 1551, "usernames": ["simon1", "simon2", "simon3"]}] |
+----------------------------------------------------------------+
If you want to display the total count of rows that satisfy the filter conditions (and where username is not null) in an additional column in your resultset, then you could use window functions:
SELECT TOP 3
a.username,
COUNT(a.username) OVER() AS cnt
FROM relationships r
JOIN accounts a ON r.account = a.id
WHERE
r.following = 4
AND EXISTS (
SELECT 1 FROM relationships t1 WHERE r1.account = 8 AND r1.following = r.account
)
;
Side notes:
if username is not nullable, use COUNT(*) rather than COUNT(a.username): this is more efficient since it does not require the database to check every value for nullity
table aliases make the query easier to write, read and maintain
I usually prefer EXISTS over IN (but here this is mostly a matter of taste, as both techniques should work fine for your use case)

Postgresql union query with priority on one query

So I have a table with 2 columns
class_id | title
CS124 | computer tactics
CS101 | intro to computers
MATH157 | math stuff
CS234 | other CS stuff
FRENCH50 | TATICS of french
ENGR101 | engineering ETHICS
OTHER1 | other CS title
I want to do a sort of smart search for auto complete where a user searches for something.
Lets say they type 'CS' into the box I want to search using both the class_id and title with a limit of lets say 5 for this example. I first want to search for class_ids like 'CS%' with a limit of 5 ordered by class_id. This will return the 3 cs classes.
Then if there is any room left in the limit I want to search using title like '%CS% and combine them but have the class_id matches be first, and make sure that duplicates are removed from the bottom like like cs234 where it would match on both queries.
So the end result for this query would be
CS101 | intro to computers
CS124 | computer tactics
CS234 | other CS stuff
ENGR101 | engineering ETHICS
FRENCH50 | TATICS of french
I am trying to do something like this
(select * from class_infos
where LOWER(class_id) like LOWER('CS%')
order by class_id)
union
(select * from class_infos
where LOWER(title) like LOWER('%CS%')
order by class_id)
limit 30
But it is not putting them in the right order or make the class id query have priority. Anyone have any suggestions
Here is the sqlfiddle
http://sqlfiddle.com/#!15/5368b
Have you try something like this?
SQL Fiddle Demo
SELECT *
FROM
(
(select 1 as priority, *
from class_infos
where LOWER(class_id) like LOWER('CS%'))
union
(select 2 as priority, *
from class_infos
where
LOWER(title) like LOWER('%CS%')
and not LOWER(class_id) like LOWER('CS%')
)
) as class
ORDER BY priority, class_id
limit 5

GROUP BY and aggregate function query

I am looking at making a simple leader board for a time trial. A member may perform many time trials, but I only want for their fastest result to be displayed. My table columns are as follows:
Members { ID (PK), Forename, Surname }
TimeTrials { ID (PK), MemberID, Date, Time, Distance }
An example dataset would be:
Forename | Surname | Date | Time | Distance
Bill Smith 01-01-11 1.14 100
Dave Jones 04-09-11 2.33 100
Bill Smith 02-03-11 1.1 100
My resulting answer from the example above would be:
Forename | Surname | Date | Time | Distance
Bill Smith 02-03-11 1.1 100
Dave Jones 04-09-11 2.33 100
I have this so far, but access complains that I am not using Date as part of an aggregate function:
SELECT Members.Forename, Members.Surname, Min(TimeTrials.Time) AS MinOfTime, TimeTrials.Date
FROM Members
INNER JOIN TimeTrials ON Members.ID = TimeTrials.Member
GROUP BY Members.Forename, Members.Surname, TimeTrials.Distance
HAVING TimeTrials.Distance = 100
ORDER BY MIN(TimeTrials.Time);
IF I remove the Date from the SELECT the query works (without the date). I have tried using FIRST upon the TimeTrials.Date, but that will return the first date which is normally incorrect.
Obviously putting the Date as part of the GROUP BY would not return the result set that I am after.
Make this task easier on yourself by starting with a smaller piece of the problem. First get the minimum Time from TimeTrials for each combination of MemberID and Distance.
SELECT
tt.MemberID,
tt.Distance,
Min(tt.Time) AS MinOfTime
FROM TimeTrials AS tt
GROUP BY
tt.MemberID,
tt.Distance;
Assuming that SQL is correct, use it in a subquery which you join back to TimeTrials again.
SELECT tt2.*
FROM
TimeTrials AS tt2
INNER JOIN
(
SELECT
tt.MemberID,
tt.Distance,
Min(tt.Time) AS MinOfTime
FROM TimeTrials AS tt
GROUP BY
tt.MemberID,
tt.Distance
) AS sub
ON
tt2.MemberID = sub.MemberID
AND tt2.Distance = sub.Distance
AND tt2.Time = sub.MinOfTime
WHERE tt2.Distance = 100
ORDER BY tt2.Time;
Finally, you can join that query to Members to get Forename and Surname. Your question shows you already know how to do that, so I'll leave it for you. :-)

Select unique records and display as category headers in rails

I have a rails 3.2 app running on PostgreSQL, and have some data I want to display in my view, which is stored in the database in this structure:
+----+--------+------------------+--------------------+
| id | name | sched_start_date | task |
+----+--------+------------------+--------------------+
| 1 | "Ben" | 2013-03-01 | "Check for debris" |
+----+--------+------------------+--------------------+
| 2 | "Toby" | 2013-03-02 | "Carry out Y1.1" |
+----+--------+------------------+--------------------+
| 3 | "Toby" | 2013-03-03 | "Check oil seals" |
+----+--------+------------------+--------------------+
I would like to display a list of tasks for each name, and for the names to be ordered ASC by the first sched_start_date they have, which should look like ...
Ben
2013-03-01 – Check for debris
Toby
2013-03-02 – Carry out Y1.1
2013-03-03 – Check oil seals
The approach I starting taking was to run a query for unique names and order them by sched_start_date ASC, then run a query for each name to get their tasks.
To get a list of unique names, the SQL would look like this.
select *
from (
select distinct on (name) name, sched_start_date
from tasks
) p
order by sched_start_date;
I would like to know if this is the correct approach (querying for unique names then running another query for all their tasks), or if there is a better rails way.
To get the data sorted like you describe, you might want to use min() as window function in the ORDER BY clause:
SELECT name, sched_start_date, task
FROM tasks
ORDER BY min(sched_start_date) OVER (PARTITION BY name), 1, 2, 3
Your original query would need an additional ORDER BY item to get the earliest date per name:
SELECT DISTINCT ON (name) name, sched_start_date, task
FROM tasks
ORDER BY 1, 2, 3;
I also added task (3) as last ORDER BY item to break ties, in case there can be more than one per date.
But the output is still ordered by name, not by date.
Getting your peculiar format with all data stuffed into one column is a bit more complex:
SELECT one_col
FROM (
WITH x AS (
SELECT name, min(sched_start_date) AS min_start
FROM tasks
GROUP BY 1
)
SELECT 2 AS rnk, name
,sched_start_date::text || ' – ' || task AS one_col
,sched_start_date, min_start
FROM tasks
JOIN x USING (name)
UNION ALL
SELECT 1 AS rnk, name, name, NULL::date, min_start
FROM x
ORDER BY min_start, name, rnk, sched_start_date, task
) y
Assuming that you have associations in your model you would be able to run
#employees = Employee.order(:name, :sched_start_date, :task).includes(:tasks)
You could then iterate over them:
#employees.each do |employee|
employee.name
employee.tasks.each do |task|
task.name
end
end
This isn't gonna exactly match your needs, but should show you where to start.

JavaDB: get ordered records in the subquery

I have the following "COMPANIES_BY_NEWS_REPUTATION" in my JavaDB database (this is some random data just to represent the structure)
COMPANY | NEWS_HASH | REPUTATION | DATE
-------------------------------------------------------------------
Company A | 14676757 | 0.12345 | 2011-05-19 15:43:28.0
Company B | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company C | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company A | -7874564 | 0.12345 | 2011-05-19 15:43:28.0
One news_hash may relate to several companies while a company can relate to several news_hashes as well. Reputation and date are bound to the news_hash.
What I need to do is calculate the average reputation of last 5 news for every company. In order to do that I somehow feel that I need to user 'order by' and 'offset' in a subquery as shown in the code below.
select COMPANY, avg(REPUTATION) from
(select * from COMPANY_BY_NEWS_REPUTATION order by "DATE" desc
offset 0 rows fetch next 5 row only) as TR group by COMPANY;
However, JavaDB allows neither ORDER BY, nor OFFSET in a subquery. Could anyone suggest a working solution for my problem please?
Which version of JavaDB are you using? According to the chapter TableSubquery in the JavaDB documentation, table subqueries do support order by and fetch next, at least in version 10.6.2.1.
Given that subqueries can be ordered and the size of the result set can be limited, the following (untested) query might do what you want:
select COMPANY, (select avg(REPUTATION)
from (select REPUTATION
from COMPANY_BY_NEWS_REPUTATION
where COMPANY = TR.COMPANY
order by DATE desc
fetch first 5 rows only))
from (select distinct COMPANY
from COMPANY_BY_NEWS_REPUTATION) as TR
This query retrieves all distinct company names from COMPANY_BY_NEWS_REPUTATION, then retrieves the average of the last five reputation rows for each company. I have no idea whether it will perform sufficiently, that will likely depend on the size of your data set and what indexes you have in place.
If you have a list of unique company names in another table, you can use that instead of the select distinct ... subquery to retrieve the companies for which to calculate averages.