How to select the 5 most recent records in a table from a SQLite database - sql

I have a SQLite database table schema called Bans like this.
| Name | Ban Reason |
| Noah | Swearing |
| Liam | Toxicity |
| Josh | Cheating |
Such simple data continues for about another 20 rows. Naturally, the latest entries are at the top. In this case, the entry containing Noah would be the most recent addition to the table.
I want to retrieve the Top 5 most recent results. How can I do this using SQLite? I am vaguely aware that LIMIT should be used, but I cannot get it to work properly. Thanks.

SQLITE:
SELECT *
FROM your_table
LIMIT 5;
Here is a DEMO
Depending on what you think when you say latest 5 you can add
ORDER BY your_column DESC
or
ORDER BY your_column ASC
between FROM clause and LIMIT keyword.
But as I see your query uses a little bit different logic and select all that is not in top 5 so I have simulated that like this in SQLite:
SELECT *
FROM your_table
where id not in
(select id from your_table
LIMIT (SELECT COUNT(*) - 5 FROM your_table));
Here is a DEMO for that EXAMPLE.

If you want the rows with the top 5 rowids you can do it with an ORDER BY clause and LIMIT:
SELECT *
FROM tablename
ORDER BY rowid DESC
LIMIT 5
but the correct way to define latest is by using a column like a created_date.

Related

How to get an incremental "RowId" column in SELECT using ROW_NUMBER()

I've been trying to update a query on the DataExplorer that we use on our Gaming SE Site for keeping track of tags without excerpts to include an incremental row number in the results to make reading the returned values easier. There are a number of questions on here that discuss how to do this, such as this one and this one which appear to have worked for those users, but I can't seem to get it work for my situation.
To be clear, I would like something like this:
RowId | TagName | Count | Easy List Formatting
----------------------------------------------
1 | Tag1 | 6 | 1. [tag:tag1] (6)
2 | Tag2 | 6 | 1. [tag:tag2] (6)
3 | Tag3 | 5 | 1. [tag:tag3] (5)
4 | Tag4 | 5 | 1. [tag:tag4] (5)
What I've come up with so far is this:
SELECT ROW_NUMBER() OVER(PARTITION BY TagInfo.[Count] ORDER BY TagInfo.TagName ASC) AS RowId, *
FROM
(
SELECT
TagName,
[Count],
concat('1. [tag:',concat(TagName,concat('] (', concat([Count],')')))) AS [Easy List Formatting]
FROM Tags
LEFT JOIN Posts pe on pe.Id = Tags.ExcerptPostId
LEFT JOIN TagSynonyms on SourceTagName = Tags.TagName
WHERE coalesce(len(pe.Body),0) = 0 and ApprovalDate is null
) AS TagInfo
ORDER BY TagInfo.[Count] DESC, TagInfo.TagName
This yields something close to what I want, but not quite. The RowId column increments, but once the Count column changes, it resets (presumably because of the PARTITION BY). But, if I remove the PARTITION BY, the RowId column becomes what appear to be random numbers.
Is what I want to do achievable given the way the tables are structured? If so, what should the SQL be?
To access the forked query, you can use this link. The original query (before my changes) can be found here if it helps in anyway.
Removing the PARTITION BY is exactly what is needed. The reason that your numbers look random is that the ORDER BY of the outer query is different from the ORDER BY of your ROW_NUMBER(). All you have to do is make those the same, and the output of the sequence project will have the monotonically increasing value you expect.
Specifically:
SELECT ROW_NUMBER() OVER (ORDER BY TagInfo.[Count] DESC, TagInfo.TagName) AS RowId, *
FROM
(
...
) AS TagInfo
ORDER BY TagInfo.[Count] DESC, TagInfo.TagName
Now you aren't partitioning, and the two ORDER BY clauses match, so you'll get your expected output.
For what it's worth, you technically don't really even care about having an ORDER BY in the ROW_NUMBER(), you just want the same order as the final result set. In that case, you can trick the query engine like so by providing a meaningless ORDER BY clause in the ROW_NUMBER():
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS RowId
Boom, done!
A little way around i used its, i add a column to the original table that has a increment value by 1 for example
ALTER TABLE ur_table ADD id INT IDENTITY(1,1)
GO
after that you do the query with order by column id
select * from ur_table (query) order by id

How can I fetch the last N rows, WITHOUT ordering the table

I have tables with multiple million rows and need to fetch the last rows of specific ID's
for example the last row which has device_id = 123 AND the last row which has device_id = 1234
because the tables are so huge and ordering takes so much time, is it possible to select the last 200 without ordering the table and then just order those 200 and fetch the rows I need.
How would I do that?
Thank you in advance for your help!
UPDATE
My PostgreSQL version is 9.2.1
sample data:
time device_id data data ....
"2013-03-23 03:58:00-04" | "001EC60018E36" | 66819.59 | 4.203
"2013-03-23 03:59:00-04" | "001EC60018E37" | 64277.22 | 4.234
"2013-03-23 03:59:00-04" | "001EC60018E23" | 46841.75 | 2.141
"2013-03-23 04:00:00-04" | "001EC60018E21" | 69697.38 | 4.906
"2013-03-23 04:00:00-04" | "001EC600192524"| 69452.69 | 2.844
"2013-03-23 04:01:00-04" | "001EC60018E21" | 69697.47 | 5.156
....
See SQLFiddle of this data
So if device_id = 001EC60018E21
I would want the most recent row with that device_id.
It is a grantee that the last row with that device_id is the row I want, but it may or may not be the last row of the table.
Personally I'd create a composite index on device_id and descending time:
CREATE INDEX table1_deviceid_time ON table1("device_id","time" DESC);
then I'd use a subquery to find the highest time for each device_id and join the subquery results against the main table on device_id and time to find the relevant data, eg:
SELECT t1."device_id", t1."time", t1."data", t1."data1"
FROM Table1 t1
INNER JOIN (
SELECT t1b."device_id", max(t1b."time") FROM Table1 t1b GROUP BY t1b."device_id"
) last_ids("device_id","time")
ON (t1."device_id" = last_ids."device_id"
AND t1."time" = last_ids."time");
See this SQLFiddle.
It might be helpful to maintain a trigger-based materialized view of the highest timestamp for each device ID. However, this will cause concurrency issues if most than one connection can insert data for a given device ID due to the connections fighting for update locks. It's also a pain if you don't know when new device IDs will appear as you have to do an upsert - something that's very inefficient and clumsy. Additionally, the extra write load and autovacuum work created by the summary table may not be worth it; it might be better to just pay the price of the more expensive query.
BTW, time is a terrible name for a column because it's a built-in data type name. Use something more appropriate if you can.
The general way to get the "last" row for each device_id looks like this.
select *
from Table1
inner join (select device_id, max(time) max_time
from Table1
group by device_id) T2
on Table1.device_id = T2.device_id
and Table1.time = T2.max_time;
Getting the "last" 200 device_id numbers without using an ORDER BY isn't really practical, but it's not clear why you might want to do that in the first place. If 200 is an arbitrary number, then you can get better performance by taking a subset of the table that's based on an arbitrary time instead.
select *
from Table1
inner join (select device_id, max(time) max_time
from Table1
where time > '2013-03-23 12:03'
group by device_id) T2
on Table1.device_id = T2.device_id
and Table1.time = T2.max_time;

How to add aggregate value to SELECT?

I'm selecting data from multiple tables and I also need to get maximum "timestamp" on those tables. I will need that to create custom cache control.
tbl_name tbl_surname
id | name id | surname
--------- ------------
0 | John 0 | Doe
1 | Jane 1 | Tully
... ...
I have following query:
SELECT name, surname FROM tbl_name, tbl_surname WHERE tbl_name.id = tbl_surname.id
and I need to add following info to result set:
SELECT MAX(ora_rowscn) FROM (SELECT ora_rowscn FROM tbl_name
UNION ALL
SELECT ora_rowscn FROM tbl_surname);
I was trying to use UNION but I get error - mixing group and not single group data - or something like that, I know why I cannot use the union.
I don't want to split this into 2 calls, because I need the timestamp of the current snapshot I took from DB for my cache management. And between select and the call for MAX the DB could change.
Here is result I want:
John | Doe | 123456
Jane | Tully | 123456
where 123456 is approximate time of last change (insert, update, delete) of tables tbl_name and tbl_surname.
I have read only access to DB, so I cannot create triggers, stored procedures, extra tables etc...
Thanks for any suggestions.
EDIT: The value *ora_rowscn* is assigned per block of rows. So in one table this value can differ per row. I need the maximal value from both (all) tables involved in query.
Try:
SELECT name,
surname,
max(greatest(tbl_name.ora_rowscn, tbl_surname.ora_rowscn)) over () as max_rowscn
FROM tbl_name, tbl_surname
WHERE tbl_name.id = tbl_surname.id
There's no need to aggregate here - just include both ora_rowscn values in your query and take the max:
SELECT
n.name,
n.ora_rowscn as n_ora_rowscn,
s.surname,
s.ora_rowscn as s_ora_rowscn,
greatest(n.ora_rowscn, s.ora_rowscn) as last_ora_rowscn
FROM tbl_name n
join tbl_surname s on n.id = s.id
BTW, I've replaced your old-style joins with ANSI style - better readable, IMHO.

Get last record of a table in Postgres

I'm using Postgres and cannot manage to get the last record of my table:
my_query = client.query("SELECT timestamp,value,card from my_table");
How can I do that knowning that timestamp is a unique identifier of the record ?
If under "last record" you mean the record which has the latest timestamp value, then try this:
my_query = client.query("
SELECT TIMESTAMP,
value,
card
FROM my_table
ORDER BY TIMESTAMP DESC
LIMIT 1
");
you can use
SELECT timestamp, value, card
FROM my_table
ORDER BY timestamp DESC
LIMIT 1
assuming you want also to sort by timestamp?
Easy way: ORDER BY in conjunction with LIMIT
SELECT timestamp, value, card
FROM my_table
ORDER BY timestamp DESC
LIMIT 1;
However, LIMIT is not standard and as stated by Wikipedia, The SQL standard's core functionality does not explicitly define a default sort order for Nulls.. Finally, only one row is returned when several records share the maximum timestamp.
Relational way:
The typical way of doing this is to check that no row has a higher timestamp than any row we retrieve.
SELECT timestamp, value, card
FROM my_table t1
WHERE NOT EXISTS (
SELECT *
FROM my_table t2
WHERE t2.timestamp > t1.timestamp
);
It is my favorite solution, and the one I tend to use. The drawback is that our intent is not immediately clear when having a glimpse on this query.
Instructive way: MAX
To circumvent this, one can use MAX in the subquery instead of the correlation.
SELECT timestamp, value, card
FROM my_table
WHERE timestamp = (
SELECT MAX(timestamp)
FROM my_table
);
But without an index, two passes on the data will be necessary whereas the previous query can find the solution with only one scan. That said, we should not take performances into consideration when designing queries unless necessary, as we can expect optimizers to improve over time. However this particular kind of query is quite used.
Show off way: Windowing functions
I don't recommend doing this, but maybe you can make a good impression on your boss or something ;-)
SELECT DISTINCT
first_value(timestamp) OVER w,
first_value(value) OVER w,
first_value(card) OVER w
FROM my_table
WINDOW w AS (ORDER BY timestamp DESC);
Actually this has the virtue of showing that a simple query can be expressed in a wide variety of ways (there are several others I can think of), and that picking one or the other form should be done according to several criteria such as:
portability (Relational/Instructive ways)
efficiency (Relational way)
expressiveness (Easy/Instructive way)
If your table has no Id such as integer auto-increment, and no timestamp, you can still get the last row of a table with the following query.
select * from <tablename> offset ((select count(*) from <tablename>)-1)
For example, that could allow you to search through an updated flat file, find/confirm where the previous version ended, and copy the remaining lines to your table.
The last inserted record can be queried using this assuming you have the "id" as the primary key:
SELECT timestamp,value,card FROM my_table WHERE id=(select max(id) from my_table)
Assuming every new row inserted will use the highest integer value for the table's id.
If you accept a tip, create an id in this table like serial. The default of this field will be:
nextval('table_name_field_seq'::regclass).
So, you use a query to call the last register. Using your example:
pg_query($connection, "SELECT currval('table_name_field_seq') AS id;
I hope this tip helps you.
To get the last row,
Get Last row in the sorted order: In case the table has a column specifying time/primary key,
Using LIMIT clause
SELECT * FROM USERS ORDER BY CREATED_TIME DESC LIMIT 1;
Using FETCH clause - Reference
SELECT * FROM USERS ORDER BY CREATED_TIME FETCH FIRST ROW ONLY;
Get Last row in the rows insertion order: In case the table has no columns specifying time/any unique identifiers
Using CTID system column, where ctid represents the physical location of the row in a table - Reference
SELECT * FROM USERS WHERE CTID = (SELECT MAX(CTID) FROM USERS);
Consider the following table,
userid |username | createdtime |
1 | A | 1535012279455 |
2 | B | 1535042279423 | //as per created time, this is the last row
3 | C | 1535012279443 |
4 | D | 1535012212311 |
5 | E | 1535012254634 | //as per insertion order, this is the last row
The query 1 and 2 returns,
userid |username | createdtime |
2 | B | 1535042279423 |
while 3 returns,
userid |username | createdtime |
5 | E | 1535012254634 |
Note : On updating an old row, it removes the old row and updates the data and inserts as a new row in the table. So using the following query returns the tuple on which the data modification is done at the latest.
Now updating a row, using
UPDATE USERS SET USERNAME = 'Z' WHERE USERID='3'
the table becomes as,
userid |username | createdtime |
1 | A | 1535012279455 |
2 | B | 1535042279423 |
4 | D | 1535012212311 |
5 | E | 1535012254634 |
3 | Z | 1535012279443 |
Now the query 3 returns,
userid |username | createdtime |
3 | Z | 1535012279443 |
Use the following
SELECT timestamp, value, card
FROM my_table
ORDER BY timestamp DESC
LIMIT 1
These are all good answers but if you want an aggregate function to do this to grab the last row in the result set generated by an arbitrary query, there's a standard way to do this (taken from the Postgres wiki, but should work in anything conforming reasonably to the SQL standard as of a decade or more ago):
-- Create a function that always returns the last non-NULL item
CREATE OR REPLACE FUNCTION public.last_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE SQL IMMUTABLE STRICT AS $$
SELECT $2;
$$;
-- And then wrap an aggregate around it
CREATE AGGREGATE public.LAST (
sfunc = public.last_agg,
basetype = anyelement,
stype = anyelement
);
It's usually preferable to do select ... limit 1 if you have a reasonable ordering, but this is useful if you need to do this within an aggregate and would prefer to avoid a subquery.
See also this question for a case where this is the natural answer.
The column name plays an important role in the descending order:
select <COLUMN_NAME1, COLUMN_NAME2> from >TABLENAME> ORDER BY <COLUMN_NAME THAT MENTIONS TIME> DESC LIMIT 1;
For example: The below-mentioned table(user_details) consists of the column name 'created_at' that has timestamp for the table.
SELECT userid, username FROM user_details ORDER BY created_at DESC LIMIT 1;
In Oracle SQL,
select * from (select row_number() over (order by rowid desc) rn, emp.* from emp) where rn=1;
select * from table_name LIMIT 1;

I DISTINCTly hate MySQL (help building a query)

This is staight forward I believe:
I have a table with 30,000 rows. When I SELECT DISTINCT 'location' FROM myTable it returns 21,000 rows, about what I'd expect, but it only returns that one column.
What I want is to move those to a new table, but the whole row for each match.
My best guess is something like SELECT * from (SELECT DISTINCT 'location' FROM myTable) or something like that, but it says I have a vague syntax error.
Is there a good way to grab the rest of each DISTINCT row and move it to a new table all in one go?
SELECT * FROM myTable GROUP BY `location`
or if you want to move to another table
CREATE TABLE foo AS SELECT * FROM myTable GROUP BY `location`
Distinct means for the entire row returned. So you can simply use
SELECT DISTINCT * FROM myTable GROUP BY 'location'
Using Distinct on a single column doesn't make a lot of sense. Let's say I have the following simple set
-id- -location-
1 store
2 store
3 home
if there were some sort of query that returned all columns, but just distinct on location, which row would be returned? 1 or 2? Should it just pick one at random? Because of this, DISTINCT works for all columns in the result set returned.
Well, first you need to decide what you really want returned.
The problem is that, presumably, for some of the location values in your table there are different values in the other columns even when the location value is the same:
Location OtherCol StillOtherCol
Place1 1 Fred
Place1 89 Fred
Place1 1 Joe
In that case, which of the three rows do you want to select? When you talk about a DISTINCT Location, you're condensing those three rows of different data into a single row, there's no meaning to moving the original rows from the original table into a new table since those original rows no longer exist in your DISTINCT result set. (If all the other columns are always the same for a given Location, your problem is easier: Just SELECT DISTINCT * FROM YourTable).
If you don't care which values come from the other columns you can use a (bad, IMHO) MySQL extension to SQL and do:
SELECT * FROM YourTable GROUP BY Location
which will give a result set with one row per location and values for the other columns derived from the original data in an undefined fashion.
Multiple rows with identical values in all columns don't have any sense. OK - the question might be a way to correct exactly that situation.
Considering this table, with id being the PK:
kram=# select * from foba;
id | no | name
----+----+---------------
2 | 1 | a
3 | 1 | b
4 | 2 | c
5 | 2 | a,b,c,d,e,f,g
you may extract a sample for every single no (:=location) by grouping over that column, and selecting the row with minimum PK (for example):
SELECT * FROM foba WHERE id IN (SELECT min (id) FROM foba GROUP BY no);
id | no | name
----+----+------
2 | 1 | a
4 | 2 | c