SQL Select Bottom Records - sql

I have a query where I wish to retrieve the oldest X records. At present my query is something like the following:
SELECT Id, Title, Comments, CreatedDate
FROM MyTable
WHERE CreatedDate > #OlderThanDate
ORDER BY CreatedDate DESC
I know that normally I would remove the 'DESC' keyword to switch the order of the records, however in this instance I still want to get records ordered with the newest item first.
So I want to know if there is any means of performing this query such that I get the oldest X items sorted such that the newest item is first. I should also add that my database exists on SQL Server 2005.

Why not just use a subquery?
SELECT T1.*
FROM
(SELECT TOP X Id, Title, Comments, CreatedDate
FROM MyTable
WHERE CreatedDate > #OlderThanDate
ORDER BY CreatedDate) T1
ORDER BY CreatedDate DESC

Embed the query. You take the top x when sorted in ascending order (i.e. the oldest) and then re-sort those in descending order ...
select *
from
(
SELECT top X Id, Title, Comments, CreatedDate
FROM MyTable
WHERE CreatedDate > #OlderThanDate
ORDER BY CreatedDate
) a
order by createddate desc

Related

Get the top N rows by row count in GROUP BY

I'm querying a records table to find which users are my top record creators for certain record types. The basic starting point of my query looks something like this:
SELECT recordtype, createdby, COUNT(*)
FROM recordtable
WHERE recordtype in (...)
GROUP BY recordtype, createdby
ORDER BY recordtype, createdby DESC
But there are many users who have created records - I want to narrow this down further.
I added HAVING COUNT(*) > ..., but some record types only have a few records, while others have hundreds. If I do HAVING COUNT(*) > 10, I won't see that all 9 records of type "XYZ" were created by the same person, but I will have to scroll through every person that's created only 15, 30, 50, etc. of the 3,500 records of type "ABC."
I only want the top 5, 10, or so creators for each record type.
I've found a few questions that address the "select top N in group" part of the question, but I can't figure out how to apply them to what I need. The answers I could find are in cases where the "rank by" column is a value stored in the table, not an aggregate.
(Example: "what are the top cities in each country by population?", with data that looks like this:)
Country City Population
United States New York 123456789
United States Chicago 123456789
France Paris 123456789
I don't know how to apply the methods I've seen used to answer that (row_number(), mostly) to get the top N by COUNT(*).
here is one way , to get top 10 rows in each group:
select * from(
select *, row_number() over (partition by recordtype order by cnt desc) rn
from (
SELECT recordtype, createdby, COUNT(*) cnt
FROM recordtable
WHERE recordtype in (...)
GROUP BY recordtype, createdby
)t
)t where rn <= 10
If i understand well you want to get the top N records with biggest count. You can achieve this with a subquery like this (I suppose you are using MySQL or PostGRESQL or db2, in other DB engines the limit and offset may differ, as for example in sqlserver that is achieved with select top n * from...
SELECT A.recordtype, A.createdby, A.total FROM (
SELECT recordtype, createdby, COUNT(*) as total
FROM recordtable
WHERE recordtype in (...)
GROUP BY recordtype, createdby
) AS A ORDER BY recordtype, createdby, total DESC
LIMIT 10 OFFSET 0
Limit is the number of records you want in the results page, and offset is the number of records to skip before taking the result page.
If you use sqlserver it may look like this (there is also a way to apply an offset, you can take a look here SQL Server OFFSET and LIMIT)
SELECT TOP 10 A.recordtype, A.createdby, A.total FROM (
SELECT recordtype, createdby, COUNT(*) as total
FROM recordtable
WHERE recordtype in (...)
GROUP BY recordtype, createdby
) AS A ORDER BY recordtype, createdby, total DESC
For a grouped result then you can take a look at this post http://www.silota.com/docs/recipes/sql-top-n-group.html
So to take first 10 records in groups not only the first i mix this answer, with the link below and the approach of #eshirvana
SELECT * FROM (
SELECT *, row_number() OVER (PARTITION recordtype BY ORDER BY total DESC) rn
FROM (
SELECT recordtype, createdby, COUNT(*) as total
FROM recordtable
WHERE recordtype in (...)
GROUP BY recordtype, createdby
) t
) t WHERE total <= 10

Extract and concatenate the same field from multiple records in big query

I would like to be able to extract one field from multiple records from within a single table. For example, assuming I have a schema as follows
userId, eventTimestamp, theField
And what I want to do is be able to concatenate all instances of the field 'theField' together into a single string for a given userId ordered by eventTimestamp. And for an extra wrinkle, lets say I only want to include the first fiftiest oldest records.
My first attempt was to try something like:
SELECT
userId,
eventTimestamp,
LEAD(theField,0) OVER (PARTITION BY userId ORDER BY eventTimestamp) AS step0,
LEAD(theField,1) OVER (PARTITION BY userId ORDER BY eventTimestamp) AS step1,
....,
LEAD(theField,50) OVER (PARTITION BY userId ORDER BY eventTimestamp) AS step50,
And then the next step was to wrap that first step up in another SELECT statement as follows:
SELECT userId, eventTimestamp, CONCAT(STRING(step0), STRING(step1),...,STRING(step50)) as concatenatedString
FROM [whateverDataset.whateverTable],
GROUP BY
userId, eventTimestamp
This approach doesn't work though because if I have more than 50 steps (which I do), then I end up getting multiple rows for each of those outer SELECT statements, basically N-50 rows, where N = the total number of records for a particular userId. A 'solution' to this would be to have a HAVING statement in the inner SELECT statement to limit itself to only reporting the first 50 records, but overall this seems like a rather cumbersome solution. In non-BigQuery variants of SQL the GROUP_CONCAT seems to be a good way to go forward, but it either doesn't work here or I lack the creativity to get it to work. Anyone have any suggestions?
Thanks,
Brad
For BigQuery Legacy SQL:
SELECT
userid, GROUP_CONCAT(theField) AS Fields
FROM (
SELECT
userid, eventTimestamp, theField,
ROW_NUMBER() OVER(PARTITION BY userid ORDER BY eventTimestamp DESC) AS pos
FROM YourTable
ORDER BY eventTimestamp
)
WHERE pos < 51
GROUP BY userid
Please note: inner ORDER BY does not guarantee the order of theField in GROUP_CONCAT. But, so far, in all practical cases I see the order is carrying. So, test carefuly
For BigQuery Standard SQL:
Don't forget to uncheck Use Legacy SQL checkbox under Show Options
SELECT
userid,
(SELECT STRING_AGG(fields) FROM t.fields) AS fields
FROM (
SELECT
userid,
ARRAY(SELECT theField FROM t.fields ORDER BY eventTimestamp) fields
FROM (
SELECT
userid,
ARRAY_AGG(STRUCT(theField, eventTimestamp)) fields
FROM (
SELECT
userid,
eventTimestamp,
theField,
ROW_NUMBER() OVER(PARTITION BY userid ORDER BY eventTimestamp DESC) AS pos
FROM YourTable
)
WHERE pos < 51
GROUP BY userid
) t
) t

How to select latest record from return set of records ?

I have SQL table in SQL server 2008, and I want to get latest record which depends on its date.
E.g. Lets say I have records with some column and a Date column which contain the date of creation of record.
Lets sat the date column contains following dates. 22-Dec, 23-Dec, 24-Dec, 25,Dec, 26-Dec.
Now , I want to fetch the record which is less than 25 Dec, but I want the latest date record, If I write the query
select * from Table where CreateDate < '25-Dec-2012'
then It will return 3 records but I want the latest record from them i.e. 24 Dec record
How to do it ?
You should add TOP 1 to your query, and order it in reverse of its natural order to get your last record first. Assuming that the default order is by CreateDate in ascending order, the ORDER BY CreateDate DESC should do the trick:
SELECT TOP 1 *
FROM Table
WHERE CreateDate < '25-Dec-2012'
ORDER BY CreateDate DESC
USE THIS, will work fine, checked manually..:)
select top 1 *
from TableName
where Createdate < '25-Dec-2012'
order by Createdate desc
Please try:
select
top 1 *
from Table
where CreateDate < '25-Dec-2012'
order by CreateDate desc
use TOP
SELECT TOP 1 *
FROM TABLENAME
WHERE CreateDate < '25-Dec-2012'
ORDER BY CreateDate DESC
The OP said there's only one row per date, so the row with 24-Dec-2012 would be the desired row.
select * from Table where CreateDate = '24-Dec-2012'
u can include "Top" as..
select top 1 * from Table where CreateDate < '25-Dec-2012' order by CreateDate desc;
You have to use order and limit
select * from Table where CreateDate < '25-Dec-2012' order by CreateDate DESC LIMIT 1

Order by clause is changing my result set

I know why it's happening but I want to find a way around it if possible.
For example I have 4 rows in my database and each has a datetime (which are all different). What I want to do is get the latest 2 rows but use ascending order, so that the oldest is at the top of the result set.
I currently am using
SELECT TOP 2 *
FROM mytable
WHERE someid = #something
ORDER BY added DESC
This gets me the correct rows but in the wrong order. If I change the DESC to ASC it gets the right order, but the older two of the four rows. This all makes sense to me but is there a way around it?
EDIT: Solved with Elliot's answer below. The syntax would not work without setting an alias for the derived table however. Here is the result
SELECT * FROM
(SELECT TOP 2 * FROM mytable WHERE someid = #something ORDER BY added DESC) AS tbl
ORDER BY tbl.added ASC
I'd think one brute-force solution would be:
SELECT *
FROM (SELECT TOP 2 * FROM mytable WHERE someid = #something ORDER BY added DESC)
ORDER BY added
This will allow "top 2 per something" with a PARTITION BY added to the OVER clause
SELECT *
FROM
(
SELECT *, ROW_NUMBER() OVER (ORDER BY added DESC) as rn
FROM mytable
WHERE someid = #something
) foo
WHERE rn <= 2
ORDER BY added
Note that the derived table requires an alias

How to use DISTINCT and ORDER BY in same SELECT statement?

After executing the following statement:
SELECT Category FROM MonitoringJob ORDER BY CreationDate DESC
I am getting the following values from the database:
test3
test3
bildung
test4
test3
test2
test1
but I want the duplicates removed, like this:
bildung
test4
test3
test2
test1
I tried to use DISTINCT but it doesn't work with ORDER BY in one statement. Please help.
Important:
I tried it with:
SELECT DISTINCT Category FROM MonitoringJob ORDER BY CreationDate DESC
it doesn't work.
Order by CreationDate is very important.
The problem is that the columns used in the ORDER BY aren't specified in the DISTINCT. To do this, you need to use an aggregate function to sort on, and use a GROUP BY to make the DISTINCT work.
Try something like this:
SELECT DISTINCT Category, MAX(CreationDate)
FROM MonitoringJob
GROUP BY Category
ORDER BY MAX(CreationDate) DESC, Category
Extended sort key columns
The reason why what you want to do doesn't work is because of the logical order of operations in SQL, as I've elaborated in this blog post, which, for your first query, is (simplified):
FROM MonitoringJob
SELECT Category, CreationDate i.e. add a so called extended sort key column
ORDER BY CreationDate DESC
SELECT Category i.e. remove the extended sort key column again from the result.
So, thanks to the SQL standard extended sort key column feature, it is totally possible to order by something that is not in the SELECT clause, because it is being temporarily added to it behind the scenes.
So, why doesn't this work with DISTINCT?
If we add the DISTINCT operation, it would be added between SELECT and ORDER BY:
FROM MonitoringJob
SELECT Category, CreationDate
DISTINCT
ORDER BY CreationDate DESC
SELECT Category
But now, with the extended sort key column CreationDate, the semantics of the DISTINCT operation has been changed, so the result will no longer be the same. This is not what we want, so both the SQL standard, and all reasonable databases forbid this usage.
Workarounds
It can be emulated with standard syntax as follows
SELECT Category
FROM (
SELECT Category, MAX(CreationDate) AS CreationDate
FROM MonitoringJob
GROUP BY Category
) t
ORDER BY CreationDate DESC
Or, just simply (in this case), as shown also by Prutswonder
SELECT Category, MAX(CreationDate) AS CreationDate
FROM MonitoringJob
GROUP BY Category
ORDER BY CreationDate DESC
I have blogged about SQL DISTINCT and ORDER BY more in detail here.
If the output of MAX(CreationDate) is not wanted - like in the example of the original question - the only answer is the second statement of Prashant Gupta's answer:
SELECT [Category] FROM [MonitoringJob]
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC
Explanation: you can't use the ORDER BY clause in an inline function, so the statement in the answer of Prutswonder is not useable in this case, you can't put an outer select around it and discard the MAX(CreationDate) part.
Just use this code, If you want values of [Category] and [CreationDate] columns
SELECT [Category], MAX([CreationDate]) FROM [MonitoringJob]
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC
Or use this code, If you want only values of [Category] column.
SELECT [Category] FROM [MonitoringJob]
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC
You'll have all the distinct records what ever you want.
2) Order by CreationDate is very important
The original results indicated that "test3" had multiple results...
It's very easy to start using MAX all the time to remove duplicates in Group By's... and forget or ignore what the underlying question is...
The OP presumably realised that using MAX was giving him the last "created" and using MIN would give the first "created"...
if object_id ('tempdb..#tempreport') is not null
begin
drop table #tempreport
end
create table #tempreport (
Category nvarchar(510),
CreationDate smallint )
insert into #tempreport
select distinct Category from MonitoringJob (nolock)
select * from #tempreport ORDER BY CreationDate DESC
Distinct will sort records in ascending order. If you want to sort in desc order use:
SELECT DISTINCT Category
FROM MonitoringJob
ORDER BY Category DESC
If you want to sort records based on CreationDate field then this field must be in the select statement:
SELECT DISTINCT Category, creationDate
FROM MonitoringJob
ORDER BY CreationDate DESC
You can use CTE:
WITH DistinctMonitoringJob AS (
SELECT DISTINCT Category Distinct_Category FROM MonitoringJob
)
SELECT Distinct_Category
FROM DistinctMonitoringJob
ORDER BY Distinct_Category DESC
By subquery, it should work:
SELECT distinct(Category) from MonitoringJob where Category in(select Category from MonitoringJob order by CreationDate desc);
We can do this with select sub query
Here is the the query:
SELECT * FROM (
SELECT DISTINCT Category FROM MonitoringJob
) AS Tbl
ORDER BY Tbl.CreationDate DESC
Try next, but it's not useful for huge data...
SELECT DISTINCT Cat FROM (
SELECT Category as Cat FROM MonitoringJob ORDER BY CreationDate DESC
);
It can be done using inner query Like this
$query = "SELECT *
FROM (SELECT Category
FROM currency_rates
ORDER BY id DESC) as rows
GROUP BY currency";
SELECT DISTINCT Category FROM MonitoringJob ORDER BY Category ASC