SQL Server recent rows - sql

I'm sure this is easy but I have googled a lot and searched.
Ok, I have a table WITHOUT dates etc with 100000000000000000 records.
I want to see the latest entries, i.e.
Select top 200 *
from table
BUT I want to see the latest entries. Is there a rowidentifier that I could use in a table?
ie
select top 200 *
from table
order by rowidentifer Desc
Thanks

Is there a row.identifier that i could use in a table ie set top 200 * from table order by row.identifer Desc
As already stated in the comment's, there is not. The best way is having an identity, timestamp or some other form of identifying the record. Here is an alternative way using EXCEPT to get what you need, but the execution plan isn't the best... Play around with it and change as needed.
--testing purposes...
DECLARE #tbl TABLE(FirstName VARCHAR(50))
DECLARE #count INT = 0
WHILE (#count <= 12000)
BEGIN
INSERT INTO #tbl(FirstName)
SELECT
'RuPaul ' + CAST(#count AS VARCHAR(5))
SET #count += 1
END
--adjust how many records you would like, example is 200
SELECT *
FROM #tbl
EXCEPT(SELECT TOP (SELECT COUNT(*) - 200 FROM #tbl) * FROM #tbl)
--faster than above
DECLARE #tblCount AS INT = (SELECT COUNT(*) FROM #tbl) - 200
SELECT *
FROM #tbl
EXCEPT(SELECT TOP (#tblCount) * FROM #tbl)
On another note, you could create another Table Variable that has an ID and other columns, then you could insert the records you would need. Then you can perform other operations against the table, for example OrderBy etc...
What you could do
ALTER TABLE TABLENAME
ADD ID INT IDENTITY
This will add another column to the table "ID" and automatically give it an ID. Then you have an identifier you can use...

Nope, in short, there is none, if you don`t have a column dedicated as one (ie. an IDENTITY, or a SEQUENCE, or something similar). If you did, then you could get an ordered result back.

Related

How to make efficient pagination with total count

We have a web application which helps organizing biological experiments (users describe experiment and upload experiment data). In the main page, we show first 10 experiments and then below Previous Next 1 2 3 .. 30.
I bugs me how to make efficient total count and pagination. Currently:
select count(id) from experiments; // not very efficient in large datasets
but how does this scale when dealing with large datarecords > 200.000. I tried to import random experiments to table, but it still performs quite ok (0.6 s for 300.000 experiments).
The other alternative I thought about is to add addtional table statistics (column tableName, column recordsCount). So after each insert to table experiments I would increase recordsCount in statistics (this means inserting to one table and updating other, using sql transaction of course). Vice versa goes for delete statement (recordsCount--).
For pagination the most efficient way is to do where id > last_id as sql uses index of course. Is there any other better way?
In case results are to be filtered e.g. select * from experiment where name like 'name%', option with table statistics fails. We need to get total count as: select count(id) from experiment where name like 'name%'.
Application was developed using Laravel 3 in case it makes any difference.
I would like to develop pagination that always performs the same. Records count must not affect pagination nor total count of records.
Please have the query like below:
CREATE PROCEDURE [GetUsers]
(
#Inactive Bit = NULL,
#Name Nvarchar(500),
#Culture VarChar(5) = NULL,
#SortExpression VarChar(50),
#StartRowIndex Int,
#MaxRowIndex Int,
#Count INT OUTPUT
)
AS
BEGIN
SELECT ROW_NUMBER()
OVER
(
ORDER BY
CASE WHEN #SortExpression = 'Name' THEN [User].[Name] END,
CASE WHEN #SortExpression = 'Name DESC' THEN [User].[Name] END DESC
) AS RowIndex, [User].*
INTO #tmpTable
FROM [User] WITH (NOLOCK)
WHERE (#Inactive IS NULL OR [User].[Inactive] = #Inactive)
AND (#Culture IS NULL OR [User].[DefaultCulture] = #Culture)
AND [User].Name LIKE '%' + #Name + '%'
SELECT *
FROM #tmpTable WITH (NOLOCK)
WHERE #tmpTable.RowIndex > #StartRowIndex
AND #tmpTable.RowIndex < (#StartRowIndex + #MaxRowIndex + 1)
SELECT #Count = COUNT(*) FROM #tmpTable
IF OBJECT_ID('tempdb..#tmpTable') IS NOT NULL DROP TABLE #tmpTable;
END

Finding strings with duplicate letters inside

Can somebody help me with this little task? What I need is a stored procedure that can find duplicate letters (in a row) in a string from a table "a" and after that make a new table "b" with just the id of the string that has a duplicate letter.
Something like this:
Table A
ID Name
1 Matt
2 Daave
3 Toom
4 Mike
5 Eddie
And from that table I can see that Daave, Toom, Eddie have duplicate letters in a row and I would like to make a new table and list their ID's only. Something like:
Table B
ID
2
3
5
Only 2,3,5 because that is the ID of the string that has duplicate letters in their names.
I hope this is understandable and would be very grateful for any help.
In your answer with stored procedure, you have 2 mistakes, one is missing space between column name and LIKE clause, second is missing single quotes around search parameter.
I first create user-defined scalar function which return 1 if string contains duplicate letters:
EDITED
CREATE FUNCTION FindDuplicateLetters
(
#String NVARCHAR(50)
)
RETURNS BIT
AS
BEGIN
DECLARE #Result BIT = 0
DECLARE #Counter INT = 1
WHILE (#Counter <= LEN(#String) - 1)
BEGIN
IF(ASCII((SELECT SUBSTRING(#String, #Counter, 1))) = ASCII((SELECT SUBSTRING(#String, #Counter + 1, 1))))
BEGIN
SET #Result = 1
BREAK
END
SET #Counter = #Counter + 1
END
RETURN #Result
END
GO
After function was created, just call it from simple SELECT query like following:
SELECT
*
FROM
(SELECT
*,
dbo.FindDuplicateLetters(ColumnName) AS Duplicates
FROM TableName) AS a
WHERE a.Duplicates = 1
With this combination, you will get just rows that has duplicate letters.
In any version of SQL, you can do this with a brute force approach:
select *
from t
where t.name like '%aa%' or
t.name like '%bb%' or
. . .
t.name like '%zz%'
If you have a case sensitive collation, then use:
where lower(t.name) like '%aa%' or
. . .
Here's one way.
First create a table of numbers
CREATE TABLE dbo.Numbers
(
number INT PRIMARY KEY
);
INSERT INTO dbo.Numbers
SELECT number
FROM master..spt_values
WHERE type = 'P'
AND number > 0;
Then with that in place you can use
SELECT *
FROM TableA
WHERE EXISTS (SELECT *
FROM dbo.Numbers
WHERE number < LEN(Name)
AND SUBSTRING(Name, number, 1) = SUBSTRING(Name, number + 1, 1))
Though this is an old post it's worth posting a solution that will be faster than a brute force approach or one that uses a scalar udf (which generally drag down performance). Using NGrams8K this is rather simple.
--sample data
declare #table table (id int identity primary key, [name] varchar(20));
insert #table([name]) values ('Mattaa'),('Daave'),('Toom'),('Mike'),('Eddie');
-- solution #1
select id
from #table
cross apply dbo.NGrams8k([name],1)
where charindex(replicate(token,2), [name]) > 0
group by id;
-- solution #2 (SQL 2012+ solution using LAG)
select id
from
(
select id, token, prevToken = lag(token,1) over (partition by id order by position)
from #table
cross apply dbo.NGrams8k([name],1)
) prep
where token = prevToken
group by id; -- optional id you want to remove possible duplicates.
another burte force way:
select *
from t
where t.name ~ '(.)\1';

SQL Server 2008 paging methods?

I have to work with a potentially large list of records and I've been Googling for ways to avoid selecting the whole list, instead I want to let users select a page (like from 1 to 10) and display the records accordingly.
Say, for 1000 records I will have 100 pages of 10 records each and the most recent 10 records will be displayed first then if the user click on page 5, it will show records from 41 to 50.
Is it a good idea to add a row number to each record then query based on row number? Is there a better way of achieving the paging result without too much overhead?
So far those methods as described here look the most promising:
http://developer.berlios.de/docman/display_doc.php?docid=739&group_id=2899
http://www.codeproject.com/KB/aspnet/PagingLarge.aspx
The following T-SQL stored procedure is a very efficient implementation of paging. THE SQL optimiser can find the first ID very fast. Combine this with the use of ROWCOUNT, and you have an approach that is both CPU-efficient and read-efficient. For a table with a large number of rows, it certainly beats any approach that I've seen using a temporary table or table variable.
NB: I'm using a sequential identity column in this example, but the code works on any column suitable for page sorting. Also, sequence breaks in the column being used don't affect the result as the code selects a number of rows rather than a column value.
EDIT: If you're sorting on a column with potentially non-unique values (eg LastName), then add a second column to the Order By clause to make the sort values unique again.
CREATE PROCEDURE dbo.PagingTest
(
#PageNumber int,
#PageSize int
)
AS
DECLARE #FirstId int, #FirstRow int
SET #FirstRow = ( (#PageNumber - 1) * #PageSize ) + 1
SET ROWCOUNT #FirstRow
-- Add check here to ensure that #FirstRow is not
-- greater than the number of rows in the table.
SELECT #FirstId = [Id]
FROM dbo.TestTable
ORDER BY [Id]
SET ROWCOUNT #PageSize
SELECT *
FROM dbo.TestTable
WHERE [Id] >= #FirstId
ORDER BY [Id]
SET ROWCOUNT 0
GO
If you use a CTE with two row_number() columns - one sorted asc, one desc, you get row numbers for paging as well as the total records by adding the two row_number columns.
create procedure get_pages(#page_number int, #page_length int)
as
set nocount on;
with cte as
(
select
Row_Number() over (order by sort_column desc) as row_num
,Row_Number() over (order by sort_column) as inverse_row_num
,id as cte_id
From my_table
)
Select
row_num+inverse_row_num as total_rows
,*
from CTE inner join my_table
on cte_id=df_messages.id
where row_num between
(#page_number)*#page_length
and (#page_number+1)*#page_length
order by rownumber
Using OFFSET
Others have explained how the ROW_NUMBER() OVER() ranking function can be used to perform pages. It's worth mentioning that SQL Server 2012 finally included support for the SQL standard OFFSET .. FETCH clause:
SELECT first_name, last_name, score
FROM players
ORDER BY score DESC
OFFSET 40 ROWS FETCH NEXT 10 ROWS ONLY
If you're using SQL Server 2012 and backwards-compatibility is not an issue, you should probably prefer this clause as it will be executed more optimally by SQL Server in corner cases.
Using the SEEK Method
There is an entirely different, much faster way to perform paging in SQL. This is often called the "seek method" as described in this blog post here.
SELECT TOP 10 first_name, last_name, score
FROM players
WHERE (score < #previousScore)
OR (score = #previousScore AND player_id < #previousPlayerId)
ORDER BY score DESC, player_id DESC
The #previousScore and #previousPlayerId values are the respective values of the last record from the previous page. This allows you to fetch the "next" page. If the ORDER BY direction is ASC, simply use > instead.
With the above method, you cannot immediately jump to page 4 without having first fetched the previous 40 records. But often, you do not want to jump that far anyway. Instead, you get a much faster query that might be able to fetch data in constant time, depending on your indexing. Plus, your pages remain "stable", no matter if the underlying data changes (e.g. on page 1, while you're on page 4).
This is the best way to implement paging when lazy loading more data in web applications, for instance.
Note, the "seek method" is also called keyset paging.
Try something like this:
declare #page int = 2
declare #size int = 10
declare #lower int = (#page - 1) * #size
declare #upper int = (#page ) * #size
select * from (
select
ROW_NUMBER() over (order by some_column) lfd,
* from your_table
) as t
where lfd between #lower and #upper
order by some_column
Here's an updated version of #RoadWarrior's code, using TOP. Performance is identical, and extremely fast. Make sure you have an index on TestTable.ID
CREATE PROC dbo.PagingTest
#SkipRows int,
#GetRows int
AS
DECLARE #FirstId int
SELECT TOP (#SkipRows)
#FirstId = [Id]
FROM dbo.TestTable
ORDER BY [Id]
SELECT TOP (#GetRows) *
FROM dbo.TestTable
WHERE [Id] >= #FirstId
ORDER BY [Id]
GO
Try this
Declare #RowStart int, #RowEnd int;
SET #RowStart = 4;
SET #RowEnd = 7;
With MessageEntities As
(
Select ROW_NUMBER() Over (Order By [MESSAGE_ID]) As Row, [MESSAGE_ID]
From [TBL_NAFETHAH_MESSAGES]
)
Select m0.MESSAGE_ID, m0.MESSAGE_SENDER_NAME,
m0.MESSAGE_SUBJECT, m0.MESSAGE_TEXT
From MessageEntities M
Inner Join [TBL_NAFETHAH_MESSAGES] m0 on M.MESSAGE_ID = m0.MESSAGE_ID
Where M.Row Between #RowStart AND #RowEnd
Order By M.Row Asc
GO
Why not to use recommended solution:
SELECT VALUE product FROM
AdventureWorksEntities.Products AS product
order by product.ListPrice SKIP #skip LIMIT #limit

delete old records and keep 10 latest in sql compact

i'm using a sql compact database(sdf) in MS SQL 2008.
in the table 'Job', each id has multiple jobs.
there is a system regularly add jobs into the table.
I would like to keep the 10 latest records for each id order by their 'datecompleted'
and delete the rest of the records
how can i construct my query? failed in using #temp table and cursor
Well it is fast approaching Christmas, so here is my gift to you, an example script that demonstrates what I believe it is that you are trying to achieve. No I don't have a big white fluffy beard ;-)
CREATE TABLE TestJobSetTable
(
ID INT IDENTITY(1,1) not null PRIMARY KEY,
JobID INT not null,
DateCompleted DATETIME not null
);
--Create some test data
DECLARE #iX INT;
SET #iX = 0
WHILE(#iX < 15)
BEGIN
INSERT INTO TestJobSetTable(JobID,DateCompleted) VALUES(1,getDate())
INSERT INTO TestJobSetTable(JobID,DateCompleted) VALUES(34,getDate())
SET #iX = #iX + 1;
WAITFOR DELAY '00:00:0:01'
END
--Create some more test data, for when there may be job groups with less than 10 records.
SET #iX = 0
WHILE(#iX < 6)
BEGIN
INSERT INTO TestJobSetTable(JobID,DateCompleted) VALUES(23,getDate())
SET #iX = #iX + 1;
WAITFOR DELAY '00:00:0:01'
END
--Review the data set
SELECT * FROM TestJobSetTable;
--Apply the deletion to the remainder of the data set.
WITH TenMostRecentCompletedJobs AS
(
SELECT ID, JobID, DateCompleted
FROM TestJobSetTable A
WHERE ID in
(
SELECT TOP 10 ID
FROM TestJobSetTable
WHERE JobID = A.JobID
ORDER BY DateCompleted DESC
)
)
--SELECT * FROM TenMostRecentCompletedJobs ORDER BY JobID,DateCompleted desc;
DELETE FROM TestJobSetTable
WHERE ID NOT IN(SELECT ID FROM TenMostRecentCompletedJobs)
--Now only data of interest remains
SELECT * FROM TestJobSetTable
DROP TABLE TestJobSetTable;
How about something like:
DELETE FROM
Job
WHERE NOT
id IN (
SELECT TOP 10 id
FROM Job
ORDER BY datecompleted)
This is assuming you're using 3.5 because nested SELECT is only available in this version or higher.
I did not read the question correctly. I suspect something more along the lines of a CTE will solve the problem, using similar logic. You want to build a query that identifies the records you want to keep, as your starting point.
Using CTE on SQL Server Compact 3.5

Split query result by half in TSQL (obtain 2 resultsets/tables)

I have a query that returns a large number of heavy rows.
When I transform this rows in a list of CustomObject I have a big memory peak, and this transformation is made by a custom dotnet framework that I can't modify.
I need to retrieve a less number of rows to do "the transform" in two passes and then avoid the memory peak.
How can I split the result of a query by half? I need to do it in DB layer. I thing to do a "Top count(*)/2" but how to get the other half?
Thank you!
If you have identity field in the table, select first even ids, then odd ones.
select * from Table where Id % 2 = 0
select * from Table where Id % 2 = 1
You should have roughly 50% rows in each set.
Here is another way to do it from(http://www.tek-tips.com/viewthread.cfm?qid=1280248&page=5). I think it's more efficient:
Declare #Rows Int
Declare #TopRows Int
Declare #BottomRows Int
Select #Rows = Count(*) From TableName
If #Rows % 2 = 1
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows + 1
End
Else
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows
End
Set RowCount #TopRows
Select * From TableName Order By DisplayOrder
Set RowCount #BottomRows
Select * From TableNameOrder By DisplayOrderDESC
--- old answer below ---
Is this a stored procedure call or dynamic sql? Can you use temp tables?
if so, something like this would work
select row_number() OVER(order by yourorderfield) as rowNumber, *
INTO #tmp
FROM dbo.yourtable
declare #rowCount int
SELECT #rowCount = count(1) from #tmp
SELECT * from #tmp where rowNumber <= #rowCount / 2
SELECT * from #tmp where rowNumber > #rowCount / 2
DROP TABLE #tmp
SELECT TOP 50 PERCENT WITH TIES ... ORDER BY SomeThing
then
SELECT TOP 50 PERCENT ... ORDER BY SomeThing DESC
However, unless you snapshot the data first, a row in the middle may slip through or be processed twice
I don't think you should do that in SQL, unless you will always have a possibility to have the same record 2 times.
I would do it in an "software" programming language, not SQL. Java, .NET, C++, etc...