I have a table with a timestamp column and a few other char columns. I have ordered the information in the table by the timestamp but due to the fact that there are more records on the same timestamp I do not know if the order displayed is the order they were inserted in the table.
Sadly there is no index on the table so apart from the timestamp there isn't anything else I could use to order them by.
Example:
Timestamp | Foregin table | Foreign table value
2012-10-09 19:29:50.000 | tableA | "random string here"
2012-10-09 19:29:50.000 | tableA | "different string here"
2012-10-09 19:29:50.000 | tableB | "another random string here"
2012-10-09 19:29:50.000 | tableC | "string here"
2012-10-09 19:29:50.000 | tableD | "another string here"
The query that I run is something like this:
SELECT *
FROM TABLE
WHERE Timestamp BETWEEN 'x' and 'y'
ORDER BY Timestamp
Reviewing my results I assume, but I am not sure, that the query returns the data ordered by the specified column and continues ordering of the results by the next column/columns, in case there are more than 1 column, in an ascending manner (maybe by length of text or by alphabetical order).
Could you please help me clarify this situation as it is very important for me to find out the order.
SQL Server in general has no order guarantee for queries not specifying an ORDER BY. So a SELECT * FROM Tbl can essentially return the rows in any random order. Because of how the data is stored on disk, if you have an empty cache and no other activity going on on the server, you will get the data in on-disk order with a fairly high probability. That probability goes down dramatically if there are other concurrent queries on your server and if the cache is not empty. So while you might observe a specific behavior in your tests, you will not get the same behavior in production.
That said, if you need the rows to come back in a specific order you need to specify an appropriate ORDER BY clause. In the case above you could use ORDER BY [Timestamp], [Foreign table], [Foreign table value] to make sure that the records always come back in the same order. If your query contains more columns and none of them is unique you need to add all those to the ORDER BY too. But keep in mind that sorting gets more expensive the more columns are involved.
SQL Server will order the results using the explicit ORDER BY clause and gives no gaurantees as to the order of results for which all columns in the ORDER BY clause are equal.
If you want it to then order by one of your char columns alphabetically you need to specify that in your ORDER BY clause. It may be the case that your current results set is sorted in this way but it is highly unlikely that this will continue indefinitely.
Tables in databases are inherently unordered. This answer is a clarification of Sebastian's answer (the clarification is too long for a comment).
The return order is not random. It is arbitrary. And, it can change from invocation to invocation. On a table that has deletes as well as inserts, then later records can be interspersed on data pages with earlier records. Once again, there is no concept of ordering within a data table. Ordering is only part of query statements.
A bigger factor than other queries running on the system is multi-threading (and to a less extant multiple partitions).
If you have an empty cache, a query that does not access the table through an index, no deletes in the table, a single partition, and a single threaded system, then the data would normally be returned in insert order. Even in this case, though, there are no guarantees. If you want the data in a particular order, use order by. If you don't want a performance hit, include an identity primary key and use that for the ordering.
Related
I defined seq_id column as NUMBER(10), when I select from this table, the record with seq_id = 10 is after shown 1 - not after 9.
How should I make the rows get sorted in numeric order? I know order by seq_id will make it numeric order. But I have seen other tables their counter & seq_id are default in numeric order.
If you don't specify an order by clause the order in which rows are returned is arbitrary. If you care about the order of rows, you must use an order by.
For most tables in simple select statements without a where clause, rows will usually be returned in the way they are physically ordered on disk. For small tables that never undergo deletes, never have updates that cause row migration, and never have multiple threads doing inserts, that physical order is likely to correspond to the numeric order of the sequence. But that is not something that you should depend on.
I am bit new to SQL, I want to write query with TOP clause and order by clause.
So, for returning all the records I write below query
select PatientName,PlanDate as Date,* from OPLMLA21..Exams order
by PlanDate desc
And I need top few elements from same query, so I modified the query to
select top(5) PatientName,PlanDate as Date,* from OPLMLA21..Exams
order by PlanDate desc
In my understanding it will give the top 5 results from the previous query, but I see ambiguity there. I have attached the screen shot of query results .
May be my understanding is wrong, I read a lot but not able to understand this please help me out.
I stated this in a comment, however, to repeat that:
TOP (5) doesn't give the "top results" of the prior query though, no. It gives the top (first) rows from the dataset defined in the query it in is. If there are multiple rows that have the same "rank", then the row(s) returned for that rank are arbitrary. So, for example, for your query if you have 100 rows all with the same value for PlanDate, what 5 rows you get are completely arbitrary and could be different (including the order they are in) every time you run said query.
What I mean by arbitrary is that, effectively, SQL Server is free to choose whatever rows, of those applicable, are returned. Sometimes this might be the same everytime you run the query, but this by luck more than anything. As your database gets larger, you have more users querying the data, you involve joins, things like locks, indexes, parrallelism, etc all will effect the "order" that SQL Server is processing said data, and will effect an ambigious TOP clause.
Take the example data below:
ID | SomeDate
---|---------
1 |2020-01-01
2 |2020-01-01
3 |2020-01-01
4 |2020-01-01
5 |2020-01-01
6 |2020-01-02
Now, what would you expect if I ran a TOP (2) against that table with an ORDER BY clause of SomeDate DESC. Well, certainly, you'd expect the "last" row (with an ID of 6) to be returned, but what about the next row? The other 5 rows all have the same value for SomeDate. Perhaps, because your under the impression that data in a table is pre-sorted, you might expect the row with a value of 5 for ID. What if I told you that there was a CLUSTERED INDEX on ID ASC; that might well end up meaning that the row with a value of 1 is returned. What if there is also an index on SomeDate DESC?
What if the table was 10,000 of rows in size, and you also have a JOIN to another table, which also has a CLUSTERED INDEX, and some user is performing a query with some specific row locking on in while you run your query? What would you expect then?
Without your ORDER BY being specific enough to ensure that each row has a distinct ordering position, SQL Server will return other rows in an arbitrary order and when mixed with a TOP means the "top" rows will also be arbitrary.
Side note: I note in your image (of what appears to be SSMS), your "dates" are in the format yyyyMMdd. This strongly implies that you are storing a date value as a varchar or int type. This is a design flaw and needs to be fixed. There are 6 date and time data types, and 5 of them are far superior to using a string and numerical data type to storing the data.
I have a table which has ShipCountry, ShipCity and Freight column in SQL database. I tried to retrieve data from that table by using the below query.
Select ShipCountry from CountryDetails Group by ShipCountry
If i run this query i am getting results in Ascending order. Instead of this i need data in database order. How to achieve this through SQL query?
Note: If i run the below query, it will return the data in Database order. I am getting sorted data when i added group by clause in my query.
Select ShipCountry from CountryDetails
The use of group by for ordering is improper .. (group by is for aggregation function as min, max or count)
if you need a specific order use order by instead
Select ShipCountry from CountryDetails Order by ShipCountry
otherwise if want not order use simply
Select ShipCountry from CountryDetails
Remember that the values store in db have not a proper order ..and are selected in the sequence used for retrive the data.
Each time you need an order you must esplicitally use order by
for avoid "redundant values" .. use distinct and not group by eg:
Select distinct ShipCountry from CountryDetails
As already has been stated, what you describe might lead to unexpected results fro your end users.
Let's assume you have a table without any indexes or keys (A so-called heap). A heap pretty much can be compared to a phone book (yeah, I've been around for a while) consisting of hundreds of pages, on which information is randomly ordered. A heap is exactly that; A lot of randomly ordered data. Whenever you query from such a table, the query analyzer will do its very best to figure out what the fastest way to deliver the data is.
Such decisions from the query analyzer are guided by statistics; a collection of metrics about the data and the distribution thereof. SQL Server uses these statistics to figure out the cardinality (the uniqueness of values), and thus pick the fastest way to return data.
When you simply issue a SELECT * FROM myTable on a heap, those statistics will determine the order in which your data is returned. However, this also means that over time, the statistics will change, as more data flows into the table. This has the effect that the sort order of your data today is not necessarily the sort order in which the data is returned tomorrow, or even five minutes from now.
If that is fine with your end users, then a SELECT * FROM myTable is the right solution for you. But, if you absolutely need to have the data returned in a certain order, you should always implement an ORDER BY clause.
if you want to have the same database order in most cases if you have sorted by primary key it will be the same without ordering as you say:
here the id is the primary key, and if you can not use the primary key add an identity column and use it:
id name
1 elly
2 ahmad
3 joseph
4 omar
While trying to implement pagination from server side in postgres, i came across a point that while using limit and offset keywords you have to provide an ORDER BY clause on a unique column probably the primary key.
In my case i am using the UUID generation for Pkeys so I can't rely on a sequential order of increasing keys. ORDER BY pkey DESC - might not result in newer rows on top always.
So i resorted to using Created Date column - timestamp column which should be unique.
But my question comes what if the UI client wants to sort by some other column? in the event that it might not always be a unique column i resort to ORDER BY user_column, created_dt DESC so as to maintain predictable results for postgres pagination.
is this the right approach? i am not sure if i am going the right way. please advise.
I talked about this exact problem on an old blog post (in the context of using an ORM):
One last note about using sorting and paging in conjunction. A query
that implements paging can have odd results if the ORDER BY clause
does not include a field that represents an empirical sequence in the
data; sort order is not guaranteed beyond what is explicitly specified
in the ORDER BY clause in most (maybe all) database engines. An
example: if you have 100 orders that all occurred on the exact same
date, and you ask for the first page of this data sorted by this date,
then ask for the second page of data sorted the same way, it is
entirely possible that you will get some of the data duplicated across
both pages. So depending on the query and the distribution of data
that is “sortable,” it can be a good practice to always include a
unique field (like a primary key) as the final field in a sort clause
if you are implementing paging.
http://psandler.wordpress.com/2009/11/20/dynamic-search-objects-part-5sorting/
The strategy of using a column that uniquely identifies a record as pkey or insertion_date may not be possible in some cases.
I have an application where the user sets up his own grid query then it can simply put any column from multiple tables and perhaps none is a unique identifier.
In a case that can be useful you use rownum. You simply select the rownum and use his sort in over function. It would be something like:
select col1, col2, col3, row_number() over(order by col3) from tableX order by col3
It's important that over(order by *) match with order by *. Thus your paging will have consistent results every time.
i've got an issue due to database conception.
My data are grouped in a table which looks like :
IdGroup | IdValue
So for each group i've got the list of value.
Indeed, we should have had an order column or an id, but i can't.
Do you know anyway which can prove the order of the select value based on the insert order ?
I mean, if I inserted 1003,1001,1002 could i garantuee it to be retrieve in this order ?
IdGroup | IdValue
1 | 1003
1 | 1001
1 | 1002
Of course, using an order by doesn't seems to fit because i don't have any column usable.
Any idea ? Using a system proc or something like this.
Thanks a lot :)
Stop telling me to use an order by and altering the table, it doesn't fit and yes i know it's the good pratice to do... thanks :)
A couple of ideas:
DBCC PAGE (undocumented) can be used to look at the raw data pages of the table. It may be possible to determine insert order by looking at the low level information.
If you cannot alter the table, can you add a table to the database? If so, consider creating a table with an identity column and use a trigger on the original table to insert the records in the new table.
Also, you should include which version(s) of SQL Server are involved. Doing anything this unusual will very often be version specific.
You shouldn't rely on the data being returned in a particular order; use an ORDER BY clause to guarantee the order.
(Despite the fact that data appears to be returned in clustered index order, this might not always be the case).
Whilst some small scale tests will show that it returns it in what appears to be the right order, it just will not hold.
The golden rule remains - unless an order by clause is specified, there are no guarentees provided on the order of the returned data.
edit : If you place a non-clustered index on the idgroup column it is forced to add a hidden field, the uniqueifier since the values are the same - the problem it, you can't access it in an order by clause, but from a forensic perspective, you can determine the order it was inserted in.
As others have said, the only way to guarantee an ordering is with an ORDER BY clause. What isn't highlighted in their answers is that, the only place that this ORDER BY matters is in the SELECT statement. It doesn't* matter if you apply an ORDER BY clause during the INSERT statement; the system is free to return results from a select in whatever order it finds most efficient, unless an ORDER BY is specified at that time.
*There's a particular way to ensure what order IDENTITY values are assigned to a result set during an INSERT, using an ORDER BY, but I can't remember the exact details, and it still doesn't effect the order of SELECT.
Can you add the Created Date column? In this way you can get the records using Order by Clause Created Date. Moreover set it's default value Getdate()