Does adding ROW_NUMBER to query get sorted automatically? - sql

It seems if I add ROW_NUMBER in a simple select query, the results are sorted automatically by the ROW_NUMBER column even without order by added to the select query at the end. I tried this on
without ROW_NUMBER - results are in random order
with ROW_NUMBER over (order by some_col) - results automatically ordered by this ROW_NUMBER column
with ROW_NUMBER over (order by some_col desc) - results again automatically order by this ROW_NUMBER column reflecting the new direction
Why is it behaving like this? Is there an implicit order by when using ROW_NUMBER?
If this is vendor specific, I was testing on MSSQL2014

The ORDER BY Clause in the window function only controls the order of the rows considered for the window function. It does not guarantee the final result set order.
Over clause
Note: The ORDER BY clause in the OVER clause only controls the order that the rows in the partition will be utilized by the window
function. It does not control the order of the final result set.
Without an ORDER BY clause on the query itself, the order of the rows
is not guaranteed. You may notice that your query may be returning in
the order of the last specified OVER clause – this is due to the way
that this is currently implemented in SQL Server. If the SQL Server
team at Microsoft changes the way that it works, it may no longer
order your results in the manner that you are currently observing. If
you need a specific order for the result set, you must provide an
ORDER BY clause against the query itself.

Related

How does the window SUM function work with OVER internally?

I'm trying to understand, how window function works internally.
ID,Amt
A,1
B,2
C,3
D,4
E,5
If I run this, will give sum of all amount in total column against every record.
Select ID, SUM (AMT) OVER () total from table
but when I run this, it will give me cumulative sum
Select ID, SUM (AMT) OVER (order by ID) total from table
Trying to understand what is happening when its OVER() and OVER(order by ID)
What I've understood is when no partition is defined in OVER, it considers everything as single partition. But not able to understand when we add order by Id within over(), how come it starts doing cumulative sum ?
Can anyone share what's happening behind the scenes for this ?
That is an interesting case, based on the documentation here is the explanation and example.
If PARTITION BY is not specified, the function treats all rows of the
query result set as a single partition. Function will be applied on
all rows in the partition if you don't specify ORDER BY clause.
So if you specifiey ORDER BY then
If it is specified, and a ROWS/RANGE is not specified, then default
RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default for
window frame by the functions that can accept optional ROWS/RANGE
specification (for example min or max).
So technically these two commands are the same:
SELECT ID, SUM(AMT) OVER (ORDER BY ID) total FROM table
SELECT ID, SUM(AMT) OVER (ORDER BY ID RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) total FROM table
More about you can read in the documentation:https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-ver15
This is not related to Oracle itself, but it's part of the SQL Standard and behaves the same way in many databases including Oracle, DB2, PostgreSQL, SQL Server, MySQL, MariaDB, H2, etc.
By definition, when you include the ORDER BY clause the engine will produce "running values" (cumulative aggregation) inside each partition; without the ORDER BY clause it produces the same, single value that aggregates the whole partition.
Now, the partition itself is mainly defined by the PARTITION BY clause. In its absence, the whole result set is considered as a single partition.
Finally, as a more advanced topic, the partition can be further tweaked using a "frame" clause (ROWS and RANGE) and by a "frame exclusion" clause (EXCLUDE).

TOP 1 and ORDER BY not returning correct results

I have read the other topics on this but they don't seem to match my scenario. I have a query that is ordering the results by Entry Date ASC and then by Sort ASC.
The results shown are correctly ordered, however when I change my query to only pull TOP 1 it returns the second result instead. I have no idea why or how this happens.
If your query has the order by in the outermost select, then the results should be returned in that order. Period.
If the order by is anywhere else -- in a subquery or in a window frame specification -- then the results might look like they are ordered, but the ordering is not guaranteed.
My guess is that you don't have the explicit order by that the query needs to do what you intend.
Also, although not the case with your sample data, if the keys have the same value then they can appear in any order -- and in different positions when you run the query multiple times.

how to resolve this - group by changes the Order of items in SQL Server

I'm using SQL server 2014,I'm fetching data from a view.The order of items is getting changed once i use Group by ,how can i get the order back after using this Group by,There is one date column,but its not saving any time,So i can't sort it based on date also..
How can I display the data in the same order as it displayed before using Group by?Anyone have any idea please help?
Thanks
Tables and views are essentially unordered sets. To get rows in a specific order, you should always add an ORDER BY clause on the columns you wish to order on.
I'm assuming you previously selected from the VIEW without an ORDER BY clause. The order in which rows are returned from a SELECT statement without an ORDER BY statement is undefined. The order you are getting them in, can change due to any number of reasons (eg some are listed here).
Your problem stems from the mistake you made on relying on the order from a SELECT from a VIEW without an ORDER BY. You should have had an ORDER BY clause in your SELECT statement to begin with.
How can I display the data in the same order as it displayed before using Group by?
The answer: You can't if your initial statement did not have an ORDER BY clause.
The resolution: Determine the order you want the resultset in and add an ORDER BY clause on those columns, both in your initial query and the version with the GROUP BY clause.
Maybe you can use the row_number() function without any OVER and ORDER BY keywords? This should be done in a sub-select and when you group the data in the outer SELECT, use the AVG() function on the numbered column and ORDER the result by this. The problem is, that when you group rows, the original rows disappear. That's kind if the purpose of GROUP BY. ;) Depending on what you GROUP BY, what you're asking might be logically impossible.
EDIT:
Found this solution Googling: http://blog.sqlauthority.com/2015/05/05/sql-server-generating-row-number-without-ordering-any-columns/
So you can number rows like this to maintain the order of rows from the table before you GROUP BY:
row_number() OVER (ORDER BY (SELECT 1))
The only way you can enforce a specific order is to explicitly use a ORDER BY clause. Otherwise the order of rows is not guaranteed (take a look at this article for more details) and the database engine will return the rows based on "as fast as it can" or "as fast as it can retrieve them from disk" rule. So, order can also vary between executions of the same query in the span of a few seconds.
When doing a DISTINCT, GROUP BY or ORDER BY, SQL Server automatically does a SORT of the data based on an index it uses for that query.
Looking at the execution plan of your query will show you what index (and implicitly columns in that index) is being used to sort the data.

Partition Over Ordering Effect

Does the Oracle clause OVER(PARTITION BY SUM(some_field))
have an implicit ordering effect and will my result data be sorted by SUM(some_field) without an additional
ORDER BY SUM(some_field) clause?
No. An analytic function in the SELECT statement does not imply any particular ordering of the result. Remember that you can have mulitple analytic functions in your query each of which is looking at rows in a different order so it wouldn't make sense for there to be an implied ordering of the result. If you want your results returned in a specific order, use an ORDER BY clause.

SQL - In Select Query if I use "Not IN" then it returns result set in random order

I have used NOT IN clause in Select Statement. When I run that query, each time it returns the same result set but the order is different.
Is this the default behavior of "NOT IN" clause?
The query which I am using is as below:
SELECT *,(ISNULL(AppFirstName,'')+' '+ISNULL(AppMiddleName,'')+' '+ISNULL(AppLastName,'')) as AppName FROM BApp AF WHERE AF.SId=11 AND AF.SCId=5 AND AF.CCId= 1 AND AF.IsActive=1 AND AF.ASId=16 AND AF.AId NOT IN (SELECT AId FROM NumberDetails where AId = AF.AId)
The order of an SQL result is not defined and left for the database to pick unless you use an ORDER clause. If you need to know more, post the query and what DB you are using.
If you don't specify an ORDER BY clause, then no query has a defined order. The database is free to return you the rows in whatever order is easiest for it.
The reason this sometimes seems consistent is that the rows will often be read out either in the order they exist on disk (probably the order they were inserted) or in the order of some index that was used to find the result.
The more complex your query, the more complex the processing the database needs to do, so the less likely the results are to come out in some obvious, repeatable, order.
Moral of the story: always use an ORDER BY clause.
SQL, by default, does not order or sort the records it returns. This behavior isn't specific to 'NOT IN', but is a general premise of the language. However, you can easily order your results by adding an 'ORDER BY table.column_name' to the end of your query.