Does UNION ALL guarantee the order of the result set [duplicate] - sql

This question already has answers here:
SQL Server UNION - What is the default ORDER BY Behaviour
(6 answers)
Closed 9 years ago.
Can I be sure that the result set of the following script will always be sorted like this O-R-D-E-R ?
SELECT 'O'
UNION ALL
SELECT 'R'
UNION ALL
SELECT 'D'
UNION ALL
SELECT 'E'
UNION ALL
SELECT 'R'
Can it be proved to sometimes be in a different order?

There is no inherent order, you have to use ORDER BY. For your example you can easily do this by adding a SortOrder to each SELECT. This will then keep the records in the order you want:
SELECT 'O', 1 SortOrder
UNION ALL
SELECT 'R', 2
UNION ALL
SELECT 'D', 3
UNION ALL
SELECT 'E', 4
UNION ALL
SELECT 'R', 5
ORDER BY SortOrder
You cannot guarantee the order unless you specifically provide an order by with the query.

No it does not. SQL tables are inherently unordered. You need to use order by to get things in a desired order.
The issue is not whether it works once when you try it out. The issue is whether you can trust this behavior. And you cannot. SQL Server does not even guarantee the ordering for this:
select *
from (select t.*
from t
order by col1
) t
It says here:
When ORDER BY is used in the definition of a view, inline function,
derived table, or subquery, the clause is used only to determine the
rows returned by the TOP clause. The ORDER BY clause does not
guarantee ordered results when these constructs are queried, unless
ORDER BY is also specified in the query itself.
A fundamental principle of the SQL language is that tables are not ordered. So, although your query might work in many databases, you should use the version suggested by BlueFeet to guarantee the ordering of results.

Try removing all of the ALLs, for example. Or even just one of them. Now consider that the type of optimization that has to happen there (and many other types) will also be possible when the SELECT queries are actual queries against tables, and are optimized separately. Without an ORDER BY, ordering within each query will be arbitrary, and you can't guarantee that the queries themselves will be processed in any order.
Saying UNION ALL with no ORDER BY is like saying "Just throw all the marbles on the floor." Maybe every time you throw all the marbles on the floor, they end up being organized by color. That doesn't mean the next time you throw them on the floor they'll behave the same way. The same is true for ordering in SQL Server - if you don't say ORDER BY then SQL Server assumes you don't care about order. You may see by coincidence a certain order being returned all the time, but many things can affect the arbitrary order that has been selected next time. Data changes, statistics changes, recompile, plan flush, upgrade, service pack, hotfix, trace flag... ad nauseum.
I will put this in large letters to make it clear:
You cannot guarantee an order without ORDER BY
Some further reading:
Bad habits to kick : relying on undocumented behavior
Also, please read this post by Conor Cunningham, a pretty smart guy on the SQL team.

No. You get the records in whatever way SQL Server fetches them for you. You can apply an order on a unioned result set by 1-based index thusly:
SELECT 1, 'O'
UNION ALL
SELECT 2, 'R'
UNION ALL
SELECT 3, 'D'
UNION ALL
SELECT 4, 'E'
UNION ALL
SELECT 5, 'R'
ORDER BY 1

Related

What is the use of ASC keyword in SQL Server as ASC is the default?

CREATE TABLE #cities(city_id INT, city_name VARCHAR(100))
INSERT INTO #cities(city_id,city_name)
SELECT 5,'New york' UNION ALL
SELECT 4,'tokyo' UNION ALL
SELECT 2,'Alaska' UNION ALL
SELECT 3,'London' UNION ALL
SELECT 1,'Banglore' UNION ALL
SELECT 1,'New york' UNION ALL
SELECT 2,'tokyo' UNION ALL
SELECT 3,'Alaska' UNION ALL
SELECT 4,'London' UNION ALL
SELECT 5,'Banglore'
And I write my queries like below:
SELECT *
FROM #cities
ORDER BY 2, 1 DESC
SELECT *
FROM #cities
ORDER BY 2 ASC, 1 DESC
As you can see both the query giving same result.
See query output
So, is there a specific use of ASC keyword in SQL Server?
SQL server implemented it that way because of the SQL ANSI specifications which clearly state the implementation as
The ORDER BY clause provides your DBMS with a list of items to sort, and the order in which to sort them: either ascending order (ASC, the default) or descending order (DESC).
and
Here's a simple example:
SELECT column_1 FROM Table_1 ORDER BY column_1;
This SQL statement retrieves all COLUMN_1 values from TABLE_1, returning them in ascending order. This SQL statement does exactly the same:
`SELECT column_1 FROM Table_1 ORDER BY column_1 ASC;`
And therefore the keyword ASC is omit-table in ORDER BY clause in MS SQL server too as it implemented the ANSI specifications correctly.
I would like to believe that it exists to help people easily understand a query written by others.
Also please note that you cannot change the default order from ASC to DESC in SQL server.
And as mentioned in the MSDN documentation, please refrain from using 2,1 DESC style in your order by syntax.
Avoid specifying integers in the ORDER BY clause as positional representations of the columns in the select list. For example, although a statement such as SELECT ProductID, Name FROM Production.Production ORDER BY 2 is valid, the statement is not as easily understood by others compared with specifying the actual column name. In addition, changes to the select list, such as changing the column order or adding new columns, will require modifying the ORDER BY clause in order to avoid unexpected results.
There is no specific use by mentioning ASC in Order by. Some one new to Sql can understand it better if you are mentioning it. By default, if you are not mentioning the sort order(asc/desc) of columns in Order By then sql server will consider it as ASC order. Thats why you are getting same result.
There is no specific use, except for clarity of code - showing the intent of the code. Since code is not just for writing, but also for reading by other people, putting in ASC even when it is not needed, can make the intention of that piece of code clearer, if the fact it's ascending is important there.
Please see this article about Intentional Programming, which says:
the programmer can express intentions explicitly in their code, rather
than implicitly via inadequate language features
The same principle is well said in one of the guiding principles of the Python programming language:
Explicit is better than implicit.
The ORDER BY default order is ASC.
So your query is the same as:
SELECT *
FROM #cities
ORDER BY 2, 1 DESC
SELECT *
FROM #cities
ORDER BY 2 ASC, 1 DESC

Fetch data sequentially using union operator

I am fetching data using Union operator. I want my output to be in the same order as my select queries are fetching but instead Union sorts it in alphabetical order. Can you suggest me a way to avoid getting it sorted by default.
Try to do it in a subquery like this:
select * from (select x , y ,z from table1
UNION ALL
select x,y,z from table2)
order by y
Thilo is correct: to be safe, a query should always explicitly order the results. Relying on implicit sorting has caused many problems in the past and will continue to cause more problems in the future.
The suggestion that UNION ALL will avoid the sorting is almost always correct. It should work in 11g and below. But 12c introduced concurrent execution of union all, which does not guarantee the order of results any more.
Even if implicit sorting works right now it's always a good idea to add an ORDER BY.
You should not rely on the order if you do not specify any ORDER BY clause.
UNION guaranties uniquines of the result. Pre 10g versions used sort to remove duplicates. Newer Oracle version also might (but not must) use a hash-table to remove duplicates - therefore the result is not necessarily sorted.
UNION ALL does not care about uniquines.
You can simply type:
select x , y ,z from table1
UNION ALL
select x,y,z from table2
order by y
Order by applies onto the whole result.

does union or union all make a big difference?

i have a code that works,
CURSOR c_val IS
SELECT 'Percentage' destype,
1 descode
FROM dual
UNION ALL --16028 change from union to union all for better performance
SELECT 'Unit'destype,
2 descode
FROM dual
ORDER BY descode DESC;
--
i change the union to union all, from what i have read it wouldnt make a big differnece as long as i have a order by (which i do) , is this truth or should i just leave it the way it was , it still works both ways
The choice between UNION and UNION ALL is not whether one of them works or not, or whether one of them is faster performing, but which one is correct for the logic you want to implement.
UNION ALL is simply the results of the two queries combined into a single result set. You should consider the order of the result sets to be random, although in practice you'd probably find that it's the first result set followed by the second.
UNION is the same, but with a DISTINCT applied to the combined sets. Therefore it's possible for it to return fewer rows and take longer and consume more resources. You would probably find that the order of rows is different, but again you should assume that the order is random.
Applying an ORDER BY to either one of them is the only way to guarantee an order on the result set. It would affect performance of the UNION ALL, but may not affect UNION too much (on top of the impact of the DISTINCT) because the optimiser might choose a DISTINCT implementation that will return an ordered result set.
So in your case you want both rows to appear (and in fact UNION would not eliminate either of them), and you want the order to be specified -- so based on that, UNION ALL and the ORDER BY you have provided are the correct approaches.
"UNION" = "UNION ALL" + "DISTINCT"
So basically by going for UNION ALL you save a dedup operation. In this case you may find that the performance gain is negligible but it can be important on large datasets.
Union All returns all rows, so you might get duplicates. Union is effectively like a distinct. In your case, it doesn't make a difference as you are using the same tables!

orderby in sql query

I need to order sql query by a column (the three different values in this column are C,E,T).
I want the results in order of E,C,T. So, of course I can't use ascending or descending orderby on this column.
Any suggestions how can I do this? I don't know if that matters or not but I am using sybase data server on tomcat.
You could do it by putting a conditional in your select clause. i'm not Sybase guy but it might look something like this:
SELECT col, if col = 'E' then 1 else if col = 'C' then 2 else 3 end AS sort_col
FROM some_table
ORDER BY sort_col
If your AS alias doesn't work you could sort by column 1-based index like this:
ORDER BY 2
The other methods work, but this is an often overlooked trick (in MSSQL, I'm not positive if it works in Sybase or not):
select
foo,
bar
from
bortz
order by
case foo
when 'E' then 1
when 'C' then 2
when 'T' then 3
else 4
end
You could use a per-row function to change the columns as other answers have stated but, if you expect this database to scale well, per-row functions are rarely a good idea.
Feel free to ignore this advice if your table is likely to remain small.
The advice here works because of a few general "facts" (I enclose that in quotation marks since it's not always the case but, in my experience, it mostly is):
The vast majority of databases are read far more often than they're written. That means it's usually a good idea to move the cost of calculation to the write phase rather than the read phase.
Most problems with database tend to be the "my query is slow" type rather than the "there's not enough disk space" type.
Tables always grow bigger than you thought they would :-)
If your situation is matched by those "facts", it makes sense to sacrifice a little disk space in order to speed up your queries. It's also better to incur the cost of calculation only when necessary (insert/update), not when the data hasn't actually changed (select).
To do that, you would create a new column (ect_col_sorted for example) in the table which would hold a numeric sort value (or more than one column if you want different soert orders).
The have an insert/update trigger so that, whenever a row is added to, or changed in, the table, you populate the sort field with the correct value (E = 1, C = 2, T = 3, anything else = 0). Then put an index on that column and your query becomes a much simpler (and faster):
select ect_col, other_col_1, other_col_2
from ect_table
order by ect_col_sorted;
Idea is to add subquery with condition that will return your data row plus fictive value which will be 0 if there is E, 1 for E and 2 for T. Then simply order it by this column.
Hope it helps.
psasik's solution will work, as will this one (which to use and which is faster depends on what else is going on in the query):
select *
from some_table
where col = 'E'
UNION ALL
select *
from some_table
where col = 'C'
UNION ALL
select *
from some_table
where col = 'E'
that should work, but you can also do this which will be "safer" for large dataset which may be paged...
select *, 1 as o
from some_table
where col = 'E'
UNION ALL
select *, 2 as o
from some_table
where col = 'C'
UNION ALL
select *, 3 as o
from some_table
where col = 'E'
ORDER BY o
After I wrote the above I decided this is the best solution (note, I do not know if this will work on a sybase server as I don't have access to one right now but if it does not work on there just pull the creation of the keysort memory table out to a variable or temporary table -- which ever sybase supports)
;WITH keysort (k,o) AS
(
SELECT 'E',0
UNION ALL
SELECT 'C',1
UNION ALL
SELECT 'E',2
)
SELECT *
FROM some_table
LEFT JOIN keysort ON some_table.col = keysort.k
ORDER BY keysort.o
This should be the fastest of all choices -- uses in memory table to exploit sql's optimized joining.
You can even go about using Field() function.
Order by Field(columnname, E, C, T)
Hope this helps you

How can I order entries in a UNION without ORDER BY?

How can I be sure that my result set will have a first and b second? It would help me to solve a tricky ordering problem.
Here is a simplified example of what I'm doing:
SELECT a FROM A LIMIT 1
UNION
SELECT b FROM B LIMIT 1;
SELECT col
FROM
(
SELECT a col, 0 ordinal FROM A LIMIT 1
UNION ALL
SELECT b, 1 FROM B LIMIT 1
) t
ORDER BY ordinal
I don't think order is guaranteed, at least not across all DBMS.
What I've done in the past to control the ordering in UNIONs is:
(SELECT a, 0 AS Foo FROM A LIMIT 1)
UNION
(SELECT b, 1 AS Foo FROM B LIMIT 1)
ORDER BY Foo
Your result set with UNION will eliminate distinct values.
I can't find any proof in documentation, but from 10 years experience I can tell that UNION ALL does preserve order, at least in Oracle.
Do not rely on this, however, if you're building a nuclear plant or something like that.
No, the order of results in a SQL query is controlled only by the ORDER BY clause. It may be that you happen to see ordered results without an ORDER BY clause in some situation, but that is by chance (e.g. a side-effect of the optimiser's current query plan) and not guaranteed.
What is the tricky ordering problem?
I know for Oracle there is no way to guarantee which will come out first without an order by. The problem is if you try it it may come out in the correct order even for most of the times you run it. But as soon as you rely on it in production, it will come out wrong.
I would have thought not, since the database would most likely need to do an ORDER BY in order to the UNION.
UNION ALL might behave differently, but YMMV.
The short answer is yes, you will get A then B.