creating functional index with a order by clause in oracle - sql

I Need to create a order by index on a table
Student (
roll_No,
name,
stream,
percentage,
class_rank,
overall_rank )
I wish to query something like
SELECT *
FROM student
WHERE stream = 'science'
The expected result would be the students arranged in descending order of their rank. A requirement is that I can not specify order by clause in the query itself.
This should be achieved by an index on (stream , order by class_rank desc). Is this achievable in oracle?

If you do not specify an ORDER BY clause, Oracle does not guarantee the order in which rows are returned. The requirement does not make sense.
You might get lucky and find that Oracle chooses a query plan that happens to return the rows in the order you want. But that would be a matter of luck-- Oracle could choose a different query plan tomorrow or an Oracle version upgrade may create the results to change. For example, folks that relied for years on the GROUP BY clause ordering the results as a side effect were distressed when a new version of Oracle added a more efficient grouping algorithm that didn't have the side effect of ordering the results.

I got it working
CREATE INDEX stream_rank_idx ON Student (stream,class_rank desc);
so when ever i fire a select * from student where stream =? query the above index will be used and it will return me the desired result.
and i think its almost always safe if i am not upgrading oracle. and even after upgrade there is very low probabilty that oracle will change the way indexes are picked.

Related

Get data like in database order without sorting it

I have a table which has ShipCountry, ShipCity and Freight column in SQL database. I tried to retrieve data from that table by using the below query.
Select ShipCountry from CountryDetails Group by ShipCountry
If i run this query i am getting results in Ascending order. Instead of this i need data in database order. How to achieve this through SQL query?
Note: If i run the below query, it will return the data in Database order. I am getting sorted data when i added group by clause in my query.
Select ShipCountry from CountryDetails
The use of group by for ordering is improper .. (group by is for aggregation function as min, max or count)
if you need a specific order use order by instead
Select ShipCountry from CountryDetails Order by ShipCountry
otherwise if want not order use simply
Select ShipCountry from CountryDetails
Remember that the values store in db have not a proper order ..and are selected in the sequence used for retrive the data.
Each time you need an order you must esplicitally use order by
for avoid "redundant values" .. use distinct and not group by eg:
Select distinct ShipCountry from CountryDetails
As already has been stated, what you describe might lead to unexpected results fro your end users.
Let's assume you have a table without any indexes or keys (A so-called heap). A heap pretty much can be compared to a phone book (yeah, I've been around for a while) consisting of hundreds of pages, on which information is randomly ordered. A heap is exactly that; A lot of randomly ordered data. Whenever you query from such a table, the query analyzer will do its very best to figure out what the fastest way to deliver the data is.
Such decisions from the query analyzer are guided by statistics; a collection of metrics about the data and the distribution thereof. SQL Server uses these statistics to figure out the cardinality (the uniqueness of values), and thus pick the fastest way to return data.
When you simply issue a SELECT * FROM myTable on a heap, those statistics will determine the order in which your data is returned. However, this also means that over time, the statistics will change, as more data flows into the table. This has the effect that the sort order of your data today is not necessarily the sort order in which the data is returned tomorrow, or even five minutes from now.
If that is fine with your end users, then a SELECT * FROM myTable is the right solution for you. But, if you absolutely need to have the data returned in a certain order, you should always implement an ORDER BY clause.
if you want to have the same database order in most cases if you have sorted by primary key it will be the same without ordering as you say:
here the id is the primary key, and if you can not use the primary key add an identity column and use it:
id name
1 elly
2 ahmad
3 joseph
4 omar

how to resolve this - group by changes the Order of items in SQL Server

I'm using SQL server 2014,I'm fetching data from a view.The order of items is getting changed once i use Group by ,how can i get the order back after using this Group by,There is one date column,but its not saving any time,So i can't sort it based on date also..
How can I display the data in the same order as it displayed before using Group by?Anyone have any idea please help?
Thanks
Tables and views are essentially unordered sets. To get rows in a specific order, you should always add an ORDER BY clause on the columns you wish to order on.
I'm assuming you previously selected from the VIEW without an ORDER BY clause. The order in which rows are returned from a SELECT statement without an ORDER BY statement is undefined. The order you are getting them in, can change due to any number of reasons (eg some are listed here).
Your problem stems from the mistake you made on relying on the order from a SELECT from a VIEW without an ORDER BY. You should have had an ORDER BY clause in your SELECT statement to begin with.
How can I display the data in the same order as it displayed before using Group by?
The answer: You can't if your initial statement did not have an ORDER BY clause.
The resolution: Determine the order you want the resultset in and add an ORDER BY clause on those columns, both in your initial query and the version with the GROUP BY clause.
Maybe you can use the row_number() function without any OVER and ORDER BY keywords? This should be done in a sub-select and when you group the data in the outer SELECT, use the AVG() function on the numbered column and ORDER the result by this. The problem is, that when you group rows, the original rows disappear. That's kind if the purpose of GROUP BY. ;) Depending on what you GROUP BY, what you're asking might be logically impossible.
EDIT:
Found this solution Googling: http://blog.sqlauthority.com/2015/05/05/sql-server-generating-row-number-without-ordering-any-columns/
So you can number rows like this to maintain the order of rows from the table before you GROUP BY:
row_number() OVER (ORDER BY (SELECT 1))
The only way you can enforce a specific order is to explicitly use a ORDER BY clause. Otherwise the order of rows is not guaranteed (take a look at this article for more details) and the database engine will return the rows based on "as fast as it can" or "as fast as it can retrieve them from disk" rule. So, order can also vary between executions of the same query in the span of a few seconds.
When doing a DISTINCT, GROUP BY or ORDER BY, SQL Server automatically does a SORT of the data based on an index it uses for that query.
Looking at the execution plan of your query will show you what index (and implicitly columns in that index) is being used to sort the data.

SQL - In Select Query if I use "Not IN" then it returns result set in random order

I have used NOT IN clause in Select Statement. When I run that query, each time it returns the same result set but the order is different.
Is this the default behavior of "NOT IN" clause?
The query which I am using is as below:
SELECT *,(ISNULL(AppFirstName,'')+' '+ISNULL(AppMiddleName,'')+' '+ISNULL(AppLastName,'')) as AppName FROM BApp AF WHERE AF.SId=11 AND AF.SCId=5 AND AF.CCId= 1 AND AF.IsActive=1 AND AF.ASId=16 AND AF.AId NOT IN (SELECT AId FROM NumberDetails where AId = AF.AId)
The order of an SQL result is not defined and left for the database to pick unless you use an ORDER clause. If you need to know more, post the query and what DB you are using.
If you don't specify an ORDER BY clause, then no query has a defined order. The database is free to return you the rows in whatever order is easiest for it.
The reason this sometimes seems consistent is that the rows will often be read out either in the order they exist on disk (probably the order they were inserted) or in the order of some index that was used to find the result.
The more complex your query, the more complex the processing the database needs to do, so the less likely the results are to come out in some obvious, repeatable, order.
Moral of the story: always use an ORDER BY clause.
SQL, by default, does not order or sort the records it returns. This behavior isn't specific to 'NOT IN', but is a general premise of the language. However, you can easily order your results by adding an 'ORDER BY table.column_name' to the end of your query.

Strange issue with the Order By --SQL

Few days ago I came across a strange problem with the Order By , While creating a new table I used
Select - Into - From and Order By (column name)
and when I open that table see tables are not arranged accordingly.
I re-verified it multiple times to make sure I am doing the right thing.
One more thing I would like to add is till the time I don't use INTO, I can see the desired result but as soon as I create new table, I see there is no Order for tht column. Please help me !
Thanks in advance.. Before posting the question I did research for 3 days but no solution yet
SELECT
[WorkOrderID], [ProductID], [OrderQty], [StockedQty]
INTO
[AdventureWorks2012].[Production].[WorkOrder_test]
FROM
[AdventureWorks2012].[Production].[WorkOrder]
ORDER BY
[StockedQty]
SQL 101 for beginners: SELECT statements have no defined order unless you define one.
When i open that table
That likely issues a SELECT (TOP 1000 IIFC) without order.
While creating a new table i used Select - Into - From and Order By (column name)
Which sort of is totally irrelevant - you basically waste performance ordering the input data.
You want an order in a select, MAKE ONE by adding an order by clause to the select. The table's internal order is by clustered index, but an query can return results in any order it wants. Fundamental SQL issue, as I said in the first sentence. Any good book on sql covers that in one of the first chapters. SQL uses a set approach, sets have no intrinsic order.
Firstly T-SQL is a set based language and sets don't have orders. More over it also doesn't mean serial execution of commands i.e, the above query is not executed in sequence written but the processing order for a SELECT statement is as:
1.FROM
2.ON
3.JOIN
4.WHERE
5.GROUP BY
6.WITH CUBE or WITH ROLLUP
7.HAVING
8.SELECT
9.DISTINCT
10.ORDER BY
Now when you execute your query without into selected column data gets ordered based on the condition specified in 'Order By' clause but when Into is used format of new_table is determined by evaluating the expressions in the select list.(Remember order by clause has not been evaluated yet).
The columns in new_table are created in the order specified by the select list but rows cannot be ordered. It's a limitation of Into clause you can refer here:
Specifying an ORDER BY clause does not guarantee the rows are inserted
in the specified order.

Is there a performance difference in using a GROUP BY with MAX() as the aggregate vs ROW_NUMBER over partition by?

Is there a performance difference between the following 2 queries, and if so, then which one is better?:
select
q.id,
q.name
from(
select id, name, row_number over (partition by name order by id desc) as row_num
from table
) q
where q.row_num = 1
versus
select
max(id) ,
name
from table
group by name
(The result set should be the same)
This is assuming that no indexes are set.
UPDATE: I tested this, and the group by was faster.
I had a table of about 4.5M rows, and I wrote both a MAX with GROUP BY as well as a ROW_NUMBER solution and tested them both. The MAX requires two clustered scans of the table, one to aggregate, and a second to join to the rest of the columns whereas ROW_NUMBER only needed one. (Obviously one or both of these could be indexed to minimize IO, but the point is that GROUP BY requires two index scans.)
According to the optimizer, in my case the ROW_NUMBER is about 60% more efficient according to the subtree cost. And according to statistics IO, about 20% less CPU time. However, in real elapsed time, the ROW_NUMBER solution takes about 80% more real time. So the GROUP BY wins in my case.
This seems to match the other answers here.
The group by should be faster. The row number has to assign a row to all rows in the table. It does this before filtering out the ones it doesn't want.
The second query is, by far, the better construct. In the first, you have to be sure that the columns in the partition clause match the columns that you want. More importantly, "group by" is a well-understood construct in SQL. I would also speculate that the group by might make better use of indexes, but that is speculation.
I'd use the group by name.
Not much in it when the index is name, id DESC (Plan 1)
but if the index is declared as name, id ASC (Plan 2) then in 2008 I see the ROW_NUMBER version is unable to use this index and gets a sort operation whereas the GROUP BY is able to use a backwards index scan to avoid this.
You'd need to check the plans on your version of SQL Server and with your data and indexes to be sure.