SQL - In Select Query if I use "Not IN" then it returns result set in random order - sql

I have used NOT IN clause in Select Statement. When I run that query, each time it returns the same result set but the order is different.
Is this the default behavior of "NOT IN" clause?
The query which I am using is as below:
SELECT *,(ISNULL(AppFirstName,'')+' '+ISNULL(AppMiddleName,'')+' '+ISNULL(AppLastName,'')) as AppName FROM BApp AF WHERE AF.SId=11 AND AF.SCId=5 AND AF.CCId= 1 AND AF.IsActive=1 AND AF.ASId=16 AND AF.AId NOT IN (SELECT AId FROM NumberDetails where AId = AF.AId)

The order of an SQL result is not defined and left for the database to pick unless you use an ORDER clause. If you need to know more, post the query and what DB you are using.

If you don't specify an ORDER BY clause, then no query has a defined order. The database is free to return you the rows in whatever order is easiest for it.
The reason this sometimes seems consistent is that the rows will often be read out either in the order they exist on disk (probably the order they were inserted) or in the order of some index that was used to find the result.
The more complex your query, the more complex the processing the database needs to do, so the less likely the results are to come out in some obvious, repeatable, order.
Moral of the story: always use an ORDER BY clause.

SQL, by default, does not order or sort the records it returns. This behavior isn't specific to 'NOT IN', but is a general premise of the language. However, you can easily order your results by adding an 'ORDER BY table.column_name' to the end of your query.

Related

why distinct and order by doesn't work together in sql query?

enter image description here I am learning how to order by is used in SQL query, then I learned that order by and distinctly don't work together but, when I try to do it practically it worked. I am so confused even after asking chatgpt what the relationship is between order by and distinct.
I learned that when executing SQL queries, the ORDER BY clause comes after the SELECT clause. This means that the database will first retrieve the data specified in the SELECT clause, and then sort it based on the criteria specified in the ORDER BY clause. If the column used in the ORDER BY clause is not present in the SELECT clause, the database will automatically include that column in the select and do order by on both the columns and give result column only column given in SELECT.
However, when using both DISTINCT and ORDER BY together, the outcome may not be what is expected. This is because DISTINCT acts on both the columns in the SELECT clause and the column in the ORDER BY clause. This may cause unexpected results, especially in MySQL.
I found that when I tried this in practice, it still produced the desired results, which makes me question if I learned something incorrectly or if there is missing information that I am unaware of.
I am using the MYSQL database.
It seems there are some spaces before your unique value. It will be appropriate to perform TRIM/RTRIM and remove these spaces in the DISTINC clause itself.
It should be something like this:
DISTINCT TRIM(value) AS trim_value
...
ORDER BY trim_value
Also, it is possible that these are not spaces but some other characters which need to be replace, too.

Are postgresql `SELECT DISTINCT` queries deterministic?

Are Postgres SELECT DISTINCT queries deterministic?
Will SELECT DISTINCT somecolumn FROM sometable return the same result (including order) if the table (and entire database) goes unchanged?
In the Select Query Documentation the Description section notes:
If the ORDER BY clause is specified, the returned rows are sorted in the specified order. If ORDER BY is not given, the rows are returned in whatever order the system finds fastest to produce.
In the DISTINCT ON clause section they add:
Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
Generally, is this still true when the database goes un-changed?
This answer assumes that the expressions in the select are deterministic. Otherwise, the question seems trivial.
The ordering is not specified, so it could change between runs of the query -- or on a different system. However, the result set should be the same.
Your second quote from the documentation is for distinct on. That is not-deterministic, unless you are using a stable sort.
Note: You might get non-deterministic results if you are using a case-insensitive collation. The built-in collations are case-sensitive; and case insensitivity means that the original expressions are not deterministic.

how to resolve this - group by changes the Order of items in SQL Server

I'm using SQL server 2014,I'm fetching data from a view.The order of items is getting changed once i use Group by ,how can i get the order back after using this Group by,There is one date column,but its not saving any time,So i can't sort it based on date also..
How can I display the data in the same order as it displayed before using Group by?Anyone have any idea please help?
Thanks
Tables and views are essentially unordered sets. To get rows in a specific order, you should always add an ORDER BY clause on the columns you wish to order on.
I'm assuming you previously selected from the VIEW without an ORDER BY clause. The order in which rows are returned from a SELECT statement without an ORDER BY statement is undefined. The order you are getting them in, can change due to any number of reasons (eg some are listed here).
Your problem stems from the mistake you made on relying on the order from a SELECT from a VIEW without an ORDER BY. You should have had an ORDER BY clause in your SELECT statement to begin with.
How can I display the data in the same order as it displayed before using Group by?
The answer: You can't if your initial statement did not have an ORDER BY clause.
The resolution: Determine the order you want the resultset in and add an ORDER BY clause on those columns, both in your initial query and the version with the GROUP BY clause.
Maybe you can use the row_number() function without any OVER and ORDER BY keywords? This should be done in a sub-select and when you group the data in the outer SELECT, use the AVG() function on the numbered column and ORDER the result by this. The problem is, that when you group rows, the original rows disappear. That's kind if the purpose of GROUP BY. ;) Depending on what you GROUP BY, what you're asking might be logically impossible.
EDIT:
Found this solution Googling: http://blog.sqlauthority.com/2015/05/05/sql-server-generating-row-number-without-ordering-any-columns/
So you can number rows like this to maintain the order of rows from the table before you GROUP BY:
row_number() OVER (ORDER BY (SELECT 1))
The only way you can enforce a specific order is to explicitly use a ORDER BY clause. Otherwise the order of rows is not guaranteed (take a look at this article for more details) and the database engine will return the rows based on "as fast as it can" or "as fast as it can retrieve them from disk" rule. So, order can also vary between executions of the same query in the span of a few seconds.
When doing a DISTINCT, GROUP BY or ORDER BY, SQL Server automatically does a SORT of the data based on an index it uses for that query.
Looking at the execution plan of your query will show you what index (and implicitly columns in that index) is being used to sort the data.

SQL - Using MAX in a WHERE clause

Assume value is an int and the following query is valid:
SELECT blah
FROM table
WHERE attribute = value
Though MAX(expression) returns int, the following is not valid:
SELECT blah
FROM table
WHERE attribute = MAX(expression)
OF course the desired effect can be achieved using a subquery, but my question is why was SQL designed this way - is there some reason why this sort of thing is not allowed? Students coming from programming languages where you can always replace a data-type by a function call that returns that type find this issue confusing. Is there an explanation one can give them rather than just saying "that's the way it is"?
It's just because of the order of operations of a query.
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
WHERE just filters the rows returned by FROM. An aggregate function like MAX() can't have a result returned because it hasn't even been applied to anything.
That's also the reason, why you can't use aliases defined in the SELECT clause in a WHERE clause, but you can use aliases defined in FROM clause.
A where clause checks every row to see if it matches the conditions specified.
A max computes a single value from a row set. If you put a max, or any other aggregate function into a where clause, how can SQL server figure out what rows the max function can use until the where clause has finished it filter?
This deals with the order that SQL Server processes commands in. It runs the WHERE clause before a GROUP BY or any aggregate. Since a where clause runs first, SQL Server can't tell if a row will be included in an aggregate until it processes the where. That is what the HAVING clause is for. HAVING runs after the GROUP BY and the WHERE and can include MAX since you have already filtered out the rows you don't want to use. See http://www.bennadel.com/blog/70-SQL-Query-Order-of-Operations.htm for a good explanation of the order in which SQL commands run.
Maybe this work
SELECT blah
FROM table
WHERE attribute = (SELECT MAX(expresion) FROM table1)
The WHERE clause is specifically designed to test conditions against raw data (individual rows of the table). However, MAX is an aggregate function over multiple rows of data. Basically, without a sub-select, the WHERE clause knows nothing about any rows in the table except for the current row. So how can you determine the maximum value over a whole bunch of rows when you don't even know what those rows are?
Yes, it's a little bit of a simplification, especially when dealing with joins, but the same principle applies. WHERE is always row-by-row, so that's all it really knows about.
Even if you have a GROUP BY clause, the WHERE clause still only processes one row at a time in the raw data before grouping. It doesn't know the value of a column in any other rows, so it has no way of knowing which row has the maximum value.
Assuming this is MS SQL Server, the following would work.
SELECT TOP 1 blah
FROM table
ORDER BY expression DESC

Using limit in sqlite SQL statement in combination with order by clause

Will the following two SQL statements always produce the same result set?
1. SELECT * FROM MyTable where Status='0' order by StartTime asc limit 10
2. SELECT * FROM (SELECT * FROM MyTable where Status='0' order by StartTime asc) limit 10
Yes, but ordering subqueries is probably a bad habit to get into. You could feasibly add a further ORDER BY outside the subquery in your second example, e.g.
SELECT *
FROM (SELECT *
FROM Test
ORDER BY ID ASC
) AS A
ORDER BY ID DESC
LIMIT 10;
SQLite still performs the ORDER BY on the inner query, before sorting them again in the outer query. A needless waste of resources.
I've done an SQL Fiddle to demonstrate so you can view the execution plans for each.
No. First because the StartTime column may not have UNIQUE constraint. So, even the first query may not always produce the same result - with itself!
Second, even if there are never two rows with same StartTime, the answer is still negative.
The first statement will always order on StartTime and produce the first 10 rows. The second query may produce the same result set but only with a primitive optimizer that doesn't understand that the ORDER BY in the subquery is redundant. And only if the execution plan includes this ordering phase.
The SQLite query optimizer may (at the moment) not be very bright and do just that (no idea really, we'll have to check the source code of SQLite*). So, it may appear that the two queries are producing identical results all the time. Still, it's not a good idea to count on it. You never know what changes will be made in a future version of SQLite.
I think it's not good practice to use LIMIT without ORDER BY, in any DBMS. It may work now, but you never know how long these queries will be used by the application. And you may not be around when SQLite is upgraded or the DBMS is changed.
(*) #Gareth's link provides the execution plan which suggests that current SQLite code is dumb enough to execute the redundant ordering.