Records issue using top keyword in sql query - sql

I have a Linq query which is working fine but i have noticed when i use take keyword with that query it does not return the same top selected records.
When i saw the Sql profiler query they are totally same excepts just top keyword in that what may be the problem. One more thing i have noticed is when i give a no greater then records in database it works fine with take as well.
I am attaching the query and records attachment
and when I apply top 10 in this query it shows this records
What could be the problem im using Sql Server 2008 R2.

Using TOP keyword without ordering does not guarantee repeatability of resultset.
From here
If a SELECT statement that includes TOP also has an ORDER BY clause,
the rows to be returned are selected from the ordered result set. The
whole result set is built in the specified order and the top n rows in
the ordered result set are returned.

Try forcing the query to order the records by using ORDER BY (or orderby in linq).

The default ordering may differ, try explicitly ordering by a column.

Related

This Oracle SQL SELECT shouldn't work. Why does it?

I am debugging a query in Oracle 19c that is trying to sort a SELECT DISTINCT result by a field not in the query. (Note: This is the wrong way to do it. Do not do this.)
This query is trying to return a unique list of customer names sorted with the most recent sale date first. It returns an expected error, "ORA-01791: not a SELECTed expression".
SELECT DISTINCT CUSTOMER_NAME
FROM SALES
ORDER BY LAST_SALE_DATE DESCENDING;
It returns an error because the query is trying to order the result by a field that has not been selected. This makes sense so far.
However, if I simply add FETCH FIRST 6 ROWS ONLY to the query, it does not return an error (although the result is not correct so do not do this). But the question is why Oracle does not return an error message?
SELECT DISTINCT CUSTOMER_NAME
FROM SALES
ORDER BY LAST_SALE_DATE DESCENDING
FETCH FIRST 6 ROWS ONLY;
Why does adding FETCH FIRST 6 ROWS ONLY make this work?
Added: The incorrect query will return duplicates if there are multiple records with the same name and date. A search for something like "sql select distinct order by another column" will show several correct ways to do this.
The explain plan for the query shows what is happening.
The parser rewrites the fetch... clause - it adds analytic row_number to the select list, and uses an outer query with a filter on row numbers. It pushes the unique directive (the distinct directive from your select) into the subquery, which is not a valid transformation; I would consider this an obvious bug in Oracle's implementation of fetch....
The explain plan shows the step where the parser creates an inline view where it applies unique and it adds analytic row_number(). It doesn't show the projection (what columns are included in the view), and - critically - what it applies unique to. A little bit of experimentation suggests the answer though: it applies unique to the combination of customer_name and last_sales_date.
It's possible that this has been reported and perhaps fixed in recent versions - what is your Oracle version?

Unexpected SQL Behaviour with "SELECT TOP" Query

I'm using Microsoft SQL Server 2019 and when I execute:
SELECT TOP 10 *
FROM WideWorldImporters.Sales.Invoices
SELECT TOP 10 CustomerID
FROM WideWorldImporters.Sales.Invoices
It gives results:
Which is incorrect because those aren't the "top 10" customer IDs as displayed by the first query.
Full Screenshot:
Edit: The behaviour I expected above matches what actually happens in SQL Sever 2014. I suspect they changed the underlying implementation in SQL Server 2019, although it still satisfies the documented behaviour.
A TOP without ORDER BY is unpredictable. This is documented by Microsoft. From Microsoft docs:
When you use TOP with the ORDER BY clause, the result set is limited to the first N number of ordered rows. Otherwise, TOP returns the first N number of rows in an undefined order.
...
In a SELECT statement, always use an ORDER BY clause with the TOP clause. Because, it's the only way to predictably indicate which rows are affected by TOP.
See also how does SELECT TOP works when no order by is specified?
You have no ORDER BY so the top 10 results by an indeterminate order are returned. This ordering can change, from one execution to the next.
Tables in SQL represent unordered sets. If you want particular rows using TOP, you need to have an ORDER BY.
This is too long for a comment.
In SQL, table records are unordered. For this reason, a query like yours:
SELECT TOP 10 * FROM WideWorldImporters.Sales.Invoices
... will produce inconsistent results, because it is missing an ORDER BY clause. You need to tell your RDBMS which column should be used to order the records so in can define which TOP records should be returned. For example, something like:
SELECT TOP 10 * FROM WideWorldImporters.Sales.Invoices ORDER BY InvoiceID DESC
This result is absolutely normal, as your query doesn't specify an order. The top 10 returns the first 10 results.
If you don't specify any ordering clause, the results will be returned according to the previous operations of the SQL engine, which might not always be the same, and surely not what you expect them to be.

Counting results in SQLite, given query with functions

As you may (or may not) already know, SQLite does not provide information about total number of results from the query. One has to wrap the query in SELECT count(*) FROM (original query); in order to get row count.
This worked perfectly fine for me, until one of users created custom SQL function (you can define your own functions in SQLite) that does INSERT into another, unrelated table. Then he executes query:
SELECT customFunction() FROM primaryTable WHERE primaryKeyColumnId = 1;
The query returns always 1 row, that is certain. It turns out that customFunction() was called twice (and inserted to that other table 2 rows) and that's because my application called his query as usuall and then called count(*) on that query as a followup.
How to approach this problem? How to execute only the original query and still have a row count from SQLite?
I'm using SQLite (3.13.0) C API.
You either have to remove such function calls from the query, or you cannot get the row count before actually having stepped through all the result rows.

SQL - Using MAX in a WHERE clause

Assume value is an int and the following query is valid:
SELECT blah
FROM table
WHERE attribute = value
Though MAX(expression) returns int, the following is not valid:
SELECT blah
FROM table
WHERE attribute = MAX(expression)
OF course the desired effect can be achieved using a subquery, but my question is why was SQL designed this way - is there some reason why this sort of thing is not allowed? Students coming from programming languages where you can always replace a data-type by a function call that returns that type find this issue confusing. Is there an explanation one can give them rather than just saying "that's the way it is"?
It's just because of the order of operations of a query.
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
WHERE just filters the rows returned by FROM. An aggregate function like MAX() can't have a result returned because it hasn't even been applied to anything.
That's also the reason, why you can't use aliases defined in the SELECT clause in a WHERE clause, but you can use aliases defined in FROM clause.
A where clause checks every row to see if it matches the conditions specified.
A max computes a single value from a row set. If you put a max, or any other aggregate function into a where clause, how can SQL server figure out what rows the max function can use until the where clause has finished it filter?
This deals with the order that SQL Server processes commands in. It runs the WHERE clause before a GROUP BY or any aggregate. Since a where clause runs first, SQL Server can't tell if a row will be included in an aggregate until it processes the where. That is what the HAVING clause is for. HAVING runs after the GROUP BY and the WHERE and can include MAX since you have already filtered out the rows you don't want to use. See http://www.bennadel.com/blog/70-SQL-Query-Order-of-Operations.htm for a good explanation of the order in which SQL commands run.
Maybe this work
SELECT blah
FROM table
WHERE attribute = (SELECT MAX(expresion) FROM table1)
The WHERE clause is specifically designed to test conditions against raw data (individual rows of the table). However, MAX is an aggregate function over multiple rows of data. Basically, without a sub-select, the WHERE clause knows nothing about any rows in the table except for the current row. So how can you determine the maximum value over a whole bunch of rows when you don't even know what those rows are?
Yes, it's a little bit of a simplification, especially when dealing with joins, but the same principle applies. WHERE is always row-by-row, so that's all it really knows about.
Even if you have a GROUP BY clause, the WHERE clause still only processes one row at a time in the raw data before grouping. It doesn't know the value of a column in any other rows, so it has no way of knowing which row has the maximum value.
Assuming this is MS SQL Server, the following would work.
SELECT TOP 1 blah
FROM table
ORDER BY expression DESC

Does SQL Server TOP stop processing once it finds enough rows?

When you use the SQL Server TOP clause in a query, does the SQL Server engine stop searching for rows once it has enough to satisfy the TOP X needed to be returned?
Consider the following queries (assume some_text_field is unique and not set for full-text indexing):
SELECT
pk_id
FROM
some_table
WHERE
some_text_field = 'some_value';
and
SELECT TOP 1
pk_id
FROM
some_table
WHERE
some_text_field = 'some_value';
The first query would need to search the entire table and return all of the results it found. The way we have it setup though, that query would ever really return one value. So, would using TOP 1 prevent SQL server from scanning the rest of the table once it has found a match?
Yes, the query stops once it has found enough rows, and doesn't query the rest of the table(s).
Note however that you would probably want to have an index that the database can use for the query. In that case there isn't really any performance difference between getting the first match and getting all one matches.
Yes.
In this case you would get 1 undefined row (as TOP without ORDER BY doesn't guarantee any particular result) then it would stop processing (The TOP iterator in the plan won't request any more rows from child iterators).
If there is a blocking operator (such as SORT) in the plan before the TOP operator or parallel operators before the TOP it may end up doing a lot of work for rows not returned in the final result anyway though.