I am debugging a query in Oracle 19c that is trying to sort a SELECT DISTINCT result by a field not in the query. (Note: This is the wrong way to do it. Do not do this.)
This query is trying to return a unique list of customer names sorted with the most recent sale date first. It returns an expected error, "ORA-01791: not a SELECTed expression".
SELECT DISTINCT CUSTOMER_NAME
FROM SALES
ORDER BY LAST_SALE_DATE DESCENDING;
It returns an error because the query is trying to order the result by a field that has not been selected. This makes sense so far.
However, if I simply add FETCH FIRST 6 ROWS ONLY to the query, it does not return an error (although the result is not correct so do not do this). But the question is why Oracle does not return an error message?
SELECT DISTINCT CUSTOMER_NAME
FROM SALES
ORDER BY LAST_SALE_DATE DESCENDING
FETCH FIRST 6 ROWS ONLY;
Why does adding FETCH FIRST 6 ROWS ONLY make this work?
Added: The incorrect query will return duplicates if there are multiple records with the same name and date. A search for something like "sql select distinct order by another column" will show several correct ways to do this.
The explain plan for the query shows what is happening.
The parser rewrites the fetch... clause - it adds analytic row_number to the select list, and uses an outer query with a filter on row numbers. It pushes the unique directive (the distinct directive from your select) into the subquery, which is not a valid transformation; I would consider this an obvious bug in Oracle's implementation of fetch....
The explain plan shows the step where the parser creates an inline view where it applies unique and it adds analytic row_number(). It doesn't show the projection (what columns are included in the view), and - critically - what it applies unique to. A little bit of experimentation suggests the answer though: it applies unique to the combination of customer_name and last_sales_date.
It's possible that this has been reported and perhaps fixed in recent versions - what is your Oracle version?
For example, I know what SELECT * FROM example_table; means. However, I feel uncomfortable not knowing what each part of the code means.
The second part of a SQL query is the name of the column you want to retrieve for each record you are getting.
You can obviously retrieve multiple columns for each record, and (only if you want to retrieve all the columns) you can replace the list of them with *, which means "all columns".
So, in a SELECT statement, writing * is the same of listing all the columns the entity has.
Here you can find probably the best tutorial for SQL learning.
I am providing you answer by seperating each part of code.
SELECT == It orders the computer to include or select each content from the database name(table ) .
(*) == means all {till here code means include all from the database.}
FROM == It refers from where we have to select the data.
example_table == This is the name of the database from where we have to select data.
the overall meaning is :
include all data from the databse whose name is example_table.
thanks.
For a beginner knowing the follower concepts can be really useful,
SELECT refers to attributes that you want to have displayed in your final query result. There are different 'SELECT' statements such as 'SELECT DISTINCT' which returns only unique values (if there were duplicate values in the original query result)
FROM basically means from which table you want the data. There can be one or many tables listed under the 'FROM' statement.
WHERE means the condition you want to satisfy. You can also do things like ordering the list by using 'order by DESC' (no point using order by ASC as SQL orders values in ascending order after you use the order by clause).
Refer to W3schools for a better understanding.
At my office, one of the tables we use keeps track of our Order Numbers. The problem is that the employees don't enter the number consistantly into the database field.
Some of the examples are listed:
'7-26-13 543006-27031', '345009-27031', 'KWYD-863009-27031'.
I need to to find a way to return just the 'nnnnnn-nnnnn' substring
no matter where in the field it is. Most of the time, this pattern is at the end of the field, but that is not always the case. I've already limited the data records to just those with that pattern using a LIKE expression in my WHERE clause, but I have no idea how to best return just that pattern as a column.
Edit:
We are still using SQL Server 2000
What I'm looking to do is along the lines of:
SELECT SUBSTRING(VendorOrderNo, ??, 12) AS OrderNo
FROM Orders
WHERE VendorOrderNo LIKE '%[0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9]%'
select 'nnnnnn-nnnnn' as employeeid
from table
where employees like'%nnnnnn-nnnnn'
It would be better if you adjust your data first, you can give this a try (its for MYSQL)
SELECT * FROM your_table_name WHERE order_number REGEXP '[0-9]+-[0-9]+';
SQLFIDDLE
Is the following possible according to standard(!) SQL?
What minimal changes should be neccessary in order to be conforming to the standard (if it wasn't already)?
It works as expected in MySQL, iff the first row has the maximum value for NumberOfPages.
SELECT *
FROM Book
HAVING NumberOfPages = MAX(NumberOfPages)
The following is written in the standard:
HAVING <search condition>
Let G be the set consisting of every column referenced by a <column reference> contained in the <group by clause>.
Each column reference directly contained in the <search condition> shall be one of the following:
An unambiguous reference to a column that is functionally dependent on G.
An outer reference.
source
Can somebody explain me, why it should be possible according to the standard?
In MySQL, it perfectly works.
Despite the Mimer Validator result, I don't believe yours is valid Standard SQL.
A HAVING clause without a GROUP BY clause is valid and (arguably) useful syntax in Standard SQL. Because it operates on the table expression all-at-once as a set, so to speak, it only really makes sense to use aggregate functions. In your example:
Book HAVING NumberOfPages = MAX(NumberOfPages)
is not valid because when considering the whole table, which row does NumberOfPages refer to? Likewise, it only makes sense to use literal values in the SELECT clause.
Consider this example, which is valid Standard SQL:
SELECT 'T' AS result
FROM Book
HAVING MIN(NumberOfPages) < MAX(NumberOfPages);
Despite the absence of the DISTINCT keyword, the query will never return more than one row. If the HAVING clause is satisfied then the result will be a single row with a single column containing the value 'T' (indicating we have books with differing numbers of pages), otherwise the result will be the empty set i.e. zero rows with a single column.
I think the reason why the query does not error in mySQL is due to propritary extensions that cause the HAVING clause to (logically) come into existence after the SELECT clause (the Standard behaviour is the other way around), coupled with the implicit GROUP BY clause mentioned in other answers.
“When GROUP BY is not used, HAVING behaves like a WHERE clause.”
The difference between where and having: WHERE filters ROWS while HAVING filters groups
SELECT SUM(spending) as totSpending
FROM militaryspending
HAVING SUM(spending) > 200000;
Result
totSpending
1699154.3
More detail, please consult
https://dba.stackexchange.com/questions/57445/use-of-having-without-group-by-in-sql-queries/57453
From the standard (bold added from emphasis)
1) Let HC be the having clause. Let TE be the table expression that immediately contains HC. If TE does not immediately contain a group by clause, then “GROUP BY ()” is implicit. Let T be the
descriptor of the table defined by the GBC immediately contained in TE and let R be
the result of GBC.
With the implicit group by clause, the outer reference can access the TE columns.
However, the certification to these standards is very much a self-certification these days, and the example you gave would not work across all of the main RDBMS providers.
Yes We can write the SQL query without Group by but write the aggregate function
in our query.
select sum(Salary) from ibs having max(Salary)>1000
I know:
Firebird: FIRST and SKIP;
MySQL: LIMIT;
SQL Server: ROW_NUMBER();
Does someone knows a SQL ANSI way to perform result paging?
See Limit—with offset section on this page: http://troels.arvin.dk/db/rdbms/
BTW, Firebird also supports ROWS clause since version 2.0
No official way, no.*
Generally you'll want to have an abstracted-out function in your database access layer that will cope with it for you; give it a hint that you're on MySQL or PostgreSQL and it can add a 'LIMIT' clause to your query, or rownum over a subquery for Oracle and so on. If it doesn't know it can do any of those, fall back to fetching the lot and returning only a slice of the full list.
*: eta: there is now, in ANSI SQL:2003. But it's not globally supported, it often performs badly, and it's a bit of a pain because you have to move/copy your ORDER into a new place in the statement, which makes it harder to wrap automatically:
SELECT * FROM (
SELECT thiscol, thatcol, ROW_NUMBER() OVER (ORDER BY mtime DESC, id) AS rownumber
)
WHERE rownumber BETWEEN 10 AND 20 -- care, 1-based index
ORDER BY rownumber;
There is also the "FETCH FIRST n ROWS ONLY" suffix in SQL:2008 (and DB2, where it originated). But like the TOP prefix in SQL Server, and the similar syntax in Informix, you can't specify a start point, so you still have to fetch and throw away some rows.
In nowadays there is a standard, not necessarily a ANSI standard (people gave many anwsers, I think this is the less verbose one)
SELECT * FROM t1
WHERE ID > :lastId
ORDER BY ID
FETCH FIRST 3 ROWS ONLY
It's not supported by all databases though, bellow a list of all databases that have support
MariaDB: Supported since 5.1 (usually, limit/offset is used)
MySQL: Supported since 3.19.3 (usually, limit/offset is used)
PostgreSQL: Supported since PostgreSQL 8.4 (usually, limit/offset is used)
SQLite: Supported since version 2.1.0
Db2 LUW: Supported since version 7
Oracle: Supported since version 12c (uses subselects with the row_num function)
Microsoft SQL Server: Supported since 2012 (traditionally, top-N is used)
You can use the offset style of course, although you could have performance issues
SELECT * FROM t1
ORDER BY ID
OFFSET 0 ROWS
FETCH FIRST 3 ROWS ONLY
It has a different support
MariaDB: Supported since 5.1
MySQL: Supported since 4.0.6
PostgreSQL: Supported since PostgreSQL 6.5
SQLite: Supported since version 2.1.0
Db2 LUW: Supported since version 11.1
Oracle: Supported since version 12c
Microsoft SQL Server: Supported since 2012
Yes (SQL ANSI 2003), feature E121-10, combined with the F861 feature you have :
ORDER BY column OFFSET n1 ROWS FETCH NEXT n2 ROWS ONLY;
Like:
SELECT Name, Address FROM Employees ORDER BY Salary OFFSET 2 ROWS
FETCH NEXT 2 ROWS ONLY;
Examples:
postgres:
https://dbfiddle.uk/?rdbms=postgres_9.5&fiddle=e25bb5235ccce77c4f950574037ef379
oracle:
https://dbfiddle.uk/?rdbms=oracle_21&fiddle=07d54808407b9dbd2ad209f2d0fe7ed7
sqlserver:
https://dbfiddle.uk/?rdbms=sqlserver_2019l&fiddle=e25bb5235ccce77c4f950574037ef379
db2:
https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=e25bb5235ccce77c4f950574037ef379
YugabyteDB:
https://dbfiddle.uk/?rdbms=yugabytedb_2.8&fiddle=e25bb5235ccce77c4f950574037ef379
Unfortunately, MySQL does not support this syntax, you need something like:
ORDER BY column LIMIT n1 OFFSET n2
But MariaDB does:
https://dbfiddle.uk/?rdbms=mariadb_10.6&fiddle=e25bb5235ccce77c4f950574037ef379
I know I'm very, very late to this question, but it's still one of the top results for this issue.
However one response missing for this question is that the I believe the "correct" ANSI SQL method for paging, at least if you want maximum portability, is to not to use LIMIT/OFFSET/FIRST etc. at all, but to instead do something like:
SELECT *
FROM MyTable
WHERE ColumnA > ?
ORDER BY ColumnA ASC
Where ? is a parameter using a library that supports them (such as PDO in PHP).
The idea here is simple, when fetching the first page we pass a parameter that will match every possible row, e.g- if ColumnA is text, we would pass an empty string (''). We then read in as many results as we want, and then release the rest. This may mean some extra rows are fetched behind the scenes, but our priority here is compatibility.
In order to fetch the next page, we take the value of ColumnA from the last row in our results, and pass it in as the parameter, this way we will only fetch values that appear after it. To run the same query in the other direction, just swap > for < and ASC for DESC.
There are some important caveats of this approach:
Since we're using a condition, your DBMS is free to use an index to optimise the request, which can actually be faster than some "proper" pagination methods, as you eliminate rows rather than advancing past them.
This form of paging is more tightly anchored than row number based methods. When using row number offsets, if you offset into the table, but new rows are added that sort earlier than the current page, then it will cause results to be shifted into later pages. For example, if your current page's last row is mango but since fetching it rows are added for apple and carrot, then mango may now appear on the next page as well, as it has been shifted in the sort order. By using a condition of ColumnA > 'mango' this can't happen. This can be very useful in cases where you are sorting by a DATETIME with frequent updates occurring.
This trick can be made to work in both directions, by reversing the sort order as mentioned when going backwards (flip > to < and ASC to DESC) and passing in the value of ColumnA from the first row of each page of results, rather than the last. Note that if values were added to your table, it may mean that your first page may be shorter, but this is a fairly minor issue.
To be sure you're on the last (or first) page, you should fetch N + 1 rows, where N is the number of rows you want per page, this way you can detect whether there are more rows to fetch.
This method works best if you have a single column with only unique values, but it is still possible to use in more complex cases, so long as you can expand your ORDER BY clause (and WHERE condition) to include enough columns that every row is unique.
So it's not without a few catches, but it's by far the most compatible method as every SQL database will support it.
Insert your results into a storage table, ordered how you'd like to display them, but with a new IDENTITY column.
Now SELECT from that table just the range of IDs you're interested in.
(Be sure to clean out the table when you're done)
Or do it on the client, as anything to do with presentation should not normally be done on the SQL Server (in my opinion)
ANSI Sql example:
offset=41, fetchsize=10
SELECT TOP(10) *
FROM table1
WHERE table1.ID NOT IN (SELECT TOP(40) table1.ID FROM table1)
For paging we need a RowNo column to filter over it -that it should be over a field like id- with two variables like #PageNo and #PageRows. So I use this query:
SELECT *
FROM (
SELECT *, (SELECT COUNT(1)
FROM aTable ti
WHERE ti.id < t.id) As RowNo
FROM aTable t) tr
WHERE
tr.RowNo >= (#PageNo - 1) * #PageRows + 1
AND
tr.RowNo <= #PageNo * #PageRows
BTW, Troels, PostgreSQL supports Limit/Offset