SQL GROUP BY 1 2 3 and SQL Order of Execution - sql

This may be a dumb question but I am really confused. So according to the SQL Query Order of Execution, the GROUP BY clause will be executed before the SELECT clause. However it allows to do something like:
SELECT field_1, SUM(field_2) FROM myTable GROUP BY 1
My confusion is that if GROUP BY clause happens before SELECT, in this scenario I provided, how does SQL know what 1 is? It works with ORDER BY clause and it makes sense to me because ORDER BY clause happens after SELECT.
Can someone help me out? Thanks in advance!
https://www.periscopedata.com/blog/sql-query-order-of-operations

My understanding is because it's ordinal notation and for the SELECT statement to pass syntax validation you have to have at least selected a column. So the 1 is stating the first column in the select statement since it knows you have a column selected.
EDIT:
I see people saying you can't use ordinal notation and they are right if you're using SQL Server. You can use it in MySQL though.

select a,b,c from emp group by 1,2,3. First it will group by column a then b and c. It works based on the column after the select statement.

Each GROUP BY expression must contain at least one column that is not an outer reference. You cannot group by 1 if it is not a column in your table.

Related

Query working in MySQL but trowing error in Oracle can someone please explain. and tell me how to rewrite this same query in oracle to avoid error [duplicate]

I have a query
SELECT COUNT(*) AS "CNT",
imei
FROM devices
which executes just fine. I want to further restrict the query with a WHERE statement. The (humanly) logical next step is to modify the query followingly:
SELECT COUNT(*) AS "CNT",
imei
FROM devices
WHERE CNT > 1
However, this results in a error message ORA-00904: "CNT": invalid identifier. For some reason, wrapping the query in another query produces the desired result:
SELECT *
FROM (SELECT COUNT(*) AS "CNT",
imei
FROM devices
GROUP BY imei)
WHERE CNT > 1
Why does Oracle not recognize the alias "CNT" in the second query?
Because the documentation says it won't:
Specify an alias for the column
expression. Oracle Database will use
this alias in the column heading of
the result set. The AS keyword is
optional. The alias effectively
renames the select list item for the
duration of the query. The alias can
be used in the order_by_clause but not
other clauses in the query.
However, when you have an inner select, that is like creating an inline view where the column aliases take effect, so you are able to use that in the outer level.
The simple answer is that the AS clause defines what the column will be called in the result, which is a different scope than the query itself.
In your example, using the HAVING clause would work best:
SELECT COUNT(*) AS "CNT",
imei
FROM devices
GROUP BY imei
HAVING COUNT(*) > 1
To summarize, this little gem explains:
10 Easy Steps to a Complete Understanding of SQL
A common source of confusion is the simple fact that SQL syntax
elements are not ordered in the way they are executed. The lexical
ordering is:
SELECT [ DISTINCT ]
FROM
WHERE
GROUP BY
HAVING
UNION
ORDER BY
For simplicity, not all SQL clauses are listed. This lexical ordering
differs fundamentally from the logical order, i.e. from the order of
execution:
FROM
WHERE
GROUP BY
HAVING
SELECT
DISTINCT
UNION
ORDER BY
As a consequence, anything that you label using "AS" will only be available once the WHERE, HAVING and GROUP BY have already been performed.
I would imagine because the alias is not assigned to the result column until after the WHERE clause has been processed and the data generated. Is Oracle different from other DBMSs in this behaviour?

Oracle query mistake

I need to know where the mistake is in this oracle query?
SELECT(KEY1),COUNT(*) FROM TABLE1 GROUP BY AGE
SELECT KEY1,COUNT(*) FROM TABLE1 GROUP BY KEY1
There are two problems. First one: You cannot close the parenthesis after the first keyword. Second: You have to group by all keys that are in the query that are not all row dependend. In that case "KEY1". If you want to order by age you have to query age as parameter.
SELECT AGE,COUNT(*) FROM TABLE1 GROUP BY AGE
Your table naming is not very good. I assume you should have a look at group by tutorials like https://www.w3schools.com/sql/sql_groupby.asp or the sql tutorial https://www.w3schools.com/sql/
Your query had an issue. You have to modify your query as below
SELECT KEY1,COUNT(*) FROM TABLE1 GROUP BY KEY1.
Observation:
All the columns that are added in the select statement alongside the aggregate functions, should be included the group by columns.
Your first column does have the bracket in it which should be removed.

Difference between "HAVING ... GROUP BY" and "GROUP BY ... HAVING"

I have got the table MYTABLE with 2 columns: A and B
I have got the following pieces of the code:
SELECT MYTABLE.A FROM MYTABLE
HAVING SUM(MYTABLE.B) > 100
GROUP BY MYTABLE.A
and
SELECT MYTABLE.A FROM MYTABLE
GROUP BY MYTABLE.A
HAVING SUM(MYTABLE.B) > 100
Is it the same? Is it possible that these 2 codes will return diffrent sets of results?
Thank you in advance
As documented, there is no difference. People are just used to seeing HAVING after GROUP BY.
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_10002.htm#SQLRF20040
Specify GROUP BY and HAVING after the where_clause and hierarchical_query_clause. If you specify both GROUP BY and HAVING, then they can appear in either order.
http://sqlfiddle.com/#!4/66e33/1
I originally wrote:
I am not sure your 1st query is valid. As far as I know, HAVING should always come after GROUP BY.
I was corrected by David Aldridge, the Oracle docs state that the order does not matter. Although I don't recommend using HAVING before GROUP for readability reasons (and to prevent confusion with a WHERE clause), it is technically correct. So that makes the answer to your question 'yes, it's the same'.
You can't have a HAVING before a GROUP BY, the HAVING is like the "WHERE" but for the GROUP BY condition.
The clauses are evaluated in order. You can have a HAVING clause following immediately the FROM clause. In this case, the HAVING clause will apply to the entire rows of the result set. The select list may only contain, in this case, one/more/all of the aggregation functions contained in the HAVING clause.
So, your first query is not valid because of the above. A valid query would be
SELECT SUM(MYTABLE.B) AS s FROM MYTABLE
HAVING SUM(MYTABLE.B) > 100
The above query will return one or no row, depending on whether the condition SUM(MYTABLE.B) > 100 is verified or not.
Still, there is one more reason for which your first query is not valid. The GROUP BY clause may refer only to columns in the data set to which it applies. So going on with my valid query above, you can write the following valid query (though it will be useless and nonsense, as it is applied to either one or no rows):
SELECT SUM(s)
FROM
(
SELECT SUM(MYTABLE.B) s
FROM MYTABLE
HAVING SUM(MYTABLE.B) > 100
) q
GROUP BY s
So, just to answer: no, they're not the same. One of them is not even valid.
both WHERE and HAVING allow for the imposition of conditions in the query. Difference:
We use WHERE for the records returned by select from the table,
We use HAVING for groups returned by group by select query

Clarification on "rownum"

I have a table Table1
Name Date
A 01-jun-2010
B 03-dec-2010
C 12-may-2010
When i query this table with the following query
select * from table1 where rownum=1
i got output as
Name Date
A 01-jun-2010
But in the same way when i use the following queries i do not get any output.
select * from table1 where rownum=2
select * from table1 where rownum=3
Someone please give me guidance why it works like that, and how to use the rownum.
Tom has an answer for many Oracle related questions
In short, rownum is available after the where clause has been applied and before the order by clause is applied.
In the case of RowNum=2, the predicate in the where clause will never evaluate to true as RowNum starts at 1 and only increases if records matching the predicate can be found.
Adding rownums is one of the last things done after the result set has been fetched from the database. This means that the first row will always have rownum 1. Rownum is better used when you want to limit the result set, for instance when doing paging.
See this for more: http://www.orafaq.com/wiki/ROWNUM
(Not an Oracle expert by any means)
From what I understand, rownum numbers the rows in a result set.
So, in your example:
select * from table1 where rownum=2
How many rows are there going to be in the result set? Therefore, what rownum would be assigned to such a row? Can you see now why no result is actually returned?
In general, you should avoid relying on rownum, or any features that imply an order to results. Try to think about working with the entire set of results.
With that being said, I believe the following would work:
select * from (select rownum as rn,table1.* from table1) as t where t.rn = 2
Because in that case, you're numbering the rows within the subquery.

Why can't I GROUP BY 1 when it's OK to ORDER BY 1?

Why are column ordinals legal for ORDER BY but not for GROUP BY? That is, can anyone tell me why this query
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY OrgUnitID
cannot be written as
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY 1
When it's perfectly legal to write a query like
SELECT OrgUnitID FROM Employee AS e ORDER BY 1
?
I'm really wondering if there's something subtle about the relational calculus, or something, that would prevent the grouping from working right.
The thing is, my example is pretty trivial. It's common that the column that I want to group by is actually a calculation, and having to repeat the exact same calculation in the GROUP BY is (a) annoying and (b) makes errors during maintenance much more likely. Here's a simple example:
SELECT DATEPART(YEAR,LastSeenOn), COUNT(*)
FROM Employee AS e
GROUP BY DATEPART(YEAR,LastSeenOn)
I would think that SQL's rule of normalize to only represent data once in the database ought to extend to code as well. I'd want to only right that calculation expression once (in the SELECT column list), and be able to refer to it by ordinal in the GROUP BY.
Clarification: I'm specifically working on SQL Server 2008, but I wonder about an overall answer nonetheless.
One of the reasons is because ORDER BY is the last thing that runs in a SQL Query, here is the order of operations
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
so once you have the columns from the SELECT clause you can use ordinal positioning
EDIT, added this based on the comment
Take this for example
create table test (a int, b int)
insert test values(1,2)
go
The query below will parse without a problem, it won't run
select a as b, b as a
from test
order by 6
here is the error
Msg 108, Level 16, State 1, Line 3
The ORDER BY position number 6 is out of range of the number of items in the select list.
This also parses fine
select a as b, b as a
from test
group by 1
But it blows up with this error
Msg 164, Level 15, State 1, Line 3
Each GROUP BY expression must contain at least one column that is not an outer reference.
There is a lot of elementary inconsistencies in SQL, and use of scalars is one of them. For example, anyone might expect
select * from countries
order by 1
and
select * from countries
order by 1.00001
to be a similar queries (the difference between the two can be made infinitesimally small, after all), which are not.
I'm not sure if the standard specifies if it is valid, but I believe it is implementation-dependent. I just tried your first example with one SQL engine, and it worked fine.
use aliasses :
SELECT DATEPART(YEAR,LastSeenOn) as 'seen_year', COUNT(*) as 'count'
FROM Employee AS e
GROUP BY 'seen_year'
** EDIT **
if GROUP BY alias is not allowed for you, here's a solution / workaround:
SELECT seen_year
, COUNT(*) AS Total
FROM (
SELECT DATEPART(YEAR,LastSeenOn) as seen_year, *
FROM Employee AS e
) AS inline_view
GROUP
BY seen_year
databases that don't support this basically are choosing not to. understand the order of the processing of the various steps, but it is very easy (as many databases have shown) to parse the sql, understand it, and apply the translation for you. Where its really a pain is when a column is a long case statement. having to repeat that in the group by clause is super annoying. yes, you can do the nested query work around as someone demonstrated above, but at this point it is just lack of care about your users to not support group by column numbers.