Generate a variable in a SQL statement - sql

I would like to declare a variable in a SQL Oracle statement to work with it in the next lines. I write a simple statement as example:
SELECT customer.surname, LENGTH(customer.name) long, customer.age
FROM customer
WHERE long > 4;
I didn't found any "clear" info on the web, is that even possible?

The order of operations for a select statement is not the same order in which it is written.
FROM (including joins and subqueries but then in the order of operation starts over for that subquery; like order of operations in algebra; inside out )
WHERE
GROUP BY
SELECT
HAVING
ORDER BY
There are some exceptions to the above as not all engines process quite this way. It appears you may be able to use an alias in a group by if you're using mySQL. I'm not familiar enough to know if it changes the processing or if mySQL is just looking ahead.
In this order you can see the where executes before the 'long' alias is generated, so the DB Engine doesn't know what long is at the time it's being executed. Put another way, long is not in scope at the time the where clause is being evaluated.
This can be solved by simply repeating the calculation in the where clause or nesting queries; but the latter is less efficient.
In the below I:
Aliased customer as c to save typing and improve readability.
re-wrote the where clause to use the formula instead of the alias
renamed your long alias due to reserved/keyword use.
.
SELECT c.surname, LENGTH(customer.name) as Name_Len, c.age
FROM customer as c
WHERE LENGTH(c.name)> 4;
In this next example we use the with key word to generate a set of data called CTE (Common Table Expression) with the length of the name calculated. This in effect changes the the order in which the where clause is processed.
In this case the FROM is processed in the CTE then the select including our calculated value but no where clause is applied. Then a second query is run selecting from the CTE data set with the where clause. Since the first dataset already calculated the Name_Len, we can now use it in the where clause.
WITH CTE AS (SELECT c.surname, LENGTH(customer.name) as Name_Len, c.age
FROM customer as c)
SELECT *
FROM CTE
WHERE Name_Len > 4;
This could also be done as a subquery; but after you nest a few of those, you can see using a with may make it easier to read/maintain.
SELECT CTE.*
FROM (SELECT c.surname, LENGTH(customer.name) as Name_Len, c.age
FROM customer as c) as CTE
WHERE CTE.Name_Len > 4;

The way you asked the question is incorrect though there is a solution to your problem in SQL.
SELECT *
FROM (SELECT customer.surname,
LENGTH (customer.name) col_long,
customer.age
FROM customer)
WHERE col_long > 4;
The sub-query here is called in-line view. For more details check Oracle documentation online.
Also, LONG is a reserved keyword, so either rename it or use like "long".

Have you searched online? This is literally covered everywhere... Something like this probably;
DECLARE aVariable NUMBER;
BEGIN
SELECT someColumn INTO aVariable FROM aTable;
END;

Related

Using calculation with an an aliased column in ORDER BY

As we all know, the ORDER BY clause is processed after the SELECT clause, so a column alias in the SELECT clause can be used.
However, I find that I can’t use the aliased column in a calculation in the ORDER BY clause.
WITH data AS(
SELECT *
FROM (VALUES
('apple'),
('banana'),
('cherry'),
('date')
) AS x(item)
)
SELECT item AS s
FROM data
-- ORDER BY s; -- OK
-- ORDER BY item + ''; -- OK
ORDER BY s + ''; -- Fails
I know there are alternative ways of doing this particular query, and I know that this is a trivial calculation, but I’m interested in why the column alias doesn’t work when in a calculation.
I have tested in PostgreSQL, MariaDB, SQLite and Oracle, and it works as expected. SQL Server appears to be the odd one out.
The documentation clearly states that:
The column names referenced in the ORDER BY clause must correspond to
either a column or column alias in the select list or to a column
defined in a table specified in the FROM clause without any
ambiguities. If the ORDER BY clause references a column alias from
the select list, the column alias must be used standalone, and not as
a part of some expression in ORDER BY clause:
Technically speaking, your query should work since order by clause is logically evaluated after select clause and it should have access to all expressions declared in select clause. But without looking at having access to the SQL specs I cannot comment whether it is a limitation of SQL Server or the other RDBMS implementing it as a bonus feature.
Anyway, you can use CROSS APPLY as a trick.... it is part of FROM clause so the expressions should be available in all subsequent clauses:
SELECT item
FROM t
CROSS APPLY (SELECT item + '') AS CA(item_for_sort)
ORDER BY item_for_sort
It is simply due to the way expressions are evaluated. A more illustrative example:
;WITH data AS
(
SELECT * FROM (VALUES('apple'),('banana')) AS sq(item)
)
SELECT item AS s
FROM data
ORDER BY CASE WHEN 1 = 1 THEN s END;
This returns the same Invalid column name error. The CASE expression (and the concatenation of s + '' in the simpler case) is evaluated before the alias in the select list is resolved.
One workaround for your simpler case is to append the empty string in the select list:
SELECT
item + '' AS s
...
ORDER BY s;
There are more complex ways, like using a derived table or CTE:
;WITH data AS
(
SELECT * FROM (VALUES('apple'),('banana') AS sq(item)
),
step2 AS
(
SELECT item AS s FROM data
)
SELECT s FROM step2 ORDER BY s+'';
This is just the way that SQL Server works, and I think you could say "well SQL Server is bad because of this" but SQL Server could also say "what the heck is this use case?" :-)

SQL Server - Pagination Without Order By Clause

My situation is that a SQL statement which is not predictable, is given to the program and I need to do pagination on top of it. The final SQL statement would be similar to the following one:
SELECT * FROM (*Given SQL Statement*) b
OFFSET 0 ROWS FETCH NEXT 50 ROWS ONLY;
The problem here is that the *Given SQL Statement* is unpredictable. It may or may not contain order by clause. I am not able to change the query result of this SQL Statement and I need to do pagination on it.
I searched for solution on the Internet, but all of them suggested to use an arbitrary column, like primary key, in order by clause. But it will change the original order.
The short answer is that it can't be done, or at least can't be done properly.
The problem is that SQL Server (or any RDBMS) does not and can not guarantee the order of the records returned from a query without an order by clause.
This means that you can't use paging on such queries.
Further more, if you use an order by clause on a column that appears multiple times in your resultset, the order of the result set is still not guaranteed inside groups of values in said column - quick example:
;WITH cte (a, b)
AS
(
SELECT 1, 'a'
UNION ALL
SELECT 1, 'b'
UNION ALL
SELECT 2, 'a'
UNION ALL
SELECT 2, 'b'
)
SELECT *
FROM cte
ORDER BY a
Both result sets are valid, and you can't know in advance what will you get:
a b
-----
1 b
1 a
2 b
2 a
a b
-----
1 a
1 b
2 a
2 b
(and of course, you might get other sorts)
The problem here is that the *Given SQL Statement" is unpredictable. It may or may not contain order by clause.
your inner query(unpredictable sql statement) should not contain order by,even if it contains,order is not guaranteed.
To get guaranteed order,you have to order by some column.for the results to be deterministic,the ordered column/columns should be unique
Please note: what I'm about to suggest is probably horribly inefficient and should really only be used to help you go back to the project leader and tell them that pagination of an unordered query should not be done. Having said that...
From your comments you say you are able to change the SQL statement before it is executed.
You could write the results of the original query to a temporary table, adding row count field to be used for subsequent pagination ordering.
Therefore any original ordering is preserved and you can now paginate.
But of course the reason for needing pagination in the first place is to avoid sending large amounts of data to the client application. Although this does prevent that, you will still be copying data to a temp table which, depending on the row size and count, could be very slow.
You also have the problem that the page size is coming from the client as part of the SQL statement. Parsing the statement to pick that out could be tricky.
As other notified using anyway without using a sorted query will not be safe, But as you know about it and search about it, I can suggest using a query like this (But not recommended as a good way)
;with cte as (
select *,
row_number() over (order by (select 0)) rn
from (
-- Your query
) t
)
select *
from cte
where rn between (#pageNumber-1)*#pageSize+1 and #pageNumber*#pageSize
[SQL Fiddle Demo]
I finally found a simple way to do it without any order by on a specific column:
declare #start AS INTEGER = 1, #count AS INTEGER = 5;
select * from (SELECT *,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS fakeCounter
FROM (select * from mytable) AS t) AS t2 order by fakeCounter OFFSET #start ROWS
FETCH NEXT #count ROWS ONLY
where select * from mytable can be any query

Using a function-generated column in the where clause

I have an SQL query, which calls a stored SQL function, I want to do this:
SELECT dbo.fn_is_current('''{round}''', r.fund_cd, r.rnd) as [current]
FROM BLAH
WHERE current = 1
The select works fine, however, it doesn't know "current". Even though (without the WHERE) the data it generates does have the "current" column, and it's correct.
So, I'm assuming that this is a notation issue.
You cannot use an alias from the select in the where clause (or even again in the same select). Just use a subquery:
SELECT t.*
FROM (SELECT dbo.fn_is_current('''{round}''', r.fund_cd, r.rnd) as [current]
FROM BLAH
) t
WHERE [current] = 1;
As a note: current is a very bad name for a column because it is a reserved word (in many databases at least, including SQL Server). The word is used when defining cursors. Use something else, such as currval.

Why can't I GROUP BY 1 when it's OK to ORDER BY 1?

Why are column ordinals legal for ORDER BY but not for GROUP BY? That is, can anyone tell me why this query
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY OrgUnitID
cannot be written as
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY 1
When it's perfectly legal to write a query like
SELECT OrgUnitID FROM Employee AS e ORDER BY 1
?
I'm really wondering if there's something subtle about the relational calculus, or something, that would prevent the grouping from working right.
The thing is, my example is pretty trivial. It's common that the column that I want to group by is actually a calculation, and having to repeat the exact same calculation in the GROUP BY is (a) annoying and (b) makes errors during maintenance much more likely. Here's a simple example:
SELECT DATEPART(YEAR,LastSeenOn), COUNT(*)
FROM Employee AS e
GROUP BY DATEPART(YEAR,LastSeenOn)
I would think that SQL's rule of normalize to only represent data once in the database ought to extend to code as well. I'd want to only right that calculation expression once (in the SELECT column list), and be able to refer to it by ordinal in the GROUP BY.
Clarification: I'm specifically working on SQL Server 2008, but I wonder about an overall answer nonetheless.
One of the reasons is because ORDER BY is the last thing that runs in a SQL Query, here is the order of operations
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
so once you have the columns from the SELECT clause you can use ordinal positioning
EDIT, added this based on the comment
Take this for example
create table test (a int, b int)
insert test values(1,2)
go
The query below will parse without a problem, it won't run
select a as b, b as a
from test
order by 6
here is the error
Msg 108, Level 16, State 1, Line 3
The ORDER BY position number 6 is out of range of the number of items in the select list.
This also parses fine
select a as b, b as a
from test
group by 1
But it blows up with this error
Msg 164, Level 15, State 1, Line 3
Each GROUP BY expression must contain at least one column that is not an outer reference.
There is a lot of elementary inconsistencies in SQL, and use of scalars is one of them. For example, anyone might expect
select * from countries
order by 1
and
select * from countries
order by 1.00001
to be a similar queries (the difference between the two can be made infinitesimally small, after all), which are not.
I'm not sure if the standard specifies if it is valid, but I believe it is implementation-dependent. I just tried your first example with one SQL engine, and it worked fine.
use aliasses :
SELECT DATEPART(YEAR,LastSeenOn) as 'seen_year', COUNT(*) as 'count'
FROM Employee AS e
GROUP BY 'seen_year'
** EDIT **
if GROUP BY alias is not allowed for you, here's a solution / workaround:
SELECT seen_year
, COUNT(*) AS Total
FROM (
SELECT DATEPART(YEAR,LastSeenOn) as seen_year, *
FROM Employee AS e
) AS inline_view
GROUP
BY seen_year
databases that don't support this basically are choosing not to. understand the order of the processing of the various steps, but it is very easy (as many databases have shown) to parse the sql, understand it, and apply the translation for you. Where its really a pain is when a column is a long case statement. having to repeat that in the group by clause is super annoying. yes, you can do the nested query work around as someone demonstrated above, but at this point it is just lack of care about your users to not support group by column numbers.

How to refer to a variable create in the course of executing a query in T-SQL, in the WHERE clause?

What the title says, really.
If I SELECT [statement] AS whatever, why can't I refer to the whatever column in the body of the WHERE clause? Is there some kind of a workaround? It's driving me crazy.
As far as I'm aware, you can't directly do this in SQL Server.
If you REALLY have to use your column alias in the WHERE clause, you can do this, but it seems like overkill to use a subquery just for the alias:
SELECT *
FROM
(
SELECT [YourColumn] AS YourAlias, etc...
FROM Whatever
) YourSubquery
WHERE YourAlias > 2
You're almost certainly better off just using the contents of the original column in your WHERE clause.
It has to do with the way a SELECT statement gets translated into an abstract query tree: the 'whatever' only appears in the query result projection part of the tree, which is above the filtering part of the tree, so the WHERE clause cannot understand the 'whatever'. This is not some internal implementation detail, it is a fundamental behavior of relational queries: the projection of the result occurs after the evaluation of the joins and filters.
IS really trivial to work around the 'problem' by making the hierarchy of the query explicit:
select ...
from (
select [something] as whatever
from ...
) as subquery
WHERE whatever = ...;
A common table expression can also server the same purpose:
with cte as (
select [something] as whatever
from ...)
select ... from cte
WHERE whatever = ...;
It's to do with the order of operations in the select statement. The WHERE clause is evaluated before the SELECT clause so this information isn't available. Although it is available in the ORDER BY clause as this is processed last.
As others have mentioned, a sub-query will get around this problem.