Spark SQL: Specify column name resulting from UDF in WHERE clause - sql

I have written a UDF function which will return a column (0 or 1) after processing 2 columns. I need to have my select query such that it returns those records for which this value is 1.
I wrote the query as below:
SELECT number, myUDF(col1, col2) as result
FROM mytable
WHERE result is not null
However it doesn't recognize the column name 'result'. Is there any special syntax needed so that it recognizes this new output column? Thanks.

CASE statement should solve the problem here:
SELECT number,
CASE when myUDF(col1, col2) = 1 then myUDF(col1, col2) END as result
FROM mytable

Related

How to get the header out of Select query execution in snowflake

How to get header name from select query execution in snowflake. Currently I am getting only values out of select query execution. is there way to get column name as well. I need to group by and aggregate function on top of the select query result.
Code tried
sql10 = f"""SELECT col1,col2,col3,col4 FROM tablename ORDER BY col4 ;"""
select_snow =cs.execute(sql10).fetchall()
snow_col = [(c[1],c[2]) for c in select_snow]
how to get the columns name and mapped to particular column value.
Output
select snow: [('value1','value12','value3','value4'), ('value1','value12','value3','value4'), ('value11','value12','value13','value14'), ('value21','value22','value23','value24')]

Displaying an alternative result when derrived table is empty

I have this sql code where I try to display an alternative value as a result whenever the table is empty or the the single column of the top row when it is not
select top 1 case when count(*)!=0 then derrivedTable.primarykey
else 0 end endCase
from
(
select top 1 m.primarykey
from mytable m
where 0=1
)derrivedTable
The problem is that when I run this, I get the error message "column 'derrivedTable.primarykey' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
But when I put 'derrivedTable.primarykey' in the group by clause, I just get an empty table.
Does anyone hve a solution?
thanks in advance
You can use aggregation:
select coalesce(max(m.primarykey), 0)
from mytable m;
An aggregation query with no group by always returns exactly one row. If the table is empty (or all rows are filtered out), then the aggregation functions -- except for COUNT() -- return NULL -- which can be transformed to a value using COALESCE().
Such a construct makes me worry. If you are using this to set the primary key on an insert, then you should learn about identity columns or sequences. The database will do the work for you.
Can you try this below script-
SELECT
CASE
WHEN COUNT(*) = 1 THEN derrivedTable.primarykey
ELSE 0
END endCase
FROM
(
SELECT TOP 1 m.primarykey
FROM mytable m
WHERE 0 = 1
) derrivedTable
derrivedTable.primarykey;

Is a subquery, which is returning no row, equal to NULL?

In SQL Server, if my SELECT statement in a subquery returns no row, is then the result of the subquery equal to NULL? I made some research, but I am not sure about it.
Example:
IF (SELECT TOP 1 CLMN1 FROM SOMETABLE) IS NOT NULL THEN
....
I am asking to understand the behaviour of the if-statement above.
Looks like the answer is yes:
DECLARE #Test TABLE (Id INT)
INSERT INTO #Test VALUES (1)
SELECT * FROM #Test WHERE Id = 2
SELECT CASE WHEN (SELECT * FROM #Test WHERE Id = 2) IS NULL THEN 1 ELSE 0 END
EDIT: after you updated your question I think I should add that instead of checking if there are rows with IS NULL you should use the following that can be better optimised by the server:
IF EXISTS(SELECT * FROM #Test WHERE Id = 2)
BEGIN
-- Whatever
END
NULL means no value, for example that the "box" for a certain column in a certain row is empty. NO ROW means that there are no rows.
No, NULL is a column value that indicates that the value of that column for a given row has no valid value. There would have to be a row returned by your query for that row to contain NULL column values.
A query that returns no rows just means that no rows matched the predicate you used in the query and therefore no data was returned at all.
Edit: After the question was edited, my answer doesn't address the specific case called out in the question. Juan's answer above does.

Why can't i refer to a column alias in the ORDER BY using CASE?

Sorry if this a duplicate, but i haven't found one. Why can't i use my column alias defined in the SELECT from the ORDER BY when i use CASE?
Consider this simple query:
SELECT NewValue=CASE WHEN Value IS NULL THEN '<Null-Value>' ELSE Value END
FROM dbo.TableA
ORDER BY CASE WHEN NewValue='<Null-Value>' THEN 1 ELSE 0 END
The result is an error:
Invalid column name 'NewValue'
Here's a sql-fiddle. (Replace the ORDER BY NewValue with the CASE WHEN... that´'s commented out)
I know i can use ORDER BY CASE WHEN Value IS NULL THEN 1 ELSE 0 END like here in this case but actually the query is more complex and i want to keep it as readable as possible. Do i have to use a sub-query or CTE instead, if so why is that so?
Update as Mikael Eriksson has commented any expression in combination with an alias is not allowed. So even this (pointless query) fails for the same reason:
SELECT '' As Empty
FROM dbo.TableA
ORDER BY Empty + ''
Result:
Invalid column name 'Empty'.
So an alias is allowed in an ORDER BY and also an expression but not both. Why, is it too difficult to implement? Since i'm mainly a programmer i think of aliases as variables which could simple be used in an expression.
This has to do with how a SQL dbms resolves ambiguous names.
I haven't yet tracked down this behavior in the SQL standards, but it seems to be consistent across platforms. Here's what's happening.
create table test (
col_1 integer,
col_2 integer
);
insert into test (col_1, col_2) values
(1, 3),
(2, 2),
(3, 1);
Alias "col_1" as "col_2", and use the alias in the ORDER BY clause. The dbms resolves "col_2" in the ORDER BY as an alias for "col_1", and sorts by the values in "test"."col_1".
select col_1 as col_2
from test
order by col_2;
col_2
--
1
2
3
Again, alias "col_1" as "col_2", but use an expression in the ORDER BY clause. The dbms resolves "col_2" not as an alias for "col_1", but as the column "test"."col_2". It sorts by the values in "test"."col_2".
select col_1 as col_2
from test
order by (col_2 || '');
col_2
--
3
2
1
So in your case, your query fails because the dbms wants to resolve "NewValue" in the expression as a column name in a base table. But it's not; it's a column alias.
PostgreSQL
This behavior is documented in PostgreSQL in the section Sorting Rows. Their stated rationale is to reduce ambiguity.
Note that an output column name has to stand alone, that is, it cannot be used in an expression — for example, this is not correct:
SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong
This restriction is made to reduce ambiguity. There is still ambiguity if an ORDER BY item is a simple name that could match either an output column name or a column from the table expression. The output column is used in such cases. This would only cause confusion if you use AS to rename an output column to match some other table column's name.
Documentation error in SQL Server 2008
A slightly different issue with respect to aliases in the ORDER BY clause.
If column names are aliased in the SELECT list, only the alias name can be used in the ORDER BY clause.
Unless I'm insufficiently caffeinated, that's not true at all. This statement sorts by "test"."col_1" in both SQL Server 2008 and SQL Server 2012.
select col_1 as col_2
from test
order by col_1;
It seems this limitation is related to another limitation in which "column aliases can't be referenced in same SELECT list". For example, this query:
SELECT Col1 AS ColAlias1 FROM T ORDER BY ColAlias1
Can be translated to:
SELECT Col1 AS ColAlias1 FROM T ORDER BY 1
Which is a legal query. But this query:
SELECT Col1 AS ColAlias1 FROM T ORDER BY ColAlias1 + ' '
Should be translated to:
SELECT Col1 AS ColAlias1, ColAlias1 + ' ' FROM T ORDER BY 2
Which will raise the error:
Unknown column 'ColAlias1' in 'field list'
And finally it seems these are because of SQL standard behaviours not an impossibility in implementation.
More info at: Here
Note: The last query can be executed by MS Access without error but will raise the mentioned error with SQL Server.
You could try something like:
select NewValue from (
SELECT (CASE WHEN Value IS NULL THEN '<Null-Value>' ELSE Value END ) as NewValue,
( CASE WHEN NewValue='<Null-Value>' THEN 1 ELSE 0 END) as ValOrder
FROM dbo.TableA
GROUP BY Value
) t
ORDER BY ValOrder

How to assign a value to a casted column in Oracle

I am wondering whether is possible to assign a value to a casted column in SQL depending on real table values.
For Example:
select *, cast(null as number) as value from table1
where if(table1.id > 10 then value = 1) else value = 0
NOTE: I understand the above example is not completely Oracle, but, it is just a demonstration on what I want to accomplish in Oracle. Also, the above example can be done multiple ways due to its simplicity. My goal here is to verify if it is possible to accomplish the example using casted columns (columns not part of table1) and some sort of if/else.
Thanks,
Y_Y
select table1.*, (case when table1.id > 10 then 1 else 0 end) as value
from table1