How to select all items where LEN > 0 - google-bigquery

I'm new to Google BigQuery, and I'm finding that the SQL really quite different from normal SQL. I'm trying to select all items from a table that has data in a field named 'ticket_fields'. I son't want nulls. I tried the following:
WHERE COLUMN <> ''
WHERE LEN(COLUMN) > 0
WHERE NULLIF(LTRIM(RTRIM(COLUMN)), '') IS NOT NULL
None of that worked. How can I select all records where there are non-nulls from one specific field?

In some systems, the empty string and NULL are treated equivalently (Oracle, for example). In BigQuery, these are distinct values, so:
-- Returns TRUE
SELECT CAST(NULL AS STRING) IS NULL
-- Returns NULL
SELECT LENGTH(CAST(NULL AS STRING)) > 0
-- Returns TRUE
SELECT CAST(NULL AS STRING) IS NULL
-- Returns FALSE
SELECT '' IS NULL
-- Returns FALSE
SELECT LENGTH('') > 0
-- Returns TRUE
SELECT '' IS NOT NULL
If you want to filter rows where the column is NULL, then use IS NOT NULL:
SELECT * FROM dataset.table WHERE column IS NOT NULL
If you want to filter rows where the column is empty or NULL, you can just check that the length is positive:
SELECT * FROM dataset.table WHERE LENGTH(column) > 0
This is because LENGTH(column) returns NULL if column is null, so the WHERE clause excludes the row.

Related

Coalesce in where clause with n

I am trying to get records based on the given input values. Below is the sample script
DECLARE input1 = '001'
DECLARE input2 = '002'
SELECT * FROM table WHERE COLUMN1 = COALESCE(input, NULL) OR
COLUMN2 = COALESCE(input2, NULL)// return non-null records, Great
DECLARE input1 = NULL
DECLARE input2 = NULL
SELECT * FROM table WHERE COLUMN = COALESCE(input, NULL) // return no records, Problem here
I know, COLUMN = NULL do not yield any values. Is there a better way, so that NULL input values return null records. Thanks in advance.
A "better" option might be to skip COALESCE (or NVL) and switch to
where column = input
or (column is null and input is null)
Your query:
SELECT *
FROM table
WHERE COLUMN1 = COALESCE(input, NULL)
OR COLUMN2 = COALESCE(input2, NULL)
Will only return results when COLUMN1 = input OR column2 = input. It will not return results when both columns are NULL.
You will get exactly the same results if you remove the COALESCE expressions:
SELECT *
FROM table
WHERE COLUMN1 = input
OR COLUMN2 = input2
If you want to check when the input is NULL and the column value is also NULL then you need to use IS to compare the values to NULL and not = like this:
SELECT *
FROM table
WHERE COLUMN1 = input -- Compare non-NULL values
OR ( input IS NULL AND COLUMN1 IS NULL ) -- Compare NULL values
OR COLUMN2 = input2 -- Compare non-NULL values
OR ( input2 IS NULL AND COLUMN2 IS NULL ) -- Compare NULL values
If you need to return rows with COLUMN1 is NULL in case of INPUT is null:
SELECT *
FROM table
WHERE
decode(COLUMN1,input, 1,0) = 1
Decode: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/DECODE.html#GUID-39341D91-3442-4730-BD34-D3CF5D4701CE
In a DECODE function, Oracle considers two nulls to be equivalent
You do not need coalesce in this query from your question:
SELECT *
FROM table
WHERE COLUMN1 = COALESCE(input, NULL) OR
COLUMN2 = COALESCE(input2, NULL)
because coalesce(input,null) is equal to simple input: coalesce returns first non-null parameter, but your second parameter is null, so it returns
input is not NULL | input
-------------------------
input is NULL | null

PostgreSQL find blank rows

I want to search for non null values from the 'currentsheet' which works fine but some fields are actually blank rather than null. How can I find blank fields using postgreSQL as the below has not worked and still displays blank values under the 'currentsheet' field.
SELECT *
FROM PUBLIC._http_requests
WHERE (_http_requests.currentsheet IS NOT NULL OR _http_requests.currentsheet <> '')
AND _http_requests.session_id IS NOT NULL
AND _http_requests.http_referer IS NOT NULL
You need to use AND to check _http_requests.currentsheet. If it was NULL, then it would always be true for the <> '' check and vice versa.
As a way simpler example, you can use select statements without a table to help debug this sort of thing (from psql or whatever SQL query tool you like):
select ('' is not null or '' <> '') as empty_result,
(null is not null or null <> '') as null_result;
empty_result | null_result
--------------+-------------
t |
If the string is '', you get true. If the string is null, you get null (this is because comparisons with null are SQL oddities -- select null = null; results in null). Let's see what happens when we replace or with and:
select ('' is not null and '' <> '') as empty_result,
(null is not null and null <> '') as null_result;
empty_result | null_result
--------------+-------------
f | f
Neat! With X is not null and X <> '', we get false when X is either '' or null.
So the way to phrase the select statement to do what you actually want is:
SELECT *
FROM PUBLIC._http_requests
WHERE _http_requests.currentsheet IS NOT NULL
AND _http_requests.currentsheet <> ''
AND _http_requests.session_id IS NOT NULL
AND _http_requests.http_referer IS NOT NULL;
I think you just need AND _http_requests.currentsheet <> ''. The AND is important there so that we exclude both.

Convert INT column values to an empty string using ISNULL

I need to convert column ID of INT data type to a empty string ['']. I should not modify the source column data type, but need to convert it in my transformation in the other table. The ID column is "nullable" as it has null in it.This is my code.
CREATE TABLE #V(ID INT) ;
INSERT INTO #V
VALUES (1),(2),(3),(4),(5),(NULL),(NULL) ;
SELECT CASE WHEN CAST(ISNULL(ID,'') AS VARCHAR(10)) = '' THEN '' ELSE ID END AS ID_Column
FROM #V;
this returns:
ID_Column
1
2
3
4
5
NULL
NULL
when I modify my CASE statement it as follows:
CASE WHEN CAST(ISNULL(ID,'') AS VARCHAR(10)) = '' THEN '' ELSE ID END AS ID_Column
it returns:
ID_Column
1
2
3
4
5
0
0
Is this what you want?
select coalesce(cast(id as varchar(255)), '')
from #v;
You have to turn the entire result column into a single column. If you want a blank value, then the type is some sort of character string.
In your examples, the else id means that the result from the case is an integer, which is why you are getting either 0 or NULL.

Change aggregate functions to output NULL when a element is NULL

Every question I search for about the warning
Warning: Null value is eliminated by an aggregate or other SET operation.
Typically people want to treat the NULL values as 0. I want the opposite, how do I modify the following stored procedure to make it return NULL instead of 1?
CREATE PROCEDURE TestProcedure
AS
BEGIN
select cast(null as int) as foo into #tmp
insert into #tmp values (1)
select sum(foo) from #tmp
END
GO
I thought it would be SET ANSI_NULLS ON (I tried before the declaration, within the procedure itself, and before executing the procedure in my test query) but that did not appear to change the behavior of SUM(.
The sum() function automatically ignores NULL. To do what you want, you need an explicit checK:
select (case when count(foo) = count(*) then sum(foo) end)
from #tmp;
If you want to be explicit, you could add else NULL to the case statement.
The logic behind this is that count(foo) counts the number of non-NULL values in foo. If this is equal to all the rows, then all the values are non-NULL. You could use the more verbose:
select (case when sum(case when foo is null then 1 else 0 end) > 0
then sum(foo)
end)
And, I want to point out that the title is quite misleading. 1 + NULL = NULL. The issue is with the aggregation functions, not the arithmetic operators.
Looking for a null value with EXISTS may be the fastest:
SELECT
CASE WHEN EXISTS(SELECT NULL FROM tmp WHERE foo IS NULL)
THEN NULL
ELSE (SELECT sum(foo) from tmp)
END
Just say
select case sign(sum(case when foo is null then 1 else 0 end))
when 1 then null
else sum(foo)
end
from some_table
...
group by
...
That's about all you need.

SQL Server variables

Why do these queries return different values? The first returns a result set as expected, but the second (which as far as I can tell is exactly the same) does not. Any thoughts?
1:
declare #c varchar(200)
set #c = 'columnName'
select top 1 *
from myTable
where #c is not null
and len(convert(varchar, #c)) > 0
2:
SELECT top 1 *
FROM myTable
WHERE columnName IS NOT NULL
and len(convert(varchar,columnName)) > 0
It's because they aren't the same query -- your variable text does not get inlined into the query.
In query 1 you are validating that #c is not null (true, you set it) and that its length is greater than 0 (true, it's 10). Since both are true, query 1 becomes:
select top 1 * from myTable
(It will return the first row in myTable based on an appropriate index.)
EDIT: Addressing the comments on the question.
declare #myTable table
(
columnName varchar(50)
)
insert into #myTable values ('8')
declare #c nvarchar(50)
set #c = 'columnName'
select top 1 *
from #myTable
where #c is not null
and len(convert(varchar, #c)) > 0
select top 1 *
from #myTable
where columnName is not null
and len(convert(varchar,columnName)) > 0
Now when I run this both queries return the same result. You'll have to tell me where I'm misrepresenting your actual data / query to get more help (or just expand upon this to find a solution).
In the first query, you are checking the value 'columnName' against the parameters IS NOT NULL and length > 0. In the second query, you are checking the values in the columnName column against those parameters.
It should be noted that query 1 will always return one row (assuming a row exists), where query 2 will only return a row if the contents of columnName are not null and length > 0.
The first query actually evaluates as
select top 1 * from myTable where 'columnName' is not null and len(convert(varchar, 'columnName' )) > 0
Not as what you were hoping for.expected.
The two querys are not the same as in the second query you are evaluating the actual value of the field columnname. The following is the equivalent of your first function.
SELECT top 1 * FROM myTable WHERE 'columnName' IS NOT NULL and len(convert(varchar,'columnName')) > 0
They're not the same - the first is checking the variable while the second is checking the column. "where #c is not null" means where the variable #c isn't null - which it isn't, since it contains the value 'columnName'. "where columnName is not null" means where the field columnName doesn't contain a null. And the same for the evaluation of the length.