How to use a where statement that does not ignore Nulls in Big Query - google-bigquery

I was a little surprised to find out that the WHERE statement in Google Big Query ignores NULLS. Does anyone know of a better way to do this?
I have the following data set:
Name Score
Allan 20
Brian NULL
Clare 30
Say I want to select all records where the Score is not equal to 20. If I use the following code in Big Query
SELECT * FROM [....]
where
Score <> 20
The following is the result:
Name Score
Clare 30
The problem is that the record for Brian which is NULL is also not equal to 20 and therefore should be in my results.
Other than checking spefically for NULLS is there a better way to do this?
Thanks
Ria

SQL (and thus BigQuery, which is SQL-like), has a trivalent logic. What that boils down to is that statements cannot just be TRUE or FALSE, they can also be NULL. In this case, the statement NULL <> 20 is neither TRUE nor FALSE, it is itself NULL. It might be helpful to think of NULL values as unknown. Since we don't know Brian's age, we don't know whether it is equal to 20. But the query only returns rows for which the where-clause evaluates to TRUE, and therefore the row with Brian is excluded.
If you want to include NULL values, you have to explicitly write
where (Score <> 20 or Score is null)

select * from [...]
where coalesce(score, 0) <> 20

One more variant:
SELECT * FROM [...]
WHERE ifnull(score <> 20, true)
I kind of like it as a way to express "accept either TRUE or NULL boolean values from this expression; reject FALSE."

How you can achieve by-
SELECT * FROM [....]
where
Score <> 20 or Scrore is NULL
Is there any efficient way-
To avoid this performance killing way, we should keep column property as not null.

Related

Oracle SQL - Multiple return from case

I may be trying it wrong. I am looking for any approach which is best.
Requirement:
My Query joins 4-5 tables based on few fields.
I have a column called product id. In my table there are 1.5 million rows. Out of those only 10% rows has product ids with the following attribute
A300X-%
A500Y-%
300,500, 700 are valid model numbers. X and Y are classifications. My query picks all the systems.
I have a check as follows
CASE
WHEN PID LIKE 'A300X%'
THEN 'A300'
...
END AS MODEL
Similarly
CASE
WHEN PID LIKE 'A300X%'
THEN 'X'
...
END AS GENRE
I am looking for the best option from the below
How do I Combine both case statement and add another[third] case which will have these two cases. i.e
CASE
WHEN desc in ('AAA')
First Case
Second Case
ELSE
don't do anything for other systems
END
Is there any regex way of doing this? Before first - take the string. Look for X, Y and also 300,500,700.
Is there any other way of doing this? Or doing via code is the best way?
Any suggestions?
EDIT:
Sample desc:
AAA,
SoftwARE,
sw-app
My query picks all the desc. But the case should be running for AAA alone.
And Valid models are
A300X-2x-P
A500Y-5x-p
A700X-2x-p
A50CE-2x-P
I have to consider only 300,500,700. And the above two cases.
Expected result:
MODEL GENRE
A300 X
A500 Y
A300 Y
Q: How do I Combine both CASE statement expressions
Each CASE expression will return a single value. If the requirement is to return two separate columns in the resultset, that will require two separate expressions in the SELECT list.
For example:
DESC PID model_number genre
---- ---------- ------------ ------
AAA A300X-2x-P 300 X
AAA A500Y-5x-p 500 Y
AAA A700X-2x-p 700 X
AAA A50CE-2x-P (NULL) (NULL)
FOO A300X-2x-P (NULL) (NULL)
There will need to be an expression to return the model_number column, and a separate expression to return the genre column.
It's not possible for a single expression to return two separate columns.
Q: and add another[third] case which will have these two cases.
A CASE expression returns a value; we can use a CASE expression almost anywhere in a SQL statement where we can use a value, including within another CASE expression.
We can also combine multiple conditions in a WHEN test with AND and OR
As an example of combining conditions and nesting CASE expressions ditions...
CASE
WHEN ( ( t.PID LIKE '_300%' OR t.PID LIKE '_500%' OR t.PID LIKE '_700%' )
AND ( t.DESC = 'AAA' )
)
THEN CASE
WHEN ( t.PID LIKE '____X%' )
THEN 'X'
WHEN ( t.PID LIKE '____Y%' )
THEN 'Y'
ELSE NULL
END
ELSE NULL
END AS `genre`
There are other expressions that will return an equivalent result; the example shown here isn't necessarily the best expression. It just serves as a demonstration of combining conditions and nesting CASE expressions.
Note that to return another column model we would need to include another expression in the SELECT list. Similar conditions will need to be repeated; it's not possible to reference the WHEN conditions in another CASE expression.
Based on your sample data, logic such as this would work:
(CASE WHEN REGEXP_LIKE(PID, '^A[0-9]{3}[A-Z]-')
THEN SUBSTR(PID, 1, 4)
ELSE PID
END) AS MODEL
(CASE WHEN REGEXP_LIKE(PID, '^A[0-9]{3}[A-Z]-')
THEN SUBSTR(PID, 5, 1)
ELSE PID
END) AS GENRE
This assumes that the "model number" always starts with "A" and is followed by three digits (as in your example data). If the model number is more complicated, you may need regexp_substr() to extract the values you want.

ISNULL with aggregate function

What is the best way to go about using these two together? In my case if a userID is null I want to return zero, and users can have multiple ID's so we want to have get the lowest (the original) one.
ISNULL(MIN(UserId),0)
Or,
MIN(ISNULL(UserId),0)
Thank you.
Is the answer indicative of all aggregate functions?
Those statements do not necessarily produce the same output:
the first takes the minimum that exists and only if that is null, uses 0.
the second checks each user id and if that is null uses 0 - it then takes the minimum of those (and unless a user ID can be negative, a user with a 5 and a null, would output 0)
A quick script can demonstrate this :
with testData as (
select 1 as SomeKey, 5 as userID
union all
select 1 as SomeKey, null as userID
union all
select 2 as SomeKey, 6 as userID
union all
select 2 as SomeKey, 5 as userID
)
select
somekey
, isnull(min(userid),0) as firstScenario
, min(isnull(userid,0)) as SecondScenario
from testdata
group by somekey
Results:
Somekey firstScenario secondScenario
1 5 0
2 5 5
The first scenario is the most likely one you were after, but the phrasing of the question makes it a bit ambiguous as to what the desired behaviour was.
(http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/10170)
It depends on what you want to do. But I am biased towards COALESCE() because it is the ANSI standard function.
Your two options are:
COALESCE(MIN(UserId), 0)
MIN(COALESCE(UserId, 0))
These do not do the same thing. The first returns the minimum user id. If all user ids are NULL, then this expression returns 0.
The second replaces each NULL with 0. Assuming the user ids are positive, then this returns 0 if any user ids are NULL.
Based on my understanding of your logic, you want the second version.
I suppose you use SQL Server, bacause ISNULL is a T-Sql function.
To use a function accross DBMS you can use COALESCE
NULL values are not included in MIN functions.
So If you want to prevent NULL result, I advice you to use the first solution
ISNULL(MIN(UserId), 0)

PostgreSQL: order by column, with specific NON-NULL value LAST

When I discovered NULLS LAST, I kinda hoped it could be generalised to 'X LAST' in a CASE statement in the ORDER BY portion of a query.
Not so, it would seem.
I'm trying to sort a table by two columns (easy), but get the output in a specific order (easy), with one specific value of one column to appear last (got it done... ugly).
Let's say that the columns are zone and status (don't blame me for naming a column zone - I didn't name them). status only takes 2 values ('U' and 'S'), whereas zone can take any of about 100 values.
One subset of zone's values is (in pseudo-regexp) IN[0-7]Z, and those are first in the result. That's easy to do with a CASE.
zone can also take the value 'Future', which should appear LAST in the result.
In my typical kludgy-munge way, I have simply imposed a CASE value of 1000 as follows:
group by zone, status
order by (
case when zone='IN1Z' then 1
when zone='IN2Z' then 2
when zone='IN3Z' then 3
.
. -- other IN[X]Z etc
.
when zone = 'Future' then 1000
else 11 -- [number of defined cases +1]
end), zone, status
This works, but it's obviously a kludge, and I wonder if there might be one-liner doing the same.
Is there a cleaner way to achieve the same result?
Postgres allows boolean values in the ORDER BY clause, so here is your generalised 'X LAST':
ORDER BY (my_column = 'X')
The expression evaluates to boolean, resulting values sort this way:
FALSE (0)
TRUE (1)
NULL
Since we deal with non-null values, that's all we need. Here is your one-liner:
...
ORDER BY (zone = 'Future'), zone, status;
Related:
Sorting null values after all others, except special
Select query but show the result from record number 3
SQL two criteria from one group-by
I'm not familiar postgreSQL specifically, but I've worked with similar problems in MS SQL server. As far as I know, the only "nice" way to solve a problem like this is to create a separate table of zone values and assign each one a sort sequence.
For example, let's call the table ZoneSequence:
Zone | Sequence
------ | --------
IN1Z | 1
IN2Z | 2
IN3Z | 3
Future | 1000
And so on. Then you simply join ZoneSequence into your query, and sort by the Sequence column (make sure to add good indexes!).
The good thing about this method is that it's easy to maintain when new zone codes are created, as they likely will be.

Avoiding the null values to replace 0 values in report

I am using SQL Server 2005 BOXIR2.
My doubt, from universe table there is an eventcode having different types of codes like Enquiry,FollowUp,LostofSales,Contact,etc
I make a measure that is from object properties formula count(Tablename.EventCode)save and export it, when I used this EventCode in Webireport, it show values for paricular EventCode, but zero values are not read it show null blank as below example .
I WANT TO GET THE ZERO VALUES FOR WHICH IT IS IN BLANK(NULL).
count(Tablename.EventCode)
Enquiry,FollowUp,LostofSales,Contact
10 20 15
5 12 5
6 4 3
Can u please help me how to get get zero values for null,Formula
I'm not sure exactly what you are asking, but I think you may be looking for ISNULL()
SELECT ISNULL(table_name.column_name, 0)
will return 0 if table_name.column_name is null
If you're getting NULLs when performing an aggregate, it's probably because one of elements is NULL. All you need to do is coalesce those entries to a known value (such as zero).
SELECT COUNT(COALESCE(Tablename.EventCode, 0)) FROM Tablename

SQLite - sql question 101

I would do something like this:
select * from cars_table where body not equal to null.
select * from cars_table where values not equal to null And id = "3"
I know the syntax for 'not equal' is <>, but I get an empty results.
For the second part, I want to get a result set where it only returns the columns that have a value. So, if the value is null, then don't include that column.
Thanks
You cannot use equality operators for nulls, you must use is null and is not null.
So your first query would be:
select * from cars_table where body is not null;
There's no easy way to do your second operation (excluding columns that are null). I'm assuming you mean don't display the column if all rows have NULL for that column, since it would be even harder to do this on a row-by-row basis. A select statement expects a list of columns to show, and will faithfully show them whether they are null or not.
The only possibility that springs to mind is a series (one per column) of selects with grouping to determine if the column only has nulls, then dynamically construct a new query with only columns that didn't meet that criteria.
But that's incredibly messy and not at all suited to SQL - perhaps if you could tell us the rationale behind the request, there may be another solution.
Comparing to NULL is not done with <>, but with is null and is not null :
select *
from cars_table
where body is not null
And :
select *
from cars_table
where values is not null
and id = '3'
NULL is not a normal value, and has to be dealt with differently than standard values.
In SQL, NULL is an unknown value. It is neither equal-to nor not-equal-to any other value. All comparison operators (=, <>, <, >, etc.) return false if either of the values being compared is NULL. (I don't know how old you are, so I can't say that you're 24, but I also can't say that you're not 24.)
As already mentioned, you have to use IS NULL or IS NOT NULL instead when testing for NULL values.
Thank you all for your answers.
Well, I have a table with a bunch of columns.
And I am going to search a particular car, and get the values for that car...
car | manual | automatic | sedan | power brakes | jet engine
honda | true | false | false | true | false
mazda | false | true | false | true | true
So, I want to do ->
select * from car_table where car='mazda' and values not equal to false
So then I just iterate over the cursor result and fill up the table with the appropriate columns and values. Where values are columns. I guess I replace values with * for columns
I know I could do it programmatically, but was thinking I could do this just by sql