Oracle null values - sql

Can somebody please explain the functionality of the below query in oracle db and why is it not returning the last null row. And also please explain me not in functionality in case of null values.
Table Store_Information
store_name Sales Date
Los Angeles $1500 Jan-05-1999
San Diego $250 Jan-07-1999
San Francisco $300 Jan-08-1999
Boston $700 Jan-08-1999
(null) $600 Jan-10-1999
SELECT *
FROM scott.Store_Information
WHERE store_name IN (null)
STORE_NAME SALES DATE
-------------------- -------------------- -------------------------
0 rows selected

SELECT *
FROM scott.Store_Information
WHERE store_name IS null;
NULL can not be "compared" as other (real) values. Therefor you have to use IS NULL or IS NOT NULL.
Here is a series of blog posts regarding this topic: http://momjian.us/main/blogs/pgblog/2012.html#December_26_2012

If the value you are looking for is a null value, the query should be:
SELECT *
FROM scott.Store_Information
WHERE store_name IS NULL;

Oracle NULLS are special:
nothing is equal to null
nothing is NOT equal to null
Since in is equivalent to a = any, this applies to in also.
So, you cannot use NULL in an in or not in clause and expect proper results.

Null is a special value which doesn't follow normal conventions of string or numeric comparison. Evaluating null using these common methods will always evaluate to FALSE unless you use some of the special built in functions.
Select * from Store_Information
where nvl(store_name,'') in ('')
Select * from store_information
where coalesce(store_name, 'Missing') in ('Missing')
Select * from store_information
where store_name is null

If you want your query select field with null value you can :
solution 1 :
use where ... IS NULL ...
solution 2 :
or if you want absolutely use a IN you can use a NVL
eg :
SELECT *
FROM scott.Store_Information
WHERE NVL(store_name, 'NULL_STORE_NAME') IN ('NULL_STORE_NAME')
but in the second case you make the assumption that you can't have a store name named 'NULL_STORE_NAME'... so usually it's better to use solution 1 in my opinion...

Related

BigQuery - concatenate array of strings for each row with `null`s

This is a clarification/follow-up on the earlier question where I didn't specify the requirement for null values.
Given this input:
Row id app_date inventor.name inventor.country
1 id_1 01-15-2022 Steve US
Ashley US
2 id_2 03-16-2011 Pete US
<null> US
Mary FR
I need to extract name from inventor struct and concatenate them for each id, like so:
Row id app_date inventors
1 id_1 01-15-2022 Steve, Ashley
2 id_2 03-16-2011 Pete, ^, Mary
Note custom filling for null value - which, to me, seems like it means I need to use ARRAY_TO_STRING specifically that supports this.
The closest example I found doesn't work with nulls. How can one do this?
Use below
SELECT * EXCEPT(inventor),
(SELECT STRING_AGG(IFNULL(name, '^'), ', ') FROM t.inventor) inventors
FROM sample t
with output

In proc sql when using SELECT * and GROUP BY, the result is not collapsed

When using the asterisk in combination with sum and group, the duplicates are not removed as I expect (and as it works in for example mysql):
col1 | country
-----------------
5 | sweden
20 | sweden
30 | denmark
select *, sum(col1) as s from table
group by country
the data returned is:
col1 | country | s
--------------------
5 | sweden | 25
20 | sweden | 25
30 | denmark | 30
instead of what I expected:
col1 | country | s
------------------------
5 | sweden | 25
30 | denmark | 30
If I don't use asterisk (*), the data returned is as I expect it to be.
SELECT country, sum(col1) as s from table
You are correct, SAS does not collapse WHEN you have variables in the statement that are not in the GROUP BY statement.
There will be a note to that effect in the log, about your data being merged.
If you want just the variables, you'll have to list them unfortunately, but since you have to list them in GROUP BY it's not extra work per se.
Different SQL implementations handle things differently, this is one way that SAS is different. It's handy when you do want to merge a summary stat back with the main data set though.
If you don't want this behaviour add the NOREMERGE option to your PROC SQL - but it throws an error, it still doesn't work the way you want.
See the documentation for the reference
Don't use SELECT *, ever. It's bad practice, risky, unsustainable... Read about it.
What flavor of SQL?
Your first query shouldn't work. You're basically saying...
select col1
, country
, sum(col1) as s
from table
group by country
...which will return an error:
Column 'table.col1' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SELECT country, sum(col1) as s from table
...also should not work:
Column 'table.country' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Given your expected output, I suspect what you are looking for is...
select min(col1) as col1
, country
, sum(col1) as s
from table
group by country

SQL: Using row values as column headers

I'm building a fairly large table for a client, and they want the data represented in a certain way. My table is currently like this:
DATE_RECEIVED | NAME | DOB | ANALYTE | RESULT
'YYYY/MM' |STRING |'MM/DD/YYYY' | String | String
2011/03 |Name, A| 07/31/1056 | AAAA | Positive
2011/03 |Name, A| 07/31/1056 | BBBB | Negative
What I need to do is something like a pivot - each "Analyte" is to be its own column, with the "result" being the value in the column, like this:
DATE_RECEIVED | NAME | DOB | AAAA | BBBB |
2011/03 |Name, A| 07/31/1056 | Positive | Negative |
I've tried a few things with PIVOT, but I think that I'm still too novice to understand how the logic for that function works. Either that, or the version of SQL I'm using doesn't support pivoting. Looking through similar questions on this site didn't really get me any closer to solving the problem because I don't really feel like I understand my problem well enough to know what I need to do to fix it. Anyway, I'm completely stumped. If anyone could give me a place to start, that would be extremely helpful. Thanks!
...Also, I know I'm using Oracle SQL, but I don't know what version. If it helps, I'm writing everything in TOAD for Oracle version 12.6.
If you can't use PIVOT (maybe your version of Oracle doesn't support it), then it is possible to use CASE instead together with an aggregate:
SELECT date_received, name, dob
, MAX(CASE WHEN analyte = 'AAAA' THEN result END) AS aaaa
, MAX(CASE WHEN analyte = 'BBBB' THEN result END) AS bbbb
FROM mytable
GROUP BY date_received, name, dob
UPDATE: Here is how you might accomplish the same thing with PIVOT:
SELECT * FROM (
SELECT date_received, name, dob, analyte, result
FROM mytable
) PIVOT ( MAX(result) FOR (analyte) IN ('AAAA' AS a,'BBBB' AS b) );
SQL Fiddle Demo here.
Note that with a PIVOT you still need an aggregate function - MAX() will do here, and so will MIN if there is only one value! AS in the PIVOT clause names the resulting columns.
The other thing with PIVOT is that the possible values have to be explicitly named. You can get around that by using XML (read more about how to do that), but then all your results will be XML-ized.

Logically merging 4 columns of the same information

I'm querying 3 different databases (4 total fields) for their "username" field given a particular machine name in our environment: SCCM, McAfee EPO, and ActiveDirectory.
The four columns are SCCM_TOP, SCCM_LAST, EPO, AD
Some of the tuples I get look like:
JOE, JOE, ADMINISTRATOR, JOE
or
JOE, SARAH, JOE, JOE
or
NULL, NULL, JOE, JOE
or
NULL, NULL, JOE, SARAH
The last example of which is the most difficult to code against.
I'm writing a CASE statement to help merge the information in an additive way to give one
final column of the "best guess". At the moment, I'm weighing the most valid username based on another column, which is "age of the record" from each database.
CASE
WHEN ePO_Age <= CT_AGE AND NOT ePO_UN IS NULL THEN ePO_UN
WHEN NOT (SCCM_AGE) IS NULL AND NOT (SCCM_LAST_UN) IS NULL THEN SCCM_LAST_UN
WHEN NOT (SCCM_AGE) IS NULL AND NOT (SCCM_TOP_UN) IS NULL THEN SCCM_TOP_UN
WHEN NOT (AD_UN) IS NULL THEN AD_UN
ELSE NULL
END AS BestName,
But there has to be a better way to combine these records into one. My next step is to weigh the "average age" and then pick the username from there, discarding "Administrator".
Any thoughts or tricks?
You could benefit a little from the COALESCE function to get the first NON-NULL value and do something like:
COALESCE(CASE WHEN ePO_Age<=CT_AGE THEN ePO_UN END,
CASE WHEN SCCM_AGE IS NOT NULL THEN COALESCE(SCCM_LAST_UN, SCCM_TOP_UN) END,
AD_UN) AS BestName
If you just want to get the most recent UserName that isn't null, try using UNION to combine the results from each table.
SELECT TOP 1 qry.UserName
FROM(
SELECT UserName, CreateDate
FROM UserNames_1
UNION ALL
SELECT UserName, CreateDate
FROM UserNames_2
UNION ALL
SELECT UserName, CreateDate
FROM UserNames_3
) AS qry
WHERE qry.UserName IS NOT NULL
ORDER BY qry.CreateDate DESC
Have a SQL Fiddle

ANSI equivalent of IS NULL

I am trying to find the ANSI way to write the T-SQL 'IS NULL'. (corrected, was 'IN NULL')
Some posts on the internet say you can use coalesce to make it work like 'IS NULL'
The reason I like to do this: portable code. And the query must return the rows that are NULL.
So far I created this:
SELECT empid,
firstname,
lastname,
country,
coalesce(region,'unknown') AS regions ,
city
FROM HR.Employees
The result set looks like:
empid firstname lastname country regions city
1 Sara Davis USA WA Seattle
2 Don Funk USA WA Tacoma
3 Judy Lew USA WA Kirkland
4 Yael Peled USA WA Redmond
5 Sven Buck UK unknown London
6 Paul Suurs UK unknown London
7 Russell King UK unknown London
8 Maria Cameron USA WA Seattle
9 Zoya Dolgopyatova UK unknown London
I identified the rows that are NULL, but how do I filter them out of this set?
Both IS NULL and COALESCE are ANSI standard and available in almost all reasonable databases. The construct that you want, I think, is:
where region IS NULL
This is standard syntax.
To have COALESCE work like IS NULL requires a value that you know is not in the data:
where coalesce(region, '<null>') <> '<null>'
However, you would need different values for dates and numbers.
You seem to be confusing IS NULL (a predicate that checks to see if a value is null) and the T-SQL specific function ISNULL(value, replace) (no space and parameters after it), which is similar, but not identical to COALESCE.
Please see SQL - Difference between COALESCE and ISNULL? for details on how COALESCE and ISNULL differ for T-SQL.
Minor differences like what type is returned and what happens when all the arguments are null aside, ISNULL is a function that returns the first argument if it is not null, or the second argument if it is. COALESCE returns the first non-null argument (it can take more than two).
As a result, each of these might be used to solve your problem in different ways and with slightly different results.
IS NULL is valid ANSI SQL-92, is called the null predicate.
<null predicate> ::= <row value constructor> IS [ NOT ] NULL
See SQL-92, paragraph 8.6.
So WHEREcolumn nameIS NULL is perfectly valid.
The bit where ANSI SQL treats NULL values different from T-SQL is when you write WHERE column name = NULL or WHERE column name <> NULL. See SET ANSI NULLS (Transact-SQL).