Exclude zero values but include NULL values - sql

I want in my query to exclude only zero values but also keeping NULL.
I have tried a few options that work. It has a few 'JOIN' in it.
However.. They all seem to generate different results. The output have different amount of certain values.
E.g. lets say one option gives COLUMN1 10 rows with value 'A' and 5 rows of 'B' and 3 rows of 'C'. The other option gives me 7 rows of value 'A' and 9 rows of 'B' and 2 rows of 'C'. Which one should be most suitable (or neither?) and why:
where..
and a.exitreason<>'0' or a.exitreason is null
and (a.exitreason<>'0' or a.exitreason is null)
and ( isnull(a.exitreason,'') <>'0' OR a.exitreason is null)
Or include it in my JOIN part of the query (table LocalOffice)?
Thanks!
SELECT DISTINCT s.PeriodDate,s.Number,SiteID,
s.LocalID,s.Appointment,s.Agreement,s.AgreementCode,a.ExitReason
FROM Office s
INNER JOIN Employer e ON s.PeriodDate=e.PeriodDate AND s.EmployerID=e.EmployerID
LEFT JOIN LocalOffice a ON a.LocalOfficeID=a.LocalOfficeID
WHERE.....

The details you provided for your particular case aren't clear, but regarding the WHERE statements you've tried, they give different result sets as they rightly should (they're logically requesting different things).
Based on your initial description that you want all rows where given value is either 0 or NULL, your second condition will get you that.

Related

SQL Server ISNULL multiple columns

I have the following query which works great but how do I add multiple columns in its select statement? Following is the query:
SELECT ISNULL(
(SELECT DISTINCT a.DatasourceID
FROM [Table1] a
WHERE a.DatasourceID = 5 AND a.AgencyID = 4 AND a.AccountingMonth = 201907), NULL) TEST
So currently I only get one column (TEST) but would like to add other columns such as DataSourceID, AgencyID and AccountingMonth.
If you want to output a row for some condition (or requested values ) and output a row when it does not meet condition,
you can set a pseudo table for your requested values in the FROM clause and make a left outer join with your Table1.
SELECT ISNULL(Table1.DatasourceId, 999999),
Table1.AgencyId,
Table1.AccountingMonth,
COUNT(*) as count
FROM ( VALUES (5, 4, 201907 ),
(6, 4, 201907 ))
AS requested(DatasourceId, AgencyId, AccountingMonth)
LEFT OUTER JOIN Table1 ON requested.agencyid=Table1.AgencyId
AND requested.datasourceid = Table1.DatasourceId
AND requested.AccountingMonth = Table1.AccountingMonth
GROUP BY Table1.DatasourceId, Table1.AgencyId, Table1.AccountingMonth
Note that:
I have put a ISNULL for the first column like you did to output a particular value (9999) when no value is found.
I did not put the ISNULL(...,NULL) like your query in the other columns since IMHO it is not necessary: if there is no value, a null will be output anyway.
I added a COUNT(*) column to illustrate an aggregate, you could use another (SUM, MIN, MAX) or none if you do not need it.
The set of requested values is provided as a constant table values (see https://learn.microsoft.com/en-us/sql/t-sql/queries/table-value-constructor-transact-sql?view=sql-server-2017)
I have added multiple rows for requested conditions : you can request for multiple datasources, agencies or months in one query with one line for each in the output.
If you want only one row, put only one row in "requested" pseudo table values.
There must be a GROUP BY, even if you do not want to use an aggregate (count, sum or other) in order to have the same behavior as your distinct clause , it restricts the output to single lines for requested values.
To me it seems that you want to see does data exists, i guess that your's AgencyID is foreign key to agency table, DataSourceID also to DataSource, and that you have AccountingMonth table which has all accounting periods:
SELECT ds.ID as DataSourceID , ag.ID as AgencyID , am.ID as AccountingMonth ,
ISNULL(COUNT(a.*),0) as Count
FROM [Table1] a
RIGHT JOIN [Datasource] ds ON ds.ID = a.DataSourceID
RIGHT JOIN [Agency] ag ON ag.ID = a.AgencyID
RIGHT JOIN [AccountingMonth] am on am.ID = a.AccountingMonth
GROUP BY ds.ID, ag.ID, am.ID
In this way you can see count of records per group by criteria. Notice RIGHT join, you must use RIGHT JOIN if you want to include all record from "Right" table.
In yours query you have DISTINCT a.DatasourceID and WHERE a.DatasourceID = 5 and it returns 5 if in table exists rows that match yours WHERE criteria, and returns null if there is no data. If you remove WHERE a.DatasourceID = 5 your query would break with error: subquery returned multiple rows.
the way you are doing only allows for one column and one record and giving it the name of test. It does not look like you really need to test for null. because you are returning null so that does nothing to help you. Remove all the null testing and return a full recordset distinct will also limit your returns to 1 record. When working with a single table you don't need an alias, if there are no spaces or keywords braced identifiers not required. if you need to see if you have an empty record set, test for it in the calling program.
SELECT DatasourceID, AgencyID,AccountingMonth
FROM Table1
WHERE DatasourceID = 5 AND AgencyID = 4 AND AccountingMonth = 201907

How to make a new column in SELECT clause and fill it with a string/list dynamically with concat of all condition statements satisfied?

So there are two tables in a database. I have to find out whichever rows have discrepancy based on certain conditions (in couple of cases that's just equality checking between fields). I report ID of those rows.
The problem is to also include the reasons in another column as to why that ID is reported. Because an id can be fail multiple conditions (like mismatch on two fields), I just wanted to include all of those reasons in another column.
Basic idea is to append all the mismatches in another column.
I've looked at several SO questions but they don't exactly my use case. So now I'm thinking it's not possible with SQL.
I searched Google for "enter dynamic column values based on conditions sql", and hit : SQL Conditional column data return in a select statement : This adds a static column
I also learned it's possible to add another column in SELECT with dynamic content like this:
SELECT id, CASE
WHEN columnname "DEF" then "I" ELSE "YOU" newColumnName
FROM tableName
But I have not been able to find dynamic column value assignment and update SQL. That's the problem.
Expected results:
I just want to be able to concat all the cases "strings" which a record is applicable for.
Do this with the two tables.
So because I have two tables to work with I have to put these conditions in the WHERE sub-clause, and not in the SELECT one.
So, if for ID = 345, column "FOO_MAN" does not match between two tables, and column "BAR_TOO" also does not match between two tables, then?
Then I want my select clause to capture information like this:
ID | REASON
345 | FOO_MAN BAR_TWO
It's probably easier to build this type of query dynamically (e.g. using a stored procedure) based on the conditions you want to test, but here is a small example which shows how it can be done:
SELECT t1.id,
CONCAT_WS(' ',
CASE WHEN t1.foo != t2.foo THEN 'foo' END,
CASE WHEN t1.bar != t2.bar THEN 'bar' END
) AS reason
FROM t1
JOIN t2 ON t2.id = t1.id
WHERE t1.foo != t2.foo OR t1.bar != t2.bar
Output (for my demo on dbfiddle)
id reason
2 foo
4 bar
5 foo bar

Select values based on DISTINCT combination of rest of columns Oracle DB

I want to select row IDs associated with distinct column combinations in the remainder of a table. For instance, if the distinct rows are
I want to get the row IDs associated with each row. I can't query for distinct IDs since they are the row's primary key (and hence are all distinct).
So far I have:
SELECT e.ID
FROM E_UPLOAD_TEST e
INNER JOIN (
SELECT DISTINCT WHAT, MATERIALS, ERROR_FIELD, UNITS, SEASONALITY, DATA_TYPE, DETAILS, METHODS, DATA_FORMAT
FROM E_UPLOAD_TEST) c
ON e.WHAT = c.WHAT AND e.MATERIALS = c.MATERIALS AND e.ERROR_FIELD = c.ERROR_FIELD AND e.DATA_TYPE = c.DATA_TYPE AND e.METHODS = c.METHODS AND e.DATA_FORMAT = c.DATA_FORMAT;
which runs but doesn't return anything. Am I missing a GROUP BY and/or MIN() statement?
#serg is correct. Every single row in your example has at least one column value that is null. That means that no row will match your join condition. That is why your query results in no rows found.
Modifying your condition might get you what you want so long has your data isn't changing frequently. If it is changing frequently, then you probably want a single query for the entire job otherwise you'll have to set your transaction so that it is immune to data changes.
An example of such a condition change is this:
( (e.WHAT is null and c.WHAT is null) or (e.WHAT = c.WHAT) )
But such a change makes sense only if two rows having a null value in the same column means the same thing for both rows and it has to mean the same thing as time marches on. What "WHAT is null" means today might not be the same thing tomorrow. And that is probably why C. J. Date hates nulls so much.
Instead of comparing, use the decode function which compares two null values correctly.
e.WHAT = c.WHAT -> DECODE(e.WHAT, c.WHAT, 1) = 1

SQL Server where column in where clause is null

Let's say that we have a table named Data with Id and Weather columns. Other columns in that table are not important to this problem. The Weather column can be null.
I want to display all rows where Weather fits a condition, but if there is a null value in weather then display null value.
My SQL so far:
SELECT *
FROM Data d
WHERE (d.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%' OR d.Weather IS NULL)
My results are wrong, because that statement also shows values where Weather is null if condition is not correct (let's say that users mistyped wrong).
I found similar topic, but there I do not find appropriate answer.
SQL WHERE clause not returning rows when field has NULL value
Please help me out.
Your query is correct for the general task of treating NULLs as a match. If you wish to suppress NULLs when there are no other results, you can add an AND EXISTS ... condition to your query, like this:
SELECT *
FROM Data d
WHERE d.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%'
OR (d.Weather IS NULL AND EXISTS (SELECT * FROM Data dd WHERE dd.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%'))
The additional condition ensures that NULLs are treated as matches only if other matching records exist.
You can also use a common table expression to avoid duplicating the query, like this:
WITH cte (id, weather) AS
(
SELECT *
FROM Data d
WHERE d.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%'
)
SELECT * FROM cte
UNION ALL
SELECT * FROM Data WHERE weather is NULL AND EXISTS (SELECT * FROM cte)
statement show also values where Wether is null if condition is not correct (let say that users typed wrong sunny).
This suggests that the constant 'sunny' is coming from end-user's input. If that is the case, you need to parameterize your query to avoid SQL injection attacks.

Oracle SQL - JOIN performance in comparing null values

Good morning,
In a query I was writing yesterday between two decent-sized result sets (<50k results each), part of my JOIN was a clause to check if the data matched or was null (simplified version below):
SELECT a JOIN b ON a.class = b.class OR (a.class is null AND b.class is null)
However, I noticed a serious performance issue centered around the use of the OR statement. I worked around the issue using the following:
SELECT a JOIN b ON NVL(a.class, 'N/A') = NVL(b.class, 'N/A')
The first query has an unacceptably long run time, while the second is a couple of orders of magnitude faster (>45 minutes vs. <1). I would expect the OR to run slower due to more comparisons, but the cases in which a.class = b.class = null are comparatively few in this particular dataset.
What would cause such a dramatic increase in performance time? Does Oracle SQL not short-circuit boolean comparisons like many other languages? Is there a way to salvage the first query over the second (for use in general SQL not just Oracle)?
You're returning a cross product with any record with a null class. Is this OK for your results?
I created two sample query in 11gR2:
WITH a as
(select NULL as class, 5 as columna from dual
UNION
select NULL as class, 7 as columna from dual
UNION
select NULL as class, 9 as columna from dual
UNION
select 'X' as class, 3 as columna from dual
UNION
select 'Y' as class, 2 as columna from dual),
b as
(select NULL as class, 2 as columnb from dual
UNION
select NULL as class, 15 as columnb from dual
UNION
select NULL as class, 5 as columnb from dual
UNION
select 'X' as class, 7 as columnb from dual
UNION
select 'Y' as class, 9 as columnb from dual)
SELECT * from a JOIN b ON (a.class = b.class
OR (a.class is null AND b.class is null))
When I run EXPLAIN PLAN on this query, it indicates the tables (inline views in my case) are joined via NESTED LOOPS. NESTED LOOPS joins operate by scanning the first row of one table, then scanning each row of the other table for matches, then scanning the second row of the first table, looks for matches on the second table, etc. Because you are not directly comparing either table in the OR portion of your JOIN, the optimizer must use NESTED LOOPS.
Behind the scenes it may look something like:
Get Table A, row 1. If class is null, include this row from Table A on the result set.
While still on Table A Row 1, Search table B for all rows where class is null.
Perform a cross product on Table A Row 1 and all rows found in Table B
Include these rows in the result set
Get Table A, row 2. If class is null, include this row from Table A on the result set.
.... etc
When I change the SELECT statement to SELECT * FROM a JOIN b ON NVL(a.class, 'N/A') = NVL(b.class, 'N/A'), EXPLAIN indicates that a HASH JOIN is used. A hash join essentially generates a hash of each join key of the smaller table, and then scans the large table, finding the hash in the smaller table for each row that matches. In this case, since it's a simple Equijoin, the optimizer can hash each row of the driving table without problems.
Behind the scenes it may look something like:
Go through table A, converting NULL class values to 'N/A'
Hash each row of table A as you go.
Hash Table A is now in temp space or memory.
Scan table B, converting NULL class values to 'N/A', then computing hash of value. Lookup hash in hash table, if it exists, include the joined row from Table A and B in the result set.
Continue scanning B.
If you run an EXPLAIN PLAN on your queries, you probably will find similar results.
Even though the end result is the same, since you aren't joining the tables in the first query with "OR", the optimizer can't use a better join methodology. NESTED LOOPS can be very slow if the driving table is large or if you are forcing a full table scan against a large secondary table.
You can use the ANSI COALESCE function to emulate the NVL oracle function in other database systems. The real issue here is that you're attempting to join on a NULL value, where you really should have a "NO CLASS" or some other method of identifying a "null" class in the sense of null = nothing instead of null = unknown.
Addendum to answer your question in the comments:
For the null query the SQL engine will do the following:
Read Row 1 from Table A, class is null, convert to 'N/A'.
Table B has 3 Rows which have class is null, convert each null to 'N/A'.
Since the first row matches to all 3 rows, 3 rows are added to our result set, one for A1B1, A1B2, A1B3.
Read Row 2 From Table A, class is null, convert to 'N/A'/
Table B has 3 Rows which have class is null, convert each null to 'N/A'.
Since the second row matches to all 3 rows, 3 rows are added to our result set, one for A2B1, A2B2, A2B3.
Read Row 3 From Table A, class is null, convert to 'N/A'/
Table B has 3 Rows which have class is null, convert each null to 'N/A'.
Since the third row matches to all 3 rows, 3 rows are added to our result set, one for A3B1, A3B2, A3B3.
10.. Rows 4 and 5 aren't null so they won't be processed in this portion of the join.
For the 'N/A' query, the SQL engine will do the following:
Read Row 1 from Table A, class is null, convert to 'N/A', hash this value.
Read Row 2 from Table A, class is null, convert to 'N/A', hash this value.
Read Row 3 from Table A, class is null, convert to 'N/A', hash this value.
Read Row 4 from Table A, class not null, hash this value.
Read Row 5 from Table A, class not null, hash this value.
Hash table C is now in memory.
Read Row 1 from Table B, class is null, convert to 'N/A', hash the value.
Compare hashed value to hash table in memory, for each match add a row to the result set. 3 rows are found, A1, A2, and A3. Results are added A1B1, A2B1, A3B1.
Read Row 2 from Table B, class is null, convert to 'N/A', hash the value.
Compare hashed value to hash table in memory, for each match add a row to the result set. 3 rows are found, A1, A2, and A3. Results are added A1B2, A2B2, A3B2.
Read Row 3 from Table B, class is null, convert to 'N/A', hash the value.
Compare hashed value to hash table in memory, for each match add a row to the result set. 3 rows are found, A1, A2, and A3. Results are added A1B3, A2B3, A3B3.
In first case, because each null is different, database doesn't use optimization (for every row from a check each row from table b).
In second case database firstly change all nulls to 'N/A' and then only compare a.class and b.class, using optimization
Comparing nulls in Oracle is very time-consuming. Null is undefined value - one null is different from other null.
Compare result of two almost identical queries:
select 1 from dual where null is null
select 1 from dual where null = null
Only first query with special is null clause return correct answer. Therefore, the null values can not be indexed.
Try this one:
SELECT a from Table1 a JOIN JTable1 b ON a.class = b.class
where a.class is null
union all
SELECT a from Table1 a JOIN JTable1 b ON a.class = b.class
where b.class is null
should be magnatudes faster
The explanation is simple:
First one has to use nested loops in join operation, it always happened when you use OR operation.
Second one has to use hash join operation, which faster then previous one.
Why don't you make it little bit easier.
like
SELECT *
FROM a,b
WHERE
a.class(+)=b.class(+)
I think it's more readable.