Access 2010 Query Confusion - sql

Just need some quick clarification
I have 2 Queries in my Access Database that should return Inverse results:
SELECT Equipment.title
FROM Equipment
WHERE (((Equipment.[EquipmentID]) Not In (
select EquipmentID
from DownPeriod
where UpDate is null
)));
The 2nd just excludes the Not before the In.
My Confusion comes from the fact that the query posted above does not return any results if an EquipmentID field has at least 1 null value in the DownPeriod table.
It works fine if the fields are filled, and the inverse query list always works. This makes me think there's an issue with the null value.
Now this field should never be null but I wanted to know if I could still get this to work in the unlikely event a null did occur.
Thank you in advanced!

Try joins:
SELECT Equipment.title FROM Equipment INNER JOIN DownPeriod
ON Equipment.EquipmentID = DownPeriod.EquipmentID
WHERE DownPeriod.UpDate is null
and
SELECT Equipment.title FROM Equipment INNER JOIN DownPeriod
ON Equipment.EquipmentID = DownPeriod.EquipmentID
WHERE DownPeriod.UpDate is not null
See if a change in syntax fixes your issue.
Not only should this work, but I believe it is a faster practise than using the IN() NOT IN() methods (might be wrong on that, but it looks nicer to read). It also adds the ability to quickly change the "is not null" criteria just the same as IN->NOT IN

I agree with StuckAtWork's approach. However, if you still want to understand why your original approach didn't produce the results you want, I think I can help you.
There may be an issue with empty strings which could complicate the situation. But regardless of whether or not empty strings are involved you have something more fundamental to consider.
Here is my version of the Equipment table.
EquipmentID title
1 one
2 two
3 three
And here is my version of the DownPeriod table.
ID EquipmentID text_field
1 1 one
2 2 two
3 Null
4 3 three
I didn't include your UpDate field in my DownPeriod table. It's irrelevant to your problem.
I pasted your SQL into a new Access query, discarded the WHERE clause from the subquery, and got exactly the same result as this query --- no rows returned:
SELECT e.title
FROM Equipment AS e
WHERE
e.EquipmentID Not In (
SELECT EquipmentID
FROM DownPeriod
);
So consider this situation from the db engine's perspective. Using my version of the Downloads table, it has a set of values (1, 2, Null, and 3) from the subquery. You're asking it to show you the rows from Equipment where EquipmentID is NOT IN that list of values. The db engine will only give you the rows for which that condition is True.
Null is the problem. For each EquipmentID, when it considers whether that value is not present in the subquery set, it doesn't know. That Null is an unknown value ... and the unknown value might be the same as the current EquipmentID it's considering ... or might be something else. But since the db engine doesn't know the real value, it can't evaluate the condition as True, so will not include that row in the result set. The same thing happens for every row in Equipment table ... therefore your query's result set is empty (no rows).
You could get your desired results by excluding Null values from the subquery result set with a WHERE clause like the one below. But I think StuckAtWork's suggestion is a better way to go.
SELECT e.title
FROM Equipment AS e
WHERE
e.EquipmentID Not In (
SELECT EquipmentID
FROM DownPeriod
WHERE EquipmentID Is Not Null
);

Related

SQL Server 2016 AdventureWorks NULL checks

I am working on 2 different assignments where I have to do null checks but I'm not sure if I have written the syntax correctly for that my instructor has not really discussed this but will be marking for it.
Below are the 2 questions and what I have written. Any help is appreciated.
Assignment 1 question: Create a list of the sales order numbers for orders not ordered online and not with a credit card. Note: 0 is false and 1 is true for bit fields. Below is the syntax i used, am i doing a null check here?
SELECT SalesOrderNumber
FROM Sales.SalesOrder_json
WHERE OnlineOrderFlag = 0 AND CreditCardID IS NULL
Assignment 2 question: list the vendors that have no products. Below is the syntax I used, am I doing a null check here?
SELECT
pv.Name AS Vendors,
COUNT(PP.ProductID) AS 'Products'
FROM
Purchasing.Vendor AS PV
LEFT JOIN
Purchasing.ProductVendor AS PPV ON PV.BusinessEntityID = PPV.BusinessEntityID
LEFT JOIN
Production.Product AS PP ON PP.ProductID = PPV.ProductID
GROUP BY
PV.Name
HAVING
COUNT(PP.ProductID) = 0;
Welcome to Stack Overflow!
In the future, please post a summary or create table statements that represents the schema of the tables used in your queries so that we have enough information to provide more than speculative responses. Even though this is the Adventure Works DB, you should start your SO journey with good habits!
please try not to post direct Assignment questions online as you will easily get done for plagiarism by most academic assignment checkers, mainly because other students may see your post, and the support that you get from the community which could result in all of you handing in the same result.
Have you run your queries? Do you think the results are correct?
If the results from your queries are correct, then the only issue is "have you done any null checks"? One could say that if your results have returned the correct results then you must have satisfied the criteria, otherwise the question wasn't formulated very well.
Null checks can be summarised into 3 patterns:
You directly compare against null using IS NULL or IS NOT NULL in your query
Use of JOIN syntax to deal with data that may have nulls.
INNER JOIN will limit the results to only records that match in both tables. Use this if you need to omit records that have a null in the foreign key field.
Non INNER joins, like LEFT JOIN. This will return results from the left table, even if there are no matching records in the joined or right table.
This is a good discussion on all supported joins: LEFT JOIN vs. LEFT OUTER JOIN in SQL Server
Use of Aggregation functions, aggregates will generally omit null values, COUNT will return 0 if all values are NULL, where as other aggregates such as SUM, MIN, MAX, AVG will return NULL if all values are NULL
Question 1
Clearly you have implemented a NULL check because you have evaluated criteria directly on the nullable column.
It looks like your answer to Question 1 is pretty good.
Question 2
While your query looks like it would return the vendors with no products, it is also returning a count of zero.
You do not need to output a column so that you can use it in a filter criteria, so remove COUNT(PP.ProductID) AS 'Products' unless you have been otherwise instructed to use it.
Is this a NULL check... That up to the interpretation, I think in this case the answer is yes. By using LEFT JOIN (or OUTER joins) you have created a result set that will have the field PP.ProductID with a value of NULL If there are no products.
Using Count in the filter criteria over that null column and recognising that a Count with a zero result means that the ProductID column was in fact null means you have evaluated a null check.
There are other ways to query for the same results, such as using NOT EXISTS. NOT EXISTS would NOT be a direct null check, because NULLABILITY was not evaluated directly.

Why would a subquery field not filter tablix but main query field does?

I'm not very good at SQL or SSRS so please excuse any incorrect terminology. I work at a wood shop and I'm editing a parts report which has an existing query that returns separate fields that contain duplicate data. One of the fields is a direct select from some joins, the other is a sub-query that is aliased. I want to use the sub-query field only to be consistent.
I try to set the tablix filter to [MAT_DESC] <> (leave blank) but the tablix does not filter. [MATNAME] <> (leave blank) works. not(isnothing([MAT_DESC])) = True also works.
WITH ORDERLIST AS (SELECT ... FROM ... WHERE...)
SELECT
IDBGPL.MATNAME, --THIS ONE WILL FILTER
(SELECT MAT.TEXT FROM MAT WHERE MAT.NAME=IDBGPL.MATID) AS MAT_DESC, --THIS ONE WON'T FILTER
(SELECT MAT.ORDERID FROM MAT WHERE MAT.NAME=IDBGPL.MATID) AS MAT_DESC2, --THIS ONE IS ALSO USED AND COMES FROM THE SAME TABLE
FROM ORDERLIST
INNER JOIN...
INNER JOIN...
INNER JOIN...
When I try to filter a table with the sub-query field it doesn't work. When I use the directly selected field it does. Why does SSRS treat the sub-query field differently?
EDIT: For some clarification. The data is coming from a CAD/CAM program. The IDBGPL table has every part in every order in the system. The MAT table is a section of the program that describes each material. There are some parent/child parts where the parent does not have a material. I'm wanting to filter out those parent parts.
This could potentially return NULL:
(SELECT MAT.TEXT FROM MAT WHERE MAT.NAME=IDBGPL.MATID) AS MAT_DESC
You cannot evaluate a NULL value other than checking for it being NULL or not NULL.
So one solution is never let it be null:
ISNULL((SELECT MAT.TEXT FROM MAT WHERE MAT.NAME=IDBGPL.MATID),"") AS MAT_DESC
Another solution is check for it being NULL on the outside (in SSRS), so checking for NULL or blank. You just need to understand that they are not the same value.
Also you should consider doing a LEFT JOIN to mat, rather than using sub-queries.

Access SQL Syntax

I have enherited an Access-database with some queries. It is an account-database with information on the different accounts. Specifically IBAN-numbers for each account.
One of the queries is this, where we compare IBAN-numbers from the database with IBAN-numbers from the imported Id-table:
SELECT CAMTaccounts.IBAN, CAMTaccounts.Comment
FROM CAMTaccounts LEFT JOIN Id ON CAMTaccounts.[IBAN] = Id.[IBAN]
WHERE (((Id.IBAN) Is Null));
I thought I understood the SQL-language to some degree coming from SQL Server, but this statement, I cannot understand.
To me, this is equivalent to writing:
Select CAMTaccounts.*
From CAMTaccounts Left Outer Join Id On CAMTaccounts.IBAN = Id.IBAN
Where Id.IBAN Is Null
and this join, to me, does not make any sense.
But clearly, I am not understanding this correctly.
I was hoping, that some of you could explain my flawed logic to me.
Thanks.
This query will return all rows from CAMTaccounts which have no matching Id rows. This is known as Anti-join query. Excessive parenthesis really make no sense, most probably generated with a tool.
Suppose that CAMTaccounts.IBAN have values
A
B
And Id.IBAN have values
B
C
Then this query WITHOUT WHERE clause outputs something like this
`CAMTaccounts.IBAN` | `Id.IBAN`
---------------------------------
A | NULL
B | B
You can easily see what will result from this, if add Where Id.IBAN Is Null clause.
So, this query will give you result from CAMTaccounts table, when IBAN column doesn't contains value from Id.IBAN column

Mystery query fail: Why did this create a massive output?

I was attempting to do some basic Venn Diagram subtraction to compare a temp table to some live data, and see how they were different.
This query blew up to well north of 15 million returned rows, and I noticed it was duplicating (by 10,000x or more) a known unique field - indicating something went very wrong with my query (I mean by this that rows were being duplicated and I could verify this by this Globally Unique Identifier field). I was expecting to get at most 200 rows returned:
select a.*
from TableOfLiveData a
inner join #TempDataToBeSubtracted b
on a.GUID <> b.guidTemp --I suspect the issue is here
where {here was a limiting condition that should have reduced my live
data to a "pre-join" count(*) of 20,000 at most...}
After I hit Execute the query ran much longer than expected and I could see that millions of rows were being returned before I had to cancel out.
Let me know what the obvious thing is!?!?
edit: FYI: If the where clause were not included, I would expect a VAST amount of rows returned...
Although your query is logically correct, the problem is you have a "Cartesian product" (n x m rows) in your join, but the where clause is executed after the join is made, so you have a colossal number of rows over which the where clause must be executed... so it will be very, very slow.
A better approach is to do an outer join on the key columns, but discard all successful joins by filtering for missed joins:
select a.*
from TableOfLiveData a
left join #TempDataToBeSubtracted b on b.guidTemp = a.GUID
where a.field1 = 3
and a.field2 = 1515
and b.guidTemp is null -- only returns rows that *don't* match
This works because when an outer join is missed, you still get the row from the main table and all columns in the joined table are null.
Creating an index on (field1, field2) will improve performance.
Thank you #Lamak and #MartinSmith for your comments that solved this problem.
By using a 'not equals' in my "on" clause, I ensured that I would be selecting every row in LiveTable that didn't have a GUID in my #TempTable, not just once as I intended, but for each entry in my #TempTable, multiplying my results by about 20,000 in this case (the cardinality of the #TempTable).
To fix this, I did a simple subquery on my #TempTable using the "Not In" Statement as recommended in the comments. This query finished in under a minute and returned under a 100 rows, which was much more in-line with my expectation:
select a.*
from TableOfLiveData a
where a.GUID not in (select b.guidTemp from #TempDataToBeSubtracted b)
and {subsequent constraint statement not relevant to question}

SQL INNER JOIN update NOT duplicating rows

I have an update query that has an inner join. I expect this query to return two columns because of the join, but it seems that the QUERY is taking only the first row and using that to update the data while ignoring the rest.
Here is my update command
UPDATE [mamd]
SET [Brand_EL] = IIF(CHARINDEX('ELECT', UPPER([mml].[Brand_Desc])) > 0, 'YES', [Brand_EL])
FROM [mamd] [m]
INNER JOIN [ior] [ir] ON [ir].[CLIENT_CUSTOMER_ID] = [m].[CustomerId] COLLATE Latin1_General_CI_AS
INNER JOIN [maslist] [mml] ON [mml].[Model] = [ir].[MODEL] COLLATE Latin1_General_CI_AS
If I do a select like this
SELECT [ir].[CLIENT_CUSTOMER_ID], IIF(CHARINDEX('ELECT', UPPER([mml].[Brand_Desc])) > 0, 'YES', [Brand_EL])
FROM [mamd] [m]
INNER JOIN [ior] [ir] ON [ir].[CLIENT_CUSTOMER_ID] = [m].[CustomerId] COLLATE Latin1_General_CI_AS
INNER JOIN [maslist] [mml] ON [mml].[Model] = [ir].[MODEL] COLLATE Latin1_General_CI_AS
I get the following data returned
CLIENT_CUSTOMER_ID | Brand_EL
-------------------+----------
980872 | NO
980872 | YES
The reason I think it's only taking one record is because
The value NEVER changes to "YES"
When I run the update command it says only 1 row updated even though it should have gone through two
One thing that might be contributing to the problem is that [mamd] does NOT contain multiple records for that same user; It's a unique field. Since it's a unique field and therefore only has one row, does that mean that it will run that join only once? If that's the case, is there a better way I can do this without nested selects to generate the results?
UPDATE
Hey Everyone,
Just as an update, I took Gordons Advice and use aggregation. In this example that I have, I only cared if the value was "YES' because I only need to know if the customer bought a specific product. So what I ended up doing was grouping by the Customer ID and using the MAX function. If the customer bought a product, "YES" would bubble up to the top. If he didn't it would stay as NO or NULL. In that event, it wouldn't matter.
The behavior is correct and documented, although not in a very clear way:
Use caution when specifying the FROM clause to provide the criteria
for the update operation. The results of an UPDATE statement are
undefined if the statement includes a FROM clause that is not
specified in such a way that only one value is available for each
column occurrence that is updated, that is if the UPDATE statement is
not deterministic. For example, in the UPDATE statement in the
following script, both rows in Table1 meet the qualifications of the
FROM clause in the UPDATE statement; but it is undefined which row
from Table1 is used to update the row in Table2.
What this is trying to say is that a row is only updated once by the update. Which value gets used is indeterminate. So, if you need to decide how you want to handle the multiple matches.