I have a simple linq query
(From a In db.payments Group Join b In db.interestcharges On a.pid Equals b.pid Into totalinterest = Group, TotalInterestReceived = Sum(b.interest)) select a, TotalInterestReceived).toList()
b.interest is type decimal in DB.
Throws
"The cast to value type 'System.Decimal' failed because the materialized value is null. Either the result type's generic parameter or the query must use a nullable type."
This is because InterestCharges table may not have any interest records for pid.
I have tried sum(if(b.interest=nothing,0,b.interest) but this is translated by LINQ to if(b.interest=0, 0, b.interest) hence it never checks for null. Nothing else seems to work instead of nothing. I have tried vbNull, isDBNull(), no success. query works fine when the sum is not null. defaultifempty(0) may work but not sure how to use it in this scenario. Any pointers?
The GROUP JOIN statement indicates that you're attempting a LEFT OUTER JOIN. This is why you're receiving a NULL issue. By using JOIN for an INNER JOIN, then you won't stumble on this, but it also means that you will only see those items that have values for interest.
(FROM a in db.Payments
Join b in db.InterestCharges ON a.pid Equals b.Pid
SELECT a, TotalInterestReceived = SUM(b.Interest)
).toList()
It is possible to generate a sub-select which may get the values you're hoping for. The purpose here is that you get all payments, and then basically add in the Interest charges (or 0 if there are none).
(From a In db.Payments
Select a
, TotalInterestRecieved = (From b in db.InterestCharges
Where b.pid = a.pid
select b.Interest).DefaultIfEmpty(0).Sum()
).ToList
EDIT:
Another option would be to bypass EF entirely. Build a view in the database and query that view directly rather than attempting to access the underlying data via LINQ.
Most other suggestions I would have would involve iterating through the initial list of "Payments" and populating the values as needed. Which is fine for a small number of "Payments" but that is a O(n) solution.
Related
Background
I've got this PostgreSQL join that works pretty well for me:
select m.id,
m.zodiac_sign,
m.favorite_color,
m.state,
c.combined_id
from people."People" m
LEFT JOIN people.person_to_person_composite_crosstable c on m.id = c.id
As you can see, I'm joining two tables to bring in a combined_id, which I need for later analysis elsewhere.
The Goal
I'd like to write a query that does so by picking the combined_id that's got the lowest value of m.id next to it (along with the other variables too). This ought to result in a new table with unique/distinct values of combined_id.
The Problem
The issue is that the current query returns ~300 records, but I need it to return ~100. Why? Each combined_id has, on average, 3 different m.id's. I don't actually care about the m.id's; I care about getting a unique combined_id. Because of this, I decided that a good "selection criterion" would be to select rows based on the lowest value m.id for rows with the same combined_id.
What I've tried
I've consulted several posts on this and I feel like I'm fairly close. See for instance this one or this one. This other one does exactly what I need (with MAX instead of MIN) but he's asking for it in Unix Bash 😞
Here's an example of something I've tried:
select m.id,
m.zodiac_sign,
m.favorite_color,
m.state,
c.combined_id
from people."People" m
LEFT JOIN people.person_to_person_composite_crosstable c on m.id = c.id
WHERE m.id IN (select min(m.id))
This returns the error ERROR: aggregate functions are not allowed in WHERE.
Any ideas?
Postgres's DISTINCT ON is probably the best approach here:
SELECT DISTINCT ON (c.combined_id)
m.id,
m.zodiac_sign,
m.favorite_color,
m.state,
c.combined_id
FROM people."People" m
LEFT JOIN people.person_to_person_composite_crosstable c
ON m.id = c.id
ORDER BY
c.combined_id,
m.id;
As for performance, the following index on the crosstable might speed up the query:
CREATE INDEX idx ON people.person_to_person_composite_crosstable (id, combined_id);
If used, the above index should let the join happen faster. Note that I cover the combined_id column, which is required by the select.
I am working on 2 different assignments where I have to do null checks but I'm not sure if I have written the syntax correctly for that my instructor has not really discussed this but will be marking for it.
Below are the 2 questions and what I have written. Any help is appreciated.
Assignment 1 question: Create a list of the sales order numbers for orders not ordered online and not with a credit card. Note: 0 is false and 1 is true for bit fields. Below is the syntax i used, am i doing a null check here?
SELECT SalesOrderNumber
FROM Sales.SalesOrder_json
WHERE OnlineOrderFlag = 0 AND CreditCardID IS NULL
Assignment 2 question: list the vendors that have no products. Below is the syntax I used, am I doing a null check here?
SELECT
pv.Name AS Vendors,
COUNT(PP.ProductID) AS 'Products'
FROM
Purchasing.Vendor AS PV
LEFT JOIN
Purchasing.ProductVendor AS PPV ON PV.BusinessEntityID = PPV.BusinessEntityID
LEFT JOIN
Production.Product AS PP ON PP.ProductID = PPV.ProductID
GROUP BY
PV.Name
HAVING
COUNT(PP.ProductID) = 0;
Welcome to Stack Overflow!
In the future, please post a summary or create table statements that represents the schema of the tables used in your queries so that we have enough information to provide more than speculative responses. Even though this is the Adventure Works DB, you should start your SO journey with good habits!
please try not to post direct Assignment questions online as you will easily get done for plagiarism by most academic assignment checkers, mainly because other students may see your post, and the support that you get from the community which could result in all of you handing in the same result.
Have you run your queries? Do you think the results are correct?
If the results from your queries are correct, then the only issue is "have you done any null checks"? One could say that if your results have returned the correct results then you must have satisfied the criteria, otherwise the question wasn't formulated very well.
Null checks can be summarised into 3 patterns:
You directly compare against null using IS NULL or IS NOT NULL in your query
Use of JOIN syntax to deal with data that may have nulls.
INNER JOIN will limit the results to only records that match in both tables. Use this if you need to omit records that have a null in the foreign key field.
Non INNER joins, like LEFT JOIN. This will return results from the left table, even if there are no matching records in the joined or right table.
This is a good discussion on all supported joins: LEFT JOIN vs. LEFT OUTER JOIN in SQL Server
Use of Aggregation functions, aggregates will generally omit null values, COUNT will return 0 if all values are NULL, where as other aggregates such as SUM, MIN, MAX, AVG will return NULL if all values are NULL
Question 1
Clearly you have implemented a NULL check because you have evaluated criteria directly on the nullable column.
It looks like your answer to Question 1 is pretty good.
Question 2
While your query looks like it would return the vendors with no products, it is also returning a count of zero.
You do not need to output a column so that you can use it in a filter criteria, so remove COUNT(PP.ProductID) AS 'Products' unless you have been otherwise instructed to use it.
Is this a NULL check... That up to the interpretation, I think in this case the answer is yes. By using LEFT JOIN (or OUTER joins) you have created a result set that will have the field PP.ProductID with a value of NULL If there are no products.
Using Count in the filter criteria over that null column and recognising that a Count with a zero result means that the ProductID column was in fact null means you have evaluated a null check.
There are other ways to query for the same results, such as using NOT EXISTS. NOT EXISTS would NOT be a direct null check, because NULLABILITY was not evaluated directly.
I'm not very good at SQL or SSRS so please excuse any incorrect terminology. I work at a wood shop and I'm editing a parts report which has an existing query that returns separate fields that contain duplicate data. One of the fields is a direct select from some joins, the other is a sub-query that is aliased. I want to use the sub-query field only to be consistent.
I try to set the tablix filter to [MAT_DESC] <> (leave blank) but the tablix does not filter. [MATNAME] <> (leave blank) works. not(isnothing([MAT_DESC])) = True also works.
WITH ORDERLIST AS (SELECT ... FROM ... WHERE...)
SELECT
IDBGPL.MATNAME, --THIS ONE WILL FILTER
(SELECT MAT.TEXT FROM MAT WHERE MAT.NAME=IDBGPL.MATID) AS MAT_DESC, --THIS ONE WON'T FILTER
(SELECT MAT.ORDERID FROM MAT WHERE MAT.NAME=IDBGPL.MATID) AS MAT_DESC2, --THIS ONE IS ALSO USED AND COMES FROM THE SAME TABLE
FROM ORDERLIST
INNER JOIN...
INNER JOIN...
INNER JOIN...
When I try to filter a table with the sub-query field it doesn't work. When I use the directly selected field it does. Why does SSRS treat the sub-query field differently?
EDIT: For some clarification. The data is coming from a CAD/CAM program. The IDBGPL table has every part in every order in the system. The MAT table is a section of the program that describes each material. There are some parent/child parts where the parent does not have a material. I'm wanting to filter out those parent parts.
This could potentially return NULL:
(SELECT MAT.TEXT FROM MAT WHERE MAT.NAME=IDBGPL.MATID) AS MAT_DESC
You cannot evaluate a NULL value other than checking for it being NULL or not NULL.
So one solution is never let it be null:
ISNULL((SELECT MAT.TEXT FROM MAT WHERE MAT.NAME=IDBGPL.MATID),"") AS MAT_DESC
Another solution is check for it being NULL on the outside (in SSRS), so checking for NULL or blank. You just need to understand that they are not the same value.
Also you should consider doing a LEFT JOIN to mat, rather than using sub-queries.
I'm working in a query window in SSMS.
Using 3 tables:
WORK_ORDER wo
An order to fabricate a part
OPERATION op
An operation in the fabrication of the part (laser, grinding, plating, etc.)
PART pt
A unique record defining the part
My objective is to report on the status of an operation (say #3) (#total parts ordered, #completed parts), but additionally to include the number of parts that have completed the previous operation (#2) in the sequence and are ready for the process. My solution was to use the LAG function, which works perfectly when the nested select statement below is run independently, but I get an avg of 4X duplication in my results, and my Completed_QTY_PREV_OP column is not displayed. I am aware that's because it's not in the parent select statement, but I wanted to correct the join first. I'm guessing the two problems are related.
Footnote: The WHERE contains a filter that you can ignore. The parent select statement works perfectly without the joined subquery.
Here's my sql:
SELECT op.RESOURCE_ID, pt.USER_5 AS PRODUCT, wo.PART_ID, wo.TYPE, wo.BASE_ID,
wo.LOT_ID, wo.SPLIT_ID, wo.SUB_ID, op.SEQUENCE_NO, pt.DESCRIPTION,
wo.DESIRED_QTY, op.FULFILLED_QTY AS QTY_COMP, op.SERVICE_ID, op.DISPATCHED_QTY, wo.STATUS
FROM dbo.WORK_ORDER wo INNER JOIN
dbo.OPERATION op ON wo.TYPE = op.WORKORDER_TYPE
AND wo.BASE_ID = op.WORKORDER_BASE_ID
AND wo.LOT_ID = op.WORKORDER_LOT_ID
AND wo.SPLIT_ID = op.WORKORDER_SPLIT_ID
AND wo.SUB_ID = op.WORKORDER_SUB_ID INNER JOIN
dbo.PART pt ON wo.PART_ID = pt.ID
LEFT OUTER JOIN
--The nested select statement works by itself in a query window,
--but the JOIN throws an error.
(SELECT
pr.WORKORDER_TYPE, pr.WORKORDER_BASE_ID, pr.WORKORDER_LOT_ID,
pr.WORKORDER_SPLIT_ID, pr.WORKORDER_SUB_ID, pr.SEQUENCE_NO,
LAG (COMPLETED_QTY, 1) OVER (ORDER BY pr.WORKORDER_TYPE, pr.WORKORDER_BASE_ID,
pr.WORKORDER_LOT_ID, pr.WORKORDER_SPLIT_ID, pr.WORKORDER_SUB_ID, pr.SEQUENCE_NO) AS COMP_QTY_PREV_OP
FROM dbo.OPERATION AS pr) AS prev
--End of nested select
ON
op.WORKORDER_TYPE = prev.WORKORDER_TYPE AND
op.WORKORDER_BASE_ID = prev.WORKORDER_BASE_ID AND
op.WORKORDER_LOT_ID = prev.WORKORDER_LOT_ID AND
op.WORKORDER_SPLIT_ID = prev.WORKORDER_SPLIT_ID AND
op.WORKORDER_SUB_ID = prev.WORKORDER_SUB_ID
WHERE (NOT (op.SERVICE_ID IS NULL)) AND (wo.STATUS = N'R')
You haven't given enough information for a definitive answer, so instead I will give you an approach to debugging this.
You are getting unexpected rows as a result of a JOIN. This means that your JOIN condition is not matching the two sides of the JOIN on a one-to-one basis. There are multiple rows in the table being JOINed that meet the JOIN conditions.
To find these rows, temporarily change your SELECT list to SELECT *. Do this both in the outer SELECT, and in the derived table. Look through the columns being returned by the JOINed table, and find the values that you didn't expect to be returned.
Since the JOIN that causes the issue is the last one, they will be all the way to right of the result of a SELECT *.
Then add more conditions to the JOIN to eliminate the unwanted rows from the results.
I simplified the whole query by first creating a temp table filled by the previously nested SELECT, and then joining to it from the parent SELECT.
Works perfectly now. Thanks for looking.
PS: I apologize for the confusion about an error message. I noticed after I posted that I had an old comment in the code regarding an error. The error had been resolved before posting, but I neglected to remove the comment.
I have two tables in an MS Access 2010 database: TBLIndividuals and TblIndividualsUpdates. They have a lot of the same data, but the primary key may not be the same for a given person's record in both tables. So I'm doing a join between the two tables on names and birthdates to see which records correspond. I'm using a left join so that I also get rows for the people who are in TblIndividualsUpdates but not in TBLIndividuals. That way I know which records need to be added to TBLIndividuals to get it up to date.
SELECT TblIndividuals.PersonID AS OldID,
TblIndividualsUpdates.PersonID AS UpdateID
FROM TblIndividualsUpdates LEFT JOIN TblIndividuals
ON ( (TblIndividuals.FirstName = TblIndividualsUpdates.FirstName)
and (TblIndividuals.LastName = TblIndividualsUpdates.LastName)
AND (TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or (TblIndividuals.DateBorn is null
and (TblIndividuals.MidName is null and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName))));
TblIndividualsUpdates has 4149 rows, but the query returns only 4103 rows. There are about 50 new records in TblIndividualsUpdates, but only 4 rows in the query result where OldID is null.
If I export the data from Access to PostgreSQL and run the same query there, I get all 4149 rows.
Is this a bug in Access? Is there a difference between Access's left join semantics and PostgreSQL's? Is my database corrupted (Compact and Repair doesn't help)?
ON (
TblIndividuals.FirstName = TblIndividualsUpdates.FirstName
and
TblIndividuals.LastName = TblIndividualsUpdates.LastName
AND (
TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or
(
TblIndividuals.DateBorn is null
and
(
TblIndividuals.MidName is null
and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName
)
)
)
);
What I would do is systematically remove all the join conditions except the first two until you find the records drop off. Then you will know where your problem is.
This should never happen. Unless rows are being inserted/deleted in the meantime,
the query:
SELECT *
FROM a LEFT JOIN b
ON whatever ;
should never return less rows than:
SELECT *
FROM a ;
If it happens, it's a bug. Are you sure the queries are exactly like this (and you have't omitted some detail, like a WHERE clause)? Are you sure that the first returns 4149 rows and the second one 4103 rows? You could make another check by changing the * above to COUNT(*).
Drop any indexes from both tables which include those JOIN fields (FirstName, LastName, and DateBorn). Then see whether you get the expected
4,149 rows with this simplified query.
SELECT
i.PersonID AS OldID,
u.PersonID AS UpdateID
FROM
TblIndividualsUpdates AS u
LEFT JOIN TblIndividuals AS i
ON
(
(i.FirstName = u.FirstName)
AND (i.LastName = u.LastName)
AND (i.DateBorn = u.DateBorn)
);
For whatever it is worth, since this seems to be a deceitful bug and any additional information could help resolving it, I have had the same problem.
The query is too big to post here and I don't have the time to reduce it now to something suitable, but I can report what I found. In the below, all joins are left joins.
I was gradually refining and changing my query. It had a derived table in it (D). And the whole thing was made into a derived table (T) and then joined to a last table (L). In any case, at one point in its development, no field in T that originated in D participated in the join to L. It was then the problem occurred, the total number of rows mysteriously became less than the main table, which should be impossible. As soon as I again let a field from D participate (via T) in the join to L, the number increased to normal again.
It was as if the join condition to D was moved to a WHERE clause when no field in it was participating (via T) in the join to L. But I don't really know what the explanation is.