SQL Server Coalesce condition - sql

I have this piece of code which I am not sure how would it work:
UPDATE Data
SET Processed = 1
FROM Data
JOIN Meters
ON Meters.ServiceAccount = serv_acct
where COALESCE(Processed, 0) = 0
My question is about the last line! Would that line ever be true in this case?
Since I am setting Processed to 1 then how would that work:
where COALESCE(Processed, 0) = 0?
Can anybody explain the logic of using Coalesce in this way?
This code is not written by me.
Thank you

Your query is:
UPDATE Data
SET Processed = 1
FROM Data JOIN
Meters
ON Meters.ServiceAccount = serv_acct
where COALESCE(Processed, 0) = 0;
An update query determines the population of rows it is acting on before any of the changes are made. So, the final line is taking rows where Processed is either NULL or 0. The update is then setting Processed to 1 for those rows. In other words, the where clause is acting as a filter on the rows to modify. The specific statement is to keep the rows where the value of Processed is NULL or 0.

The COALESCE function is described here:
http://technet.microsoft.com/en-us/library/ms190349.aspx
I think the reason behind using this predicate where COALESCE(Processed, 0) = 0 was to filter all rows which have column Processed IS NULL or equal to 0.
Instead, I would use use predicates:
UPDATE Data
SET Processed = 1
FROM Data JOIN
Meters
ON Meters.ServiceAccount = serv_acct
where Processed IS NULL OR Processed = 0;
because they are SARGable. This means Index Seek.
Applying an expression on Processed column will force SQL Server to choose an [Clustered] Index Scan.

Related

Executing subquery to insert data on another table

I am getting this error while executing below query:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
This is the query I am using:
UPDATE BatchRecords SET Fees = (
SELECT
(Fees/NoOfPayments) as AvgTotal
FROM BatchFx
WHERE BatchRecords.BatchId = BatchFx.BatchId
AND BatchFx.CreatedOn < '2019-05-04'
);
Your error is happening because your WHERE clause is returning more than 1 record. So the update statement is confused which row does it need to update (because you most probably have more than 1 record per ID in the BatchFX table). I replicated your issue in this link and got the same output.
So to solve this issue you need to use an aggregator to group all the rows and output one record from the subquery.
UPDATE BatchRecords
SET Fees = (
SELECT
AVG(Fees / NoOfPayments) as AvgTotal
FROM BatchFx
WHERE BatchRecords.BatchId = BatchFx.BatchId
AND BatchFx.CreatedOn < '2019-05-04'
);
Hope this helps :)
The alias AvgTotal seems to imply that you want to take an average of something, so why not try doing that:
UPDATE BatchRecords br
SET Fees = (SELECT AVG(Fees/NoOfPayments) AS AvgTotal
FROM BatchFx bf
WHERE br.BatchId = bf.BatchId AND bf.CreatedOn < '2019-05-04');
Note that the error message you are seeing implies that the subquery in some cases would return more than one record. Selecting an aggregate function would be one way to get around this problem.

accumulating a value in SQL

I am trying to accumulate a value when a certain condition exists such as
If statusCode = 0
then add 1 to a value.
I am trying to show the number of successful records as defined by the statusCode.
There must be a better way to do this.
Thanks
Select count(1) from yourTable where statusCode=0

CASE Clause on select clause throwing 'SQLCODE=-811, SQLSTATE=21000' Error

This query is very well working in Oracle. But it is not working in DB2. It is throwing
DB2 SQL Error: SQLCODE=-811, SQLSTATE=21000, SQLERRMC=null, DRIVER=3.61.65
error when the sub query under THEN clause is returning 2 rows.
However, my question is why would it execute in the first place as my WHEN clause turns to be false always.
SELECT
CASE
WHEN (SELECT COUNT(1)
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779) = 1
THEN
(SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779
)
ELSE NULL
END STAPPFAC
FROM SHIPMENT SHIPMENT
WHERE SHIPMENT.SHIPMENT_ID IN (2779);
The SQL standard does not require short cut evaluation (ie evaluation order of the parts of the CASE statement). Oracle chooses to specify shortcut evaluation, however DB2 seems to not do that.
Rewriting your query a little for DB2 (8.1+ only for FETCH in subqueries) should allow it to run (unsure if you need the added ORDER BY and don't have DB2 to test on at the moment)
SELECT
CASE
WHEN (SELECT COUNT(1)
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779) = 1
THEN
(SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779
ORDER BY ST.SHIPMENT_ID
FETCH FIRST 1 ROWS ONLY
)
ELSE NULL
END STAPPFAC
FROM SHIPMENT SHIPMENT
WHERE SHIPMENT.SHIPMENT_ID IN (2779);
Hmm... you're running the same query twice. I get the feeling you're not thinking in sets (how SQL operates), but in a more procedural form (ie, how most common programming languages work). You probably want to rewrite this to take advantage of how RDBMSs are supposed to work:
SELECT Current_Stop.facility_alias_id
FROM SYSIBM/SYSDUMMY1
LEFT JOIN (SELECT MAX(Stop.facility_alias_id) AS facility_alias_id
FROM Stop
JOIN Facility
ON Facility.facility_id = Stop.facility_id
AND Facility.is_dock_sched_fac = 1
WHERE Stop.shipment_id = 2779
HAVING COUNT(*) = 1) Current_Stop
ON 1 = 1
(no sample data, so not tested. There's a couple of other ways to write this based on other needs)
This should work on all RDBMSs.
So what's going on here, why does this work? (And why did I remove the reference to Shipment?)
First, let's look at your query again:
CASE WHEN (SELECT COUNT(1)
FROM STOP ST, FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC = 1
AND ST.SHIPMENT_ID = 2779) = 1
THEN (SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST, FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC = 1
AND ST.SHIPMENT_ID = 2779)
ELSE NULL END
(First off, stop using the implicit-join syntax - that is, comma-separated FROM clauses - always explicitly qualify your joins. For one thing, it's way too easy to miss a condition you should be joining on)
...from this it's obvious that your statement is the 'same' in both queries, and shows what you're attempting - if the dataset has one row, return it, otherwise the result should be null.
Enter the HAVING clause:
HAVING COUNT(*) = 1
This is essentially a WHERE clause for aggregates (functions like MAX(...), or here, COUNT(...)). This is useful when you want to make sure some aspect of the entire set matches a given criteria. Here, we want to make sure there's just one row, so using COUNT(*) = 1 as the condition is appropriate; if there's more (or less! could be 0 rows!) the set will be discarded/ignored.
Of course, using HAVING means we're using an aggregate, the usual rules apply: all columns must either be in a GROUP BY (which is actually an option in this case), or an aggregate function. Because we only want/expect one row, we can cheat a little, and just specify a simple MAX(...) to satisfy the parser.
At this point, the new subquery returns one row (containing one column) if there was only one row in the initial data, and no rows otherwise (this part is important). However, we actually need to return a row regardless.
FROM SYSIBM/SYSDUMMY1
This is a handy dummy table on all DB2 installations. It has one row, with a single column containing '1' (character '1', not numeric 1). We're actually interested in the fact that it has only one row...
LEFT JOIN (SELECT ... )
ON 1 = 1
A LEFT JOIN takes every row in the preceding set (all joined rows from the preceding tables), and multiplies it by every row in the next table reference, multiplying by 1 in the case that the set on the right (the new reference, our subquery) has no rows. (This is different from how a regular (INNER) JOIN works, which multiplies by 0 in the case that there is no row) Of course, we only maybe have 1 row, so there's only going to be a maximum of one result row. We're required to have an ON ... clause, but there's no data to actually correlate between the references, so a simple always-true condition is used.
To get our data, we just need to get the relevant column:
SELECT Current_Stop.facility_alias_id
... if there's the one row of data, it's returned. In the case that there is some other count of rows, the HAVING clause throws out the set, and the LEFT JOIN causes the column to be filled in with a null (no data) value.
So why did I remove the reference to Shipment? First off, you weren't using any data from the table - the only column in the result set was from the subquery. I also have good reason to believe that there would only be one row returned in this case - you're specifying a single shipment_id value (which implies you know it exists). If we don't need anything from the table (including the number of rows in that table), it's usually best to remove it from the statement: doing so can simplify the work the db needs to do.

SQL Server COALESCE bit compared to an int

I have a situation where I am passing a bit datatype into a COALESCE. Then setting that = 0 to check if it equals 0. The problem is, that is not working!
(Please note that I am not authorized to change the datatype of any columns)
This is what I have:
SELECT Meters.ID, Consumption, Charge
FROM Data (nolock)
JOIN Meters
ON Data.MeterID = Meters.ID
WHERE COALESCE(Processed, 0) = 0
The idea behind Processed is that, if the data were processed than it should 1 so I do not want to process them again.
//Processed is a column in the Data table that is type bit. My Joins are 100% correct cause I ran them without the Where and execute with no problems, the problem is when I add that where! Even though that Processed column has 1, 0, NULL values.... It does not return anything! Can anybody suggest a solution? Thank you.
where processed is null or processed = 0
is the logical equivalent expresion, I don't figue why COALESCE is not working

SQL Server 2008 UPDATE Statement WHERE clause precedence

I wrote the following query:
UPDATE king_in
SET IN_PNSN_ALL_TP_CNTRCT_CD = IN_PNSN_ALL_TP_CNTRCT_CD + '3'
WHERE COALESCE(IN_PNSN_ALL_TP_CNTRCT_TX, '') <> ''
AND CHARINDEX('3', IN_PNSN_ALL_TP_CNTRCT_CD) = 0
It checks to see if a field has a value in it and if it does it puts a 3 in a corresponding field if there isn't a 3 already in it. When I ran it, I got a string or binary data will be truncated error. The field is a VARCHAR(3) and there are rows in the table that already have 3 characters in them but the rows that I was actually doing the updating on via the WHERE filter had a MAX LEN of 2 so I was completely baffled as to why SQL Server was throwing me the truncation error. So I changed my UPDATE statement to:
UPDATE king_in
SET IN_PNSN_ALL_TP_CNTRCT_CD = k.IN_PNSN_ALL_TP_CNTRCT_CD + '3'
FROM king_in k
INNER JOIN
(
SELECT ki.row_key,
in_sqnc_nb
FROM king_in ki
INNER JOIN King_Ma km
ON ki.Row_Key = km.Row_Key
INNER JOIN King_Recs kr
ON km.doc_loc_nb = kr.ACK_ID
WHERE CHARINDEX('3', IN_PNSN_ALL_TP_CNTRCT_CD) = 0
AND COALESCE(IN_PNSN_ALL_TP_CNTRCT_TX, '') <> ''
) a
ON k.Row_Key = a.Row_Key
AND k.in_sqnc_nb = a.insr_sqnc_nb
and it works fine without error.
So it appears based on this that when doing an UPDATE statement without a FROM clause that SQL Server internally goes through and runs the SET statement before it filters the records based on the WHERE clause. Thats why I was getting the truncation error, because even though the records I wanted to update were less than 3 characters, there were rows in the table that had 3 characters in that field and when it couldn't add a '3' to the end of one of those rows, it threw the error.
So after all of that, I've got a handful of questions.
1) Why? Is there a specific DBMS reason that SQL Server wouldn't filter the result set before applying the SET statement?
2) Is this just a known thing about SQL that I never learned along the way?
3) Is there a setting in SQL Server to change this behavior?
Thanks in advance.
1 - Likely because your criteria are not SARGable - that is, they can't use an index. If the query optimizer determines it's faster to do a table scan, it'll go ahead and run on all the rows. This is especially likely when you filter on a function applied to the field like you do here.
2 - Yes. The optimizer will do what it thinks it best. You can get around this somewhat by using parentheses to force an evaluation order of your WHERE clause but in your example I don't think it would help since it forces a table scan regardless.
3 - No, you need to alter your data or your logic to allow indexes to be used. If you really really need to filter on existence of a certain character in a field, it probably should be it's own column and/or you should normalize that particular bit of data better.
A workaround for your particular instance would be to add a WHERE LEN(IN_PNSN_ALL_TP_CNTRCT_CD) < 3 as well.