How to force AVG to include zeros - sql

I am averaging values from a column in a sub query that includes zeros. The average seems to be ignoring zero values and giving me an inflated value.
I pulled the sub query alone and I see zeros in column F (UWDays), the one I am trying to average. I tried the same query but replaced avg(mm.UWdays) with avg(NULLIF(mm.days,0)) and again I got the same values as the original pull.
SELECT mm.month, mm.FlagA, mm.FlagB, mm.FlagC, avg(mm.UWdays) AS UWDays,
FROM (
select date_trunc('Month', dates_table.month) as month,
customer_table.customer_id,
Case WHEN (table1.attribute1 LIKE '%Yes%' AND table2.attribute2 LIKE '%Yes%' and table3.attribute3 NOT LIKE '%Yes%') THEN 1 ELSE 0 END AS FlagA,
Case WHEN (table1.attribute1 LIKE '%Yes%' AND table2.attribute2 LIKE '%Yes%' AND table3.attribute3 LIKE '%Yes%') THEN 1 ELSE 0 END AS FlagB,
Case WHEN (table1.attribute1 LIKE '%Yes%' AND table2.attribute2 LIKE '%No%' AND table3.attribute3 LIKE '%Yes%') THEN 1 ELSE 0 END AS FlagC,
CASE WHEN (min(table1.date) is null AND max(table1.date) is null) THEN 0 ELSE count(table1.date) end AS UWDays
FROM customer_table cross dates_table
left outer join table1 ON customer_table.customer_id= table1.customer_id
left outer join table2 on customer_table.customer_id= table2.customer_id
left outer join table3 on customer_table.customer_id= table3.customer_id
group by 1,2,3,4,5
order by 2,1) mm
GROUP BY 1,2,3,4

AVG() does not exclude zeroes. However, it does ignore NULL values, so perhaps that is what you mean -- particularly because your query has LEFT JOINs which would tend to generate NULL values.
You can treat NULL values as 0 using COALESCE():
avg(coalesce(mm.UWdays), 0)

Related

Why does this not return 0

I have a query like:
select nvl(nvl(sum(a.quantity),0)-nvl(cc.quantityCor,0),0)
from RCV_TRANSACTIONS a
LEFT JOIN (select c.shipment_line_id,c.oe_order_line_id,nvl(sum(c.quantity),0) quantityCor
from RCV_TRANSACTIONS c
where c.TRANSACTION_TYPE='CORRECT'
group by c.shipment_line_id,c.oe_order_line_id) cc on (a.shipment_line_id=cc.shipment_line_id and a.shipment_line_id=7085740)
where a.transaction_type='DELIVER'
and a.shipment_line_id=7085740
group by nvl(cc.quantityCor,0);
The query runs OK, but returns no value. I want it to return 0 if there is no quantity found. Where have I gone wrong?
An aggregation query with a GROUP BY returns no rows if all rows are filtered out.
An aggregation query with no GROUP BY always returns one row, even if all rows are filtered out.
So, just remove the GROUP BY. And change the SELECT to:
select coalesce(sum(a.quantity), 0) - coalesce(max(cc.quantityCor), 0)
I may be wrong, but it seems you merely want to subtract CORRECT quantity from DELIVER quantity for shipment 7085740. You don't need a complicated query for that. Especially your GROUP BY clauses make no sense if that is what you are after.
One way to write this query would be:
select
sum(case when transaction_type = 'DELIVER' then quantity else 0 end) -
sum(case when transaction_type = 'CORRECT' then quantity else 0 end) as diff
from rcv_transactions
where shipment_line_id = 7085740;
I had a query like this and was trying to return 'X' when the item is not valid.
SELECT case when segment1 is not null then segment1 else 'X' end
--INTO v_orgValidItem
FROM mtl_system_items_b
WHERE segment1='1676001000'--'Jul-00'--l_item
and organization_id=168;
..but it was returning NULL.
Changed to use aggregation with no group by and now it returns 'X' when the item is not valid.
SELECT case when max(segment1) is not null then max(segment1) else 'X' end valid
--INTO v_orgValidItem
FROM mtl_system_items_b
WHERE segment1='1676001000'--'Jul-00'--l_item
and organization_id=168;--l_ship_to_organization_id_pb;
Here is another example, proving the order of operations really matters.
When there is no match for this quote number, this query returns NULL:
SELECT MAX(NVL(QUOTE_VENDOR_QUOTE_NUMBER,0))
FROM PO_HEADERS_ALL
WHERE QUOTE_VENDOR_QUOTE_NUMBER='foo.bar';
..reversing the order of MAX and NVL makes all the difference. This query returns the NULL value condition:
SELECT NVL(MAX(QUOTE_VENDOR_QUOTE_NUMBER),0)
FROM PO_HEADERS_ALL
WHERE QUOTE_VENDOR_QUOTE_NUMBER='foo.bar';

SQL query syntax in CASE WHEN ELSE END to count

Writing a query to find the number of ED visits that were discharged from non-ED units.
The column dep.ADT_UNIT_TYPE_C column stores 1 if the unit was an ED unit.
Assume NULL values are non-ED units for the purpose of this query.
Which of the following produces this number?
I am thinking it is A because in my mind, that sound the correct syntax.
COUNT(CASE WHEN THEN ELSE END standard format)
A has that.
B doesn't have the THEN? so it is incorrect syntax?
Please help me understanding the nuances between these choices.
A.)
COUNT( CASE WHEN dep.ADT_UNIT_TYPE_C is NULL OR dep.ADT_UNIT_TYPE_C <> 1 THEN NULL
ELSE 1
END )
B.)
COUNT( CASE WHEN dep.ADT_UNIT_TYPE_C is NULL or dep.ADT_UNIT_TYPE_C <> 1
ELSE NULL
END)
C.)
CASE WHEN dep.ADT_UNIT_TYPE_C Is NULL or dep.ADT_UNIT_TYPE_C <> 1 THEN COUNT (NULL)
ELSE COUNT (1)
END
D.)
CASE WHEN dep.ADT_UNIT_TYPE_C is NULL or dep.ADT_UNIT_TYPE_C <> 1 THEN COUNT(1)
ELSE COUNT(NULL)
END
You can count the records that are returned COUNT(*) and put the condition in the where clause.
If you are using Oracle, you can use NVL.
The sample below is for Oracle, but if using mysql or SQL server, you can use the ISNULL Function.
SELECT COUNT(*) FROM dep WHERE NVL(ADT_UNIT_TYPE_C, 0) != 1
It looks like however, you are joining this to another table, probably a visit table. So, you want to count visits. Visits probably stores some kind of department id or way to join it to departments.
Something like this:
SELECT COUNT(*) FROM visit v, departments d WHERE v.dep_id = d.dep_id AND NVL(d.ADT_UNIT_TYPE_C, 0) !=1
If you want the entire list like shown above, you want to use a group by. This will show you the count for each visit by department type.
SELECT COUNT(*) FROM visit v, departments d GROUP BY d.ADT_UNIT_TYPE_C

Using SQL SUM with Case statement containing inner SELECT

I have two tables, an Orders table which contains a list of a users orders and a OrderShippingCosts table which contains a price for shipping each item based on the OrderTypeID in the Orders table.
I am running a query like below to calculate the total shipping costs:
SELECT
SUM(CASE
WHEN OR.OrderTypeID = 1
THEN (SELECT CostOfShippingSmallParcel
FROM OrderShippingCosts)
ELSE (SELECT CostOfShippingBigParcel
FROM OrderShippingCosts)
END) AS TotalShippingCost
FROM
Orders AS OR
But I'm getting the following error:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery
Does anyone know what is wrong with my query?
Function SUM takes an expression on input, which evaluates into single data value, not a dataset. Expression definition from MSDN:
Is a combination of symbols and operators that the SQL Server Database Engine evaluates to obtain a single data value.
You trying to pass to SUM function a dataset (which is result of subquery), not a single data value. This is simplification of what you trying to query:
SELECT SUM(SELECT Number FROM SomeTable)
It is not valid. The valid query would be:
SELECT SUM(Value) FROM SomeTable
In your particular case looks like you missing JOIN. Your original logic will result in summary of entire OrderShippingCosts table for each row of Orders table. I think, it should be something like this:
SELECT
SUM
(
CASE
WHEN ord.OrderTypeID = 1 THEN ship.CostOfShippingSmallParcel
ELSE ship.CostOfShippingBigParcel
END
) TotalShippingCost
FROM Orders AS ord
JOIN OrderShippingCosts ship ON /* your search condition, e.g.: ord.OrderID = ship.OrderID */
By the way, it is not a good idea to use reserved symbols as aliases, names and so on. In your query you use OR as alias for Orders table. Symbol OR is reserved for logical or operation. If you really need to use reserved symbol, wrap it into [ and ] square braces. Look here and here for more details.
The error message is clear, you can avoid it with a join:
SELECT
SUM(CASE WHEN [OR].OrderTypeID = 1
THEN CostOfShippingSmallParcel
ELSE CostOfShippingBigParcel END) AS TotalShippingCost
FROM Orders [OR]
CROSS JOIN OrderShippingCosts
You can try like this...
SELECT
CASE WHEN OR.OrderTypeID = 1
THEN (SELECT SUM(CostOfShippingSmallParcel) FROM OrderShippingCosts)
ELSE (SELECT SUM(CostOfShippingBigParcel) FROM OrderShippingCosts) END AS TotalShippingCost
FROM Orders AS OR
Let me know
select sum (or.TotalShippingCost)
FROM
SELECT
(CASE WHEN OR.OrderTypeID = 1
THEN (SELECT CostOfShippingSmallParcel FROM OrderShippingCosts)
ELSE (SELECT CostOfShippingBigParcel FROM OrderShippingCosts) END) AS TotalShippingCost
FROM Orders AS OR
Try this
SELECT
ISNULL
(
SUM
(
CASE
WHEN O.OrderTypeID = 1 THEN C.CostOfShippingSmallParcel
ELSE C.CostOfShippingBigParcel END
), 0
) AS TotalShippingCost
FROM
Orders AS O LEFT JOIN
OrderShippingCosts C ON O.Id = C.OrderId -- Your releation id

How to delete 0 values but not blanks from a column in SQL Server 2008 programming?

I am trying to find the average of a column. I want to remove the 0 values but keep the blanks using SQL Server 2008 programming. Please help
Use AVG()
If you wish to ignore the 0 and include "blank" (if you mean NULL) to the base, you can make use of the following characteristic of the function:
AVG () computes the average of a set of values by dividing the sum of those values by the count of nonnull values.
So that
SELECT AVG(
CASE WHEN [column] = 0 THEN NULL -- Skip 0 when calculate the average
WHEN [column] IS NULL THEN 0 -- Include blank as 0 value
ELSE [column] END) AS Average
FROM [table]
Something like this I guess:
select sum(distinct t1.val)/( count(distinct t2.id)+ count(distinct t3.id))
from mytable t1
join mytable t2
on 1=1
join mytable t3
on 1=1
where not t2.val = 0 and (t3.val is null)
SQLFIDDLE

SQL query and joins

Please see my query below:
select I.OID_CUSTOMER_DIM, I.segment as PISTACHIO_SEGMENT,
MAX(CASE WHEN S.SUBSCRIPTION_TYPE = '5' THEN 'Y' ELSE 'N' END ) PB_SUBS,
max(case when S.SUBSCRIPTION_TYPE ='12' then 'Y' else 'N' end) DAILY_TASTE,
MAX(CASE WHEN S.SUBSCRIPTION_TYPE ='8' THEN 'Y' ELSE 'N' END) COOKING_FOR_TWO
FROM WITH_MAIL_ID i JOIN CUSTOMER_SUBSCRIPTION_FCT S
ON I.IDENTITY_ID = S.IDENTITY_ID
WHERE S.SITE_CODE ='PB'and S.SUBSCRIPTION_END_DATE is null
group by I.oid_customer_dim, I.segment
In this one I am getting 654105 rows, which is lower than the one of the joins table with_mail_id which has 706795 rows.
Now, for the qc purpose my manager is wondering as why I am not having all the rows in my final table. I tried to remove all the filters but the results are still not same in both tables. What am I doing wrong?
I am not very good in SQL yet and this thing is really confusing me.
You're doing an inner join on the two tables, so only rows from WITH_MAIL_ID that can join against CUSTOMER_SUBSCRIPTION_FCT will be returned. Additionally you have a group clause.
First the join. If you want to return all rows regardless of the join condition, you can use a left join, but in this case all the S. columns will be NULL, and you'll have to deal with that.
If you run this, you might see the count is the difference:
select count(*) from WITH_MAIL_ID i
left join CUSTOMER_SUBSCRIPTION_FCT S
on I.IDENTITY_ID = S.IDENTITY_ID
where s.IDENTITY_ID is NULL
The most likely thing however is that it's just the grouping. If you are grouping on two columns and selecting the max of various other columns based on that grouping, you would expect that the number of rows returned is less than the original table, otherwise why bother grouping?
If I have data like this:
groupkey1 value
1 2
1 10
2 1
2 1
Then I group by groupkey1, and select MAX(value) I would get 2 rows [1,2], [2,1], not 4 rows.