Difference between COUNT and SUM within an Aggregate CASE statement - sql

Full disclosure, I am learning and I have searched all over the internet and I just can't figure out my question.
I am working on an online class and was given the following example:
select
DATENAME(MONTH, DATEADD(MONTH, MONTH(OrderDate), -1)) AS 'Month',
SUM(CASE WHEN YEAR(OrderDate) = 2005 THEN 1 ELSE 0 END) AS Orders,
SUM(CASE YEAR(OrderDate) WHEN 2005 THEN Totaldue ELSE 0 END) AS 'Total Value'
from
sales.salesorderheader
group by Month(orderdate)
order by Month(orderdate) ASC
That returns the following results:
I understood that (I thought) so I began messing around with the code to further understand Case statements. Looking at the code I thought that the Orders field was essentially finding all the orders in a month, assigning a 1 to each one, and then adding them all up. Because each one was assigned a 1 I figured that I could change the SUM to COUNT and I would get the same results.
However, this code:
select
DATENAME(MONTH, DATEADD(MONTH, MONTH(OrderDate), -1)) AS 'Month',
COUNT(CASE WHEN YEAR(OrderDate) = 2005 THEN 1 ELSE 0 END) AS Orders,
SUM(CASE YEAR(OrderDate) WHEN 2005 THEN Totaldue ELSE 0 END) AS 'Total Value'
from
sales.salesorderheader
group by Month(orderdate)
order by Month(orderdate) ASC
Returns these results:
To try and break this down I created a query that would just look for the orders in January 2005 and count them.
SELECT COUNT(*)
FROM Sales.SalesOrderHeader
WHERE OrderDate >= '1/1/2005' AND OrderDate < '1/1/2005'
This returned 0. The same as the SUM query. I get that COUNT counts rows and SUM sums numbers in a column, but I just don't understand the results I'm getting. Could someone please explain why the count query is returning 2483 for January and not 0?

For COUNT 1 and 0 are the same. What you really need is NULL:
COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.
select
DATENAME(MONTH, DATEADD(MONTH, MONTH(OrderDate), -1)) AS 'Month',
COUNT(CASE WHEN YEAR(OrderDate) = 2005 THEN 1 ELSE NULL END) AS Orders,
SUM(CASE YEAR(OrderDate) WHEN 2005 THEN Totaldue ELSE 0 END) AS 'Total Value'
from sales.salesorderheader
group by Month(orderdate)
order by Month(orderdate) ASC;
Or even shorter(default ELSE is NULL so we could omit that part)
COUNT(CASE WHEN YEAR(OrderDate) = 2005 THEN 1 END) AS Orders,
Example:
SUM COUNT COUNT
2005 1 1 1
2006 0 0 NULL
2007 0 0 NULL
2005 1 1 1
===============================================
2 4 2

When you use count(*) you count ALL the rows. If you want to count how many orders you have, you have to use a column: eg: count(OrderDate). Try it

count example:
assume that your column has 3 value and column name is the order
2 ---------- 5 ---------- 4----- Null
now if you run
count (order)
it will return = 3 how many entries you have in the column without null
sum example:
2 ---------- 5 ---------- 4
now if you run
sum (order)
it will return = 2+5+4=11 its add all the entries

Related

SQL case operation

Im fairly new with sql, and been trying to solve a problem where you have a table information about orders. In this case, Im trying to use the case operation to get a monthly report on orders, so I should have a column which states the year,another one which states the month, and then I should have columns for days 1-20,21-22,23-24 and above 25. Im trying to use the case operation to get the amount of orders that happened on those days.
I tried the following query :
SELECT
DATEPART(YEAR,date) AS year,DATEPART(MONTH,date) AS month,
COUNT(CASE WHEN DATEPART(DAY,date) BETWEEN 1 AND 20 THEN order ELSE 0 END) AS D1_D20,
COUNT(CASE WHEN DATEPART(DAY,date) BETWEEN 21 AND 22 THEN order ELSE 0 END) AS D21_D22,
COUNT(CASE WHEN DATEPART(DAY,date) BETWEEN 23 AND 24 THEN order ELSE 0 END) AS D23_D24,
COUNT(CASE WHEN DATEPART(DAY,date) > 25 THEN order ELSE 0 END) AS D25_END
FROM ORDERS
GROUP BY DATEPART(YEAR,date),DATEPART(MONTH,date)
Obviously the problem with that query is that, now I just get the total number of orders for each of the days, I know I should count the orders, but dont know the syntax. Help would be greatly appreciated!
Use SUM():
SELECT
DATEPART(YEAR, date) AS year, DATEPART(MONTH, date) AS month,
SUM(CASE WHEN DATEPART(DAY,date) BETWEEN 1 AND 20 THEN 1 ELSE 0 END) AS D1_D20,
SUM(CASE WHEN DATEPART(DAY,date) BETWEEN 21 AND 22 THEN 1 ELSE 0 END) AS D21_D22,
SUM(CASE WHEN DATEPART(DAY,date) BETWEEN 23 AND 24 THEN 1 ELSE 0 END) AS D23_D24,
SUM(CASE WHEN DATEPART(DAY,date) > 25 THEN 1 ELSE 0 END) AS D25_END
FROM ORDERS
GROUP BY DATEPART(YEAR, date), DATEPART(MONTH, date);
I would recommend using the functions DAY(), YEAR(), and MONTH() because they are simpler to type.
By the way, you can use COUNT() if you remove the ELSE clause. Your particular problem is that COUNT(0) = COUNT(1) because COUNT() counts non-NULL values. I prefer SUM() because it is more intuitive in this respect.

How to get year numbers in columns

I need to display the number of transactions in individual years for individual employees (ID, 2011, 2012, 2013, 2014) and I should see the number of transactions in every year under years number, but now I receive this form:
How can I change it?
I think I should use WHERE, but I dont know how to do it
My current query is:
SELECT oh.SalesPersonID AS perID,
YEAR(oh.OrderDate) AS [Year], COUNT(*)
FROM Sales.SalesOrderHeader oh
JOIN Person.Person per ON oh.SalesPersonID = per.BusinessEntityID
GROUP BY SalesPersonID, YEAR(OrderDate)
ORDER BY perID, [Year];
Just use conditional aggregation:
SELECT oh.SalesPersonID AS perID,
SUM(CASE WHEN YEAR(oh.OrderDate) = 2011 THEN 1 ELSE 0 END) as cnt_2011,
SUM(CASE WHEN YEAR(oh.OrderDate) = 2012 THEN 1 ELSE 0 END) as cnt_2012,
SUM(CASE WHEN YEAR(oh.OrderDate) = 2013 THEN 1 ELSE 0 END) as cnt_2013,
SUM(CASE WHEN YEAR(oh.OrderDate) = 2014 THEN 1 ELSE 0 END) as cnt_2014
FROM Sales.SalesOrderHeader oh JOIN
Person.Person per
ON oh.SalesPersonID = per.BusinessEntityID
GROUP BY SalesPersonID
ORDER BY perID;

SQL Efficiency on Date Range or Separate Tables

I'm calculating historical amount from a table in years(ex. 2015-2016, 2014-2015, etc.) I would like to seek expertise if its more efficient to do it in one batch or repeat the query multiple times filtered by the date required.
Thanks in advance!
OPTION 1:
select
id,
sum(case when year(getdate()) - year(txndate) between 5 and 6 then amt else 0 end) as amt_6_5,
...
sum(case when year(getdate()) - year(txndate) between 0 and 1 then amt else 0 end) as amt_1_0,
from
mytable
group by
id
OPTION 2:
select
id, sum(amt) as amt_6_5
from
mytable
group by
id
where
year(getdate()) - year(txndate) between 5 and 6
...
select
id, sum(amt) as amt_1_0
from
mytable
group by
id
where
year(getdate()) - year(txndate) between 0 and 1
1.
Unless you have resources issues I would go with the CASE version.
Although it has no impact on the results, filtering on the requested period in the WHERE clause might have a significant performance advantage.
2. Your period definition creates overlapping.
select id
,sum(case when year(getdate()) - year(txndate) = 6 then amt else 0 end) as amt_6
-- ...
,sum(case when year(getdate()) - year(txndate) = 0 then amt else 0 end) as amt_0
where txndate >= dateadd(year, datediff(year,0, getDate())-6, 0)
from mytable
group by id
This may be help you,
WITH CTE
AS
(
SELECT id,
(CASE WHEN year(getdate()) - year(txndate) BETWEEN 5 AND 6 THEN 'year_5-6'
WHEN year(getdate()) - year(txndate) BETWEEN 4 AND 5 THEN 'year_4-5'
...
END) AS my_year,
amt
FROM mytable
)
SELECT id,my_year,sum(amt)
FROM CTE
GROUP BY id,my_year
Here, inside the CTE, just assigned a proper year_tag for each records (based on your conditions), after that select a summary for the CTE grouped by that year_tag.

SQL Columns Subquery with different criteria

Lets say a have two tables, [Products] and [Quantiy].
I need to select a join in both in order to built a table with total Quantity per product.
I know who to do this in a vanilla join. My problem is that i need 3 columns one [PreviewTotal] where I should have this total quantity since ever till yesterday no matter sign of quantity other [TodaytotalPos] with quantity for today and positive and other [TodaytotalNeg] only today and negative.
result like:
[Products] [PreviewTotal] [TodaytotalPos] [TodaytotalNeg]
AAPL 20,000 500 -700
MCD 15,000 NULL -300
BAC -30,000 2,000 NULL
Sample of structure:
Producst:
[id] [name]
1 AAPL
2 MCD
3 BAC
Quantity:
[date] [Id_Product] [Quantity]
12/16 1 500
12/16 2 -300
12/17 1 1,000
12/18 3 5,500
12/18 1 -2,000
Based on the rules you specify, I think you are just looking for conditional aggregation. Your query would look something like this:
select p.name,
sum(case when [date] < cast(getdate() as date) then quantity end) as PreviewTotal,
sum(case when [date] = cast(getdate() as date) and quantity > 0
then quantity end) as TodayTotalPos,
sum(case when [date] = cast(getdate() as date) and
quantity < 0 then quantity end) as TodayTotNeg
from products p join
quantity q
on q.id_product = p.id
group by p.name
order by p.name;
However, your desired results don't match the input data, based on these rules.
In SQL Server you can use CASE to select a field value if a condition is true or false. I assume your [Products] table contains at least ProductId and ProductName, and your [Quantity] table contains at least ProductId, Qty, and SoldDate.
-- get the time of midnight, the start of today:
DECLARE Today datetime;
SELECT DATEADD(DAY, DATEDIFF(DAY, '19000101', GETDATE()), '19000101') INTO Today
-- get the totals
SELECT Products.ProductName,
SUM(
CASE
WHEN Quantity.SoldDate >= Today THEN Quantity.Qty
ELSE 0
END) AS PreviewTotal,
SUM(
CASE
WHEN Quantity.Qty> 0 AND Quantity.SoldDate >= Today) THEN Quantity.Qty
ELSE 0
END) AS TodaytotalPos,
SUM(
CASE
WHEN Quantity.Qty< 0 AND Quantity.SoldDate >= Today THEN Quantity.Qty
ELSE 0
END) AS TodaytotalNeg
FROM Products JOIN Quantity on Products.ProductId = Quantity.ProductId
GROUP BY ProductName

Can I combine my two SQLite SELECT statements into one?

I have a SQLite table called posts. An example is shown below. I would like to calculate the monthly income and expenses.
accId date text amount balance
---------- ---------- ------------------------ ---------- ----------
1 2008-03-25 Ex1 -64.9 3747.56
1 2008-03-25 Shop2 -91.85 3655.71
1 2008-03-26 Benny's -100.0 3555.71
For the income I have this query:
SELECT SUBSTR(date, 0,7) "month", total(amount) "income" FROM posts
WHERE amount > 0 GROUP BY month ORDER BY date;
It works fine:
month income
---------- ----------
2007-05 4877.0
2007-06 8750.5
2007-07 8471.0
2007-08 5503.0
Now I need the expenses and I could of cause just repeat the first statement with the condition amount < 0, but I am wondering if there is an elegant way to get both income and expenses in one query?
Try something like this
select substr(date, 0,7) "Month",
total(case when a > 0 then a else 0 end) "Income",
total(case when a < 0 then a else 0 end) "Expenses"
from posts
group by month
Look into the UNION statement (bottom of the link). This will let you combine the results of two queries, generally in the form:
<SELECT_STATEMENT_1> UNION <SELECT_STATEMENT_2>
Not sure if SQL Lite supports CASE statements, but if it does you could do something like this.
SELECT SUBSTR(date, 0,7) "month"
, total(CASE WHEN Amount > 0 THEN Amount ELSE 0 END) "income"
, -1 * total(CASE WHEN Amount < 0 THEN Amount ELSE 0 END) "expenses"
FROM posts
GROUP BY month
ORDER BY date;