SQL query overlapping time undercounting - sql

I am writing a query that counts how many people were 18 months active on a certain day of each year.
The problem I'm having is that between the years there is an overlap in time period which causes the later years to be undercounted because they are classified as the previous year.
For example, '2017-03-06' could be considered activity for 2018 AND 2017.
Here is my query:
select case when deposit_dt between '2017-02-07' and date then '2018'
when deposit_dt between '2016-02-07' and '2017-08-07' then '2017'
when deposit_dt between '2015-02-07' and '2016-08-07' then '2016'
when deposit_dt between '2014-02-07' and '2015-08-07' then '2015'
when deposit_dt between '2013-02-07' and '2014-08-07' then '2014'
end as yr, count(unique(op_id))
from activity_table
where deposit_dt between '2013-02-07' and date
group by deposit_dt
Any advice on how to get around this issue (other than running a new query for each year)?

My feeling is that you really want something along the lines of the following:
select
count(case when deposit_dt between '2017-02-07' and date then 1 end) as 2018,
count(case when deposit_dt between '2016-02-07' and '2017-08-07' then 1 end) as 2017,
count(case when deposit_dt between '2015-02-07' and '2016-08-07' then 1 end) as 2016,
count(case when deposit_dt between '2014-02-07' and '2015-08-07' then 1 end) as 2015,
count(case when deposit_dt between '2013-02-07' and '2014-08-07' then 1 end) as 2014
from activity_table
where deposit_dt between '2013-02-07' and date;
Note that it doesn't make sense to group by deposit_dt, since this is the column which is being used to aggregate.
This assumes that you don't have logic beyond this to take the potentially overlapping dates into account. If you can provide logic for how to resolve a date which matches more than one range, then the above query can be updated.

I think this might be what you want.
select count(distinct [2018Users]) as [2018], count(distinct [2017Users]) as [2017],count(distinct [2016Users]) as [2016]
from(
select case when deposit_dt between '2017-02-07' and date then op_id end as [2018Users],
case when deposit_dt between '2016-02-07' and '2017-08-07' then op_id end as [2017Users],
case when deposit_dt between '2015-02-07' and '2016-08-07' then op_id end as [2016Users]
from activity_table
) c

Related

Count sequence selected columns

I have query below, I want sequence result like the value of 'feb' will sum by jan and feb, value of 'mar' will sum by jan, feb and mar,... . Is there any way to get the result like that?
select A.location as location
, count(Case When SUBSTRING(A.base_date,5,2)='01' Then A.customer_no else null end) as "jan"
, count(Case When SUBSTRING(A.base_date,5,2)='02' Then A.customer_no else null end) as "feb"
....
, count(Case When SUBSTRING(A.base_date,5,2)='12' Then A.customer_no else null end) as "dec"
from table_income A group by A.location;
SQL is a much more effective language when you think in rows rather than columns (normalisation).
For example, having one row per month is much simpler...
SELECT
location,
SUBSTRING(base_date,5,2) AS base_month,
SUM(COUNT(customer_no))
OVER (
PARTITION BY location
ORDER BY SUBSTRING(base_date,5,2)
)
AS count_cust
FROM
table_income
GROUP BY
location,
SUBSTRING(base_date,5,2)
Side notes:
If your base_date is a string, it shouldn't be, use data-types relevant to the data
If your base_date is a date or timestamp, you should really use date/timestamp functions, such as EXTRACT(month FROM base_date).
You probably should also account for different years...
SELECT
location,
DATE_TRUNC('month', base_date) AS base_month,
SUM(COUNT(customer_no))
OVER (
PARTITION BY location, DATE_TRUNC('year', base_date)
ORDER BY DATE_TRUNC('month', base_date)
)
AS count_cust
FROM
table_income
GROUP BY
location,
DATE_TRUNC('month', base_date)
Try this :
SELECT A.location as location
, count(Case When SUBSTRING(A.base_date,5,2) in ('01') Then A.customer_no else null end) as "jan"
, count(Case When SUBSTRING(A.base_date,5,2) in ('01','02') Then A.customer_no else null end) as "feb"
....
, count(Case When SUBSTRING(A.base_date,5,2) in ('01','02',...'12') Then A.customer_no else null end) as "dec"
from table_income A group by A.location;

Filtering multiple aggregated columns

So I have a database with sales, I had to group sales by day of week so that each day has it's own column for the selected months. I need to filter out the aggregated sum for a day where there were no sales (the result would still be NULL), I did add the s.Amount_sold>0 in the where clause but I'm not sure if that's the correct solution to the problem. I've been trying to think of a way other than just repeating all of the sums in the where / having clause but no luck so far, would really appreciate some help.
SELECT
t.CALENDAR_MONTH_NAME AS SALES_MONTH,
UPPER(LEFT(p.PROD_NAME,CHARINDEX('&',p.PROD_NAME)-1))+' '+ SUBSTRING(p.PROD_NAME,CHARINDEX('&',p.PROD_NAME),LEN(p.PROD_NAME))+' ('+CAST(p.PROD_ID AS VARCHAR)+')' AS PRODUCT_NAME,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=1) THEN s.AMOUNT_SOLD END) AS Monday,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=2) THEN s.AMOUNT_SOLD END) AS Tuesday,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=3) THEN s.AMOUNT_SOLD END) AS Wednesday,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=4) THEN s.AMOUNT_SOLD END) AS Thursday,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=5) THEN s.AMOUNT_SOLD END) AS Friday,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=6) THEN s.AMOUNT_SOLD END) AS Saturday,
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=7) THEN s.AMOUNT_SOLD END) AS Sunday
FROM sh.CUSTOMERS c
JOIN sh.SALES s ON c.CUST_ID=s.CUST_ID
JOIN sh.TIMES t ON s.TIME_ID=t.TIME_ID
JOIN sh.PRODUCTS p ON s.PROD_ID=p.PROD_ID
WHERE s.PROD_ID = 5
AND (t.CALENDAR_QUARTER_NUMBER=2 AND t.CALENDAR_YEAR=2000) AND s.AMOUNT_SOLD>0
GROUP BY p.PROD_NAME,t.CALENDAR_MONTH_NAME,t.CALENDAR_MONTH_NUMBER,p.PROD_ID
ORDER BY t.CALENDAR_MONTH_NUMBER
Could you use
SUM(CASE WHEN (t.DAY_NUMBER_IN_WEEK=1) THEN IFNULL(s.AMOUNT_SOLD, 0) END) AS Monday

Query to show all months and show values where there are data for the corresponding months

I have a query and it shows the months where there is corresponding data. However, I would like to show all of the months in the year and have the months where there are no data shown as zero.
There is my SQL Statement:
SELECT DATENAME(MONTH, hb_Disputes.OPENED) AS MonthValue,
COUNT(CASE WHEN REV_CLS = 2 THEN 1 END) AS SmallCommercialIndust,
COUNT(CASE WHEN REV_CLS <> 2 THEN 1 END) AS Residential
FROM hb_Disputes
WHERE (hb_Disputes.ASSGNTO = 'E099255') AND (YEAR(hb_Disputes.OPENED) = YEAR(GETDATE()))
GROUP BY hb_Disputes.OPENED
And this is my output:
I also have a table name MonthName that shows all of the months in a year and I know I may need to use this to accomplish what I'm trying to achieve but I'm not sure how to get there:
If you have data in the table for all months, but the where clause is filtering it out, then the simplest method is to extend the conditional aggregation:
SELECT DATENAME(MONTH, d.OPENED) AS MonthValue,
SUM(CASE WHEN d.ASSGNTO = 'E099255' AND d.REV_CLS = 2 THEN 1 ELSE 0 END) AS SmallCommercialIndust,
SUM(CASE WHEN d.ASSGNTO = 'E099255' AND d.REV_CLS <> 2 THEN 1 ELSE 0 END) AS Residential
FROM hb_Disputes d
WHERE YEAR(d.OPENED) = YEAR(GETDATE())
GROUP BY DATENAME(MONTH, d.OPENED)
ORDER BY MIN(d.OPENED);
Note: This does not fix the issue in all cases. It should just be a simple way to modify your query -- and will often work.

SQL summing the same column with different date conditions

I'm using one table and am trying to sum total spend for two separate years, pulling from the same column.
Essentially I have a list of customers, and I'm trying to sum their 2018 spend, and their 2019 spend. I've tried a few different things, but I can't seem to get the "case when" function to work because once the 2018 spend condition is met and that value is populated, it won't sum for 2019, and vice versa—so I've got a total for 2018 OR 2019, but no customers are showing spend for both.
This is my query:
select *
from
(select
buyer_first_name, buyer_last_name, buyer_address_1, buyer_address_2,
buyer_address_city, buyer_address_state, buyer_address_zip, buyer_email, buyer_phone_1,
sum(case when sale_amount > 0 and year(sale_date) = 2018 then sale_amount end) as Spend18,
sum(case when sale_amount > 0 and year(sale_date) = 2019 then sale_amount end) as Spend19
from
database.table
where
sale_date between date '2018-01-01' and date '2019-10-30'
group by
buyer_first_name, buyer_last_name, buyer_address_1, buyer_address_2,
buyer_address_city, buyer_address_state, buyer_address_zip, buyer_email, buyer_phone_1)
Any idea what I'm doing wrong? Thank you!
I really doubt that your problem is with SQL syntax or a bug. Your query looks like it should be doing what you want.
The issue is that those fields are not the same for any customer in the two years. Try something like this:
select buyer_last_name
sum(case when sale_amount > 0 and year(sale_date) = 2018 then sale_amount end) as Spend18,
sum(case when sale_amount > 0 and year(sale_date) = 2019 then sale_amount end) as Spend19
from database.table
where sale_date between date '2018-01-01' and date '2019-10-30'
group by buyer_last_name;
I speculate that last names are unlikely to change. If this works, you can start adding columns back to see where the problem columns are.
This is why databases are normalized. Repeated data tends to get out of synch.
it's weird why you have such a problem and this really need to research. Frankly I would expect the same result you do. However you can bypass this by doing something like that (assuming you are using mysql):
sum(sale_amount * if(sale_amount > 0 and year(sale_date) = 2018, 1, 0)) as Spend18,
sum(sale_amount * if(sale_amount > 0 and year(sale_date) = 2019, 1, 0)) as Spend19
If all your sales are >=0 you can skip positive condition for sale amount
If this does not work, test select without grouping and see what's going on:
select sale_amount * if(sale_amount > 0 and year(sale_date) = 2018, 1, 0) as spend18,
sale_amount * if(sale_amount > 0 and year(sale_date) = 2019, 1, 0) as spend19
from your_table;

calculate YTD & Prev Year YTD

I want to calculate YTD (1st Jan 2016 to last date of a month) & Prev year YTD (1st Jan 2015 to last date of a month) for each Client.
Below is the SQL Query that i have attempted, but here i get two rows for each Client instead of 1 as I'm using 'CASE WHEN'.
My question is how can i get the result in just one row per Client instead of one row for YTD & another row for YTD-1 for each client?
SELECT [ClientName]
, (CASE WHEN YEAR([Purchase_Date]) = YEAR(GETDATE())-1 THEN (count(Activity)) end) AS 'YTD-1'
, (CASE WHEN YEAR([Purchase_Date]) = YEAR(GETDATE()) THEN (count(Activity)) end) AS 'YTD'
FROM Purchases
WHERE MONTH([Purchase_Date]) <= MONTH(GETDATE())
GROUP BY [ClientName], YEAR([Purchase_Date])
ORDER BY 1
Kindly Help!
Thanks,
Ramesh
Remove YEAR([Purchase_Date]) from the GROUP BY part.
Also, instead of:
(CASE WHEN YEAR([Purchase_Date]) = YEAR(GETDATE())-1 THEN (count(Activity)) end) AS 'YTD-1'
use:
count(CASE WHEN YEAR([Purchase_Date]) = YEAR(GETDATE())-1 THEN Activity else NULL end) AS 'YTD-1'
And the same for 'YTD' column.