Optimise SQL Query with SUM and Case - sql

I have the following query which takes more than 1 mn to return data:
SELECT extract(HOUR
FROM date) AS HOUR,
SUM(CASE
WHEN country_name = France THEN atdelay
ELSE 0
END) AS France,
SUM(CASE
WHEN country_name = USA THEN atdelay
ELSE 0
END) AS USA,
SUM(CASE
WHEN country_name = China THEN atdelay
ELSE 0
END) AS China,
SUM(CASE
WHEN country_name = Brezil THEN atdelay
ELSE 0
END) AS Brazil,
SUM(CASE
WHEN country_name = Argentine THEN atdelay
ELSE 0
END) AS Argentine,
SUM(CASE
WHEN country_name = Equator THEN atdelay
ELSE 0
END) AS Equator,
SUM(CASE
WHEN country_name = Maroc THEN atdelay
ELSE 0
END) AS Maroc,
SUM(CASE
WHEN country_name = Egypt THEN atdelay
ELSE 0
END) AS Egypt
FROM
(SELECT *
FROM Contry
WHERE (TO_CHAR(entrydate, 'YYYY-MM-DD')::DATE) >= '2021-01-01'
AND (TO_CHAR(entrydate, 'YYYY-MM-DD')::DATE) <= '2021-01-31'
AND code IS NOT NULL) AS A
GROUP BY HOUR
ORDER BY HOUR ASC;
My table is structured like so:
+---------------------+---------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+-------------------+-----------------------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| country_name | varchar(30) | YES | MUL | NULL | |
| date | timestamp | NO | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| entrydate | timestamp | NO | | NULL | |
| keyword_count | int(11) | YES | | NULL | |
| all_impressions | int(11) | YES | | NULL | |
| all_clicks | int(11) | YES | | NULL | |
| all_ctr | float | YES | | NULL | |
| all_positions | float | YES | | NULL | |
+---------------------+---------------+------+-----+-------------------+-----------------------------+
The current table size is closing in on 50 million rows.
How can I make this faster?
I'm hoping there is another query or table optimisation I can do - alternatively I could pre-aggregate the data but I'd rather avoid that.

(Your table definition doesn't look like you are really using Postgres, but as you tagged your question with Postgres I'll answer it nevertheless)
One obvious attempt would be to create an index on entrydate, then change your WHERE clause so it can make use of that. When it comes to timestamp columns and a range condition it's usually better to use the "next day" as the upper limit together with < instead of <=
WHERE entrydate >= date '2021-01-01'
AND entrydate < date '2021-02-01'
AND code IS NOT NULL
If the condition AND code IS NOT NULL removes many rows in addition to the date range, you can created a partial index.
create index on country (entrydate)
where code IS NOT NULL;
However, when a large part of the rows qualifies for code is not null the additional filter won't help very much.
Not performance related, but the conditional aggregation can be written in a bit more compact way using the filter clause:
sum(atdelay) filter (where country_name = 'France') as france

Related

One SQL query with multiple conditions

I am running an Oracle database and have two tables below.
#account
+----------------------------------+
| acc_id | date | acc_type |
+--------+------------+------------+
| 1 | 11-07-2018 | customer |
| 2 | 01-11-2018 | customer |
| 3 | 02-09-2018 | employee |
| 4 | 01-09-2018 | customer |
+--------+------------+------------+
#credit_request
+-----------------------------------------------------------------+
| credit_id | date | credit_type | acc_id | credit_amount |
+------------+-------------+---------- +--------+
| 1112 | 01-08-2018 | failed | 1 | 2200 |
| 1214 | 02-12-2018 | success | 2 | 1500 |
| 1312 | 03-11-2018 | success | 4 | 8750 |
| 1468 | 01-12-2018 | failed | 2 | 3500 |
+------------+-------------+-------------+--------+---------------+
Want to have followings for each customer:
the last successful credit_request
sum of credit_amount of all failed credit_requests
Here is one method:
select a.acct_id, acr.num_fails,
acr.num_successes / nullif(acr.num_fails) as ratio, -- seems weird. Why not just the failure rate?
last_cr.credit_id, last_cr.date, last_cr.credit_amount
from account a left join
(select acc_id,
sum(case when credit_type = 'failed' then 1 else 0 end) as num_fails,
sum(case when credit_type = 'failed' then credit_amount else 0 end) as num_fails,
sum(case when credit_type = 'success' then 1 else 0 end) as num_successes
max(case when credit_type = 'success' then date else 0 end) as max_success_date
from credit_request
group by acct_id
) acr left join
credit_request last_cr
on last_cr.acct_id = acr.acct_id and last_cr.date = acr.date;
The following query should do the trick.
SELECT
acc_id,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN credit_id END) as last_successfull_credit_id,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN cdate END) as last_successfull_credit_date,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN credit_amount END) as last_successfull_credit_amount,
SUM(CASE WHEN credit_type = 'failed' THEN credit_amount ELSE 0 END) total_amount_of_failed_credit,
SUM(CASE WHEN credit_type = 'failed' THEN 1 ELSE 0 END) / COUNT(*) ratio_success_request
FROM (
SELECT
a.acc_id,
a.cdate adate,
a.acc_type,
c.credit_id,
c.cdate,
c.credit_type,
c.credit_amount,
ROW_NUMBER() OVER(PARTITION BY c.acc_id, c.credit_type ORDER BY c.cdate DESC) rn
FROM
account a
LEFT JOIN credit_request c ON c.acc_id = a.acc_id
) x
GROUP BY acc_id
ORDER BY acc_id
The subquery assigns a sequence to each record, within groups of accounts and credit types, using ROW_NUMBR(). The outer query does conditional aggrgation to compute the different computation you asked for.
This Db Fiddle demo with your test data returns :
ACC_ID | LAST_SUCCESSFULL_CREDIT_ID | LAST_SUCCESSFULL_CREDIT_DATE | LAST_SUCCESSFULL_CREDIT_AMOUNT | TOTAL_AMOUNT_OF_FAILED_CREDIT | RATIO_SUCCESS_REQUEST
-----: | -------------------------: | :--------------------------- | -----------------------------: | ----------------------------: | --------------------:
1 | null | null | null | 2200 | 1
2 | 1214 | 02-DEC-18 | 1500 | 3500 | .5
3 | null | null | null | 0 | 0
4 | 1312 | 03-NOV-18 | 8750 | 0 | 0
This might be what you are looking for... Since you did not show expected results, this might not be 100% accurate, feel free to adapt this.
I guess the below query is easy to understand and implement. Also, to avoid more and more terms in the CASE statements you can just make use of WITH clause and use it in the CASE statements to reduce the query size.
SELECT a.acc_id,
c.credit_type,
(distinct c.credit_id),
CASE WHEN
c.credit_type='success'
THEN max(date)
END CASE,
CASE WHEN
c.credit_type='failure'
THEN sum(credit_amount)
END CASE,
(CASE WHEN
c.credit_type='success'
THEN count(*)
END CASE )/
( CASE WHEN
c.credit_type='failure'
THEN count(*)
END CASE)
from accounts a LEFT JOIN
credit_request c on
a.acc_id=c.acc_id
where a.acc_type= 'customer'
group by c.credit_type

How to table date according to date

Given table like:
+---------+------+--------+-----------+--------------+
| Empcode | name | desig | joinmonth | releivemonth |
+---------+------+--------+-----------+--------------+
| 1. | A1. | D1. | Jan-18. | null |
| 2. | A2. | D2. | Jan-18. | May-18 |
| 3. | A3. | D3. | Jan-18. | null |
+---------+------+--------+-----------+--------------+
I want to show table like:
+---------------+--------+--------+--------+--------+--------+
| Remarks | jan-18 | feb-18 | mar-18 | apr-18 | may-18 |
+---------------+--------+--------+--------+--------+--------+
| Joinmonth | 3 | 0 | 0 | 0 | 0 |
| Releivedmonth | 0 | 0 | 0 | 0 | 1 |
+---------------+--------+--------+--------+--------+--------+
You need to unpivot and then re-pivot:
select remarks,
sum(case when mon = 'jan-18' then 1 else 0 end) as jan_18,
sum(case when mon = 'feb-18' then 1 else 0 end) as feb_18,
sum(case when mon = 'mar-18' then 1 else 0 end) as mar_18,
sum(case when mon = 'apr-18' then 1 else 0 end) as apr_18,
sum(case when mon = 'may-18' then 1 else 0 end) as may_18
from t cross apply
(values ('Joinmonth', t.Joinmonth), ('Receivedmonth', Receivedmonth)
) v(remarks, mon)
group by remarks
This is an extended comment rather than answer, please accept that I
needed formatting controls before down-voting this.
You appear to have added a query into a comment, although the syntax wasn't fully correct. You have often used standard parentheses () instead of brackets [] and there was a closing parenthesis missing to terminate the IN(). I believe your query should look like this:
SELECT
empname AS remarks
, [1-1-18]
, [1-2-18]
, [1-3-18]
, [1-4-18]
, [1-5-18]
FROM (
SELECT
empname
, joimonth
, releivedmonth
FROM emply
) AS s
PIVOT (
COUNT(releivedmonth)
FOR joinmonth IN ([1-1-18], [1-2-18], [1-3-18], [1-4-18], [1-5-18])
) piv
You should not attempt to add queries to comments, instead just edit the question.
In this query you refer to values that look like 1-1-18 but in the sample of data there is nothing that looks like that at all. What data type is the column [joinmonth] and [releivedmonth]?
With data that is text in those columns you have substantial problem. If for example these are all different: Jan-18.,Jan 18,Jan-18 so they would not align as you need them to. Variations in data like this will make this impossible.
CREATE TABLE emply(
Empcode NUMERIC(9,0)
,empname VARCHAR(6)
,desig VARCHAR(8)
,joinmonth varchar(30)
,releivemonth varchar(30)
);
INSERT INTO emply(Empcode,empname,desig,joinmonth,releivemonth) VALUES (1.,'A1.','D1.','Jan-18.',NULL);
INSERT INTO emply(Empcode,empname,desig,joinmonth,releivemonth) VALUES (2.,'A2.','D2.','Jan-18.','May 18');
INSERT INTO emply(Empcode,empname,desig,joinmonth,releivemonth) VALUES (3.,'A3.','D3.','Jan-18.',NULL);
SELECT
empname AS remarks
, [Jan-18.]
, [Feb-18.]
, [Mar-18.]
, [Apr-18.]
, [May-18.]
FROM (
SELECT
empname
, joinmonth
, releivemonth
FROM emply
) AS s
PIVOT (
COUNT(releivemonth)
FOR joinmonth IN ([Jan-18.], [Feb-18.], [Mar-18.], [Apr-18.], [May-18.])
) piv
The output from this however is:
+----+---------+---------+---------+---------+---------+---------+
| | remarks | Jan-18. | Feb-18. | Mar-18. | Apr-18. | May-18. |
+----+---------+---------+---------+---------+---------+---------+
| 1 | A1. | 0 | 0 | 0 | 0 | 0 |
| 2 | A2. | 1 | 0 | 0 | 0 | 0 |
| 3 | A3. | 0 | 0 | 0 | 0 | 0 |
+----+---------+---------+---------+---------+---------+---------+
There is only one non-null value of COUNT(releivemonth)

Aggregation for multiple SQL SELECT statements

I've got a table TABLE1 like this:
|--------------|--------------|--------------|
| POS | TYPE | VOLUME |
|--------------|--------------|--------------|
| 1 | A | 34 |
| 2 | A | 2 |
| 1 | A | 12 |
| 3 | B | 200 |
| 4 | C | 1 |
|--------------|--------------|--------------|
I want to get something like this (TABLE2):
|--------------|--------------|--------------|--------------|--------------|
| POS | Amount_A | Amount_B | Amount_C | Sum_Volume |
|--------------|--------------|--------------|--------------|--------------|
| 1 | 2 | 0 | 0 | 46 |
| 2 | 1 | 0 | 0 | 2 |
| 3 | 0 | 1 | 0 | 200 |
| 4 | 0 | 0 | 1 | 1 |
|--------------|--------------|--------------|--------------|--------------|
My Code so far is:
SELECT
(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'A') AS [Amount_A]
,(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'B') AS [Amount_B]
,(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'C') AS [Amount_C]
,(SELECT SUM(VOLUME)
FROM TABLE AS [Sum_Volume]
INTO [TABLE2]
Now two Questions:
How can I include the distinction concerning POS?
Is there any better way to count each TYPE?
I am using MSSQLServer.
What you're looking for is to use GROUP BY, along with your Aggregate functions. So, this results in:
USE Sandbox;
GO
CREATE TABLE Table1 (Pos tinyint, [Type] char(1), Volume smallint);
INSERT INTO Table1
VALUES (1,'A',34 ),
(2,'A',2 ),
(1,'A',12 ),
(3,'B',200),
(4,'C',1 );
GO
SELECT Pos,
COUNT(CASE WHEN [Type] = 'A' THEN [Type] END) AS Amount_A,
COUNT(CASE WHEN [Type] = 'B' THEN [Type] END) AS Amount_B,
COUNT(CASE WHEN [Type] = 'C' THEN [Type] END) AS Amount_C,
SUM(Volume) As Sum_Volume
FROM Table1 T1
GROUP BY Pos;
DROP TABLE Table1;
GO
if you have a variable, and undefined, number of values for [Type], then you're most likely going to need to use Dynamic SQL.
your first column should be POS, and you'll GROUP BY POS.
This will give you one row for each POS value, and aggregate (COUNT and SUM) accordingly.
You can also use CASE statements instead of subselects. For instance, instead of:
(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'A') AS [Amount_A]
use:
COUNT(CASE WHEN TYPE = 'A' then 1 else NULL END) AS [Amount_A]

How to combine two results from same table

I'm looking for a way of pulling totals from a transactions table. Sales and return transactions are differentiated by a column, but the value is always stored as a positive.
I've managed to pull the different transaction type totals, grouped by product, as separate rows:
SELECT `type`,
`product`,
sum(`final_price`) AS `value`,
count(`final_price`) AS `count`
GROUP BY `product`, `type`
The result is:
Type | Product | Value | Count
S | 1 | 1000 | 2
S | 4 | 750 | 3
S | 2 | 300 | 2
S | 3 | 10 | 1
R | 1 | 500 | 1
Ideally, I'd like to have the totals displayed on a single row but in additional columns different columns for ordering purposes. The ideal result would be:
Type | Product | s_value | s_count | r_value | r_count
S | 1 | 1000 | 2 | 500 | 1
S | 4 | 750 | 3 | 0 | 0
S | 2 | 300 | 2 | 0 | 0
S | 3 | 10 | 1 | 0 | 0
I've tried union all and left joins with no joys so far.
You can use case expressions to differentiate by the type of transaction:
SELECT `product`,
SUM(CASE `type` WHEN 'S' THEN `final_price` END) AS `s_value`,
COUNT(CASE `type` WHEN 'S' THEN `final_price` END) AS `s_count`,
SUM(CASE `type` WHEN 'R' THEN `final_price` END) AS `r_value`,
COUNT(CASE `type` WHEN 'R' THEN `final_price` END) AS `r_count`
GROUP BY `product`, `type`
EDIT:
By the forward-quotes around the column names I'm assuming this is a MySQL questions even though it's not explicitly tagged as such.
If this is the case, you can simplify the count statements by utilizing MySQL's automatic conversion from Boolean to int which takes true as a 1 and false as a 0:
SELECT `product`,
SUM(CASE `type` WHEN 'S' THEN `final_price` END) AS `s_value`,
SUM(`type` = 'S') AS `s_count`,
SUM(CASE `type` WHEN 'R' THEN `final_price` END) AS `r_value`,
SUM(`type` = 'R') AS `r_count`
GROUP BY `product`, `type`

Counting by dates in a loop in SQL

I've been working with about 20k records, I don't need all the information, I just need aggregate totals as snapshots of certain times in the records history. Luckily each of the events has a column that records the date of the event, some of those dates will be null in the instance that a particular event never happened to that record. But a couple of the stages, can only be calculated by other fields, for instance a stage of "In Progress" can only be determined by the existence of a create date and either a null in the submit date or a submit date greater than the create date for example in pseudo:
if createDate <= #runDate && (submitDate=null || submitDate > #runDate)
In_Progress_count = In_Progress_count + 1
Any of the other fields are simply counted if the date in the field is less than or equal to the field so for example:
if approvedDate <= #runDate
Approved_count = Approved_count+1
For example I have data that looks something like this:
+-------------+--------------+--------------+--------------+--------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| Application | Applicant | Program | Create Date | Accept |Active Duplicate| Cond. Accept | Defer | Deposited | Divert | Duplicate | Early Quit | Incomplete | Ineligible | Pending | Review | Purge | Reject | Withdraw |
+-------------+--------------+--------------+--------------+--------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| 1 | Peg Bundy | Comp-Sci | 2013-08-01 | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> | <null> |
| 2 | Marcy Darcy | Comp-Sci | 2013-08-25 | 2013-09-05 | <null> | <null> | <null> | 2013-09-30 | <null> | <null> | <null> | 2013-08-30 | <null> | <null> | <null> | <null> | 2013-10-01 | <null> |
| 3 | Al Bundy | Language | 2013-09-01 | 2013-09-05 | <null> | <null> | <null> | 2013-09-27 | <null> | <null> | <null> | 2013-09-05 | <null> | <null> | <null> | <null> | <null> | 2013-09-27 |
+-------------+--------------+--------------+--------------+--------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
I'm trying to get a result for a query that looks like this if run with '2013-09-26' as the #rundate:
+---------------+--------------+--------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| Program Name | totalApps | countAccept |ActivDuplicates | countCondAccept | countDefer | countDeposited | countDivert | countDuplicate | countEarlyQuit | countIncomplete | countIneligible | countPending | countReview | countPurge | countReject | countWithdraw |
+---------------+--------------+--------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| Comp-Sci | 2 | 1 | 0 | 0 | <null> | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| Language | 1 | 1 | 0 | 0 | <null> | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
+---------------+--------------+--------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
What I've tried so far is to count by date on each of the colums, but I'm getting the wrong totals because I only know how to look at one column to asses the date, so basically it's counting everything that's not null even dates past the date I'm trying
SELECT Programs_Name,
Reported_Application_Stage,
count(Reported_Application_Stage) AS AppStageTotal,
count(SubmitDate) AS AppSubmitted,
count(Application_Accept_Date) AS AcceptDate,
count(Deposit_Paid_Date) AS Deposited,
count(Defer_Date) AS Deferred,
count(Deny_Date) AS Denied,
count(Divert_Date) AS Divert,
count(Early_Quit) AS EarlyQuit,
count(Ineligible_Date) AS Ineligible,
count(Purge_Date) AS Purged,
FROM ExtractApplications
WHERE (Report_Date1='2013-09-27')
GROUP BY ExtractSnapshots.Report_Date1, ExtractSnapshots.id, .Programs, .Reported_Application_Stage, _Program, _Start_Term_Year, _Start_Term, _Decision_Display_Value;
Although I can really easily get any specific stages values by date easily using this and they're correct:
SELECT Programs_Name,
count(Defer_Date) AS Deferred
FROM ExtractApplication
WHERE Defer_Date <='2013-09-26'
GROUP BY Programs_Name;
The problem being that I have about 100 dates that I have to use, and about 15 stages that I'm looking for, and I can't really sit and run 1500 queries one at a time for the next week or so without getting fired :P
So what I'm trying to do, is find the right query to count each field, I honestly just don't know how to use the count() function with the types of parameters I'm trying to use I've tried count(someField<'2013-09-27') and it didn't work, I also don't know how to find the "In Progress" field that relies on a createDate combined with a null or > date value in the submitDate field
To top all of that off, I need to put it into a loop that will run this with the dates being the first, eigth, fifteenth, and twenty second of each month over the last few years, and running a loop in SQL is something I don't know how to do, if it were java I would just nest two for loops that run off of array sizes like:
for (i=0; i<year.length;i++) {
for (j=1; j<13; j++) {
for (k=0; k<setDays.length) {
runDate=year[i]+'-'+j+'-'+setDays[k];
}
}
}
(I only include that because that's how I think of this happening contextually as I'm a PHP/Java programmer mainly and not a database admin)
I could really use some help here as I'm at a loss of what to do and I've spent a ton of time working on this already.
Assuming this is SQL Server, and not Access...
This should get you going in the right direction. This is effectively what #DaveJohnson suggested, with a twist in that it only counts each column if the date is before/on the #RunDate (and not null).
DECLARE #RunDate DATE
SET #RunDate = '2013-12-01'
DECLARE #DATA TABLE (AppID INT,Applicant VARCHAR(100),Program VARCHAR(100),CreateDate DATE,Accept DATE,ActiveDuplicate DATE,CondAccept DATE,Defer DATE,Depostited DATE,Divert DATE,Duplicate DATE,EarlyQuit DATE,Incomplete DATE,Ineligible DATE,Pending DATE,Review DATE,Purge DATE,Reject DATE,Withdraw DATE)
INSERT INTO #DATA
SELECT 1,'Peg Bundy','Comp-Sci','2013-08-01',NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
UNION ALL
SELECT 2,'Marcy Darcy','Comp-Sci','2013-08-25','2013-09-05',NULL,NULL,NULL,'2013-09-30',NULL,NULL,NULL,'2013-08-30',NULL,NULL,NULL,NULL,'2013-10-01',NULL
UNION ALL
SELECT 3,'Al Bundy','Language','2013-09-01','2013-09-05',NULL,NULL,NULL,'2013-09-27',NULL,NULL,NULL,'2013-09-05',NULL,NULL,NULL,NULL,NULL,'2013-09-27'
SELECT Program
, SUM(CASE WHEN CreateDate IS NULL OR CreateDate>#RunDate THEN 0 ELSE 1 END) AS CreateDate
, SUM(CASE WHEN Accept IS NULL OR Accept>#RunDate THEN 0 ELSE 1 END) AS Accept
, SUM(CASE WHEN ActiveDuplicate IS NULL OR ActiveDuplicate>#RunDate THEN 0 ELSE 1 END) AS ActiveDuplicate
, SUM(CASE WHEN CondAccept IS NULL OR CondAccept>#RunDate THEN 0 ELSE 1 END) AS CondAccept
, SUM(CASE WHEN Defer IS NULL OR Defer>#RunDate THEN 0 ELSE 1 END) AS Defer
, SUM(CASE WHEN Depostited IS NULL OR Depostited>#RunDate THEN 0 ELSE 1 END) AS Depostited
, SUM(CASE WHEN Divert IS NULL OR Divert>#RunDate THEN 0 ELSE 1 END) AS Divert
, SUM(CASE WHEN Duplicate IS NULL OR Duplicate>#RunDate THEN 0 ELSE 1 END) AS Duplicate
, SUM(CASE WHEN EarlyQuit IS NULL OR EarlyQuit>#RunDate THEN 0 ELSE 1 END) AS EarlyQuit
, SUM(CASE WHEN Incomplete IS NULL OR Incomplete>#RunDate THEN 0 ELSE 1 END) AS Incomplete
, SUM(CASE WHEN Ineligible IS NULL OR Ineligible>#RunDate THEN 0 ELSE 1 END) AS Ineligible
, SUM(CASE WHEN Pending IS NULL OR Pending>#RunDate THEN 0 ELSE 1 END) AS Pending
, SUM(CASE WHEN Review IS NULL OR Review>#RunDate THEN 0 ELSE 1 END) AS Review
, SUM(CASE WHEN Purge IS NULL OR Purge>#RunDate THEN 0 ELSE 1 END) AS Purge
, SUM(CASE WHEN Reject IS NULL OR Reject>#RunDate THEN 0 ELSE 1 END) AS Reject
, SUM(CASE WHEN Withdraw IS NULL OR Withdraw>#RunDate THEN 0 ELSE 1 END) AS Withdraw
FROM #DATA
GROUP BY Program
Try using a conditional CASE WHEN construct within your aggregation. Also, avoid looping in SQL for your dates as SQL Server is not optimized for this. You can build a date range and then join to that for an efficient set-based solution.
This is a SQL Server (2005+) only answer.
ex:
WITH [cte] AS
(
SELECT
[date]
FROM ( -- build date range
SELECT TOP (DATEDIFF(DAY,0,GETDATE())) -- avoid overflow
DATEADD(DAY,-1 * ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),CAST(GETDATE() AS DATE)) [Date]
FROM sys.all_objects O1
CROSS JOIN sys.all_objects O2 -- if you need LOTS of days
) A
WHERE [date] BETWEEN '01 Jan 2010' AND GETDATE() -- set these accordingly
AND DAY([date]) IN (1,8,15,22)
)
SELECT
[Programs_Name],
SUM(CASE WHEN [SubmitDate] <= B.[date] THEN 1 ELSE 0 END) [AppSubmitted],
SUM(CASE WHEN [Application_Accept_Date] <= B.[date] THEN 1 ELSE 0 END) [AcceptDate],
...
FROM ExtractApplications A
CROSS JOIN [cte] B
GROUP BY [Programs_Name]
Ok sorry that I derped on the "sql-server" tags guys, and I appreciate the help, but I figured it out.
Instead of using SUM(CASE WHEN field=x THEN 1 ELSE 0 END) I found that the equivalent of that in Access is basically SUM(IIF(field=x, 1, 0)) thanks to LittleBobbyTables (fantastic username) over in this thread getting sum using sql with multiple conditions
So what I was looking for in the combined field is SUM(IIF((createDate<=#myDate AND (submitDate>#myDate OR submitDate=null),1,0)) and the rest of the columns work via SUM(IIF(column<=#myDate, 1, 0))
Thanks again guys!