How to Count Distinct on Case When? - sql

I have been building up a query today and I have got stuck. I have two unique Ids that identify if and order is Internal or Web. I have been able to split this out so it does the count of how many times they appear but unfortunately it is not providing me with the intended result. From research I have tried creating a Count Distinct Case When statement to provide me with the results.
Please see below where I have broken down what it is doing and how I expect it to be.
Original data looks like:
Company Name Order Date Order Items Orders Value REF
-------------------------------------------------------------------------------
CompanyA 03/01/2019 Item1 Order1 170 INT1
CompanyA 03/01/2019 Item2 Order1 0 INT1
CompanyA 03/01/2019 Item3 Order2 160 WEB2
CompanyA 03/01/2019 Item4 Order2 0 WEB2
How I expect it to be:
Company Name Order Date Order Items Orders Value WEB INT
-----------------------------------------------------------------------------------------
CompanyA 03/01/2019 4 2 330 1 1
What currently comes out
Company Name Order Date Order Items Orders Value WEB INT
-----------------------------------------------------------------------------------------
CompanyA 03/01/2019 4 2 330 2 2
As you can see from my current result it is counting every line even though it is the same reference. Now it is not a hard and fast rule that it is always doubled up. This is why I think I need a Count Distinct Case When. Below is my query I am currently using. This pull from a Progress V10 ODBC that I connect through Excel. Unfortunately I do not have SSMS and Microsoft Query is just useless.
My Current SQL:
SELECT
Company_0.CoaCompanyName
, SopOrder_0.SooOrderDate
, Count(DISTINCT SopOrder_0.SooOrderNumber) AS 'Orders'
, SUM(CASE WHEN SopOrder_0.SooOrderNumber IS NOT NULL THEN 1 ELSE 0 END) AS 'Order Items'
, SUM(SopOrderItem_0.SoiValue) AS 'Order Value'
, SUM(CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN 1 ELSE 0 END) AS 'INT'
, SUM(CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'WEB%' THEN 1 ELSE 0 END) AS 'WEB'
FROM
SBS.PUB.Company Company_0
, SBS.PUB.SopOrder SopOrder_0
, SBS.PUB.SopOrderItem SopOrderItem_0
WHERE
SopOrder_0.SopOrderID = SopOrderItem_0.SopOrderID
AND Company_0.CompanyID = SopOrder_0.CompanyID
AND SopOrder_0.SooOrderDate > '2019-01-01'
GROUP BY
Company_0.CoaCompanyName
, SopOrder_0.SooOrderDate
I have tried using the following line but it errors on me when importing:
, Count(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference ELSE 0 END) AS 'INT'
Just so know the error I get when importing at the moment is syntax error at or about "CASE WHEN sopOrder_0.SooParentOrderRefer" (10713)

Try removing the ELSE:
COUNT(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference END) AS num_int
You don't specify the error, but the problem is probably that the THEN is returning a string and the ELSE a number -- so there is an attempt to convert the string values to a number.
Also, learn to use proper, explicit, standard JOIN syntax. Simple rule: Never use commas in the FROM clause.

count distinct on the SooOrderNumber or the SooParentOrderReference, whichever makes more sense for you.
If you are COUNTing, you need to make NULL the thing that your are not counting. I prefer to include an else in the case because it is more consistent and complete.
, Count(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference ELSE null END) AS 'INT'
Gordon Linoff is correct regarding the source of your error, i.e. datatype mismatch between the case then value else value end. null removes (should remove) this ambiguity - I'd need to double check.
Editing my earlier answer...
Even though it looks, as you say, like count distinct is not supported in Pervasive PSQL, CTEs are supported. So you can do something like...
This is what you are trying to do but it is not supported...
with
dups as
(
select 1 as id, 'A' as col1 union all select 1, 'A' union all select 1, 'B' union all select 2, 'B'
)
select id
,count(distinct col1) as col_count
from dups
group by id;
Stick another CTE in the query to de-duplicate the data first. Then count as normal. That should work...
with
dups as
(
select 1 as id, 'A' as col1 union all select 1, 'A' union all select 1, 'B' union all select 2, 'B'
)
,de_dup as
(
select id
,col1
from dups
group by id
,col1
)
select id
,count(col1) as col_count
from de_dup
group by id;
These 2 versions should give the same result set.
There is always a way!!

I cannot explain the error you are getting. You are mistakenly using single quotes for alias names, but I don't actually think this is causing the error.
Anyway, I suggest you aggregate your order items per order first and only join then:
SELECT
c.coacompanyname
, so.sooorderdate
, COUNT(*) AS orders
, SUM(soi.itemcount) AS order_items
, SUM(soi.ordervalue) AS order_value
, COUNT(CASE WHEN so.sooparentorderreference LIKE 'INT%' THEN 1 END) AS int
, COUNT(CASE WHEN so.sooparentorderreference LIKE 'WEB%' THEN 1 END) AS web
FROM sbs.pub.company c
JOIN sbs.pub.soporder so ON so.companyid = c.companyid
JOIN
(
SELECT soporderid, COUNT(*) AS itemcount, SUM(soivalue) AS ordervalue
FROM sbs.pub.soporderitem
GROUP BY soporderid
) soi ON soi.soporderid = so.soporderid
GROUP BY c.coacompanyname, so.sooorderdate
ORDER BY c.coacompanyname, so.sooorderdate;

Related

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.
If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

Constructing A Query In BigQuery With CASE Statements

So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?
You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.
Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC

Not a GROUP BY Expression & aggregate functions

I was wondering why, for this query that I have right here, why I have to use the MAX() aggregate function for the case statements, and not just jump directly into the case statement:
select
bank_id,
tran_branch_code,
acct_sol_id,
acct_sol_name,
transaction_date,
gl_date,
transaction_id,
account_number,
max(case
when cast(substr(GLSH_Code,0,1) as int) >= 1
and cast(substr(GLSH_Code,0,1) as int) <= 5
and trans_type = 'D'
then (trans_amount)
--else 0
end ) Ind_Part_Tran_Dr_RBU,
max(case
when cast(substr(GLSH_Code,0,1) as int) >= 1
and cast(substr(GLSH_Code,0,1) as int) <= 5
and trans_type = 'C'
then (trans_amount)
--else 0
end) Ind_Part_Tran_Cr_RBU,
max(case
when cast(substr(GLSH_Code,0,1) as int) = 0
or (cast(substr(GLSH_Code,0,1) as int) >= 6
and cast(substr(GLSH_Code,0,1) as int) <= 9)
and trans_type = 'D'
then (trans_amount)
--else 0
end)Ind_Part_Tran_Dr_FCDU,
max(case
when cast(substr(GLSH_Code,0,1) as int) = 0
or (cast(substr(GLSH_Code,0,1) as int) >= 6
and cast(substr(GLSH_Code,0,1) as int) <= 9)
and trans_type = 'C'
then (trans_amount)
--else 0
end) Ind_Part_Tran_Cr_FCDU,
ccy_alias,
ccy_name,
acct_currency,
tran_currency
from
(
SELECT
DTD.BANK_ID,
DTD.SOL_ID Acct_Sol_ID, --Account Sol ID
dtd.br_code Tran_branch_code, -- branch code of the transacting branch
sol.sol_desc Acct_sol_name, -- name/description of SOL
DTD.TRAN_DATE Transaction_Date, --TransactionDate
DTD.GL_DATE GL_Date, --GL Date
TRIM(DTD.TRAN_ID) Transaction_ID, --Transaction ID
DTD.GL_SUB_HEAD_CODE GLSH_Code, --GLSH Code
dtd.tran_amt trans_amount,
GAM.ACCT_CRNCY_CODE Acct_Currency, --Account Currency
DTD.TRAN_CRNCY_CODE Tran_Currency, --Transaction Currency
cnc.crncy_alias_num ccy_alias,
cnc.crncy_name ccy_name,
GAM.FORACID Account_Number, --Account Number
DTD.TRAN_PARTICULAR Transaction_Particulars, --Transaction Particulars
DTD.CRNCY_CODE DTD_CCY,
--GSH.CRNCY_CODE GSH_CCY,
DTD.PART_TRAN_TYPE Transaction_Code,
--'Closing_Balance',
DTD.PSTD_USER_ID PostedBy,
CASE WHEN DTD.REVERSAL_DATE IS NOT NULL
THEN 'Y' ELSE 'N' END Reversal,
TRIM(DTD.TRAN_ID) REV_ORIG_TRAN_ID,
--OTT.REF_NUM OAP_REF_NUM,
'OAP_SETTLEMENT',
'RATE_CODE',
EAB.EOD_DATE
FROM TBAADM.DTD
LEFT OUTER JOIN TBAADM.GAM ON DTD.ACID = GAM.ACID AND DTD.BANK_ID = GAM.BANK_ID
LEFT OUTER JOIN TBAADM.EAB ON DTD.ACID = EAB.ACID AND DTD.BANK_ID = EAB.BANK_ID AND EAB.EOD_DATE = '24-MAR-2014'
left outer join tbaadm.sol on dtd.sol_id = sol.sol_id and dtd.bank_id = sol.bank_id
left outer join tbaadm.cnc on dtd.tran_crncy_code = cnc.crncy_code
WHERE DTD.BANK_ID = 'CBC01'
AND GAM.ACCT_OWNERSHIP = 'O'
AND GAM.DEL_FLG != 'Y'
--AND DTD.TRAN_DATE = '14-APR-2014'
AND DTD.TRAN_DATE between '01-APR-2014' and '21-APR-2014'
--and foracid in ('50010112441109','50010161635051')
--and DTD.SOL_ID = '5001'
and GAM.ACCT_CRNCY_CODE = 'USD'
)
group by
bank_id,
tran_branch_code,
acct_sol_id,
acct_sol_name,
transaction_date,
gl_date,
transaction_id,
account_number,
ccy_alias,
ccy_name,
Acct_Currency,
Tran_Currency
Because If I would remove the MAX(), I'd get the "Not a GROUP BY Expression", and Toad points me to the first occurrence of the GLSH_Code. Based from other websites, the cure for this is really adding the MAX() function. I would just like to understand why should I use that particular function, what it exactly does in the query, stuff like that.
EDIT: inserted the rest of the code.
I know for sure what MAX() does, it returns the largest value in an expression. But in this case, I can't seem to figure out exactly what that largest value is that the function is attempting to return.
The GROUP BY statement declares that all columns returned in the SELECT should be aggregated, but that you want to separate the results by those listed in the GROUP BY.
This means we have to use aggregate functions like MIN, MAX, AVG, SUM, etc. on any column that is NOT listed in the GROUP BY.
It's about telling the SQL engine what the expected results should be when there is more than one option.
In a simple example, we have a table with three columns:
PrimaryId SubId RowValue
1 1 1
2 1 2
3 2 4
4 2 8
And an SQL like the following (which is invalid):
SELECT SubId, RowValue
FROM SampleTable
GROUP BY SubId
We know we want the distinct SubId's (because of the GROUP BY), but we don't know what RowValue should be when we aggregate the results.
SubId RowValue
1 ?
2 ?
We have to be explicit in our query, and indicate what RowValue should be as the results can vary.
If we choose MIN(RowValue) we see:
SubId RowValue
1 1
2 4
If we choose MAX(RowValue) we see:
SubId RowValue
1 2
2 8
If we choose SUM(RowValue) we see:
SubId RowValue
1 3
2 12
Without being explicit there's a high likelihood that the results will be wrong, so our SQL engine of choice protects us from ourselves by enforcing the need for aggregate functions.
You have group by clause at the end on all the columns except for Ind_Part_Tran_Dr_RBU, Ind_Part_Tran_Cr_RBU, Ind_Part_Tran_Dr_FCDU, Ind_Part_Tran_Cr_FCDU. In this case oracle wants you to tell what to do with these columns, i.e. based on which function it has to aggregate them for every group it finds.

SQL Server Pivoting based off of a column to be delimited

I'm using SQL Server Management Studio and I have a table of survey results for project managers and I'm aggregating by question/score at the project manager RepID level:
SELECT Lower(A.RepID) as 'HHRepID'
, YEAR(A.ProjectEndDate) AS 'Year'
, MONTH(A.ProjectEndDate) AS 'Month'
, DATENAME(mm,A.ProjectEndDate) AS 'MonthName'
, SUM(CASE WHEN A.SatisfactionWithCommunication >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeCommunicationCount'
, COUNT(A.SatisfactionWithCommunication) as 'CommunicationCount'
, SUM(CASE WHEN A.InteractionConnectionWithClient >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeInteractionCount'
, COUNT(A.InteractionConnectionWithClient) as 'InteractionCount'
, SUM(CASE WHEN A.OverallSatisfactionWithEngagement >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeOverallSatisfactionCount'
, COUNT(A.OverallSatisfactionWithEngagement) as 'OverallSatisfactionCount'
, COUNT(A.ResponseID) as 'SurveysReturned'
, 'SalesOps' as 'Grouping'
FROM
SurveyData.dbo.SalesSurvey as A with(nolock)
WHERE
A.ResponseID IS NOT NULL AND A.IsExcludedFromReporting IS NULL
GROUP BY
YEAR(A.ProjectEndDate), MONTH(A.ProjectEndDate), DATENAME(mm,A.ProjectEndDate), A.RepID
ORDER BY
A.RepID
The output would look something like this:
Everything is great. Here's the problem. For each response for a project manager, there could be multiple Project Assistants. The project assistants for each project are aggregated (separated by ;) in one column:
What I need to do is pivot/delimit these results so each projectassistantID will be 1 row with the same grouped data as if it was a project manager. So for example, let's say that that row from the first screenshot had (HHRepID = jdoe) had 2 project assistants to it (call them Michael Matthews and Sarah Boyd): mmathews; sboyd. Via pivot/delimit, the output of the 2nd query would look like this:
In the actual table, it's just 1 record. But b/c there're multiple names in the ProjectAssistantID column, I need to pivot/delimit those out and essentially get the same results for each instance, just with ProjectAssistants rather than Project Managers.
I've been able to find some stuff on pivoting but this is pivoting based on delimiting values which adds an extra layer of complexity. It's entirely possible that there could be only 1 project assistant per project or as many as 6.
You can find several string split functions when you google for that, I will just assume this one which returns a table with one column named Item.
Then you can use cross apply as follows:
SELECT assist.Item as 'HHRepID'
, YEAR(A.ProjectEndDate) AS 'Year'
, MONTH(A.ProjectEndDate) AS 'Month'
, DATENAME(mm,A.ProjectEndDate) AS 'MonthName'
, SUM(CASE WHEN A.SatisfactionWithCommunication >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeCommunicationCount'
, COUNT(A.SatisfactionWithCommunication) as 'CommunicationCount'
, SUM(CASE WHEN A.InteractionConnectionWithClient >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeInteractionCount'
, COUNT(A.InteractionConnectionWithClient) as 'InteractionCount'
, SUM(CASE WHEN A.OverallSatisfactionWithEngagement >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeOverallSatisfactionCount'
, COUNT(A.OverallSatisfactionWithEngagement) as 'OverallSatisfactionCount'
, COUNT(A.ResponseID) as 'SurveysReturned'
, 'SalesOps' as 'Grouping'
FROM
SurveyData.dbo.SalesSurvey as A with(nolock)
CROSS APPLY Split(Lower(A.RepID), ';') AS assist
WHERE
A.ResponseID IS NOT NULL AND A.IsExcludedFromReporting IS NULL
GROUP BY
YEAR(A.ProjectEndDate), MONTH(A.ProjectEndDate), DATENAME(mm,A.ProjectEndDate), assist.Item
ORDER BY
assist.Item
For a strictly sql solution, you could use recursive cte to get your names split and then join it back to you query..
Try the below query
DECLARE #col varchar(20)='a;b;c;g;y;u;i;o;p;';
WITH CTE as
(
SELECT SUBSTRING(#COL,CHARINDEX(';',#COL)+1, LEN(#COL)-CHARINDEX(';', #COL)) col
,SUBSTRING(#COL,0, CHARINDEX(';',#COL)) Split_Names
, 1 i
union ALL
SELECT SUBSTRING(COL,CHARINDEX(';',COL)+1, LEN(COL)-CHARINDEX(';', COL)) col
,SUBSTRING(COL,0, CHARINDEX(';',COL)) Split_Names
, i+1
from CTE
WHERE CHARINDEX(';', col)>1
)
SELECT * FROM CTE
You will need to use your column in place of #col and change the code to refer to your table in the first union statement.
Hope this helps...
So you would do something like this...
WITH ProjectAssistants as
(
SELECT SUBSTRING(PROJECTASSISTANTID,CHARINDEX(';',PROJECTASSISTANTID)+1, LEN(PROJECTASSISTANTID)-CHARINDEX(';', PROJECTASSISTANTID)) col
,SUBSTRING(PROJECTASSISTANTID,0, CHARINDEX(';',PROJECTASSISTANTID)) Assistnames_Names
, ProjectManagerID
FROM YOUR_PROJECT_ASSISTANT_TABLE_NAME
union ALL
SELECT SUBSTRING(COL,CHARINDEX(';',COL)+1, LEN(COL)-CHARINDEX(';', COL)) col
,SUBSTRING(COL,0, CHARINDEX(';',COL)) Assistnames_Names
, ProjectManagerID
from ProjectAssistants
WHERE CHARINDEX(';', col)>1
)
SELECT Lower(A.RepID) as 'HHRepID'
, YEAR(A.ProjectEndDate) AS 'Year'
, MONTH(A.ProjectEndDate) AS 'Month'
, DATENAME(mm,A.ProjectEndDate) AS 'MonthName'
, SUM(CASE WHEN A.SatisfactionWithCommunication >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeCommunicationCount'
, COUNT(A.SatisfactionWithCommunication) as 'CommunicationCount'
, SUM(CASE WHEN A.InteractionConnectionWithClient >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeInteractionCount'
, COUNT(A.InteractionConnectionWithClient) as 'InteractionCount'
, SUM(CASE WHEN A.OverallSatisfactionWithEngagement >= 4 THEN 1 ELSE 0 END) as 'AgreeStronglyAgreeOverallSatisfactionCount'
, COUNT(A.OverallSatisfactionWithEngagement) as 'OverallSatisfactionCount'
, COUNT(A.ResponseID) as 'SurveysReturned'
, 'SalesOps' as 'Grouping'
FROM
SurveyData.dbo.SalesSurvey as A with(nolock)
LEFT JOIN ProjectAssistants as B
ON A.ProjectManagerID=B.ProjectManagerID
WHERE
A.ResponseID IS NOT NULL AND A.IsExcludedFromReporting IS NULL
GROUP BY
YEAR(A.ProjectEndDate), MONTH(A.ProjectEndDate), DATENAME(mm,A.ProjectEndDate), A.RepID
ORDER BY
A.RepID
Assumptions:
Your Surveydata table has a ProjectManagerid column
Left Join-ed the ProjectManager CTE considering you still want to show data when there arent any assistants.

Select Distinct Attribute and Print out Count of another even when the count is 0

I don't quite know how I should describe the problem for title, but here's my question.
I have a table named hello with two columns named time and state.
Time | State
Here's an example of the data I have
1 DC
1 VA
1 VA
2 DC
2 MD
3 MD
3 MD
3 VA
3 DC
I would like to get all the possible time and the count of "VA" (0 if "VA" doesn't appear at the time)
The output would look like this
Time Number
1 2
2 0
3 1
I tried to do
SELECT DISTINCT time,
COUNT(state) as Number
FROM hello
WHERE state = 'VA'
GROUP BY time
but it doesn't seem to work.
This is a conditional aggregation:
select time, sum(case when state = 'VA' then 1 else 0 end) as NumVA
from hello
group by time
I want to add that you should never use distinct when you have a group by. The two are redundant. Distinct as a keyword is not even needed in the SQL language; semantically, it is just shorthand for grouping by all the columns.
SELECT TIME,
SUM(CASE WHEN State = 'VA' THEN 1 ELSE 0 END)
FROm tableName
GROUP BY Time
SQLFiddle Demo
One rule of thumb is to get your counts first and put them into a temp for use later.
See below:
Create table temp(Num int, [state] varchar(2))
Insert into temp(Num,[state])
Select 1,'DC'
UNION ALL
Select 1,'VA'
UNION ALL
Select 1,'VA'
UNION ALL
Select 2,'DC'
UNION ALL
Select 2,'MD'
UNION ALL
Select 3,'MD'
UNION All
Select 3,'MD'
UNION ALL
Select 3,'VA'
UNION ALL
Select 3,'DC'
Select t.Num [Time],t.[State]
, CASE WHEN t.[state] = 'VA' THEN Count(t.[State]) ELSE 0 END [Number]
INTO #temp2
From temp t
Group by t.Num, t.[state]
--drop table #temp2
Select
t2.[time]
,SUM(t2.[Number])
From #temp2 t2
group by t2.[time]