Count number of occurrences for each unique value [duplicate] - sql

This question already has answers here:
Count the occurrences of DISTINCT values
(4 answers)
Closed 5 years ago.
Basically I have a table similar to this:
time.....activities.....length
13:00........3.............1
13:15........2.............2
13:00........3.............2
13:30........1.............1
13:45........2.............3
13:15........5.............1
13:45........1.............3
13:15........3.............1
13:45........3.............2
13:45........1.............1
13:15........3.............3
A couple of notes:
Activities can be between 1 and 5
Length can be between 1 and 3
The query should return:
time........count
13:00.........2
13:15.........2
13:30.........0
13:45.........1
Basically for each unique time I want a count of the number of rows where the activities value is 3.
So then I can say:
At 13:00 there were X amount of activity 3s.
At 13:45 there were Y amount of activity 3s.
Then I want a count for activity 1s,2s,4s and 5s. so I can plot the distribution for each unique time.

Yes, you can use GROUP BY:
SELECT time,
activities,
COUNT(*)
FROM table
GROUP BY time, activities;

select time, coalesce(count(case when activities = 3 then 1 end), 0) as count
from MyTable
group by time
SQL Fiddle Example
Output:
| TIME | COUNT |
-----------------
| 13:00 | 2 |
| 13:15 | 2 |
| 13:30 | 0 |
| 13:45 | 1 |
If you want to count all the activities in one query, you can do:
select time,
coalesce(count(case when activities = 1 then 1 end), 0) as count1,
coalesce(count(case when activities = 2 then 1 end), 0) as count2,
coalesce(count(case when activities = 3 then 1 end), 0) as count3,
coalesce(count(case when activities = 4 then 1 end), 0) as count4,
coalesce(count(case when activities = 5 then 1 end), 0) as count5
from MyTable
group by time
The advantage of this over grouping by activities, is that it will return a count of 0 even if there are no activites of that type for that time segment.
Of course, this will not return rows for time segments with no activities of any type. If you need that, you'll need to use a left join with table that lists all the possible time segments.

If i am understanding your question, would this work? (you will have to replace with your actual column and table names)
SELECT time_col, COUNT(time_col) As Count
FROM time_table
GROUP BY time_col
WHERE activity_col = 3

You should change the query to:
SELECT time_col, COUNT(time_col) As Count
FROM time_table
WHERE activity_col = 3
GROUP BY time_col
This vl works correctly.

Related

Count rows within each group when condition is satisfied Sql Server

I have at table which looks like below:
ID Date IsFull
1 2020-01-05 0
1 2020-02-05 0
1 2020-02-25 1
1 2020-03-01 1
1 2020-03-20 1
I want to display how many months for ID = 1
have sum(isfull)/count(*) > .6 in a given month (More than 60% of the times in that month isfull = 1)
So the final output should
ID HowManyMonths
1 1 --------(Only month 3----2 out 2 cases)
If the question changes to sum(isfull)/count(*) > .4
then the final output should be
ID HowManyMonths
1 2 --------(Month 2 and Month 3)
Thanks!!
You can do this with two levels of aggregation:
select id, count(*) howManyMonths
from (
select id
from mytable
group by id, year(date), month(date)
having avg(1.0 * isFull) > 0.6
) t
group by id
The subquery aggregates by id, year and month, and uses a having clause to filter on groups that meet the success rate (avg() comes handy for this). The outer query counts how many month passed the target rate for each id.

Pull a DATEDIFF between Rows with Distinct value and WHERE Clause

I'm trying to generate a DATEDIFF to calculate SLA adherence, but the table info I need to perform the DATEDIFF is replicated in rows (instead of a start time column, end time column).
As such, I need to set a WHERE statement for the same column, using 2 different StatusIds to DATEDIFF those two timestamps.
For example,
PackageId | StatusId | rowDateModified
--------------------------------------
1 1 2019-06-01 00:41
1 2 2019-06-01 01:30
Ideally, I need to be able to calculate how long it took between package creation (StatusId = 1) and package completion (StatusId = 2) for an infinite number of rows, where the PackageId is Distinct.
You can use conditional aggregation:
select packageid,
datediff(day,
max(case when statusid = 1 then rowdatemodified end),
max(case when statusid = 2 then rowdatemodified end)
)
from t
group by packageid

Complex Query Involving Search for Contiguous Dates (by Month)

I have a table that contains a list of accounts by month along with a field that indicates activity. I want to search through to find when an account has "died", based on the following criteria:
the account had consistent activity for a contiguous period of months
the account had a spike of activity on a final month (spike = 200% or more of average of all previous contiguous months of activity)
the month immediately following the spike of activity and the next 12 months all had 0 activity
So the table might look something like this:
ID | Date | Activity
1 | 1/1/2010 | 2
2 | 1/1/2010 | 3.2
1 | 2/3/2010 | 3
2 | 2/3/2010 | 2.7
1 | 3/2/2010 | 8
2 | 3/2/2010 | 9
1 | 4/6/2010 | 0
2 | 4/6/2010 | 0
1 | 5/2/2010 | 0
2 | 5/2/2010 | 2
So in this case both accounts 1 and 2 have activity in months Jan - Mar. Both accounts exhibit a spike of activity in March. Both accounts have 0 activity in April. Account 2 has activity again in May, but account 1 does not. Therefore, my query should return Account 1, but not Account 2. I would want to see this as my query result:
ID | Last Date
1 | 3/2/2010
I realize this is a complicated question and I'm not expecting anyone to write the whole query for me. The current best approach I can think of is to create a series of sub-queries and join them, but I don't even know what the subqueries would look like. For example: how do I look for a contiguous series of rows for a single ID where activity is all 0 (or all non-zero?).
My fall-back if the SQL is simply too involved is to use a brute-force search using Java where I would first find all unique IDs, and then for each unique ID iterate across the months to determine if and when the ID "died".
Once again: any help to move in the right direction is very much appreciated.
Processing in Java, or partially processing in SQL, and finishing the processing in Java is a good approach.
I'm not going to tackle how to define a spike.
I will suggest that you start with condition 3. It's easy to find the last non-zero value. Then that's the one you want to test for a spike, and consistant data before the spike.
SELECT out.*
FROM monthly_activity out
LEFT OUTER JOIN monthly_activity comp
ON out.ID = comp.ID AND out.Date < comp.Date AND comp.Activity <> 0
WHERE comp.Date IS NULL
Not bad, but you don't want the result if this is because the record is the last for the month, so instead,
SELECT out.*
FROM monthly_activity out
INNER JOIN monthly_activity comp
ON out.ID = comp.ID AND out.Date < comp.Date AND comp.Activity == 0
GROUP BY out.ID
Probably not the world's most efficient code, but I think this does what you're after:
declare #t table (AccountId int, ActivityDate date, Activity float)
insert #t
select 1, '2010-01-01', 2
union select 2, '2010-01-01', 3.2
union select 1, '2010-02-03', 3
union select 2, '2010-02-03', 2.7
union select 1, '2010-03-02', 8
union select 2, '2010-03-02', 9
union select 1, '2010-04-06', 0
union select 2, '2010-04-06', 0
union select 1, '2010-05-02', 0
union select 2, '2010-05-02', 2
select AccountId, ActivityDate LastActivityDate --, Activity
from #t a
where
--Part 2 --select only where the activity is a peak
Activity >= isnull
(
(
select 2 * avg(c.Activity)
from #t c
where c.AccountId = 1
and c.ActivityDate >= isnull
(
(
select max(d.ActivityDate)
from #t d
where d.AccountId = c.AccountId
and d.ActivityDate < c.ActivityDate
and d.Activity = 0
)
,
(
select min(e.ActivityDate)
from #t e
where e.AccountId = c.AccountId
)
)
and c.ActivityDate < a.ActivityDate
)
, Activity + 1 --Part 1 (i.e. if no activity before today don't include the result)
)
--Part 3
and not exists --select only dates which have had no activity for the following 12 months on the same account (assumption: count no record as no activity / also ignore current date in this assumption)
(
select 1
from #t b
where a.AccountId = b.AccountId
and b.Activity > 0
and b.ActivityDate between dateadd(DAY, 1, a.ActivityDate) and dateadd(YEAR, 1, a.ActivityDate)
)

SQL Query with condition - Count new and returning values for each ID

So here is the problem that I've been trying to solve without success for the last couple of days: I have a table which keeps track of the participation of people (identified by their unique PartcipantId) to some events (identified by their unique EventId).
I would like to write a SQL query (I'm using MS Access 2010) which returns for each event the number of returning participants (ie which have already participated in another event with a lower EventId), the number of new participants (first time they appear, if sorting by EventId) and the total number of participants for that event?
As as example:
ParticipantId | EventId
1 1
1 3
1 4
2 3
2 4
3 5
Would give:
EventId | New | Returning | Total
1 1 0 1
3 1 1 2
4 0 2 2
5 1 0 1
Is this even possible to begin with? Any idea on how I could do it?
Many Thanks!
You could use a subquery to determine the first event for a user. Then, Access' iif allows you to count only first events for the New column:
select e.eventid
, sum(iif(e.eventid = p.FirstEvent,1,0)) as [New]
, sum(iif(e.eventid <> p.FirstEvent,1,0)) as Returning
, count(p.participantid) as Total
from (
select participantid
, min(eventid) as FirstEvent
from Table1
group by
participantid
) as p
left join
Table1 e
on p.participantid = e.participantid
group by
e.eventid

Need help designing a proper sql query

I have this table:
DebitDate | DebitTypeID | DebitPrice | DebitQuantity
----------------------------------------------------
40577 1 50 3
40577 1 100 1
40577 2 75 2
40578 1 50 2
40578 2 150 2
I would like to get with a single query (if that's possible), these details:
date, debit_id, total_sum_of_same_debit, how_many_debits_per_day
so from the example above i would get:
40577, 1, (50*3)+(100*1), 2 (because 40577 has 1 and 2 so total of 2 debits per this day)
40577, 2, (75*2), 2 (because 40577 has 1 and 2 so total of 2 debits per this day)
40578, 1, (50*2), 2 (because 40578 has 1 and 2 so total of 2 debits per this day)
40578, 2, (150*2), 2 (because 40578 has 1 and 2 so total of 2 debits per this day)
So i have this sql query:
SELECT DebitDate, DebitTypeID, SUM(DebitPrice*DebitQuantity) AS TotalSum
FROM DebitsList
GROUP BY DebitDate, DebitTypeID, DebitPrice, DebitQuantity
And now i'm having trouble and i'm not sure where to put the count for the last info i need.
You would need a correlated subquery to get this new column. You also need to drop DebitPrice and DebitQuantity from the GROUP BY clause for it to work.
SELECT DebitDate,
DebitTypeID,
SUM(DebitPrice*DebitQuantity) AS TotalSum,
( select Count(distinct E.DebitTypeID)
from DebitsList E
where E.DebitDate=D.DebitDate) as CountDebits
FROM DebitsList D
GROUP BY DebitDate, DebitTypeID
I think this can help you.
SELECT DebitDate, SUM(DebitPrice*DebitQuantity) AS TotalSum, Count(DebitDate) as DebitDateCount
FROM DebitsList where DebitTypeID = 1
GROUP BY DebitDate