Query to sum & count - sql

My data is something like this
Count years
1 2020-08-11
1 2020-07-11
1 2019-09-01
1 2019-08-16
1 2019-05-04
1 2018-06-11
I'm writing a query where I have to find the count of year for eg <= 04 May 2019
,I need to find the count of all the dates lesser than that date i.e '2019-05-04'
count will come as 1 and then add 1 to the count.
I've written the query like this:
with sum_count as(
select count(*) as 'Counts', years from [practice].[dbo].[People]
where years<='2019-05-04'
group by years)
select sum(Counts) + 1 as Sum
from sum_count
Could you please help to do the same for all the dates like for 2020-08-11
the count shall come as 5 and sum a 6

You could achieve this quite simply by a select statement without the need for a cte:
Declare #td datetime = '20190504'
SELECT COUNT([years])+1 FROM [practice].[dbo].[People] Where [years] <= #td
If this something you would be repeating a lot you can have it as a stored procedure
CREATE PROC proc_name (#dt datetime)
as SELECT COUNT([years])+1 FROM [practice].[dbo].[People] Where [years] <= #td
and you would call it as such
exec proc_name ('20200801')

If I understand correctly, you simply want a window function. The following enumerates each row within each year:
select p.*,
row_number() over (partition by year(years) order by years) as seqnum
from [practice].[dbo].[People] p;
No stored procedure or auxiliary function is necessary.

Writing procedure can be one way, but if your hell bent on using a query you can use this.
select count(1)+1 as sum,a.year1
from
(
select distinct year as year1
from [practice].[dbo].[People]
) a
inner join
[practice].[dbo].[People] b on b.year<=a.year1
group by a.year1
Cheers!!!

Related

Past 7 days running amounts average as progress per each date

So, the query is simple but i am facing issues in implementing the Sql logic. Heres the query suppose i have records like
Phoneno Company Date Amount
83838 xyz 20210901 100
87337 abc 20210902 500
47473 cde 20210903 600
Output expected is past 7 days progress as running avg of amount for each date (current date n 6 days before)
Date amount avg
20210901 100 100
20210902 500 300
20210903 600 400
I tried
Select date, amount, select
avg(lg) from (
Select case when lag(amount)
Over (order by NULL) IS NULL
THEN AMOUNT
ELSE
lag(amount)
Over (order by NULL) END AS LG)
From table
WHERE DATE>=t.date-7) as avg
From table t;
But i am getting wrong avg values. Could anyone please help?
Note: Ive tried without lag too it results the wrong avgs too
You could use a self join to group the dates
select distinct
a.dt,
b.dt as preceding_dt, --just for QA purpose
a.amt,
b.amt as preceding_amt,--just for QA purpose
avg(b.amt) over (partition by a.dt) as avg_amt
from t a
join t b on a.dt-b.dt between 0 and 6
group by a.dt, b.dt, a.amt, b.amt; --to dedupe the data after the join
If you want to make your correlated subquery approach work, you don't really need the lag.
select dt,
amt,
(select avg(b.amt) from t b where a.dt-b.dt between 0 and 6) as avg_lg
from t a;
If you don't have multiple rows per date, this gets even simpler
select dt,
amt,
avg(amt) over (order by dt rows between 6 preceding and current row) as avg_lg
from t;
Also the condition DATE>=t.date-7 you used is left open on one side meaning it will qualify a lot of dates that shouldn't have been qualified.
DEMO
You can use analytical function with the windowing clause to get your results:
SELECT DISTINCT BillingDate,
AVG(amount) OVER (ORDER BY BillingDate
RANGE BETWEEN TO_DSINTERVAL('7 00:00:00') PRECEDING
AND TO_DSINTERVAL('0 00:00:00') FOLLOWING) AS RUNNING_AVG
FROM accounts
ORDER BY BillingDate;
Here is a DBFiddle showing the query in action (LINK)

how to find number of active users for say 1 day,2 days, 3 days.....postgreSQL

A distribution of # days active within a week: I am trying to find how many members are active for 1 day, 2days, 3days,…7days during a specific week 3/1-3/7.
Is there any way to use aggregate function on top of partition by?
If not what can be used to achieve this?
select distinct memberID,count(date) over(partition by memberID) as no_of_days_active
from visitor
where date between '"2019-01-01 00:00:00"' and '"2019-01-07 00:00:00"'
order by no_of_days_active
result should look something like this
#Days Active Count
1 20
2 32
3 678
4 34
5 3
6 678
7 2345
I think you want two levels of aggregation to count the number of days during the week:
select num_days_active, count(*) as num_members
from (select memberID, count(distinct date::date) as num_days_active
from visitor
where date >= '2019-01-01'::date and
date < '2019-01-08'::date
group by memberID
) v
group by num_days_active
order by num_days_active;
Note that I changed the date comparisons. If you have a time component, then between does not work. And, because you included time in the constant, I added an explicit conversion to date for the count(distinct). That might not be necessary, if date is really a date with no time component.
Piggybacking off of #Gordon's answer, I personally like using a with statement for the subqueries:
with dat as (
select distinct
memberID,
count(date) over(partition by memberID) as no_of_days_active
from visitor
where 1=1
and date between '2019-01-01'::date and '2019-01-07'::date
order by no_of_days_active
)
select
no_of_days_active,
count(no_of_days_active) no_of_days_active_cnt
from dat
group by no_of_days_active
order by no_of_days_active

Counting an already counted column in SQL (db2)

I'm pretty new to SQL and have this problem:
I have a filled table with a date column and other not interesting columns.
date | name | name2
2015-03-20 | peter | pan
2015-03-20 | john | wick
2015-03-18 | harry | potter
What im doing right now is counting everything for a date
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
what i want to do now is counting the resulting lines and only returning them if there are less then 10 resulting lines.
What i tried so far is surrounding the whole query with a temp table and the counting everything which gives me the number of resulting lines (yeah)
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
)
select count(*)
from temp_count
What is still missing the check if the number is smaller then 10.
I was searching in this Forum and came across some "having" structs to use, but that forced me to use a "group by", which i can't.
I was thinking about something like this :
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
)
select *
from temp_count
having count(*) < 10
maybe im too tired to think of an easy solution, but i can't solve this so far
Edit: A picture for clarification since my english is horrible
http://imgur.com/1O6zwoh
I want to see the 2 columned results ONLY IF there are less then 10 rows overall
I think you just need to move your having clause to the inner query so that it is paired with the GROUP BY:
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
having count(*) < 10
)
select *
from temp_count
If what you want is to know whether the total # of records (after grouping), are returned, then you could do this:
with temp_count (date, counter) as
(
select date, counter=count(*)
from testtable
where date >= current date - 10 days
group by date
)
select date, counter
from (
select date, counter, rseq=row_number() over (order by date)
from temp_count
) x
group by date, counter
having max(rseq) >= 10
This will return 0 rows if there are less than 10 total, and will deliver ALL the results if there are 10 or more (you can just get the first 10 rows if needed with this also).
In your temp_count table, you can filter results with the WHERE clause:
with temp_count (date, counter) as
(
select date, count(distinct date)
from testtable
where date >= current date - 10 days
group by date
)
select *
from temp_count
where counter < 10
Something like:
with t(dt, rn, cnt) as (
select dt, row_number() over (order by dt) as rn
, count(1) as cnt
from testtable
where dt >= current date - 10 days
group by dt
)
select dt, cnt
from t where 10 >= (select max(rn) from t);
will do what you want (I think)

Select repeat occurrences within time period <x days

If I had a large table (100000 + entries) which had service records or perhaps admission records. How would I find all the instances of re-occurrence within a set number of days.
The table setup could be something like this likely with more columns.
Record ID Customer ID Start Date Time Finish Date Time
1 123456 24/04/2010 16:49 25/04/2010 13:37
3 654321 02/05/2010 12:45 03/05/2010 18:48
4 764352 24/03/2010 21:36 29/03/2010 14:24
9 123456 28/04/2010 13:49 31/04/2010 09:45
10 836472 19/03/2010 19:05 20/03/2010 14:48
11 123456 05/05/2010 11:26 06/05/2010 16:23
What I am trying to do is work out a way to select the records where there is a re-occurrence of the field [Customer ID] within a certain time period (< X days). (Where the time period is Start Date Time of the 2nd occurrence - Finish Date Time of the first occurrence.
This is what I would like it to look like once it was run for say x=7
Record ID Customer ID Start Date Time Finish Date Time Re-occurence
9 123456 28/04/2010 13:49 31/04/2010 09:45 1
11 123456 05/05/2010 11:26 06/05/2010 16:23 2
I can solve this problem with a smaller set of records in Excel but have struggled to come up with a SQL solution in MS Access. I do have some SQL queries that I have tried but I am not sure I am on the right track.
Any advice would be appreciated.
I think this is a clear expression of what you want. It's not extremely high performance but I'm not sure that you can avoid either correlated sub-query or a cartesian JOIN of the table to itself to solve this problem. It is standard SQL and should work in most any engine, although the details of the date math may differ:
SELECT * FROM YourTable YT1 WHERE EXISTS
(SELECT * FROM YourTable YT2 WHERE
YT2.CustomerID = YT1.CustomerID AND YT2.StartTime <= YT2.FinishTime + 7)
In order to accomplish this you would need to make a self join as you are comparing the entire table to itself. Assuming similar names it would look something like this:
select r1.customer_id, min(start_time), max(end_time), count(1) as reoccurences
from records r1,
records r2
where r1.record_id > r2.record_id -- this ensures you don't double count the records
and r1.customer_id = r2.customer_id
and r1.finish_time - r2.start_time <= 7
group by r1.customer_id
You wouldn't be able to easily get both the record_id and the number of occurences, but you could go back and find it by correlating the start time to the record number with that customer_id and start_time.
This will do it:
declare #t table(Record_ID int, Customer_ID int, StartDateTime datetime, FinishDateTime datetime)
insert #t values(1 ,123456,'2010-04-24 16:49','2010-04-25 13:37')
insert #t values(3 ,654321,'2010-05-02 12:45','2010-05-03 18:48')
insert #t values(4 ,764352,'2010-03-24 21:36','2010-03-29 14:24')
insert #t values(9 ,123456,'2010-04-28 13:49','2010-04-30 09:45')
insert #t values(10,836472,'2010-03-19 19:05','2010-03-20 14:48')
insert #t values(11,123456,'2010-05-05 11:26','2010-05-06 16:23')
declare #days int
set #days = 7
;with a as (
select record_id, customer_id, startdatetime, finishdatetime,
rn = row_number() over (partition by customer_id order by startdatetime asc)
from #t),
b as (
select record_id, customer_id, startdatetime, finishdatetime, rn, 0 recurrence
from a
where rn = 1
union all
select a.record_id, a.customer_id, a.startdatetime, a.finishdatetime,
a.rn, case when a.startdatetime - #days < b.finishdatetime then recurrence + 1 else 0 end
from b join a
on b.rn = a.rn - 1 and b.customer_id = a.customer_id
)
select record_id, customer_id, startdatetime, recurrence from b
where recurrence > 0
Result:
https://data.stackexchange.com/stackoverflow/q/112808/
I just realize it should be done in access. I am so sorry, this was written for sql server 2005. I don't know how to rewrite it for access.

T-SQL - SELECT by nearest date and GROUPED BY ID

From the data below I need to select the record nearest to a specified date for each Linked ID using SQL Server 2005:
ID Date Linked ID
...........................
1 2010-09-02 25
2 2010-09-01 25
3 2010-09-08 39
4 2010-09-09 39
5 2010-09-10 39
6 2010-09-10 34
7 2010-09-29 34
8 2010-10-01 37
9 2010-10-02 36
10 2010-10-03 36
So selecting them using 01/10/2010 should return:
1 2010-09-02 25
5 2010-09-10 39
7 2010-09-29 34
8 2010-10-01 37
9 2010-10-02 36
I know this must be possible, but can't seem to get my head round it (must be too near the end of the day :P) If anyone can help or give me a gentle shove in the right direction it would be greatly appreciated!
EDIT: Also I have come across this sql to get the closest date:
abs(DATEDIFF(minute, Date_Column, '2010/10/01'))
but couldn't figure out how to incorporate into the query properly...
Thanks
you can try this.
DECLARE #Date DATE = '10/01/2010';
WITH cte AS
(
SELECT ID, LinkedID, ABS(DATEDIFF(DD, #date, DATE)) diff,
ROW_NUMBER() OVER (PARTITION BY LinkedID ORDER BY ABS(DATEDIFF(DD, #date, DATE))) AS SEQUENCE
FROM MyTable
)
SELECT *
FROM cte
WHERE SEQUENCE = 1
ORDER BY ID
;
You didn't indicate how you want to handle the case where multiple rows in a LinkedID group represent the closest to the target date. This solution will only include one row And, in this case you can't guarantee which row of the multiple valid values is included.
You can change ROW_NUMBER() with RANK() in the query if you want to include all rows that represent the closest value.
You want to look at the absolute value of the DATEDIFF function (http://msdn.microsoft.com/en-us/library/ms189794.aspx) by days.
The query can look something like this (not tested)
with absDates as
(
select *, abs(DATEDIFF(day, Date_Column, '2010/10/01')) as days
from table
), mdays as
(
select min(days) as mdays, linkedid
from absDates
group by linkedid
)
select *
from absdates
inner join mdays on absdays.linkedid = mdays.linkedid and absdays.days = mdays.mdays
You can also try to do it with a subquery in the select statement:
select [LinkedId],
(select top 1 [Date] from [Table] where [LinkedId]=x.[LinkedId] order by abs(DATEDIFF(DAY,[Date],#date)))
from [Table] X
group by [LinkedId]