Count the number of records for each 1st of the month in SQL - sql

I have a dataset where I would like to query and obtain output of a count of records for the first of every month.
Data
name date1
hello july 1 2018
hello july 1 2018
hello july 10 2018
sure august 1 2019
sure august 1 2019
why august 20 2019
ok september 1 2019
ok september 1 2019
ok september 1 2019
sure september 5 2019
Desired
ID MONTH Day YEAR
2 July 1 2018
2 August 1 2019
3 September 1 2019
We are only counting the records from the 1st of each month
Doing
USE [Data]
SELECT COUNT(*) AS ID , MONTH(date1) AS MONTH, YEAR(date1) AS YEAR
FROM dbo.data1
GROUP BY MONTH(date1), YEAR(date1)
ORDER BY YEAR ASC
This only outputs the year and month
Any suggestion is appreciated

Assuming you are using the implicit conversion for date
Example
SELECT COUNT(*) AS ID,
DATENAME(MONTH,date1) AS MONTH,
DATEPART(DAY,date1) as DAY,
YEAR(date1) AS YEAR
FROM dbo.data1
WHERE DAY(date1)=1
GROUP BY YEAR(date1),DATENAME(MONTH,date1),DATEPART(DAY,date1)
ORDER BY YEAR ASC
Results
ID MONTH DAY YEAR
2 July 1 2018
2 August 1 2019
3 September 1 2019

Related

How to use SAS/SQL to create a table with certain conditions from a dataset

I have a dataset with ID and event_year (event meaning something happened that year, a person has more than one record in this table with more than one event year eg. ID 1 can have three entries with event_year 2017, 2018, 2019 ). Example dataset like:
ID event_year
1 2017
1 2018
1 2019
2 2018
2 2017
ID
event_year
1
2017
1
2018
1
2019
2
2018
2
2017
I need to get a table from this of all ID where the event_year is between 2017 and 2021 to make a frequency table counting people with event_year at set years 2017, 2018, 2019, 2020, 2021 (these are the columns refer to as study year x).
Year frequency
2017 2
2018 2
2019 1
2020 1
2021 0
Year
frequency
2017
2
2018
2
2019
1
2020
1
2021
0
Another condition is for the study year x if a person didnt have an event_year in x but had event_year x-1 they will be included in the frequency of year x, for example the ID 1 above should be included in frequency of once in each 2017, 2018, 2019 and 2020- because following the condition above for year 2020 they didnt have event_year in 2020 but did in 2019 so will be included in 2020. I apologise if this is confusing and would be happy to clarify
If I understood your question, this should work:
data have;
input ID event_year;
datalines;
1 2017
1 2018
1 2019
2 2018
2 2017
3 2017
3 2020
;
run;
For the next step (your additional requirement of being included a year after last event) we need data grouped by ID.
proc sort data=have;
by ID;
run;
We just add extra rows to a table, where a year is last year + 1.
data have;
set have;
output;
by ID;
if last.ID then do;
ID=ID;
event_year=event_year+1;
output;
end;
run;
Now we just check how many different IDs every year had. If you want to check only for certain years, just add a where clause (for example, where event_year in (2017, 2018, 2019, 2020, 2021) ).
proc sql;
create table want as
select distinct event_year, count(distinct ID) as frequency
from have
group by event_year
;
run;

SQL query to Find highest value in table and sum the corresponding value

I would like to group Highest values in month column group by year and Sum the value column
value
Year
Month
4
2019
10
1
2019
11
5
2019
11
1
2019
11
1
2019
12
8
2019
12
1
2019
12
1
2020
1
10
2020
1
3
2021
1
2
2021
2
11
2021
2
1
2021
2
3
2021
2
2
2021
3
In above table I would like to extract highest value of month group by year
in year 2019 highest month is 12 so there are 3 rows and sum of value column will be 10
The output should be
value
Year
Month
10
2019
12
11
2020
1
2
2021
3
supposing that the table is called "example_table" you can use the following query:
select sum(example_table.value), example_table.year, example_table.month
from example_table
join (
select year, max(month) "month"
from example_table
group by year
) sub on example_table.year = sub.year and example_table.month = sub.month
group by example_table.year, example_table.month
order by example_table.year

big query SQL - repeatedly/recursively change a row's column in the select statement based on the values in previous row

I have table like below
customer
date
end date
1
jan 1 2021
jan 30 2021
1
jan 2 2021
jan 31 2021
1
jan 3 2021
feb 1 2021
1
jan 27 2021
feb 26 2021
1
feb 3 2021
mar 5 2021
2
jan 2 2021
jan 31 2021
2
jan 10 2021
feb 9 2021
2
feb 10 2021
mar 12 2021
Now, I wanted to update the value in the 'end date' column of a row based on the values in the previous row 'end date' and the current row 'date'.
Say if the date in current row < end date of the previous row, I wanted to update the end date of the current row = (end date of the previous row).
I Wanted to do this repeated for all the rows (grouped by customer).
I want the output as below. Just need it in the select statement instead of a updating/inserting in a table.
Note - in below as the second row(end date) is updated with the value in the first row (jan 30 2021), now the third row value (jan 3 2021) is evaluated against the updated value in the second row (which is jan 30 2021) but not with the second row value before update (jan 31 2021).
customer
date
end date
1
jan 1 2021
jan 30 2021
1
jan 2 2021
jan 30 2021 [updated because current date < previous end date]
1
jan 3 2021
jan 30 2021[updated because current date < previous end date]
1
jan 27 2021
jan 30 2021 [updated because current date < previous end date]
1
feb 3 2021
mar 5 2021
2
jan 2 2021
jan 31 2021
2
jan 10 2021
jan 31 2021[updated because current date < previous end date]
2
feb 10 2021
mar 12 2021
I think I should go this way. I use the datasource twice just to get the way its needed to perform the operation without updating or inserting into the table.
input table:
1|2021-01-01|2021-01-30
1|2021-01-02|2021-01-31
1|2021-01-03|2021-02-01
1|2021-01-27|2021-02-26
1|2021-02-03|2021-03-05
2|2021-01-02|2021-01-31
2|2021-01-10|2021-02-09
2|2021-02-10|2021-03-12
code:
with num_raw_data as (
SELECT row_number() over(partition by customer)as num, customer,date_init,date_end
FROM `project-id.data-set.table`
), analyzed_data as(
select r.num,
r.customer,
r.date_init,
r.date_end,
case when date_init<(select date_end from num_raw_data where num=r.num-1 and customer=r.customer and EXTRACT(month FROM r.date_init)=EXTRACT(month FROM date_init)) then 1 else 0 end validation
from num_raw_data r
)
select customer,
date_init,
case when validation !=0 then (select MIN(date_end) from analyzed_data where validation=0 and customer=ad.customer and date_init<ad.date_end) else date_end end as date_end
from analyzed_data ad
order by customer,num
output:
1|2021-01-01|2021-01-30
1|2021-01-02|2021-01-30
1|2021-01-03|2021-01-30
1|2021-01-27|2021-01-30
1|2021-02-03|2021-03-05
2|2021-01-02|2021-01-31
2|2021-01-10|2021-01-31
2|2021-02-10|2021-03-12
Using column validation from analyzed_data to get to know where I should be looking for changes. I'm not sure if its fast (probably not) but it works for the scenario you bring in your question.

Determine the first occurrence of a particular customer visiting the store in a particular month

I need to determine the counts breakdown to per month (and year) of customers [alias'ed as Patient_ID] which made their first visit to a store. The date times of store visits are stored in the [MDT Review Date] column of the table.
Customers can come to the store multiple times throughout the year and increase the total count-> but what I require is ONLY the first time a customer visited.
E.g. Tom Bombadil visited the store once in January 2019, so count increased to 1, then again 4 times in March, so count should be 1 for the month of March and 0 for febraury and 1 for January, then again 4 times in October, then again 2 times in December.
I require that Tom Bombadil should be counted one and only once for a particular month, his first occurrence which was per month
The output should be like :
rn1 YEAR Month_Number Month Total_Count
1 2010 6 June 2
1 2010 7 July 1
1 2010 8 August 5
1 2010 10 October 5
1 2010 11 November 3
1 2011 1 January 4
1 2011 2 February 6
1 2011 4 April 7
1 2011 5 May 4
1 2011 6 June 10
1 2011 7 July 10
1 2011 8 August 14
1 2011 9 September 4
1 2011 10 October 8
1 2011 11 November 11
1 2011 12 December 11
1 2012 1 January 8
1 2012 2 February 21​
Please refer to my query. What I have attempts to use the windowing function COUNT to count the store visits per month. Then the ROW_NUMBER function attempts to assign a unique number to each visit. What am I doing wrong?
select
*
from
(select distinct
row_number() over (partition by p.Patient_ID, p.PAT_Forename1, p.PAT_Surname
order by PAT_Forename1, p.Patient_ID, PAT_Surname) AS rn1,
datepart(year, [DATE_COLUMN]) as YEAR,
datepart(month, [DATE_COLUMN]) as Month_Number,
datename(month,[DATE_COLUMN]) as Month,
count(p.Patient_ID) over (partition by datepart(year,[DATE_COLUMN]),
datename(month, [DATE_COLUMN])) as Total_Count
from
Tablename m
inner join
TableName p on m.PK_ID = p.PK_ID
) as temp
where
rn1 = 1​

distribute a value starting from the first months

let be a query such as the following.
Select MONTH, sum(RECEIVABLES), sum(COLLECTED) from TABLE1 group by MONTH
result
MONTH RECEIVABLES COLLECTED
JANUARY 2 0
FEBRUARY 1 0
MARCH 3 0
Now, APRIL 4 get made COLLECTED ...
Question: APRIL 4 in value, starting from the first month , we distribute according to how COLLECTED column.
as follows
MONTH RECEIVABLES COLLECTED
JANUARY 2 2
FEBRUARY 1 1
MARCH 3 1
APRIL 0 0
With SQL or stored procedures...
thanks...