SQL Query to only pull one record for first day of month, for every month - sql

I have a table that has one record per day. E.g. (this is just the date col of the table)
2018-07-08 03:00:00
2018-07-07 03:00:00
2018-07-06 03:00:00
2018-07-05 03:00:00
2018-07-04 03:00:00
2018-07-03 03:00:00
2018-07-02 03:00:00
2018-07-01 03:00:00
2018-06-30 03:00:00
2018-06-29 03:00:00
This data goes back a few years
I want to pull just the first day of month record, for all months in the table.
What is the SQL to do that?
(On SQL Server 2014)

I would use the day() function:
select t.*
from t
where day(t.MyDate) = 1;
Neither this nor datepart() are ANSI/ISO-standard, but there are other databases that support day(). The standard function is extract(day from t.MyDate).
If you want the first record in the table for each month -- but for some months, that might not be day 1 -- then you can use row_number(). One method is:
select top (1) with ties t.*
from t
order by row_number() over (partition by year(mydate), month(mydate) order by day(mydate) asc);

If all your time are zeroed all you do need is to get everything where DATEPART is first day.
select * from dbo.MyTable mt where DATEPART(day, mt.MyDate) = 1
It will work if you got one row per day. Off course you will need to use DISTINCT or an aggregation if you got more than one row per day.

You can use row_number() function :
select *
from (select *, row_number() over (partition by datepart(year, date), datepart(month, date) order by datepart(day, date)) seq
from table
) t
where seq = 1;
Perhaps you also need year in partition clause.

Though this has been answered, you can use date from parts in MS SQL as well.
create table #temp (dates date)
insert into #temp values ('2018-01-02'),('2018-01-05'), ('2018-01-09'), ('2018-01-10')
select * from #temp
dates
2018-01-02
2018-01-05
2018-01-09
2018-01-10
You can use this to get beginning of the month
select DATEFROMPARTS(year(dates), month(dates), 01) Beginningofmonth from #temp
group by DATEFROMPARTS(year(dates), month(dates), 01)
Output:
Beginningofmonth
2018-01-01

Related

(SQL BigQuery) Using Lag but data contains missing months

I have the following table with monthly data. But we do not have the third month.
DATE
FREQUENCY
2021-01-01
6000
2021-02-01
4533
2021-04-01
7742
2021-05-01
1547
2021-06-01
9857
I want to get the frequency of the previous month into the following table.
DATE
FREQUENCY
PREVIOUS_MONTH_FREQ
2021-01-01
6000
NULL
2021-02-01
4533
6000
2021-04-01
7742
NULL
2021-05-01
1547
7742
2021-06-01
9857
1547
I want the 2021-04-01 record to have NULL for the PREVIOUS_MONTH_FREQ since there is no data for the previous month.
I got so far as...
SELECT DATE,
FREQUENCY,
LAG(FREQUENCY) OVER(ORDER BY DATE) AS PREVIOUS_MONTH_FREQ
FROM Table1
Use a CASE expression to check if the previous row contains data of the previous month:
SELECT DATE,
FREQUENCY,
CASE WHEN DATE_SUB(DATE, INTERVAL 1 MONTH) = LAG(DATE) OVER(ORDER BY DATE)
THEN LAG(FREQUENCY) OVER(ORDER BY DATE)
END AS PREVIOUS_MONTH_FREQ
FROM Table1
See the demo.
In BigQuery, you can use a RANGE window specification. This only trick is that you need a number rather than a date:
select t.*,
max(frequency) over (order by date_diff(date, date '2000-01-01', month)
range between 1 preceding and 1 preceding
) as prev_frequence
from t;
The '2000-01-01' is an arbitrary date. This turns the date column into the number of months since that date. The actual date is not important.

how to count a column by month if the date column has time stamp?

I have two columns in a table:
id date
1 1/1/18 12:55:00 AM
2 1/2/18 01:34:00 AM
3 1/3/18 02:45:00 AM
How do I count the number of IDs per month if the time is appended into the date column?
The output would be:
Count month
3 1
In ANSI SQL, you would use:
select extract(month from date) as month, count(*)
from t
group by extract(month from date);
I think more databases support a month() function rather than extract(), though.
you have to extract month and count by using group by
select DATE_PART('month', date) as month,count(id) from yourtable
group by DATE_PART('Month', date)

Counting rows between dates using row number?

I am trying to find the number of rows that 2 dates fall between. Basically I have an auth dated 1/1/2018 - 4/1/2018 and I need the count of pay periods those dates fall within.
Here is the data I am looking at:
create table #dates
(
pp_start_date date,
pp_end_date date
)
insert into #dates (pp_start_date,pp_end_date)
values ('2017-12-28', '2018-01-10'),
('2018-01-11', '2018-01-24'),
('2018-01-25', '2018-02-07'),
('2018-02-08', '2018-02-21'),
('2018-02-22', '2018-03-07'),
('2018-03-08', '2018-03-21'),
('2018-03-22', '2018-04-04'),
('2018-04-05', '2018-04-18');
When I run this query,
SELECT
ad.pp_start_date, ad.pp_end_date, orderby
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY pp_start_date) AS orderby, *
FROM
#dates) ad
WHERE
'2018-01-01' <= ad.pp_end_date
I somehow want to only get 7 rows. Is this even possible? Thanks in advance for any help!
EDIT - Ok so using a count(*) worked to get the number of rows but now I am trying to get the number of rows for 2 dynamic dates form another temp table but I don't see a way to relate the data.
Using the #dates temp table referenced above gives me the date data. Now using this data:
create table #stuff
([month] date,
[name] varchar(20),
units int,
fips_code int,
auth_datefrom date,
auth_dateto date)
insert into #stuff (month,name,units,fips_code,auth_datefrom,auth_dateto)
values ('2018-01-01','SMITH','50','760', '2018-01-01', '2018-04-01');
insert into #stuff (month,name,units,fips_code,auth_datefrom,auth_dateto)
values ('2018-01-01','JONES','46','193', '2018-01-01', '2018-04-01');
insert into #stuff (month,name,units,fips_code,auth_datefrom,auth_dateto)
values ('2018-01-01','DAVID','84','109', '2018-02-01', '2018-04-01');
I want to somehow create a statement that does a count of rows from the #dates table where the auth dates are referenced in the #stuff table I just can't figure out how to relate them or join them.
pp_start_date <= auth_dateto and pp_end_date >= auth_datefrom
Here is my output for #dates
pp_start_date pp_end_date
2017-12-28 2018-01-10
2018-01-11 2018-01-24
2018-01-25 2018-02-07
2018-02-08 2018-02-21
2018-02-22 2018-03-07
2018-03-08 2018-03-21
2018-03-22 2018-04-04
2018-04-05 2018-04-18
Here is my output for #stuff
month name units fips_code auth_datefrom auth_dateto
2018-01-01 SMITH 50 760 2018-01-01 2018-04-01
2018-01-01 JONES 46 193 2018-01-01 2018-04-01
2018-01-01 DAVID 84 109 2018-02-01 2018-04-01
I am trying to use the auth_datefrom and auth_dateto from #stuff to find out how many rows that is from #dates.
try this one.
SELECT ad.pp_start_date, ad.pp_end_date, orderby
from (select
row_number()over ( order by pp_start_date) as orderby, * from
#dates) ad
where ad.pp_end_date <= '2018-01-01'
or ad.pp_start_date >= '2018-01-01'
Are you looking for this?
select d.*
from #dates d
where d.startdate <= '2018-04-01' and
d.enddate >= '2018-01-01';
This returns all rows that have a date with the time period you specify.
I'm not sure what the row_number() does. If you want the count, then:
select count(*)
from #dates d
where d.startdate <= '2018-04-01' and
d.enddate >= '2018-01-01';

Netezza Grouping by Week Start (Sunday) AND Month Start

I have a little bit of an unusual question. I'm using Python to write some data to a text file that I then use Tableau to read from and build visualizations. I'm grouping the query results by week in order to reduce the size of the output file. I think the SQL is pretty standard for that type of operation.
SELECT [Date] - EXTRACT(DOW FROM [Date]) + 1
[this gives me the Sunday of the week for any date]
However, I occasionally want to group by months rather than weeks, which is impossible with the current output. What I want is a modification to the query which will group by week EXCEPT when a week overlaps two months. If the week overlaps two months, it will split the results into the first part of the week which is in the first month, and then the second part of the week which is in the second month. That way, we could use the output to show weekly result OR monthly/quarterly/yearly results simply by grouping the dates within Tableau.
Has anyone tackled a problem like this before?
As an illustration, consider the following values.
2016-08-21 1
2016-08-22 1
2016-08-23 1
2016-08-24 1
2016-08-25 1
2016-08-26 1
2016-08-27 1
2016-08-28 1
2016-08-29 1
2016-08-30 1
2016-08-31 1
2016-09-01 1
2016-09-02 1
2016-09-03 1
2016-09-04 1
... ...
I would like the code to output the following values:
2016-08-21 7
2016-08-28 4
2016-09-01 3
2016-09-04 1...
Would really appreciate any help!
Based on googled Netzetta syntax, this could work:
select
min([Date]) as MinDate, count(*) as TotalDays
from YourTable
group by
extract(year from [Date]),
extract(month from [Date]),
(case
when extract(dow from [Date]) = 1 -- dow 1 is sunday
then extract(week from [Date]) + 1 -- week starts on monday
else extract(week from [Date])
end);
Or as suggested in the comments, group on the sunday:
select
min([Date]) as MinDate, count(*) as TotalDays
from YourTable
group by
([Date] - (extract(dow from [Date]) - 1));
Here's the final code that I used.
CASE
WHEN EXTRACT(MONTH FROM [Date]) <> EXTRACT(MONTH FROM [Date] - EXTRACT(DOW FROM [Date]) + 1)
THEN DATE_TRUNC('month', [Date])
ELSE [Date] - EXTRACT(DOW FROM [Date]) + 1 END
Then I grouped on that field.
The way it works is that it checks if the month of the date is equal to the month of the week start. If it isn't, it returns the first day of the month. If it is, it returns the week start. This code returns the values in the example from the original post.

SQL query : Sum a column (not the date column) for day, week to date, month to date, etc

I am wondering if there is a way within one SQL statement if you can break out a sum of a total column by date, week to date, month to date, etc.
Sample data
DataDate TimeInterval TotalCalls
9/1/2014 12:00 154
9/1/2014 15:15 25
9/2/2014 07:30 125
9/3/2014 11:45 8
9/8/2014 10:15 15
9/9/2014 19:30 6
9/9/2014 12:15 100
In this case, I would want the select statement to return the following data for a given date of 9/9/2014 and week starting on Monday, 9/8/2014
Time Interval SumofCalls
Today 106
WTD 121
MTD 433
I have thought a CASE WHEN would work, but I can only find examples of how to use a CASE WHEN to sum the count of the date column being queried against the criteria. Any suggestions? Thanks!
Here's a way to do it in SQL Server using UNION and getting each range.
SQL Fiddle Demo
DECLARE #today DATE = CAST(GETDATE() AS DATE)
SELECT 'Today' AS 'TimeInterval',
SUM(TotalCalls) AS 'SumOfCalls'
FROM yourTable
WHERE DataDate = #today
UNION ALL
SELECT 'WTD',
SUM(TotalCalls)
FROM yourTable
WHERE DataDate BETWEEN dateadd(week, datediff(week, 0, #today), 0) AND #today
UNION ALL
SELECT 'MTD',
SUM(TotalCalls)
FROM yourTable
WHERE DataDate BETWEEN DATEADD(dd,-(DAY(#today)-1),#today) AND #today
If you could have the times periods go across the top as columns instead of rows then you could use CASE.