SQL Finding the first day of the month of a data set - sql

I've found lots of code in getting the sql to display the first of a month, but I need to display the first day of the month based on my data set not just [month] 1st [year]
EX 1: January 1st is a holiday, so it'll never be the first day of the month in the data set, the first day of January is January 2nd.
Another example is if the first date of the month is the 7th in my data set, I want to see the 7th not the 1st.
This is my data set
DATE
----------
2016-02-01
2016-02-05
2016-02-08
2016-02-19
2016-02-20
2016-02-22
2016-05-02
2016-05-05
2016-05-07
2016-05-09
2016-05-11
2016-05-23
2016-06-01
2016-06-10
2016-06-20
2016-06-29
2016-07-01
2016-07-07
2016-07-14
2016-07-21
2016-07-28
2016-07-31
2016-08-04
2016-08-10
2016-08-18
2017-02-23
2017-02-30
I need this to display
DATE
----------
2016-02-01
2016-05-02
2016-06-01
2016-07-01
2016-08-04
2017-02-23
I keep getting stuck, I thought this may work but I'm not getting the min date for that month
select min(load_date) from multi_dt
group by month(load_date)

Try this:
select min(load_date) as min_load_date
from multi_dt
group by dateadd(month, datediff(month, 0, load_date ) , 0)
Using month() only returns the month, but using the function in the query above will return the first of the month, but as a datetime datatype so when you group by it, it is including the year and the month.
rextester demo: http://rextester.com/UJRN68337
returns:
+---------------+
| min_load_date |
+---------------+
| 2016-02-01 |
| 2016-05-02 |
| 2016-06-01 |
| 2016-07-01 |
| 2016-08-04 |
| 2017-02-23 |
+---------------+

Your initial answer was fine, you just also needed to group by the year.
group by
month(load_date),year(load_date)

I would use row_number():
select t.date
from (select t.*,
row_number() over (partition by year(date), month(date) order by date) as seqnum
from t
) t
where seqnum = 1;
If you don't need any additional columns, an aggregation would be equivalent:
select min(t.date)
from t
group by year(t.date), month(t.date);

Related

SQL select query using Joins with aggregate counts

I have a table with the following fields:
tickets: id, createddate, resolutiondate
A sample set of data has:
jira=# select * from tickets;
id | createddate | resolutiondate
---------+-------------+----------------
ticket1 | 2020-09-21 | 2020-10-01
ticket2 | 2020-09-22 | 2020-09-23
ticket3 | 2020-10-01 |
ticket4 | 2020-10-01 | 2020-10-04
ticket5 | 2020-10-01 |
ticket6 | 2020-10-01 | 2020-10-07
(6 rows)
jira=#
I would like to create a query which reports:
Week: Issues Created: Issues Resolved
I can do the two separate queries:
# select date_trunc('week', createddate) week, count(id) created
from tickets
group by week
order by week desc
;
week | created
------------------------+---------
2020-09-28 00:00:00+00 | 4
2020-09-21 00:00:00+00 | 2
(2 rows)
# select date_trunc('week', resolutiondate) week, count(id) resolved
from tickets
where resolutiondate is not NULL
group by week
order by week desc
;
week | resolved
------------------------+----------
2020-10-05 00:00:00+00 | 1
2020-09-28 00:00:00+00 | 2
2020-09-21 00:00:00+00 | 1
(3 rows)
However - I can not figure out how (with a join, union, sub-query, ...?) to combine these queries into a combined query with the appropriate aggregations.
I'm doing this is Postgres - any pointers would be appreciated.
Performing a union before aggregating values may work here eg
select week,
count(id_created) as created,
count(id_resolved) as resolved
from (
select date_trunc('week', resolutiondate) week, NULL as id_created, id as id_resolved from tickets UNION ALL
select date_trunc('week', createddate) week, id as id_created, NULL as id_resolved from tickets
) t
group by week
order by week desc
Let me know if this works for you.

Time and attendance

I have a table with below data
EMPID | DEVICE | EVENTTIME
-----------------------------------------
112 | READ_IN | 2018-11-02 07:00:00.000
112 | READ_IN | 2018-11-02 08:00:00.000
112 | READ_OUT | 2018-11-02 12:00:00.000
112 | READ_IN | 2018-11-02 13:00:00.000
112 | READ_OUT | 2018-11-02 16:00:00.000
I need a select query to achieve below data:
ID_Emp |Date |TimeIn |TimeOut|Hours
112 |02/11/2018 |8:00 |16:00 |7:00
In my table, the employee came at 7:00 but he didn't do his work then after one hour he came back and work. He took his lunch break at 12:00-13:00 and left his work at 16:00. So his total working hours will be 7 hours.
At first you need to eliminate time between 12 and 1, I wrote simple where clause for this. After that
I used PIVOT for transposing rows to columns by max EVENTTIME.
And finally, I wrote outermost SELECT query for converting columns to your intended format.
here is the fiddler link: http://sqlfiddle.com/#!4/f1189/10
here is the code:
SELECT
EMPID,
TO_CHAR(READ_IN, 'HH24:MI') READ_IN,
TO_CHAR(READ_OUT, 'HH24:MI') READ_OUT,
EXTRACT(HOUR FROM READ_OUT - READ_IN) HOUR
FROM (
select * from (
select * from Table1
WHERE
extract(hour from eventtime) not between '12' and '13'
)
PIVOT (
MAX(EVENTTIME)
for DEVICE in ( 'READ_IN' READ_IN, 'READ_OUT' READ_OUT )
)
)
please note that this example only works for oracle.

How to identify MIN value for records within a rolling date range in SQL

I am trying to calculate a MIN date by Patient_ID for each record in my dataset that dynamically references the last 30 days from the date (Discharge_Dt) on that row. My initial thought was to use a window function, but I opted for a subquery, which is close, but not quite what I need.
Please note, my sample query is also missing logic that limits the MIN Discharge_Dt to the last 30 days, in other words, I do not want a MIN Discharge_Dt that is older than 30 days for any given row.
Sample Query:
SELECT Patient_ID,
Discharge_Dt,
/* Calculating the MIN Discharge_Dt by Patient_ID for the last 30
days based upon the Discharge_Dt for that row */
(SELECT MIN(Discharge_Dt)
FROM admissions_ds AS b
WHERE a.Patient_ID = b.Patient_ID AND
a.Discharge_Dt >= DATEADD('D', -30, GETDATE())) AS MIN_Dt
FROM admissions_ds AS a
Desired Output Table:
Patient_ID | Discharge_Dt | MIN_Dt
10 | 2017-08-15 | 2017-08-15
10 | 2017-08-31 | 2017-08-15
10 | 2017-09-21 | 2017-08-31
15 | 2017-07-01 | 2017-07-01
15 | 2017-07-18 | 2017-07-01
20 | 2017-05-05 | 2017-05-05
25 | 2017-09-24 | 2017-09-24
Here you go,
Just a simple join required.
drop TABLE if EXISTS admissions_ds;
create table admissions_ds (Patient_ID int,Discharge_Dt date);
insert into admissions_ds
values
(10,'2017-08-15'),
(10,'2017-08-31'),
(10,'2017-09-21'),
(15,'2017-07-01'),
(15,'2017-07-18'),
(20,'2017-05-05'),
(25,'2017-09-24');
select t1.Patient_ID,t1.Discharge_Dt,min(t2.Discharge_Dt) as min_dt
from admissions_ds as t1
join admissions_ds as t2 on t1.Patient_ID=t2.Patient_ID and t2.Discharge_Dt > t1.Discharge_Dt - interval '30 days'
group by 1,2
order by 1,2
;

How do I group by month when I have data in a time range, accurate up to the second?

I'd like to ask if there's a way to group my data by months in this case:
I have table of orders, with order Ids in a column and the dates the orders were created in another.
For example,
orderId | creationDate
58111 | 2017-01-01 00:00:00
58111 | 2017-01-12 00:00:00
58232 | 2017-01-31 00:00:00
62882 | 2017-02-21 00:00:00
90299 | 2017-03-20 00:00:00
I need to find the number of unique orderIds, grouped by month. Normally this would be simple, but with my creationDates accurate to the second, I have no idea how to segment them into months. Ideally, this is what I'd obtain:
creationMonth | count_orderId
January | 2
February | 1
March | 1
Try this:
select count( distinct orderId ), year( creationDate ), month( creationDate )
from my_table group by year( creationDate ), month( creationDate )

How to sort PostgreSQL data by weeks over month period

In simplest terms, I want to pull aggregate data from a table over a 4 week period but group by each week. It is safe to assume we can "force" a specific date or time (although it would be nice to allow any date entered and have the query run based on the date entered).
For example, the resulting data from a query would look like this:
start_date | end_date | count_of_sales
---------------------------------------------------------------
2014-03-03 04:00:00 | 2014-03-10 03:59:59 | 375
2014-03-10 04:00:00 | 2014-03-17 03:59:59 | 375
2014-03-17 04:00:00 | 2014-03-24 03:59:59 | 375
2014-03-24 04:00:00 | 2014-03-31 04:00:00 | 200
This would stem from unaggregated data that simply had a date (and of course other data but that is irrelevant):
saleDate | repID | productID
---------------------------------------------------------------
2014-03-04 12:36:33 | 1235 | 443
2014-03-09 07:08:12 | 1235 | 493
2014-03-09 10:12:44 | 3948 | 472
2014-03-21 23:33:01 | 2957 | 479
In my head the query would look SOMETHING (although accurate) like this:
SELECT start_date, end_date, COUNT(*) FROM table WHERE date < '2014-03-31 04:00:00' GROUP BY date
I understand the query above however does not understand how far back to look (ideally the customer enters the final date and perhaps how many weeks prior of data they want to pull) which is why I left out a date BETWEEN clause (they may not know the exact 'start' date.
Sorry if this is confusing but hopefully the sample SQL (albeit wrong) and desired results will give a clearer picture
If I got your question correctly, then following code should help you,
For clarification: Code which I have given is of SQL Server.
With CTE as
(
Select 1 as pID,'2014-03-03 04:00:00' as startDate,'2014-03-10 03:59:59' as endDate
Union All
Select 2,'2014-03-10 04:00:00','2014-03-17 03:59:59'
Union All
Select 3,'2014-03-17 04:00:00','2014-03-24 03:59:59'
Union All
Select 4,'2014-03-24 04:00:00','2014-03-31 04:00:00'
)
select a.pID,a.startDate,a.endDate,count(*) from CTE as a
inner join MyTable on myDateCol between a.startDate and a.endDate
group by a.pID,a.startDate,a.endDate
for demo SQL Fiddle