Create a Series of Dates between two Dates in a table - SQL - sql

I have a table like this:
I want to list the rows per day between their Start Date and and End Date and Total Payment divided by number of days (I assume I would need a window function partition by name here). But my main concern is how to create those series of dates for each name based on their Start Date and End Date.
Using the table above I would like the output to look like this:

Consider a range join with count window function to spread out total by days:
SELECT t."Name",
t."Total Payment" / COUNT(dates) OVER(PARTITION BY t."Name") AS Payment,
t."Start Date",
t."End Date",
dates AS "Date of"
FROM generate_series(
timestamp without time zone '2022-01-01',
timestamp without time zone '2022-12-31',
'1 day'
) AS dates
INNER JOIN my_table t
ON dates BETWEEN t."Start Date" AND t."End Date"

You can get what your after is a single query by generate_series for getting each day, and by just subtracting the 2 dates. (Since you seem to want both dates included in the day count an additional 1 needs added).
select name, (total_payment/( (end_date-start_date) +1))::numeric(6,2), start_date, end_date, d::date date_of
from test t
cross join generate_series(t.start_date
,t.end_date
,interval ' 1 day'
) gs(d)
order by name desc, date_of;
See demo. I leave for you what to do when the total_payment is not a multiple of the number of days. The demo just ignores it.

Related

getting day wise query result for a certain time period in postgresql

i have a table in postgresql database called orders. where all the order related informations are stored. now, if an order gets rejected that certain order row gets moved from the orders table and gets stored in the rejected_orders table. As a result, the count function does not provide the correct number of orders.
Now, if I want to get the number of order request(s) in a certain day. I have to subtract the id numbers between the last order of the day and first order of the day. Below, i have the query for number total request for March 1st, 2022. Sadly, the previous employe forgot to save the timezone correctly in the database. Data is saved in the DB at UTC+00 timezone, Fetched data needs to be in GMT+06 timezone.
select
(select id from orders
where created_at<'2022-03-02 00:00:00+06'
order by created_at desc limit 1
)
-
(select id from orders
where created_at>='2022-03-01 00:00:00+06'
order by created_at limit 1
) as march_1st;
march_1st
-----------
185
Now,
If I want to get total request per day for certain time period(let's for month March, 2021). how can I do that in one sql query without having to write one query per day ?
To wrap-up,
total_request_per_day = id of last order of the day - id of first
order of the day.
How do I write a query based on that logic that would give me total_request_per_day for every day in a certain month.
like this,
|Date | total requests|
|01-03-2022 | 187 |
|02-03-2022 | 202 |
|03-03-2022 | 227 |
................
................
With respect, using id numbers to determine numbers of rows in a time period is incorrect. DELETEing rows leaves gaps in id number sequences; they are not designed for this purpose.
This is a job for date_trunc(), COUNT(*), and GROUP BY.
The date_trunc('day', created_at) function turns an arbitrary timestamp into midnight on its day. For example, it turns ``2022-03-02 16:41:00into2022-03-02 00:00:00`. Using that we can write the query this way.
SELECT COUNT(*) order_count,
date_trunc('day', created_at) day
FROM orders
WHERE created_at >= date_trunc('day', NOW()) - INTERVAL '7 day'
AND created_at < date_trunc('day', NOW())
GROUP BY date_trunc('day', created_at)
This query gives the number of orders on each day in the last 7 days.
Every minute you spend learning how to use SQL data arithmetic like this will pay off in hours saved in your work.
Try this :
SELECT d.ref_date :: date AS "date"
, count(*) AS "total requests"
FROM generate_series('20220301' :: timestamp, '20220331' :: timestamp, '1 day') AS d(ref_date)
LEFT JOIN orders
ON date_trunc('day', d.ref_date) = date_trunc('day', created_at)
GROUP BY d.ref_date
generate_series() generates the list of reference days where you
want to count the number of orders
Then you join with the orders table by comparing the reference date with the created_at date on year/month/day only. LEFT JOIN allows you to select reference days with no existing order.
Finally you count the number of orders per day by grouping by reference day.

Repeat a record for every day between two dates BigQuery?

I am attempting to produce a table of historical unfulfilled units. Currently, the database captures fulfillment date and order date for a record.
CREATE TABLE `input_table`
(order_name STRING,
line_item_id STRING,
order_date DATE,
fulfillment_date DATE)
Sample Record:
order_name: ABC
line_item_id: 123456
order_date: 2017-04-19
fulfillment_date: 2017-04-25
I want to produce a table that shows the fulfillment status by day, starting with the order date and ending with the date prior to the fulfillment date of each line item, e.g. in the above sample record the output_table would be:
Ultimately, this would allow me to query the count of unfulfilled line items each day:
SELECT
date,
count(line_item_id) AS unfulfilled_line_items
FROM
`output_table`
GROUP BY 1
Indicating the fulfillment status is not strictly necessary, considering it would only include dates in which the status was unfulfilled.
While I could do something like this:
with days as (SELECT
*
FROM
UNNEST(GENERATE_DATE_ARRAY('2017-01-01', CURRENT_DATE(), INTERVAL 1 day)) AS day)
SELECT
*
FROM
`input_table`
JOIN days
ON 1=1
AND order_date <= day
AND fulfillment_date > day
..the operation is fairly expensive.
Is there a better way of going about this?
I want to produce a table that shows the fulfillment status by day, starting with the order date and ending with the date prior to the fulfillment date of each line item
Consider below
select date, order_name, line_item_id, 'unfulfilled' fulfillment_status
from `project.dataset.table`,
unnest(generate_date_array(order_date, fulfillment_date - 1)) date
if applied to sample entry in your question - output is

SQL to display data of current month & next month from table

I wanted some guidance on producing an SQL query that collects the table information of the current date and also next month without having to type in every day for the current month being October or the next month being November.
Basically I've got a table called WORK, in this table there are SHIFTID, DATEOFSHIFT, and MEMBERSHIPID. I basically need to list the SHIFTID's of shifts where MEMBERSHIPID = null and where DATEOFSHIFT is in November (next month)
Then I need to produce a query for the shift roster showing SHIFTID, DATEOFSHIFT, and MEMBERSHIPID of each shift in this current month.
This is the structure of my database table if needed.
I would recommend:
select w.*
from work w
where w.membershipid is null and
w.dateofshift >= trunc(sysdate, 'Month') + interval '1' month and
w.dateofshift < trunc(sysdate, 'Month') + interval '2' month;
You can also phrase the where as:
where w.membershipid is null and
trunc(w.dateofshift, 'Month') >= trunc(sysdate, 'Month') + interval '1' month
but this makes it hard for Oracle to use an index if an appropriate one is available.
Well from what you've provided, I infer that you want a query to display the information on all those fields for the current month. That is achievable by:
Select SHIFTID, DATEOFSHIFT, MEMBERSHIPID
From WORK
Where Month(DATEOFSHIFT)=MONTH(GETDATE());

SQL: Getting the min date of a series of dates partitioning by if previous date is more than 1 day ago

I have a data import which happens every week and when it starts, lasts a couple of days. As a result, in the date column, I have multiple dates for each data import. I would like to get the min date of each import. Is this possible in SQL? Specifically, in Google BigQuery. Example:
date desired_output
4/25/17 4/25/17
4/26/17 4/25/17
4/27/17 4/25/17
5/2/17 5/2/17
5/3/17 5/2/17
5/10/17 5/10/17
5/16/17 5/16/17
5/17/17 5/16/17
5/23/17 5/23/17
5/24/17 5/23/17
5/30/17 5/30/17
5/31/17 5/30/17
6/5/17 6/5/17
6/6/17 6/6/17
You can identify groups of dates that are in order sequentially -- this is a gaps and islands problem. Perhaps this will do what you want:
select date,
min(date) over (partition by date_add(date, interval - seqnum_d day)) as desired_output
from (select t.*,
dense_rank() over (order by date) as seqnum_d
from t
) t
The date arithmetic identifies sequences of dates by subtracting a sequence -- voila! The result is a constant.
Note: This assumes that sequences of dates have gaps.
Also, I used dense_rank() so it can handle multiple entries on a single date.

use of week of year & subsquend in bigquery

I need to show distinct users per week. I have a date-visit column, and a user id, it is a big table with 1 billion rows.
I can change the date column from the CSVs to year,month, day columns. but how do I deduce the week from that in the query.
I can calculate the week from the CSV, but this is a big process step.
I also need to show how many distinct users visit day after day, looking for workaround as there is no date type.
any ideas?
To get the week of year number:
SELECT STRFTIME_UTC_USEC(TIMESTAMP('2015-5-19'), '%W')
20
If you have your date as a timestamp (i.e microseconds since the epoch) you can use the UTC_USEC_TO_DAY/UTC_USEC_TO_WEEK functions. Alternately, if you have an iso-formatted date string (e.g. "2012/03/13 19:00:06 -0700") you can call PARSE_UTC_USEC to turn the string into a timestamp and then use that to get the week or day.
To see an example, try:
SELECT LEFT((format_utc_usec(day)),10) as day, cnt
FROM (
SELECT day, count(*) as cnt
FROM (
SELECT UTC_USEC_TO_DAY(PARSE_UTC_USEC(created_at)) as day
FROM [publicdata:samples.github_timeline])
GROUP BY day
ORDER BY cnt DESC)
To show week, just change UTC_USEC_TO_DAY(...) to UTC_USEC_TO_WEEK(..., 0) (the 0 at the end is to indicate the week starts on Sunday). See the documentation for the above functions at https://developers.google.com/bigquery/docs/query-reference for more information.