Generate sales fact data from a date range - sql

I have a "placements" table with the following fields:
id,
name (string),
start_date (date),
end_date (date),
hourly_rate (float)
My goal is to run reports to see billing forecasts based on this placement data, where 8 hours are billed each workday (M-F.) The billing forecasts need to be able to be run weekly or monthly.
For example, a new placement is made for Jan 1st 2021 to Jan 31st 2021 with a hourly_rate of $50 and I am now able to run a report for an arbitrary time period (weekly, monthy, daily) and see the additional billings for that time period.
To do this, I wanted to use the placement data above to generate another set of data in a fact table that would look like this:
id,
placement_id,
date (date) -- or could be a reference to a date dimension table
amount (float) -- placement.hourly_rate x 8 hours
Is there a way to use Postgres to generate these new rows, or is there a better way to accomplish my goal?

You can use generate_series() and a lateral join:
select p.id, d.date, p.hourly_rate * 8 as amount
from placements p
cross join lateral generate_series(start_date, end_date, '1 day') d(date)

Related

Calculate the monthly average including the date where data is missing

I want to calculate the monthly average of some data using SQL query where the data resides in redshift DB.
The data is present in the following format in the table.
s_date | sales
------------+-------
2020-08-04 | 10
2020-08-05 | 20
---- | --
---- | --
The data may not be present for all the date in a month. If the data is not present for a day, it should be considered as 0.
Following query using AVG() function "group by" month as gives the average of based on the data on available date.
select trunc(date_trunc('MONTH', s_date)::timestamp) as month, avg(sales) from sales group by month;
However it does not consider the data for missing dates as 0. What should be the right query to calculate the monthly average as expected?
One more expectation is that, for the current month, the average should be calculated based on the data till today. So it should not consider entire month (like 30 or 31 days).
Regards,
Paul
Using a calendar table might be the easiest way to go here:
WITH dates AS (
SELECT date_trunc('day', t)::date AS dt
FROM generate_series('2020-01-01'::timestamp, '2020-12-31'::timestamp, '1 day'::interval) t
),
cte AS (
SELECT t.dt, COALESCE(SUM(s.sales), 0) AS sales
FROM dates t
LEFT JOIN sales s ON t.dt = s.s_date
GROUP BY t.dt
)
SELECT
LEFT(dt::text, 7) AS ym,
AVG(sales) AS avg_sales
FROM cte
GROUP BY
LEFT(dt::text, 7);
The logic here is to first generate an intermediate table in the second CTE which has one record for each data in your data set, along with the total sales for that date. Then, we aggregate by year/month, and report the average sales.

sql to find the weekdays in the month from the current day

please help me with this. using SQL server 2008
I need to find the number of sales done on the current day.
then find the weekday from current date and based on that find the average of the sales on all those particular weekdays in the last month
ex:
select count(sales) from salestable where orderdate= getdate()
where it gives the count of the sales done on the current date
then I need to find out the average of the sales done on the same weekday for ex if today is Sunday find the average of the sales done in the last month on all Sundays in that month.
I recommend that you borrow the data warehousing technique of creating a Calendar table that you pre-populate with 1 row for every date within the range you might need. You can add to it basically any column that is useful - in this case DayOfWeek and MonthID. Then you can eliminate date math entirely and use joins - sort of like this (not complete but points you in the right direction):
select count(salestable.sales) as salescount, a.salesavg
from salestable
join calendar on salestable.orderdate = calendar.calendardate
join (
select monthid, dayofweek, avg(salestable.sales) as salesavg
from salestable
join calendar on salestable.orderdate = calendar.calendardate
group by monthid, dayofweek) as a
on calendar.monthid = a.monthid and calendar.dayofweek = a.dayofweek
where calendar.calendardate = getdate()
You create and populate the calendar table once and reuse it every time you need to do date operations. Once you get used to this technique, you will NEVER go back to date math.
For this kind of queries are Common Table Expressions very usefull. Then you can use DATEPART function to get day of week.
This solution is also untested and intended to just point you in the right direction.
This solution uses a co-related sub-query to get the average sales.
select
order_date,
count(sales) total_sales,
(select avg(sales)
from sales_table
where order_date between dateadd(day,-30,#your_date) and #your_date
and datepart(WEEKDAY,order_date) = datepart(WEEKDAY,#your_date)
) avg_sales_mth
from sales_table
where order_date = #your_date

Calculating sales for a particular weekday in a given period in teradata

I have a table with transaction, date and their respective sales value. I need to calculate sum of Sales of all the distinct transactions on all Saturdays between date x and y. Teradata doesn't have a datename, datepart function. How can I do this?
I'm not Teradata expert, I don't even know it but I've found something. It's not a solution it's only suggestion of course, because you tell nothing about your db schema.
Source Dates and Times in Teradata (thru V2R4.1):
Computing the day of the week for a given date is not easy in SQL. If you need a weekday, I recommend that you look it up in the view sys_calendar.calendar (or join to it), thus:
select day_of_week
from sys_calendar.calendar
where calendar_date = date '2003-05-01';
day_of_week
-----------
5 [i.e. Thursday]
Did an inner join as
inner join sys_calendar.calendar as cal on cal.calendar_date=my_table.date where cal.day_of_week=7

SQL query to search by day/month/year/day&month/day&year etc

I have a PostgreSQL database with events. Each event has a datetime or an interval. Common data are stored in the events table and dates are stored in either events_dates (datetime field) or events_intervals (starts_date, ends_date both are date fields).
Sample datetime events
I was born on 1930-06-09
I got my driver's license on 1950-07-12
Christmas is on 1900-12-24 (1900 is reserved for yearly reoccuring events)
Sample interval events
I'll be on vacation from 2011-06-09 till 2011-07-23
Now I have a user that will want to look up these events. They will be able to fill out a form with from and to fields and in those fields they can enter full date, day, month, year, day and month, day and year, month and year in one or both fields.
Sample queries
From May 3 to 2012 December 21 will look for events between May 3 and December 21 whose max year is 2012
From day 3 to day 15 will look for events between the 3rd and 15th day of every month and year
From day 3 will look for events on the 3rd day of every month and year (same if from is empty and to is not)
From May 3 to June will look for events between May 3 and last day of June of every year
etc.
Any tips on how to write a maintanable query (it doesn't necessarily have to be fast)?
Some things that we thought of
write all possible from, to and day/month/year combinations - not maintable
compare dates as strings e.g. input: ____-06-__ where _ is a wildcard - I wouldn't have to generate all possible combinations but this doesn't work for intervals
You can write maintainable queries that additionally are fast by using the pg/temporal extension:
https://github.com/jeff-davis/PostgreSQL-Temporal
create index on events using gist(period(start_date, end_date));
select *
from events
where period(start_date, end_date) #> :date;
select *
from events
where period(start_date, end_date) && period(:start, :end);
You can even use it to disallow overlaps as a table constraint:
alter table events
add constraint overlap_excl
exclude using gist(period(start_date, end_date) WITH &&);
write all possible from, to and day/month/year combinations - not maintable
It's actually more maintainable than you might think, e.g.:
select *
from events
join generate_series(:start_date, :end_date, :interval) as datetime
on start_date <= datetime and datetime < end_date;
But it's much better to use the above-mentioned period type.

SQL queries with date types

I am wanting to do some queries on a sales table and a purchases table, things like showing the total costs, or total sales of a particular item etc.
Both of these tables also have a date field, which stores dates in sortable format(just the default I suppose?)
I am wondering how I would do things with this date field as part of my query to use date ranges such as:
The last year, from any given date of the year
The last 30 days, from any given day
To show set months, such as January, Febuary etc.
Are these types of queries possible just using a DATE field, or would it be easier to store months and years as separate tex fields?
If a given DATE field MY_DATE, you can perform those 3 operation using various date functions:
1. Select last years records
SELECT * FROM MY_TABLE
WHERE YEAR(my_date) = YEAR(CURDATE()) - 1
2. Last 30 Days
SELECT * FROM MY_TABLE
WHERE DATE_SUB(CURDATE(), INTERVAL 30 DAY) < MY_DATE
3. Show the month name
SELECT MONTHNAME(MY_DATE), * FROM MY_TABLE
I have always found it advantageous to store dates as Unix timestamps. They're extremely easy to sort by and query by range, and MySQL has built-in features that help (like UNIX_TIMESTAMP() and FROM_UNIXTIME()).
You can store them in INT(11) columns; when you program with them you learn quickly that a day is 86400 seconds, and you can get more complex ranges by multiplying that by a number of days (e.g. a month is close enough to 86400 * 30, and programming languages usually have excellent facilities for converting to and from them built into standard libraries).