Determine date gaps - sql

I have a SQL Server table that has a begin date and end date column that denote the beginning and ending range of a pricing schedule.
As the years go by, many versions of this same schedule will be created, with different beginning and ending dates.
What I would like to do is ensure that the user doesn't add, or, in some cases edit, a beginning or ending date in such a way that days would be excluded in the overall time frame.
So if the data looked like this:
Start | End
-----------+--------------
01/01/2015 | 06/30/2015
07/01/2015 | 09/30/2016
10/01/2016 | 12/31/2020
So, lets assume I attempted to revised the last row Start to 10/15/2016. That would create a gap of days between 10/01/2016 and 10/14/2016, but I have no idea who to write a script to do this for me. Ultimately, I would like a list of all missing dates, but even a count of days missing would be great.
Is this possible or am I approaching the issue incorrectly? Any ideas?
Using SQL Server 2012, if it matters.

I am guessing you don't want overlaps either. So, just use lag() and check that it is the date before:
select t.*
from (select t.*,
lag(end_date) over (order by start_date) as prev_end_date
from t
) t
where start_date <> dateadd(day, 1, prev_end_date)

Related

Oracle SQL: Count Weekdays of a Calendar Week

So I want to make a query to show me if a certain calendar week has all 7 Day.
It would be okay if it just returns the numbers 1-7.
The table that I have contains articles of the 3 month of 2020 but even so the first week just contains Wednesday to Sunday it still counts it as a calendar week.
With that select I would make pl/sql Script to check it and if yes something happens.
This is an example of the Table:
Date Articel_Id
14.10.2020 78
15.10.2020 80
16.10.2020 96
17.10.2020 100
18.10.2020 99
Can I Use to_char() to check if Calendar Week has all 7 Days ?
If yes, how ?
The challenging is actually defining the weeks. If you want to define them using the ISO standard, then aggregate:
select to_char(date, 'IYYYY-IW') as yyyyww,
count(distinct trunc(date)) as num_days
from t
group by to_char(date, 'IYYYY-IW')
order by yyyyww;
This counts the number of days per week. I'm not sure if you want to filter, have a flag, or what the result set should look like. For filtering, using a having clause, such as having count(distinct trunc(date)) = 7.

Count distinct customers, active within a year, for every week of the year

I am working with an existing E-commerce database. Actually, this process is usually done in Excel, but we want to try it directly with a query in PostgreSQL (version 10.6).
We define as an active customer a person who has bought at least once within 1 year. This means, if I analyze week 22 in 2020, an active customer will be the one that has bought at least once since week 22, 2019.
I want the output for each week of the year (2020). Basically what I need is ...
select
email,
orderdate,
id
from
orders_table
where
paid = true;
|---------------------|-------------------|-----------------|
| email | orderdate | id |
|---------------------|-------------------|-----------------|
| email1#email.com |2020-06-02 05:04:32| Order-2736 |
|---------------------|-------------------|-----------------|
I can't create new tables. And I would like to see the output like this:
Year| Week | Active customers
2020| 25 | 6978
2020| 24 | 3948
depending on whether there is a year and week column you can use a OVER (PARTITION BY ...) with extract:
SELECT
extract(year from orderdate),
extract(week from orderdate),
sum(1) as customer_count_in_week,
OVER (PARTITION BY extract(YEAR FROM TIMESTAMP orderdate),
extract(WEEK FROM TIMESTAMP orderdate))
FROM ordertable
WHERE paid=true;
Which should bucket all orders by year and week, thus showing the total count per week in a year where paid is true.
references:
https://www.postgresql.org/docs/9.1/tutorial-window.html
https://www.postgresql.org/docs/8.1/functions-datetime.html
if I analyze week 22 in 2020, an active customer will be the one that has bought at least once since week 22, 2019.
Problems on your side
This method has some corner case ambiguities / issues:
Do you include or exclude "week 22 in 2020"? (I exclude it below to stay closer to "a year".)
A year can have 52 or 53 full weeks. Depending on the current date, the calculation is based on 52 or 53 weeks, causing a possible bias of almost 2 %!
If you start the time range on "the same date last year", then the margin of error is only 1 / 365 or ~ 0.3 %, due to leap years.
A fixed "period of 365 days" (or 366) would eliminate the bias altogether.
Problems on the SQL side
Unfortunately, window functions do not currently allow the DISTINCT key word (for good reasons). So something of the form:
SELECT count(DISTINCT email) OVER (ORDER BY year, week
GROUPS BETWEEN 52 PRECEDING AND 1 PRECEDING)
FROM ...
.. triggers:
ERROR: DISTINCT is not implemented for window functions
The GROUPS keyword has only been added in Postgres 10 and would otherwise be just what we need.
What's more, your odd frame definition wouldn't even work exactly, since the number of weeks to consider is not always 52, as discussed above.
So we have to roll our own.
Solution
The following simply generates all weeks of interest, and computes the distinct count of customers for each. Simple, except that date math is never entirely simple. But, depending on details of your setup, there may be faster solutions. (I had several other ideas.)
The time range for which to report may change. Here is an auxiliary function to generate weeks of a given year:
CREATE OR REPLACE FUNCTION f_weeks_of_year(_year int)
RETURNS TABLE(year int, week int, week_start timestamp)
LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE
ROWS 52 COST 10 AS
$func$
SELECT _year, d.week::int, d.week_start
FROM generate_series(date_trunc('week', make_date(_year, 01, 04)::timestamp) -- first day of first week
, LEAST(date_trunc('week', localtimestamp), make_date(_year, 12, 28)::timestamp) -- latest possible start of week
, interval '1 week') WITH ORDINALITY d(week_start, week)
$func$;
Call:
SELECT * FROM f_weeks_of_year(2020);
It returns 1 row per week, but stops at the current week for the current year. (Empty set for future years.)
The calculation is based on these facts:
The first ISO week of the year always contains January 04.
The last ISO week cannot start after December 28.
Actual week numbers are computed on the fly using WITH ORDINALITY. See:
PostgreSQL unnest() with element number
Aside, I stick to timestamp and avoid timestamptz for this purpose. See:
Generating time series between two dates in PostgreSQL
The function also returns the timestamp of the start of the week (week_start), which we don't need for the problem at hand. But I left it in to make the function more useful in general.
Makes the main query simpler:
WITH weekly_customer AS (
SELECT DISTINCT
EXTRACT(YEAR FROM orderdate)::int AS year
, EXTRACT(WEEK FROM orderdate)::int AS week
, email
FROM orders_table
WHERE paid
AND orderdate >= date_trunc('week', timestamp '2019-01-04') -- max range for 2020!
ORDER BY 1, 2, 3 -- optional, might improve performance
)
SELECT d.year, d.week
, (SELECT count(DISTINCT email)
FROM weekly_customer w
WHERE (w.year, w.week) >= (d.year - 1, d.week) -- row values, see below
AND (w.year, w.week) < (d.year , d.week) -- exclude current week
) AS active_customers
FROM f_weeks_of_year(2020) d; -- (year int, week int, week_start timestamp)
db<>fiddle here
The CTE weekly_customer folds to unique customers per calendar week once, as duplicate entries are just noise for our calculation. It's used many times in the main query. The cut-off condition is based on Jan 04 once more. Adjust to your actual reporting period.
The actual count is done with a lowly correlated subquery. Could be a LEFT JOIN LATERAL ... ON true instead. See:
What is the difference between LATERAL and a subquery in PostgreSQL?
Using row value comparison to make the range definition simple. See:
SQL syntax term for 'WHERE (col1, col2) < (val1, val2)'

sql- move each date to 1st day of the month

I need to take every date to the first day of the month. For example if I have:
20140103 I need to have 20140101
I thought a good idea could be loaddate - difference between loaddate and 1st date and I wrote:
loaddate- DATEdiff(day, day(loaddate),loaddate)
But the result is wrong. How can I solve this???
Thanks
For SQL Server You can do:
SELECT CONVERT(VARCHAR(25),DATEADD(dd,-(DAY(loaddate)-1),loaddate),101)
For:
SELECT CONVERT(VARCHAR(25),DATEADD(dd,-(DAY(GETDATE())-1),GetDATE()),101)
You will get back: 09/01/2014

Earliest and Lastdate for each year in sql

I have a column with 3 columns. I have multiple records for a year. As you see some of my records as follows
ID stardate enddate
1 1/1/2010 5/3/2010
2 2/4/2010 NULL -**EDIT**
3 1/2/2011 5/6/2011
4 3/4/2011 NULL -**EDIT**
I want to get a result for the earliest date in that year and the last date in that year. So output could be like
**EDITED:** 1/1/2010 12/31/2010 - For Year 2010
**EDITED:** 1/2/2011 12/31/2011 - For Year 2011
How can i get that in a query?If you need more info,please ask. Thanks
EDIT: If for the year if one of the columns read NULL then I have to consider the last day of the year as the enddate. i.e.12/31/YYYY. And I need to do that for each year again.
Assuming you use DATE (or related) columns in a MySQL table, something like this should serve your request:
SELECT MIN(startdate),
MAX(enddate),
YEAR(startdate)
FROM my_table
GROUP BY YEAR(startdate);
This groups all entries by year (of the startdate) and show you the minimum and maximum entries for each year as you want. See also the documentation for the DATE function in MySQL.
There are similar date functions and possibilities if you are using an other database system. Usually you can easily find them by googling the database system and something like "date functions".
select MIN(stardate),max(enddate)
from [Tablename]
where YEAR(enddate)=2013

Add date without exceeding a month

I hope someone could help me on this.
I want to add a month to a database date, but I want to prevent two jumping over month on those days at the end.
For instance I may have:
Jan 31 2009
And I want to get
Feb 28 2009
and not
March 2 2009
Next date would be
March 28 2009
Jun 28 2009
etc.
Is there a function that already perform this kind of operation in oracle?
EDIT
Yeap. I want to copy each month all the records with some status to the next ( so the user don't have to enter again 2,000 rows each month )
I can fetch all the records and update the date manually ( well in an imperative way ) but I would rather let the SQL do the job.
Something like:
insert into the_table
select f1,f2,f3, f_date + 30 /* sort of ... :S */ from the_Table where date > ?
But the problem comes with the last day.
Any idea before I have to code something like this?
for each record in
createObject( record )
object.date + date blabala
if( date > 29 and if februrary and the moon and the stars etc etc 9
end
update.... et
EDIT:2
Add months did the trick.
now I just have this:
insert into my_table
select f1, add_months( f2, 1 ) from my_table where status = etc etc
Thanks for the help.
Oracle has a built-in function ADD_MONTHS that does exactly that:
SQL> select add_months(date '2008-01-31',1) from dual;
ADD_MONTHS(
-----------
29-FEB-2008
SQL> select add_months(date '2008-02-29',1) from dual;
ADD_MONTHS(
-----------
31-MAR-2008
I think you're looking for LAST_DAY:
http://download.oracle.com/docs/cd/B28359_01/olap.111/b28126/dml_functions_2006.htm
I just did:
select add_months(TO_DATE('30-DEC-08'), 2) from dual
and got
28-FEB-2009
No need to use LAST_DAY. If you went that route, you could create a function that:
1. takes a date
2. Changes the day to the first of the month.
3. Add a month.
4. Changes the day to the LAST_DAY for that month.
I think you'll have to write it on your own, My advice is first to evaluate the "last day of the month" with this method:
Add one month (not 30 days, one month!)
Find first day of the month (should be easy)
substract one day
Then compare it to your "plus x days" value, and choose the lowest one (I understood the logic behind the jump from 31/Jan to 28/Feb, but I don't get it for the jump from 28-Feb to 28-Mar)
It sounds like you want the current month plus one (with appropriate rollover in December)
and the minimum of the last day of that month and the current day.