query to find the maximum gap between dates - sql

I have a table with the name of each customer and date columns and want to write a query to give me the number of gap days for each user,
name date
ali 2022-01-01
ali 2022-01-04
ali 2022-01-05
ser 2022-03-01
the answer should be 3 for ali and for ser will be null.
here is what I tried:
select name ,min(date) over (partition by name order by date) start_date , max(date) over (partition by name order by date) end_date from table

One approach to achieve this is using a window function (like lag, lead) to find the prior/next day and then find the difference between the dates (current and prior, for example ) using datediff function. Something like this..
SELECT name,
MAX(datediff(date, PreviousDate)) AS Gap
FROM (SELECT name,
date,
LAG(date) OVER(PARTITION BY name ORDER BY date) as PreviousDate
FROM table t
GROUP BY name

my approach is to match every record with the closest date then find the maximum gap and left join with the original table to get the gap for each user.
here's MySQL version:
select
cu.name, max(cg.gap) maxgap
from
customers cu left join
(
select
c.name, datediff(min(cn.date), c.date) gap
from
customers c left join customers cn on c.name = cn.name
where
cn.date > c.date
group by
c.name, c.date
) cg
on cu.name = cg.name
group by
cu.name

Related

SQL: Difference between consecutive rows

Table with 3 columns: order id, member id, order date
Need to pull the distribution of orders broken down by No. of days b/w 2 consecutive orders by member id
What I have is this:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id+1;
It's not helping me completely as the output I need is:
You can use lag() to get the date of the previous order by the same customer:
select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o
When there are two rows for the same date, the smallest order_id is considered first. Also note that I fixed your datediff() syntax: in Hive, the function just takes two dates, and no unit.
I just don't get the logic you want to compute num_orders.
May be something like this:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id
where not exists (
select intermediate_order
from orders as intermedite_order
where intermediate_order.order_date < a1.order_date and intermediate_order.order_date > a2.order_date) ;

How to retrieve only top rows per date

I am trying to extract first row per date from a balance table. and I am trying to write an sql code but I cant get a clue from how can I do it..
I tried max, sum, group by.. but its not helping out.
Date Account Balance
4/6/2019 A 90
4/5/2019 B 80
4/4/2019 C 70
4/3/2019 C 60
4/2/2019 D 80
4/1/2019 D 100
So how can I make a query which will show the following results?
Account Balance in April
Account Balance
A 90
B 80
C 70
D 80
use analytic function first_value if your dbms support
select Account,
FIRST_VALUE(balance) OVER (partition by Account ORDER BY date desc) AS balance
from table_name
First group by account to get the max date for each account and then join to the table:
select t.acount, t.balance
from tablename t inner join (
select account, max(date) maxdate
from tablename
group by account
) g on g.account = t.account and g.maxdate = t.date
you could use a sunquery for max date
select account, balance
from (
select accont, balance, max(date)
from my_table
group by accont, balance
) t
A canonical method would be filtering in the where clause:
select b.*
from balances b
where b.date = (select max(b2.date)
from balances b2
where b2.account = b.account and
b2.date >= '2019-04-01' and
b2.date < '2019-05-01'
);
Specific databases may have other approaches to this problem. The above generally has very good performance, particularly with an index on balances(account, date).

SQL creating a pivot function

I have a SQL code that looks like this:
select cast(avg(age) as decimal(16,2)) as 'avg' From
(select distinct acct.Account, cast(Avg(year(getdate())- year(client_birth_date)) as decimal(16,2)) as 'Age'
from WF_PM_ACCT_DB DET
inner join WF_PM_ACCT_DET_DB ACCT
ON det.Account = acct.Account
where (acct_closing_date is null or acct_closing_date > '2017-01-01')
and Acct_Open_Date < '2017-01-01'
group by acct.Account
) x
Then basically what this give me is a simple one cell answer of the average age of accounts in the year Acct_Open_Date < '2017-01-01' . I am an ameture so i change the date everytime and run the query again and again to get the remaining year. Is there an easy way to say lets have all the years as column headings and just one row with the average account age in that year.
Please note that the account closing date being null means accounts never got close and i have to change it to less than the analysis year in order to get a true picture of the average account age that existed at that time
Any help is appreciated. Thanks.
You can run this for multiple dates by including them in a single derived table:
with dates as (
select cast('2017-01-01' as date) as yyyy union all
select cast('2016-01-01' as date)
)
select yyyy, cast(avg(age) as decimal(16,2)) as avg_age
From (select dates.yyyy, acct.Account,
cast(Avg(year(getdate())- year(client_birth_date)) as decimal(16,2)) as Age
from dates cross join
WF_PM_ACCT_DB DET inner join WF_PM_ACCT_DET_DB
ACCT
on det.Account = acct.Account
where (acct_closing_date is null or acct_closing_date > dates.yyyy) and
Acct_Open_Date < dates.yyyy
group by acct.Account, dates.yyyy
) x
group by yyyy
order by yyyy;

Summarize Table Based on Two Date Fields

I have a table that, in its simplified form, has two date fields and an amount field. One of the date fields is holds the order date, and one of the fields contains the shipped date. I've been asked to report on both the amounts ordered and shipped grouped by date.
I used a self join that seemed to be working fine, except I found that it doesn't work on dates where no new orders were taken, but orders were shipped. I'd appreciate any help figuring out how best to solve the problem. (See below)
Order_Date Shipped_Date Amount
6/1/2015 6/2/2015 10
6/1/2015 6/3/2015 15
6/2/2015 6/3/2015 17
The T-SQL statement I'm using is as follows:
select a.ddate, a.soldamt, b.shippedamt
from
(select order_date as ddate, sum(amount) as soldamt from TABLE group by order_date) a
left join
(select shipped_date as ddate, sum(amount) as shippedamt from TABLE group by shipped_date) b
on a.order_date = b.shipped_date
This results in:
ddate soldamt shippedamt
6/1/2015 15 0
6/2/2015 17 10
The amount shipped on 6/3/2015 doesn't appear, obviously because there are no new orders on that date.
It's important to note this is being done in a Visual FoxPro table using T-SQL syntax, so some of the features found in more popular databases do not exist (for example, PIVOT)
The simplest change would be to use a FULL OUTER JOIN instead of LEFT. A full join combines both right and left joins including unmatched records in both directions.
SELECT a.ddate, a.soldamt, b.shippedamt
FROM
(select order_date as ddate, sum(amount) as soldamt from TABLE group by order_date) a
FULL OUTER JOIN
(select shipped_date as ddate, sum(amount) as shippedamt from TABLE group by shipped_date) b
ON a.order_date = b.shipped_date
An other method (besides full outer join) is to use union all and an additional aggregation:
select ddate, sum(soldamt) as soldamt, sum(shippedamt) as shippedamt
from ((select order_date as ddate, sum(amount) as soldamt, 0 as shippedamt
from TABLE
group by order_date
) union all
(select shipped_date as ddate, 0, sum(amount) as shippedamt
from TABLE
group by shipped_date
)
) os
group by ddate;
This also results in fewer NULL values.

SQL Count Of Open Orders Each Day Between Two Dates

I've tried searching but it's likely I'm using the wrong keywords as I can't find an answer.
I'm trying to find the number of orders that are open between two dates and by employee. I have one table that shows a list of employees, another that shows a list of orders that contains an open and close date and also a dates table if that helps.
The employee and order tables joined will return something like:
employee order ref opened closed
a 123 01/01/2012 04/01/2012
b 124 02/01/2012 03/01/2012
a 125 02/01/2012 03/01/2012
And I need to transform this data into:
Date employee Count
01/01/2012 a 1
02/01/2012 a 2
02/01/2012 b 1
03/01/2012 a 2
03/01/2012 b 1
04/01/2012 a 1
I'm pulling the data from SQL server.
Any ideas?
Thanks
Nick
Join Dates to the result of the join between Employees and Orders, then group by dates and employees to obtain the counts, something like this:
SELECT
d.Date,
o.Employee,
COUNT(*) AS count
FROM Employees e
INNER JOIN Orders o ON e.ID = o.Employee
INNER JOIN Dates d ON d.Date BETWEEN o.Opened AND o.Closed
GROUP BY
d.Date,
o.Employee
My favorite way to do this counts the number of cumulative opens and the number of cumulative closes over time.
with cumopens as
(select employee, opened as thedate,
row_number() over (partition by employee order by opened) as cumopens,
0 as cumcloses
from eo
),
cumcloses as
(select employee, closed as thedate, 0 as cumopens,
row_number() over (partition by employee order by closed ) as cumcloses
from eo
)
select employee, c.thedate, max(cumopens), max(cumcloses),
max(cumopens) - max(cumcloses) as stillopened
from ((select *
from cumopens
) union all
(select *
from cumcloses
)
) c
group by employee, thedate
The only problem with this approach is that only dates where there is employee activity get reported. This works in your case.
The more general solution requires a sequence numbers to generate dates. For this, I often create one from some existing table with enough rows:
with nums as
(select row_number() over (partition by null order by null) as seqnum
from employees
)
select employee, dateadd(day, opened, seqnum) as thedate, count(*)
from eo join
nums
on datediff(day, opened, closed) < seqnum
group by employee, dateadd(day, opened, seqnum)
order by 1, 2
SELECT opened,employee,count(*)
FROM employee LEFT JOIN orders
WHERE opened < firstDate and opened > secondDate
GROUP BY opened,employee
or you can change the first condition in
WHERE opened BETWEEN firstDate and secondDate
Calling the result column count was a bit odd because it seems to be in fact a row number.
You can do that by using ROW_NUMBER.
The other interesting part is that you also want open date and close date as separate rows. Using a simple UNION will solve that.
WITH cte
AS (SELECT Row_number() OVER ( PARTITION BY employee
ORDER BY order_ref) count,
employee,
opened,
closed
FROM orders)
SELECT employee, opened date, count
FROM cte
UNION ALL
SELECT employee, closed date, count
FROM cte
ORDER BY Date,
employee
DEMO