Create rows for iteration in SQL - sql

I have to create a report of how long a ticket is open every first of the month, and another that shows how long it took to close a ticket. What is the best way to do this with SQL without creating an interval for each month? I am using SQL Server 2008 R2
My current data:
| Ticket | Start Date | End Date |
|--------|------------|------------|
| ABC | 5/8/2018 | 9/28/2018 |
| XYZ | 6/22/2018 | 10/15/2018 |
Expected result:
| Ticket | Start Date | End Date | Report Date | Ticket Age | Ticket Interval |
|--------|------------|------------|-------------|------------|-----------------|
| ABC | 5/8/2018 | 9/28/2018 | 6/1/2018 | 24 | |
| ABC | 5/8/2018 | 9/28/2018 | 7/1/2018 | 54 | |
| ABC | 5/8/2018 | 9/28/2018 | 8/1/2018 | 85 | |
| ABC | 5/8/2018 | 9/28/2018 | 9/1/2018 | 116 | |
| ABC | 5/8/2018 | 9/28/2018 | 10/1/2018 | | 143 |
| XYZ | 6/22/2018 | 10/15/2018 | 7/1/2018 | 9 | |
| XYZ | 6/22/2018 | 10/15/2018 | 8/1/2018 | 40 | |
| XYZ | 6/22/2018 | 10/15/2018 | 9/1/2018 | 71 | |
| XYZ | 6/22/2018 | 10/15/2018 | 10/1/2018 | 101 | |
| XYZ | 6/22/2018 | 10/15/2018 | 11/1/2018 | | 115 |

You can use recursive CTEs:
with cte as (
select ticket, sdate, edate, dateadd(month, 1, dateadd(day, 1 - day(sdate), sdate)) as reportdate
from t
union all
select ticket, sdate, edate, dateadd(month, 1, reportdate)
from cte
where reportdate <= edate
)
select cte.*, datediff(day, sdate, reportdate) as ticketage,
(case when datediff(month, edate, reportdate) = 1 then datediff(day, sdate, edate) end) as interval
from cte
order by ticket, reportdate;
I included the ticket age on the last month for the ticket. You can use a similar case expression if you really don't want it.
Here is a db<>fiddle.

Related

How to Do Data-Grouping in BigQuery?

I have list of database that needed to be grouped. I've successfully done this by using R, yet now I have to do this by using BigQuery. The data is shown as per following table
| category | sub_category | date | day | timestamp | type | cpc | gmv |
|---------- |-------------- |----------- |----- |------------- |------ |------ |--------- |
| ABC | ABC-1 | 2/17/2020 | Mon | 11:37:36 PM | BI | 1.94 | 252,293 |
| ABC | ABC-1 | 2/17/2020 | Mon | 11:37:39 PM | RT | 1.94 | 252,293 |
| ABC | ABC-1 | 2/17/2020 | Mon | 11:38:29 PM | RT | 1.58 | 205,041 |
| ABC | ABC-1 | 2/18/2020 | Tue | 12:05:14 AM | BI | 1.6 | 208,397 |
| ABC | ABC-1 | 2/18/2020 | Tue | 12:05:18 AM | RT | 1.6 | 208,397 |
| ABC | ABC-1 | 2/18/2020 | Tue | 12:05:52 AM | RT | 1.6 | 208,397 |
| ABC | ABC-1 | 2/18/2020 | Tue | 12:06:33 AM | BI | 1.55 | 201,354 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:55:47 PM | PP | 1 | 129,282 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:56:23 PM | PP | 0.98 | 126,928 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:57:19 PM | PP | 0.98 | 126,928 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:57:34 PM | PP | 0.98 | 126,928 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:58:46 PM | PP | 0.89 | 116,168 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:59:27 PM | PP | 0.89 | 116,168 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 11:59:51 PM | RT | 0.89 | 116,168 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 12:00:57 AM | BI | 0.89 | 116,168 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 12:01:11 AM | PP | 0.89 | 116,168 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 12:03:01 AM | PP | 0.89 | 116,168 |
| XYZ | XYZ-1 | 2/17/2020 | Mon | 12:12:42 AM | RT | 1.19 | 154,886 |
I wanted to group the rows. A row that has <= 8 minutes timestamp-difference with the next row will be grouped as one row with below output example:
| category | sub_category | date | day | time | start_timestamp | end_timestamp | type | cpc | gmv |
|---------- |-------------- |----------------------- |--------- |---------- |--------------------- |--------------------- |---------- |------ |--------- |
| ABC | ABC-1 | 2/17/2020 | Mon | 23:37:36 | (02/17/20 23:37:36) | (02/17/20 23:38:29) | BI|RT | 1.82 | 236,542 |
| ABC | ABC-1 | 2/18/2020 | Tue | 0:05:14 | (02/18/20 00:05:14) | (02/18/20 00:06:33) | BI|RT | 1.59 | 206,636 |
| XYZ | XYZ-1 | 02/17/2020|02/18/2020 | Mon|Tue | 0:06:21 | (02/17/20 23:55:47) | (02/18/20 00:12:42) | PP|RT|BI | 0.95 | 123,815 |
There were some new-generated fields as per below:
| fields | definition |
|----------------- |-------------------------------------------------------- |
| day | Day of the row (combination if there's different days) |
| time | Start of timestamp |
| start_timestamp | Start timestamp of the first row in group |
| end_timestamp | Start timestamp of the last row in group |
| type | Type of Row (combination if there's different types) |
| cpc | Average CPC of the Group |
| gwm | Average GMV of the Group |
Could anyone help me to make the query as per above requirements?
Thank you
This is a gaps and island problem. Here is a solution that uses lag() and a cumulative sum() to define groups of adjacent records with less than 8 minutes gap; the rest is aggregation.
select
category,
sub_category,
string_agg(distinct day, '|' order by dt) day,
min(dt) start_dt,
max(dt) end_dt,
string_agg(distinct type, '|' order by dt) type,
avg(cpc) cpc,
avg(gwm) gwm
from (
select
t.*,
sum(case when dt <= datetime_add(lag_dt, interval 8 minute) then 0 else 1 end)
over(partition by category, sub_category order by dt) grp
from (
select
t.*,
lag(dt) over(partition by category, sub_category order by dt) lag_dt
from (
select t.*, datetime(date, timestamp) dt
from mytable t
) t
) t
) t
) t
group by category, sub_category, grp
Note that you should not be storing the date and time parts of your timestamps in separated columns: this makes the logic more complicated when you need to combine them (I added another level of nesting to avoid repeated conversions, which would have obfuscated the code).

How to Create a Flag Based on Date Values in Hive

I have a sample table as follows:
| name | startdate | enddate | flg |
|-------|-----------|------------|-----|
| John | 6/1/2018 | 7/1/2018 | |
| John | 10/1/2018 | 11/1/2018 | |
| John | 12/1/2018 | 12/20/2018 | |
| Ron | 3/1/2017 | 9/1/2017 | |
| Ron | 5/1/2018 | 10/1/2018 | |
| Jacob | 6/10/2018 | 6/12/2018 | |
What I want in the output: If a person has a 'startdate' within 60 days (or 2 months) of an 'enddate' values; then set the flg as 1 for that person. else have the flg as 0.
For example: John has a record of startdate on December 1st; which is within 60 days of one of the enddate for this person (November 1st 2018). So, the flg for this person is set to 1.
So, the output should look like as:
| Name | startdate | enddate | flg |
|-------|-----------|------------|-----|
| John | 6/1/2018 | 7/1/2018 | 1 |
| John | 10/1/2018 | 11/1/2018 | 1 |
| John | 12/1/2018 | 12/20/2018 | 1 |
| Ron | 3/1/2017 | 9/1/2017 | 0 |
| Ron | 5/1/2018 | 10/1/2018 | 0 |
| Jacob | 6/10/2018 | 6/12/2018 | 0 |
Any idea please?
Date Functions: Use datediff and case
select Name,startdate,enddate,
case when datediff(enddate,startdate) < 60 then 1 else 0 end flag
from table
If you are comparing the previous row's enddate, use lag()
select Name,startdate,enddate,
case when datediff(startdate,prev_enddate) < 60 then 1 else 0 end flag
from
(
select Name,startdate,enddate,
lag(endate) over(partition by Name order by startdate,enddate) as prev_enddate
from table
) t
Use lag to get the enddate of the previous row (per name). After this the flag can be set per name using max window function with a case expression that checks to see if the 60 day diff is satisfied at least once per name.
select name
,startdate
,enddate
,max(case when datediff(startdate,prev_end_dt) < 60 then 1 else 0 end) over(partition by name) as flag
from (select t.*
,lag(enddate) over(partition by name order by startdate) as prev_end_dt
from table t
) t

SQL Add summary row for all days with same id

We have the following results table from this query but how can we add a summary row to sum all of the days for the same ad id, as seen in the desired results table? Thanks.
Query:
SELECT
right(ad_id,6) AS ad_id,
CAST(date_start AS DATE) AS "Day",
objective,
SUM(impressions) AS Impressions,
sum(clicks) AS Clicks
FROM ads
WHERE date_start >= '2018-05-01' AND date_start < '2018-06-01'
GROUP BY ad_id, CAST(date_start AS DATE), objective
Order by ad_id, CAST(date_start AS DATE) desc
Results table:
+--------+----------+-------------+-------------+--------+
| ad_id | day | objective | impressions | clicks |
+--------+----------+-------------+-------------+--------+
| 36911 | 5/2/2018 | CONVERSIONS | 16689 | 160 |
| 36911 | 5/1/2018 | CONVERSIONS | 4223 | 59 |
| 37111 | 5/2/2018 | CONVERSIONS | 1964 | 9 |
| 37111 | 5/1/2018 | CONVERSIONS | 1409 | 19 |
| 279311 | 5/3/2018 | LINK_CLICKS | 309 | 10 |
| 279311 | 5/2/2018 | LINK_CLICKS | 2816 | 19 |
| 279311 | 5/1/2018 | LINK_CLICKS | 5876 | 66 |
| 279511 | 5/3/2018 | LINK_CLICKS | 3551 | 86 |
| 279511 | 5/2/2018 | LINK_CLICKS | 3334 | 76 |
| 279511 | 5/1/2018 | LINK_CLICKS | 17798 | 508 |
+--------+----------+-------------+-------------+--------+
Desired results table with summary row:
+--------+----------+-------------+-------------+--------+
| ad_id | day | objective | impressions | clicks |
+--------+----------+-------------+-------------+--------+
| 36911 | All | CONVERSIONS | 20912 | 219 |
| 36911 | 5/2/2018 | CONVERSIONS | 16689 | 160 |
| 36911 | 5/1/2018 | CONVERSIONS | 4223 | 59 |
| 37111 | All | CONVERSIONS | 3373 | 28 |
| 37111 | 5/2/2018 | CONVERSIONS | 1964 | 9 |
| 37111 | 5/1/2018 | CONVERSIONS | 1409 | 19 |
| 279311 | All | LINK_CLICKS | 9001 | 95 |
| 279311 | 5/3/2018 | LINK_CLICKS | 309 | 10 |
| 279311 | 5/2/2018 | LINK_CLICKS | 2816 | 19 |
| 279311 | 5/1/2018 | LINK_CLICKS | 5876 | 66 |
| 279511 | All | LINK_CLICKS | 24683 | 670 |
| 279511 | 5/3/2018 | LINK_CLICKS | 3551 | 86 |
| 279511 | 5/2/2018 | LINK_CLICKS | 3334 | 76 |
| 279511 | 5/1/2018 | LINK_CLICKS | 17798 | 508 |
+--------+----------+-------------+-------------+--------+
Use grouping sets:
SELECT COALESCE(right(ad_id, 6), 'All') AS ad_id,
CAST(date_start AS DATE) AS "Day",
objective,
SUM(impressions) AS Impressions,
sum(clicks) AS Clicks
FROM ads
WHERE date_start >= '2018-05-01' AND date_start < '2018-06-01'
GROUP BY GROUPING SETS ( (ad_id), (ad_id, CAST(date_start AS DATE), objective) )
Order by ad_id, CAST(date_start AS DATE) desc;
In earlier versions of Postgres, use a CTE and union all:
with t as (
SELECT right(ad_id, 6) AS ad_id,
CAST(date_start AS DATE) AS "Day",
objective,
SUM(impressions) AS Impressions,
sum(clicks) AS Clicks
FROM ads
WHERE date_start >= '2018-05-01' AND date_start < '2018-06-01'
GROUP BY GROUPING SETS (ad_id, CAST(date_start AS DATE), objective)
)
select *
from t
union all
select ad_id, NULL, 'All', sum(impressions), sum(clicks)
from t
group by ad_id
order by 1, 2 desc;

How to check dates condition from one table to another in SQL

Which way we can use to check and compare the dates from one table to another.
Table : inc
+--------+---------+-----------+-----------+-------------+
| inc_id | cust_id | item_id | serv_time | inc_date |
+--------+---------+-----------+-----------+-------------+
| 1 | john | HP | 40 | 17-Apr-2015 |
| 2 | John | HP | 60 | 10-Jan-2016 |
| 3 | Nick | Cisco | 120 | 11-Jan-2016 |
| 4 | samanta | EMC | 180 | 12-Jan-2016 |
| 5 | Kerlee | Oracle | 40 | 13-Jan-2016 |
| 6 | Amir | Microsoft | 300 | 14-Jan-2016 |
| 7 | John | HP | 120 | 15-Jan-2016 |
| 8 | samanta | EMC | 20 | 16-Jan-2016 |
| 9 | Kerlee | Oracle | 10 | 2-Feb-2017 |
+--------+---------+-----------+-----------+-------------+
Table: Contract:
+-----------+---------+----------+------------+
| item_id | con_id | Start | End |
+-----------+---------+----------+------------+
| Dell | DE2015 | 1/1/2015 | 12/31/2015 |
| HP | HP2015 | 1/1/2015 | 12/31/2015 |
| Cisco | CIS2016 | 1/1/2016 | 12/31/2016 |
| EMC | EMC2016 | 1/1/2016 | 12/31/2016 |
| HP | HP2016 | 1/1/2016 | 12/31/2016 |
| Oracle | OR2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2017 | 1/1/2017 | 12/31/2017 |
+-----------+---------+----------+------------+
Result:
+-------+---------+---------+--------------+
| Calls | Cust_id | Con_id | Tot_Ser_Time |
+-------+---------+---------+--------------+
| 2 | John | HP2016 | 180 |
| 2 | samanta | EMC2016 | 200 |
| 1 | Nick | CIS2016 | 120 |
| 1 | Amir | MS2016 | 300 |
| 1 | Oracle | OR2016 | 40 |
+-------+---------+---------+--------------+
MY Query:
select count(inc_id) as Calls, inc.cust_id, contract.con_id,
sum(inc.serv_time) as tot_serv_time
from inc inner join contract on inc.item_id = contract.item_id
where inc.inc_date between '2016-01-01' and '2016-12-31'
group by inc.cust_id, contract.con_id
The result from inc table with filter between 1-jan-2016 to 31-Dec-2016 with
count of inc_id based on the items and its contract start and end dates .
If I understand correctly your problem, this query will return the desidered result:
select
count(*) as Calls,
inc.cust_id,
contract.con_id,
sum(inc.serv_time) as tot_serv_time
from
inc inner join contract
on inc.item_id = contract.item_id
and inc.inc_date between contract.start and contract.end
where
inc.inc_date between '2016-01-01' and '2016-12-31'
group by
inc.cust_id,
contract.con_id
the question is a little vague so you might need some adjustments to this query.
select
Calls = count(*)
, Cust = i.Cust_id
, Contract = c.con_id
, Serv_Time = sum(Serv_Time)
from inc as i
inner join contract as c
on i.item_id = c.item_id
and i.inc_date >= c.[start]
and i.inc_date <= c.[end]
where c.[start]>='20160101'
group by i.Cust_id, c.con_id
order by i.Cust_Id, c.con_id
returns:
+-------+---------+----------+-----------+
| Calls | Cust | Contract | Serv_Time |
+-------+---------+----------+-----------+
| 1 | Amir | MS2016 | 300 |
| 2 | John | HP2016 | 180 |
| 1 | Kerlee | OR2016 | 40 |
| 1 | Nick | CIS2016 | 120 |
| 2 | samanta | EMC2016 | 200 |
+-------+---------+----------+-----------+
test setup: http://rextester.com/WSYDL43321
create table inc(
inc_id int
, cust_id varchar(16)
, item_id varchar(16)
, serv_time int
, inc_date date
);
insert into inc values
(1,'john','HP', 40 ,'17-Apr-2015')
,(2,'John','HP', 60 ,'10-Jan-2016')
,(3,'Nick','Cisco', 120 ,'11-Jan-2016')
,(4,'samanta','EMC', 180 ,'12-Jan-2016')
,(5,'Kerlee','Oracle', 40 ,'13-Jan-2016')
,(6,'Amir','Microsoft', 300 ,'14-Jan-2016')
,(7,'John','HP', 120 ,'15-Jan-2016')
,(8,'samanta','EMC', 20 ,'16-Jan-2016')
,(9,'Kerlee','Oracle', 10 ,'02-Feb-2017');
create table contract (
item_id varchar(16)
, con_id varchar(16)
, [Start] date
, [End] date
);
insert into contract values
('Dell','DE2015','20150101','20151231')
,('HP','HP2015','20150101','20151231')
,('Cisco','CIS2016','20160101','20161231')
,('EMC','EMC2016','20160101','20161231')
,('HP','HP2016','20160101','20161231')
,('Oracle','OR2016','20160101','20161231')
,('Microsoft','MS2016','20160101','20161231')
,('Microsoft','MS2017','20170101','20171231');

SQL Server aggregate over range of dates

I am using SQL Server 2014. I need to aggregate totals (sum total) over a range of dates that are partitioned or grouped by customer and location. The key is to get all the adjustment amounts and sum them up as they apply to a billing transaction date.
So all adjustments after the last bill date, but less than the next bill date need to get summed up and presented nicely along with the bill amount.
See example:
+------------------+------------+------------+------------------+--------------------+
| TRANSACTION_TYPE | CUSTOMERID | LOCATIONID | TRANSACTION DATE | TRANSACTION AMOUNT |
+------------------+------------+------------+------------------+--------------------+
| bill | 215 | 102 | 7/7/2016 | $100.00 |
| bill | 215 | 102 | 6/6/2016 | $121.00 |
| adj | 215 | 102 | 6/1/2016 | $22.00 |
| adj | 215 | 102 | 5/8/2016 | $0.35 |
| adj | 215 | 102 | 5/7/2016 | $5.00 |
| bill | 215 | 102 | 5/6/2016 | $115.00 |
| bill | 215 | 102 | 4/7/2016 | $200.00 |
| adj | 215 | 102 | 4/2/2016 | $4.35 |
| adj | 215 | 102 | 4/1/2016 | $(0.50) |
| adj | 215 | 102 | 3/28/2016 | $33.00 |
| bill | 215 | 102 | 3/28/2016 | $75.00 |
| adj | 215 | 102 | 3/5/2016 | $0.33 |
| bill | 215 | 102 | 3/3/2016 | $99.00 |
+------------------+------------+------------+------------------+--------------------+
What I would like to see is the following:
+------------------+------------+------------+------------------+-------------+-------------------+
| TRANSACTION_TYPE | CUSTOMERID | LOCATIONID | TRANSACTION DATE | BILL AMOUNT | ADJUSTMENT AMOUNT |
+------------------+------------+------------+------------------+-------------+-------------------+
| bill | 215 | 102 | 7/7/2016 | $100.00 | $- |
| bill | 215 | 102 | 6/6/2016 | $121.00 | $27.35 |
| bill | 215 | 102 | 5/6/2016 | $115.00 | $- |
| bill | 215 | 102 | 4/7/2016 | $200.00 | $36.85 |
| bill | 215 | 102 | 3/28/2016 | $75.00 | $0.33 |
| bill | 215 | 102 | 3/3/2016 | $99.00 | $- |
+------------------+------------+------------+------------------+-------------+-------------------+
You need to:
first conceive the table as two (virtual) sub-tables, on the TransactionType;
then use the LEAD function to get the date range of adjustments to be applied; and
finally perform a eft join.
Untested SQL below:
with
BillData as (
select
TransactionType,
CustomerID,
LocationID,
TransactionDate,
TransactionAmount,
lead(TransactionDate, 1) over (partition by CustomerID
order by TransactionDate) as NextDate
from #data bill
where TransactionType = 'bill'
),
AdjData as (
select
CustomerID,
TransactionDate,
sum(TransactionAmount) as AdjAmount
from #data adj
where TransactionType = 'adj'
)
select
bill.TransactionType,
bill.CustomerID,
bill.LocationID,
bill.TransactionDate,
sum(TransactionAmount) as BillAmount,
sum(AdjAmount) as AdjAmount
from BillData bill
left join AdjData adj
on adj.CustomerID = bill.CustomerID
and bill.TransactionDate <= adj.TransactionDate
and adj.TransactionDate < bill.NextDate
group by
bill.TransactionType,
bill.CustomerID,
bill.LocationID,
bill.TransactionDate
;
This is what I ended up doing:
select
bill.TransactionType,
bill.CustomerID,
bill.LocationID,
bill.TransactionDate,
TransactionAmount as BillAmount,
sum(AdjAmount) as AdjAmount
from
(
select
TransactionType,
CustomerID,
LocationID,
TransactionDate,
TransactionAmount,
lag(TransactionDate, 1) over (partition by CustomerID, LocationID
order by TransactionDate) as PreviousDate --NextDate
from test1
where TransactionType = 'bill'
) as bill
left join
(
select
CustomerID,
LocationID,
TransactionDate,
TransactionAmount as AdjAmount
from test1
where TransactionType = 'adj'
) as adj
ON
adj.CustomerID = bill.CustomerID
and adj.LocationID = bill.LocationID
and adj.TransactionDate >= bill.PreviousDate
and adj.TransactionDate < bill.TransactionDate
group by
bill.TransactionType,
bill.CustomerID,
bill.LocationID,
bill.TransactionDate,
bill.TransactionAmount
order by 4 desc