T-SQL Programming . Common Table expression - sql

I would need a help in the following scneario. I am using T-SQL
Following is my table details. Say the table name is #tempk
Customer Current_Month Contract Amount
201 2015-09-01 3 100
My requirement is to add 12 months from the current month.that is 2016-09-01. Assuming
I am getting the start date of the month. I need the data in the following format
Customer Renewal_Month Contract_months End_Month Amount
201 2015-09-01 3 2016-09-01 100
201 2015-12-01 3 2016-09-01 100
201 2015-03-01 3 2016-09-01 100
201 2015-06-01 3 2016-09-01 100
The contract column can have any values
The consquent records are incremental of contract columns from the previous records.
I am using the following query. I have a date dimension table called Dim_Date that has date,quareter,year,month etc..
WITH GetProrateCTE (Customer_ID,Renewal_Month,Contract_Months,End_Month,MRR) as
(SELECT Customer_ID,Renewal_Month,Contract_Months,DATEADD(month, 12,Renewal_Month) End_Month,MRR
from #tempk),
GetRenewalMonths (Customer_ID,Renewal_Month,Contract_Months,End_Month,MRR) as
(
SELECT A.Customer_ID,B.Month Renewal_Month,A.Contract_Months,A.End_Month,A.MRR
FROM GetProrateCTE A
INNER JOIN (SELECT Month from DW..Dim_Date B GROUP BY MONTH) B
ON B.Month between A.Renewal_Month and A.End_Month
)
SELECT G.Customer_ID,G.Renewal_Month,G.Contract_Months,G.End_Month,G.MRR
FROM GetRenewalMonths G
Could you please help me to achieve the result. Any help would be greatly appreciated.
I want to do this in Common table Expressions. or would it be better if I go cursor.

You can try in this way -
WITH CTE AS
(SELECT Customer,DATEADD(MM,DATEDIFF(MM,0,Current_Month), 0) AS Renewal_Month,Contract,DATEADD(YEAR,1,Current_Month) AS End_Month,Amount,1 AS Level FROM #tempk
UNION ALL
SELECT t.Customer,DATEADD(MONTH,t.Contract,c.Renewal_Month),t.Contract,DATEADD(YEAR,1,t.Current_Month) AS End_Month,t.Amount,Level + 1
FROM #tempk t join CTE c on t.customer = c.customer
WHERE Level < (12/t.Contract))
SELECT Customer,Renewal_Month,Contract AS Contract_months,End_Month,Amount
FROM CTE
Just append your logic of the date dimension table to this.

Related

how to aggregate one record multiple times based on condition

I have a bunch of records in the table below.
product_id produced_date expired_date
123 2010-02-01 2012-05-31
234 2013-03-01 2014-08-04
345 2012-05-01 2018-02-25
... ... ...
I want the output to display how many unexpired products currently we have at the monthly level. (Say, if a product expires on August 04, we still count it in August stock)
Month n_products
2010-02-01 10
2010-03-01 12
...
2022-07-01 25
2022-08-01 15
How should I do this in Presto or Hive? Thank you!
You can use below SQL.
Here we are using case when to check if a product is expired or not(produced_date >= expired_date ), if its expired, we are summing it to get count of product that has been expired. And then group that data over expiry month.
select
TRUNC(expired_date, 'MM') expired_month,
SUM( case when produced_date >= expired_date then 1 else 0 end) n_products
from mytable
group by 1
We can use unnest and sequence functions to create a derived table; Joining our table with this derived table, should give us the desired result.
Select m.month,count(product_id) as n_products
(Select
(select x
from unnest(sequence(Min(month(produced_date)), Max(month(expired_date)), Interval '1' month)) t(x)
) as month
from table) m
left join table t on m.month >= t.produced_date and m.month <= t.expired_date
group by 1
order by 1

SQL For each value from a table, execute a query on another table depending on that value

I have two simple tables:
Table #1 apples_consumption
report_date
apples_consumed
2022-01-01
5
2022-02-01
7
2022-03-01
2
Table #2 hotel_visitors
visitor_id
check_in_date
check_out_date
1
2021-12-01
2022-02-01
2
2022-01-01
NULL
3
2022-02-01
NULL
4
2022-03-01
NULL
My purpose is to get a table which shows the ratio between number of visitors in the hotel to number of apples consumed by that time.
For the example above the desired query output should look like this:
report_date
visitors_count
apples_consumed
2022-01-01
2 -->(visitors #1, #2)
5
2022-02-01
3 -->(visitors #1, #2, #3)
7
2022-03-01
3 -->(visitors #2, #3, #4)
2
If I were to write a solution to this task using code I would go over each report_date from the apples_consumption table and count how many visitors have a lower/equal check_in_date than that report_date and also have a check_out_date = NULL or check_out_date greater/equal than that report_date
I came up with this query:
select
ac.report_date,
ac.apples_consumed,
(
select count(*)
from hotel_visitors hv
where
hv.check_in_date <= ac.report_date and
(hv.check_out_date is null or hv.check_out_date >= ac.report_date
) as visitors_count
from
apples_consumptions ac
order by
ac.report_date
The query above works but it is very inefficient (I can see its relatively long execution time for larger datasets and by the way its written [it runs the inner count(*) query for as many rows as the outer apples_consuptions table has)
I am looking for a more efficient way to achieve this result and your help will be highly appreciated!
It is very rarely a good idea to put a subselect in your select list.
Join your tables and then use an aggregate count:
select a.report_date, count(v.visitor_id) as visitors_count, a.apples_consumed
from apples_consumption a
left join hotel_visitors v
on a.report_date
between v.check_in_date
and coalesce(v.check_out_date, '9999-12-31')
group by a.report_date, a.apples_consumed
order by a.report_date;
db<>fiddle here

SQL update statement to sum column in one table, then add the total to a different column/table

Evening all, hoping for some pointers with an SQL Server query if possible.
I have two tables in a database, example as follows:
PostedTran
PostedTranID AccountID PeriodID Value TransactionDate
1 100 120 100 2019-01-01
2 100 120 200 2020-01-01
3 100 130 300 2021-01-01
4 101 120 400 2020-01-01
5 101 130 500 2021-01-01
PeriodValue
PeriodValueID AccountID PeriodID ActualValue
10 100 120 500
11 101 120 600
I have a mismatch in the two tables, and I'm failing miserably in my attempts. From the PostedTran table, I'm trying to select all transaction lines dated before 2021-01-01, then sum the Value for each AccountID from the results. I then need to add that value to the existing ActualValue in the PeriodValue table.
So, in the above example, the ActualValue on PeriodValueID 10 will update to 800, and 11 to 1000. The PeriodID in this example is constant and will always be 120.
Thanks in advance for any help.
Since RDMS not mentioned, pseudo-sql looks like:
with DataSum as
(
select AccountID, PeriodID, sum(Value) as TotalValue
from PostedTran
where TransactionDate<'1/1/2021'
group by AccountID, PeriodID
)
update PeriodValue set ActualValue = ActualValue + ds.TotalVaue
from PeriodValue pv inner join DataSum ds
on pv.accountid=ds.accountid and pv.periodid=ds.periodid
The following should do what you ask. I haven't included PeriodId in the correlation as you did not specify it in your description, however you can just include it if it's required.
update pv set pv.ActualValue=pv.ActualValue + t.Value
from PeriodValue pv
cross apply (
select Sum(value) value
from PostedTran pt
where pt.AccountId=pv.AccountId and pt.TransactionDate <'20210101'
)t

Selecting the most recent date

I have data structured like this:
ID | Enrolment_Date | Appointment1_Date | Appointment2_Date | .... | Appointment150_Date |
112 01/01/2015 01/02/2015 01/03/2018 01/08/2018
113 01/06/2018 01/07/2018 NULL NULL
114 01/04/2018 01/05/2018 01/06/2018 NULL
I need a new variable which counts the number of months between the enrolment_date and the most recent appointment. The challenge is is that all individuals have a different number of appointments.
Update: I agree with the comments that this is poor table design and it needs to be reformatted. Could proposed solutions please include suggested code on how to transform the table?
Since the OP is currently stuck with this bad design, I will point out a temporary solution. As others have suggested, you really must change the structure here. For now, this will suffice:
SELECT '['+ NAME + '],' FROM sys.columns WHERE OBJECT_ID = OBJECT_ID ('TableA') -- find all columns, last one probably max appointment date
SELECT ID,
Enrolment_Date,
CASE WHEN Appointment150_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment150_Date)
WHEN Appointment149_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment149_Date)
WHEN Appointment148_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment148_Date)
WHEN Appointment147_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment147_Date)
WHEN Appointment146_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment146_Date)
WHEN Appointment145_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment145_Date)
WHEN Appointment144_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment144_Date) -- and so on
END AS NumberOfMonths
FROM TableA
This is a very ugly temporary solution and should be considered as such.
You will need to restructure your data, the given structure is poor database design. Create two separate tables - one called users and one called appointments. The users table contains the user id, enrollment date and any other specific user information. Each row in the appointments table contains the user's unique id and a specific appointment date. Structuring your tables like this will make it easier to write a query to get days/months since last appointment.
For example:
Users Table:
ID, Enrollment_Date
1, 2018-01-01
2, 2018-03-02
3, 2018-05-02
Appointments Table:
ID, Appointment_Date
1, 2018-01-02
1, 2018-02-02
1, 2018-02-10
2, 2018-05-01
You would then be able to write a query to join the two tables together and calculate the difference between the enrollment date and min value of the appointment date.
It is better if you can create two tables.
Enrolment Table (dbo.Enrolments)
ID | EnrolmentDate
1 | 2018-08-30
2 | 2018-08-31
Appointments Table (dbo.Appointments)
ID | EnrolmentID | AppointmentDate
1 | 1 | 2018-09-02
2 | 1 | 2018-09-03
3 | 2 | 2018-09-01
4 | 2 | 2018-09-03
Then you can try something like this.
If you want the count of months from Enrolment Date to the final appointment date then use below query.
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MAX(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A
And, If you want the count of months from Enrolment Date to the nearest appointment date then use below query.
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MIN(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A
Try this on sqlfiddle
You have a lousy data structure, as others have noted. You really one a table with one row per appointment. After all, what happens after the 150th appointment?
select t.id, t.Enrolment_Date,
datediff(month, t.Enrolment_Date, m.max_Appointment_Date) as months_diff
from t cross apply
(select max(Appointment_Date) as max_Appointment_Date
from (values (Appointment1_Date),
(Appointment2_Date),
. . .
(Appointment150_Date)
) v(Appointment_Date)
) m;

Select info from table where row has max date

My table looks something like this:
group date cash checks
1 1/1/2013 0 0
2 1/1/2013 0 800
1 1/3/2013 0 700
3 1/1/2013 0 600
1 1/2/2013 0 400
3 1/5/2013 0 200
-- Do not need cash just demonstrating that table has more information in it
I want to get the each unique group where date is max and checks is greater than 0. So the return would look something like:
group date checks
2 1/1/2013 800
1 1/3/2013 700
3 1/5/2013 200
attempted code:
SELECT group,MAX(date),checks
FROM table
WHERE checks>0
GROUP BY group
ORDER BY group DESC
problem with that though is it gives me all the dates and checks rather than just the max date row.
using ms sql server 2005
SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group
That works to get the max date..join it back to your data to get the other columns:
Select group,max_date,checks
from table t
inner join
(SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group)a
on a.group = t.group and a.max_date = date
Inner join functions as the filter to get the max record only.
FYI, your column names are horrid, don't use reserved words for columns (group, date, table).
You can use a window MAX() like this:
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
to get max dates per group alongside other data:
group date cash checks max_date
----- -------- ---- ------ --------
1 1/1/2013 0 0 1/3/2013
2 1/1/2013 0 800 1/1/2013
1 1/3/2013 0 700 1/3/2013
3 1/1/2013 0 600 1/5/2013
1 1/2/2013 0 400 1/3/2013
3 1/5/2013 0 200 1/5/2013
Using the above output as a derived table, you can then get only rows where date matches max_date:
SELECT
group,
date,
checks
FROM (
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
) AS s
WHERE date = max_date
;
to get the desired result.
Basically, this is similar to #Twelfth's suggestion but avoids a join and may thus be more efficient.
You can try the method at SQL Fiddle.
Using an in can have a performance impact. Joining two subqueries will not have the same performance impact and can be accomplished like this:
SELECT *
FROM (SELECT msisdn
,callid
,Change_color
,play_file_name
,date_played
FROM insert_log
WHERE play_file_name NOT IN('Prompt1','Conclusion_Prompt_1','silent')
ORDER BY callid ASC) t1
JOIN (SELECT MAX(date_played) AS date_played
FROM insert_log GROUP BY callid) t2
ON t1.date_played = t2.date_played
SELECT distinct
group,
max_date = MAX(date) OVER (PARTITION BY group), checks
FROM table
Should work.