CASE statement in the WHERE clause, with further conditioning after THEN - sql

I have a table of invoices and one of contracts that invoices are linked to. However, not every invoice is linked to a contract. I want to return all invoices linked to contracts which are within a timeframe (Starting before 2021-03-01), and also the invoices with no contracts linked at all. I believe this is something like: contract IS null OR CASE WHEN contract IS NOT NULL THEN (condition on timestamp), but I don't know how to write it. Note that the actual conditioning will be done on multiple columns, so I am looking for the general form of multiple sub-conditions under the conditions in the WHERE clause.
Example:
Invoice Table
InvoiceID
ContractID
1
NULL
2
1
3
NULL
4
2
5
3
6
4
Contract Table
ContractID
Contract Start Timestamp
1
2021-01-01 00:00:00
2
2021-02-01 00:00:00
3
2021-03-02 00:00:00
4
2021-05-01 00:00:00
Desired Result
InvoiceID
ContractID
1
NULL
2
1
3
NULL
4
2

maybe a simple left join like this can help you in filtering
select
I.*
from Invoice I
left outer join Contract C -- left join gets even the NULL contractid invoices
on C.ContractID= I.ContractID
where
C.[Contract Start Timestamp] IS NULL
OR C.[Contract Start Timestamp]< '2021-03-01'

Related

SQL For each value from a table, execute a query on another table depending on that value

I have two simple tables:
Table #1 apples_consumption
report_date
apples_consumed
2022-01-01
5
2022-02-01
7
2022-03-01
2
Table #2 hotel_visitors
visitor_id
check_in_date
check_out_date
1
2021-12-01
2022-02-01
2
2022-01-01
NULL
3
2022-02-01
NULL
4
2022-03-01
NULL
My purpose is to get a table which shows the ratio between number of visitors in the hotel to number of apples consumed by that time.
For the example above the desired query output should look like this:
report_date
visitors_count
apples_consumed
2022-01-01
2 -->(visitors #1, #2)
5
2022-02-01
3 -->(visitors #1, #2, #3)
7
2022-03-01
3 -->(visitors #2, #3, #4)
2
If I were to write a solution to this task using code I would go over each report_date from the apples_consumption table and count how many visitors have a lower/equal check_in_date than that report_date and also have a check_out_date = NULL or check_out_date greater/equal than that report_date
I came up with this query:
select
ac.report_date,
ac.apples_consumed,
(
select count(*)
from hotel_visitors hv
where
hv.check_in_date <= ac.report_date and
(hv.check_out_date is null or hv.check_out_date >= ac.report_date
) as visitors_count
from
apples_consumptions ac
order by
ac.report_date
The query above works but it is very inefficient (I can see its relatively long execution time for larger datasets and by the way its written [it runs the inner count(*) query for as many rows as the outer apples_consuptions table has)
I am looking for a more efficient way to achieve this result and your help will be highly appreciated!
It is very rarely a good idea to put a subselect in your select list.
Join your tables and then use an aggregate count:
select a.report_date, count(v.visitor_id) as visitors_count, a.apples_consumed
from apples_consumption a
left join hotel_visitors v
on a.report_date
between v.check_in_date
and coalesce(v.check_out_date, '9999-12-31')
group by a.report_date, a.apples_consumed
order by a.report_date;
db<>fiddle here

SQL Troubleshooting Help on Table Structure

I'm attempting to calculate average number of days between a customer's 1st and 3rd purchase, but struggling to get the data ordered in a way that will allow me to calculate.
I currently have the below data table. (Note: Order sequence number refers to the number order for that customer.)
Order Date
Customer Number
Order Sequence Number
2020-09-20
1
1
2021-01-20
1
2
2021-01-21
1
3
2020-10-01
2
1
2020-08-06
3
1
2020-09-06
3
2
2020-09-09
3
3
I've been trying to get the data to look like the following table. [To then be able to calculate datediff on the last two columns.]
Customer Number
Order Count
First Order Date
Third Order Date
1
3
2020-09-20
2021-01-21
2
1
2020-10-01
Null
3
3
2020-08-06
2020-09-09
I've completely messed up the code, but here's what I've been trying.
CREATE TABLE X2 as
SELECT
customer_number,
max(order_sequence_number) as order_count,
CASE
WHEN order_sequence_number = 1 then order_date
ELSE null
END as first_order_date,
CASE
WHEN order_sequence_number = 3 then order_date
ELSE null
END as third_order_date
FROM X1
GROUP BY customer_number;
Can someone please tell me what I'm missing? Thanks in advance!
You are on the right track but you need aggregation functions:
SELECT customer_number,
max(order_sequence_number) as order_count,
MAX(CASE WHEN order_sequence_number = 1 THEN order_date END) as first_order_date,
MAX(CASE WHEN order_sequence_number = 3 THEN order_date END) as third_order_date
FROM X1
GROUP BY customer_number;
To get the difference in days, you would just subtract the two expressions using whatever date arithmetic is supported in your database.

Counting subscriber numbers given events on SQL

I have a dataset on mysql in the following format, showing the history of events given some client IDs:
Base Data
Text of the dataset (subscriber_table):
user_id type created_at
A past_due 2021-03-27 10:15:56
A reactivate 2021-02-06 10:21:35
A past_due 2021-01-27 10:30:41
A new 2020-10-28 18:53:07
A cancel 2020-07-22 9:48:54
A reactivate 2020-07-22 9:48:53
A cancel 2020-07-15 2:53:05
A new 2020-06-20 20:24:18
B reactivate 2020-06-14 10:57:50
B past_due 2020-06-14 10:33:21
B new 2020-06-11 10:21:24
date_table:
full_date
2020-05-01
2020-06-01
2020-07-01
2020-08-01
2020-09-01
2020-10-01
2020-11-01
2020-12-01
2021-01-01
2021-02-01
2021-03-01
I have been struggling to come up with a query to count subscriber counts given a range of months, which are not necessary included in the event table either because the client is still subscribed or they cancelled and later resubscribed. The output I am looking for is this:
Output
date subscriber_count
2020-05-01 0
2020-06-01 2
2020-07-01 2
2020-08-01 1
2020-09-01 1
2020-10-01 2
2020-11-01 2
2020-12-01 2
2021-01-01 2
2021-02-01 2
2021-03-01 2
Reactivation and Past Due events do not change the subscription status of the client, however only the Cancel and New event do. If the client cancels in a month, they should still be counted as active for that month.
My initial approach was to get the latest entry given a month per subscriber ID and then join them to the premade date table, but when I have months missing I am unsure on how to fill them with the correct status. Maybe a lag function?
with last_record_per_month as (
select
date_trunc('month', created_at)::date order by created_at) as month_year ,
user_id ,
type,
created_at as created_at
from
subscriber_table
where
user_id in ('A', 'B')
order by
created_at desc
), final as (
select
month_year,
created_at,
type
from
last_record_per_month lrpm
right join (
select
date_trunc('month', full_date)::date as month_year
from
date_table
where
full_date between '2020-05-01' and '2021-03-31'
group by
1
order by
1
) dd
on lrpm.created_at = dd.month_year
and num = 1
order by
month_year
)
select
*
from
final
I do have a premade base table with every single date in many years to use as a joining table
Any help with this is GREATLY appreciated
Thanks!
The approach here is to have the subscriber rows with new connections as base and map them to the cancelled rows using a self join. Then have the date tables as base and aggregate them based on the number of users to get the result.
SELECT full_date, COUNT(DISTINCT user_id) FROM date_tbl
LEFT JOIN(
SELECT new.user_id,new.type,new.created_at created_at_new,
IFNULL(cancel.created_at,CURRENT_DATE) created_at_cancel
FROM subscriber new
LEFT JOIN subscriber cancel
ON new.user_id=cancel.user_id
AND new.type='new' AND cancel.type='cancel'
AND new.created_at<= cancel.created_at
WHERE new.type IN('new'))s
ON DATE_FORMAT(s.created_at_new, '%Y-%m')<=DATE_FORMAT(full_date, '%Y-%m')
AND DATE_FORMAT(s.created_at_cancel, '%Y-%m')>=DATE_FORMAT(full_date, '%Y-%m')
GROUP BY 1
Let me breakdown some sections
First up we need to have the subscriber table self joined based on user_id and then left table with rows as 'new' and the right one with 'cancel' new.type='new' AND cancel.type='cancel'
The new ones should always precede the canceled rows so adding this new.created_at<= cancel.created_at
Since we only care about the rows with new in the base table we filter out the rows in the WHERE clause new.type IN('new'). The result of the subquery would look something like this
We can then join this subquery with a Left join the date table such that the year and month of the created_at_new column is always less than equal to the full_date DATE_FORMAT(s.created_at_new, '%Y-%m')<=DATE_FORMAT(full_date, '%Y-%m') but greater than that of the canceled date.
Lastly we aggregate based on the full_date and consider the unique count of users
fiddle

T-SQL Programming . Common Table expression

I would need a help in the following scneario. I am using T-SQL
Following is my table details. Say the table name is #tempk
Customer Current_Month Contract Amount
201 2015-09-01 3 100
My requirement is to add 12 months from the current month.that is 2016-09-01. Assuming
I am getting the start date of the month. I need the data in the following format
Customer Renewal_Month Contract_months End_Month Amount
201 2015-09-01 3 2016-09-01 100
201 2015-12-01 3 2016-09-01 100
201 2015-03-01 3 2016-09-01 100
201 2015-06-01 3 2016-09-01 100
The contract column can have any values
The consquent records are incremental of contract columns from the previous records.
I am using the following query. I have a date dimension table called Dim_Date that has date,quareter,year,month etc..
WITH GetProrateCTE (Customer_ID,Renewal_Month,Contract_Months,End_Month,MRR) as
(SELECT Customer_ID,Renewal_Month,Contract_Months,DATEADD(month, 12,Renewal_Month) End_Month,MRR
from #tempk),
GetRenewalMonths (Customer_ID,Renewal_Month,Contract_Months,End_Month,MRR) as
(
SELECT A.Customer_ID,B.Month Renewal_Month,A.Contract_Months,A.End_Month,A.MRR
FROM GetProrateCTE A
INNER JOIN (SELECT Month from DW..Dim_Date B GROUP BY MONTH) B
ON B.Month between A.Renewal_Month and A.End_Month
)
SELECT G.Customer_ID,G.Renewal_Month,G.Contract_Months,G.End_Month,G.MRR
FROM GetRenewalMonths G
Could you please help me to achieve the result. Any help would be greatly appreciated.
I want to do this in Common table Expressions. or would it be better if I go cursor.
You can try in this way -
WITH CTE AS
(SELECT Customer,DATEADD(MM,DATEDIFF(MM,0,Current_Month), 0) AS Renewal_Month,Contract,DATEADD(YEAR,1,Current_Month) AS End_Month,Amount,1 AS Level FROM #tempk
UNION ALL
SELECT t.Customer,DATEADD(MONTH,t.Contract,c.Renewal_Month),t.Contract,DATEADD(YEAR,1,t.Current_Month) AS End_Month,t.Amount,Level + 1
FROM #tempk t join CTE c on t.customer = c.customer
WHERE Level < (12/t.Contract))
SELECT Customer,Renewal_Month,Contract AS Contract_months,End_Month,Amount
FROM CTE
Just append your logic of the date dimension table to this.

SQL: Select all from column A and add a value from column B if present

I'm having quite an easy problem with SQL, I just can't word it properly (therefore I didn't find anything in google and my title probably is misleading)
The problem is: I have a big table containing transaction informations in the form (ID, EmployeeID, Date, Value) (and some more, but only those matter currently) and a list of all EmployeeIDs. What I want is a result table showing all employee IDs with their aggregated value of transactions in a given timespan.
The problem is: How do I get those employees into the result table that don't have an entry for the given time period?
e.g.
ID EMPLID DATE VALUE
1 1 2013-01-01 1000
2 2 2013-02-02 2000
3 1 2013-01-03 3000
4 2 2013-04-01 2000
5 2 2013-03-01 2000
6 1 2013-02-01 4000
EMPLID NAME
1 bob
2 alice
And now I want the aggregated value of all transactions after 2013-03-01 like this
EMPLID VALUE
1 0 <- how to get this based on the employee table?
2 4000
The SQL Server in use is Firebird and I connect to it through JDBC (if that matters)
SELECT a.EmpID, a.Name,
COALESCE(SUM(b.Value), 0) TotalValue
FROM Employee a
LEFT JOIN Transactions b
ON a.EmpID = b.EmpID AND
b.Date >= '2013-03-01'
GROUP BY a.EmpID, a.Name
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins