Join table with filtering by date - sql

I have 2 tables. 1st - wallets, 2nd - wallet histories. Now I can make a selection by the sum of all completed deposits/withdrawal for each wallet.
Sql-code looks something like this:
SELECT w.*, wh.deposit, wh2.withdrawal, wh3.pending
FROM wallets w
LEFT JOIN (
SELECT wallet_id, SUM(amount) deposit
FROM wallet_histories
WHERE type = 'Deposit'
AND status = 'Completed'
AND timestamp BETWEEN '2019-08-02 00:00:00+00' AND '2019-08-03 00:00:00+00'
GROUP BY wallet_id
) wh ON w.id = wh.wallet_id
LEFT JOIN (
SELECT wallet_id, SUM(amount) withdrawal
FROM wallet_histories
WHERE type = 'Withdrawal'
AND status = 'Completed'
AND timestamp BETWEEN '2019-08-02 00:00:00+00' AND '2019-08-03 00:00:00+00'
GROUP BY wallet_id
) wh2 ON w.id = wh2.wallet_id
LEFT JOIN (
SELECT wallet_id, SUM(amount) pending
FROM wallet_histories
WHERE status = 'Pending'
GROUP BY wallet_id
) wh3 ON w.id = wh3.wallet_id
Next, I need to make this selection but with filtering by field enabled_at from the table wallets.
I supposed to make one more condition in WHERE:
AND timestamp >= w.enabled_at
but got error
LINE 10: AND timestamp >= w.enabled_at
^
HINT: There is an entry for table "w", but it cannot be referenced from this part of the query.
How i can avoid this error (maybe fully rebuild query)?
Thank you in advance.

You need to add the condition in each subquery. The results of the aggregations do not have enabled_at, so that column is not available in the outer query -- either in the on or where clauses.
That said, you can drastically simplify your query using conditional aggregation:
SELECT wallet_id,
SUM(amount) FILTER (WHERE type = 'Deposit') as deposit
SUM(amount) FILTER (WHERE type = 'Withdrawal') as withdrawal,
SUM(amount) FILTER (WHERE status = 'Pending') as pending
FROM wallet_histories
WHERE (type IN ('Deposit', 'Withdrawal') AND
status = 'Completed' AND
timestamp >= '2019-08-02' AND
timestamp < '2019-08-03'
) OR
status = 'Pending'
GROUP BY wallet_id;
This might make it easier to add additional conditions.

Related

Calculate rolling year totals in sql

I am gathering something that is essentially am "enrollment date" for users. The "enrollment date" is not stored in the database (for a reason too long to explain here), so I have to deduce it from the data. I then want to reuse this CTE in numerous places throughout another query to gather values such as "total orders 1 year before enrollment" and "total orders 1 year after enrollment".
I haven't gotten this code to run, as it's much more complex in my actual data set (this code is paraphrased from the actual code) and I have a feeling it's not the best way to do this. As you can see, my date conditionals are mostly just placeholders, but I think it should be obvious what I am trying to do.
That said, I think this would mostly work. My question is, is there a better way to do this? Additionally, could I combine the rolling year before and rolling year after into one table somehow? (maybe window functions)? This is part of a much bigger query, so the more consolidation I could do, the better it would seem.
For what it's worth, the subquery to derive the "enrollment date" is also more complex than shown here.
With enroll as (Select
user_id,
MIN(date) as e_date
FROM `orders` o
WHERE (subscribed = True)
group by user_id
)
Select*
from users
left join (select
user_id,
SUM(total_paid)
from orders where date > (select enroll.e_date where user_id = user_id) AND date < (select enroll.e_date where user_id = user_id + 365 days)
and order_type = 'special'
group by user_id
) as rolling_year_after on rolling_year_after.user_id = users.user_id
left join (select
user_id,
SUM(total_paid)
from orders where date < (select enroll.e_date where user_id = user_id) and date > (select enroll.e_date where user_id = user_id - 365 days)
and order_type = 'special'
group by user_id
) as rolling_year_before on rolling_year_before.user_id = users.user_id
Maybe something like this, not sure if its more performant, but looks a bit cleaner:
With enroll as (Select
user_id,
MIN(date) as e_date
FROM `orders` o
WHERE (subscribed = True)
group by user_id
)
, rolling_year as (
select
user_id,
SUM(CASE WHEN date between enroll.edate and enroll.edate + 365 days then (total_paid) else 0 end) as rolling_year_after,
SUM(CASE WHEN date between enroll.edate - 365 days and enroll.edate then (total_paid) else 0 end) as rolling_year_before
from orders
left join enroll
on order.user_id = enroll.user_id
where order_type = 'special'
group by user_id
)
Select *
from users
left join rolling_year
on users.user_id = rolling_year.user_id

SUM with left outer join gets inflated result

The following query gives me the MRR (monthly recurring revenue) for my customer:
with dims as (
select distinct subscription_id, country_name, product_name from revenue
where site_id = '18XLsHIVSJg' and subscription_id is not null
)
select to_date('2022-07-01') as occurred_date,
count(distinct srm.subscription_id) as subscriptions,
count(distinct srm.receiver_contact) as subscribers,
sum(srm.baseline_mrr) as mrr_srm
from subscription_revenue_mart srm
join dims d on d.subscription_id = srm.subscription_id
where srm.site_id = '18XLsHIVSJg'
-- MRR as of the day before ie June 30th
and to_date(srm.creation_date) < '2022-07-01'
-- Counting the subscriptions active after July 1st
and ((srm.subscription_status = 'SUBL.A') or
-- Counting the subscriptions canceled/deactivated after July 1st
(srm.subscription_status = 'SUBL.C' and (srm.deactivation_date >= '2022-07-01') or (srm.canceled_date >= '2022-07-01')) ) group by 1;
I get a total of $5922.15 but I need to add data from another table to capture upgrades/downgrades a customer makes on a product subscription. Using the same approach as above, I can query my "change" table thusly:
select subscription_id, sum(mrr_change_amount) mrr_change_amount,max(subscription_event_date) subscription_event_date from subscription_revenue_mart_change srmc
where site_id = '18XLsHIVSJg'
and to_date(srmc.creation_date) < '2022-07-01'
and ((srmc.subscription_status = 'SUBL.A')
or (srmc.subscription_status = 'SUBL.C' and (srmc.deactivation_date >= '2022-07-01') or (srmc.canceled_date >= '2022-07-01')))
group by 1;
I get a total of $3635.47
When I combine both queries into one, I get an inflated result:
with dims as (
select distinct subscription_id, country_name, product_name from revenue
where site_id = '18XLsHIVSJg' and subscription_id is not null
),
change as (
select subscription_id, sum(mrr_change_amount) mrr_change_amount,
-- there can be multiple changes per subscription
max(subscription_event_date) subscription_event_date from subscription_revenue_mart_change srmc
where site_id = '18XLsHIVSJg'
and to_date(srmc.creation_date) < '2022-07-01'
and ((srmc.subscription_status = 'SUBL.A')
or (srmc.subscription_status = 'SUBL.C' and (srmc.deactivation_date >= '2022-07-01') or (srmc.canceled_date >= '2022-07-01')))
group by 1
)
select to_date('2022-07-01') as occurred_date,
count(distinct srm.subscription_id) as subscriptions,
count(distinct srm.receiver_contact) as subscribers,
-- See comment RE: LEFT OUTER join
sum(coalesce(c.mrr_change_amount,srm.baseline_mrr)) as mrr
from subscription_revenue_mart srm
join dims d
on d.subscription_id = srm.subscription_id
-- LEFT OUTER join required for customers that never made a change
left outer join change c
on srm.subscription_id = c.subscription_id
where srm.site_id = '18XLsHIVSJg'
and to_date(srm.creation_date) < '2022-07-01'
and ((srm.subscription_status = 'SUBL.A')
or (srm.subscription_status = 'SUBL.C' and (srm.deactivation_date >= '2022-07-01') or (srm.canceled_date >= '2022-07-01'))) group by 1;
It should be $9557.62 ie (5922.15 + $3635.47) but the query outputs $16116.91, which is wrong.
I think the explode-implode syndrome may cause this.
I had designed my "change" CTE to prevent this by aggregating all the relevant fields but it's not working.
Can someone provide pointers on the best way to work around this issue?
It would help if you gave us sample data too, but I see a problem here:
sum(coalesce(c.mrr_change_amount,srm.baseline_mrr)) as mrr
Why COALESCE? That will give you one of the 2 numbers, but I guess what you want is:
sum(ifnull(c.mrr_change_amount, 0) + srm.baseline_mrr) as mrr
That's the best I can offer with what you've given us.

How to select data without using group?

My base data based on dealer code only but in one condition we need to select other field as well to matching the condition in other temp table how can i retrieve data only based on dealercode ith matching the condition on chassis no.
Below is the sample data:
This is how we have selected the data for the requirement:
---------------lastyrRenewalpolicy------------------
IF OBJECT_ID('TEMPDB..#LASTYRETEN') IS NOT NULL DROP TABLE #LASTYRETEN
select DEALERMASTERCODE , count(*) RENEWALEXPRPOLICY,SUM(NETOD_YEAR_PREM_PART_A) AS 'ACHIEVED-ODPREMIUM_RENEWAL' into #LASTYRETEN
from [dbo].[T_RE_POLICY_TRANSACTION]
where cast (InsPolicyCreatedDate as date) between #FirstDayC and #LastDayC
AND PolicyStatus= 'Renewal' AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE
-----------------lastrollower------------------------
IF OBJECT_ID('TEMPDB..#LASTYROLWR') IS NOT NULL DROP TABLE #LASTYROLWR
select DEALERMASTERCODE , count(*) ROLLOWEEXPRPOLICY ,SUM(NETOD_YEAR_PREM_PART_A) AS 'ACHIEVED-ODPREMIUM_ROLLOVER'
into #LASTYROLWR from [dbo].[T_RE_POLICY_TRANSACTION] where cast (InsPolicyCreatedDate as date) between #FirstDayC and #LastDayC
AND PolicyStatus= 'ROLLOVER' AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE
And continue with above flow Below is the other select statement which creating issue at the end due to grouping
:
-------------OTHERYRBASE(EXPIRYRENEWAL)--------------
IF OBJECT_ID('TEMPDB..#OTHERYRBASEEXPIRY') IS NOT NULL DROP TABLE #OTHERYRBASEEXPIRY
select DEALERMASTERCODE ,ChassisNo , count(*) RENEWALPOLICYEXPIRY
into #OTHERYRBASEEXPIRY
from [dbo].[T_RE_POLICY_TRANSACTION] where cast (PolicyExpiryDate as date) between '2020-08-01' and '2020-08-31'
and BASIC_PREM_TOTAL <> 0 AND PolicyStatus in ('Renewal','rollover') and BusinessType='jcb'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE,ChassisNo
-------------OTHERYRBASE(EXPIRYRENEWAL)--------------
IF OBJECT_ID('TEMPDB..#OTHERYRCON') IS NOT NULL DROP TABLE #OTHERYRCON
select OTE.DEALERMASTERCODE ,OTE.ChassisNo , count(*) OTHERYRCON into #OTHERYRCON
from [dbo].[T_RE_POLICY_TRANSACTION] OTE INNER JOIN #OTHERYRBASEEXPIRY EXP
ON OTE.ChassisNo=EXP.ChassisNo
where cast(CREATED_DATE as date) between '2020-06-01' and '2020-12-31' and BusinessType='jcb'
and OTE.BASIC_PREM_TOTAL <> 0 AND OTE.PolicyStatus = 'Renewal'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by OTE.DEALERMASTERCODE,OTE.ChassisNo
Thanks a lot in advance for helping and giving a solution very quickly ///
After taking a look at this code it seems possible there was an omitted JOIN condition in the last SELECT statement. In the code provided the JOIN condition is only on ChassisNo. The GROUP BY in the prior queries which populates the temporary table also included the DEALERMASTERCODE column. I'm thinking DEALERMASTERCODE should be added to the JOIN condition. Something like this
select OTE.DEALERMASTERCODE ,OTE.ChassisNo , count(*) OTHERYRCON
into #OTHERYRCON
from [dbo].[T_RE_POLICY_TRANSACTION] OTE
INNER JOIN #OTHERYRBASEEXPIRY EXP ON OTE.DEALERMASTERCODE=EXP.DEALERMASTERCODE
and OTE.ChassisNo=EXP.ChassisNo
where cast(CREATED_DATE as date) between '2020-06-01' and '2020-12-31'
and BusinessType='jcb'
and OTE.BASIC_PREM_TOTAL <> 0
AND OTE.PolicyStatus = 'Renewal'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 )
group by OTE.DEALERMASTERCODE,OTE.ChassisNo;

How to solve a nested aggregate function in SQL?

I'm trying to use a nested aggregate function. I know that SQL does not support it, but I really need to do something like the below query. Basically, I want to count the number of users for each day. But I want to only count the users that haven't completed an order within a 15 days window (relative to a specific day) and that have completed any order within a 30 days window (relative to a specific day). I already know that it is not possible to solve this problem using a regular subquery (it does not allow to change subquery values for each date). The "id" and the "state" attributes are related to the orders. Also, I'm using Fivetran with Snowflake.
SELECT
db.created_at::date as Date,
count(case when
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-15,Date) and dateadd(day,-1,Date)) then db.id end)
= 0) and
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-30,Date) and dateadd(day,-16,Date)) then db.id end)
> 0) then db.user end)
FROM
data_base as db
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
In other words, I want to transform the below query in a way that the "current_date" changes for each date.
WITH completed_15_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-15,current_date) and dateadd(day,-1,current_date)
group by User
),
completed_16_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-30,current_date) and dateadd(day,-16,current_date)
group by User
)
SELECT
date(db.created_at) as Date,
count(distinct case when comp_15.completadas = 0 and comp_16.completadas > 0 then comp_15.user end) as "Total Users Churn",
count(distinct case when comp_15.completadas > 0 then comp_15.user end) as "Total Users Active",
week(Date) as Week
FROM
data_base as db
left join completadas_15_days_before as comp_15 on comp_15.user = db.user
left join completadas_16_days_before as comp_16 on comp_16.user = db.user
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
Does anyone have a clue on how to solve this puzzle? Thank you very much!
The following should give you roughly what you want - difficult to test without sample data but should be a good enough starting point for you to then amend it to give you exactly what you want.
I've commented to the code to hopefully explain what each section is doing.
-- set parameter for the first date you want to generate the resultset for
set start_date = TO_DATE('2020-01-01','YYYY-MM-DD');
-- calculate the number of days between the start_date and the current date
set num_days = (Select datediff(day, $start_date , current_date()+1));
--generate a list of all the dates from the start date to the current date
-- i.e. every date that needs to appear in the resultset
WITH date_list as (
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date_item
from table (generator(rowcount => ($num_days)))
)
--Create a list of all the orders that are in scope
-- i.e. 30 days before the start_date up to the current date
-- amend WHERE clause to in/exclude records as appropriate
,order_list as (
SELECT created_at, rt_id
from data_base
where created_at between dateadd(day,-30,$start_date) and current_date()
and state = 'finished'
)
SELECT dl.date_item
,COUNT (DISTINCT ol30.RT_ID) AS USER_COUNT
,COUNT (ol30.RT_ID) as ORDER_COUNT
FROM date_list dl
-- get all orders between -30 and -16 days of each date in date_list
left outer join order_list ol30 on ol30.created_at between dateadd(day,-30,dl.date_item) and dateadd(day,-16,dl.date_item)
-- exclude records that have the same RT_ID as in the ol30 dataset but have a date between 0 amd -15 of the date in date_list
WHERE NOT EXISTS (SELECT ol15.RT_ID
FROM order_list ol15
WHERE ol30.RT_ID = ol15.RT_ID
AND ol15.created_at between dateadd(day,-15,dl.date_item) and dl.date_item)
GROUP BY dl.date_item
ORDER BY dl.date_item;

What's the proper SQL query to find a 'status change' before given date?

I have a table of logged 'status changes'. I need to find the latest status change for a user, and if it was a) a certain 'type' of status change (s.new_status_id), and b) greater than 7 days old (s.change_date), then include it in the results. My current query sometimes returns the second-to-latest status change for a given user, which I don't want -- I only want to evaluate the last one.
How can I modify this query so that it will only include a record if it is the most recent status change for that user?
Query
SELECT DISTINCT ON (s.applicant_id) s.applicant_id, a.full_name, a.email_address, u.first_name, s.new_status_id, s.change_date, a.applied_class
FROM automated_responses_statuschangelogs s
INNER JOIN application_app a on (a.id = s.applicant_id)
INNER JOIN accounts_siuser u on (s.person_who_modified_id = u.id)
WHERE now() - s.change_date > interval '7' day
AND s.new_status_id IN
(SELECT current_status
FROM application_status
WHERE status_phase_id = 'In The Flow'
)
ORDER BY s.applicant_id, s.change_date DESC, s.new_status_id, s.person_who_modified_id;
You can use row_number() to filter one entry per applicant:
select *
from (
select row_number() over (partition by applicant_id
order by change_date desc) rn
, *
from automated_responses_statuschangelogs
) as lc
join application_app a
on a.id = lc.applicant_id
join accounts_siuser u
on lc.person_who_modified_id = u.id
join application_status stat
on lc.new_status_id = stat.current_status
where lc.rn = 1
and stat.status_phase_id = 'In The Flow'
and lc.change_date < now() - interval '7' day