Combining queries into one final table with business date calculation - sql

I am new to sql and I have two measures that were done in tableau. The measure is as followed
{ FIXED [Client ID]: MAX(IF [Appt Status] = 'Complete' AND [Appointment Type Roll-Up] = '1' THEN [Appt Date]END)}
{ FIXED [Client ID]: MIN(IF[Appointment Status Roll-UP]='Active' and [Appt Date] > [Last Direct Trmt Appt] then [Appt Date]END)}
These measures, I converted into SQL as followed:
select
a.client_id,
Max(a.appointment_date) as max_date
From #Last_Direct_treatment A
where a.[appointment_status] = 'Complete'
and a.[Appointment_type_roll_up] = '1'
Group by client_id
select
a.client_id,
min(a.appointment_date) as min_date
From #Last_Direct_treatment A
where a.[Appointment_Status_Roll_up] = 'active'
and a.appointment_date > a.last_direct_only_date
Group by client_id
In SQL, I already created the first query looking for my main fields:
SELECT
s.[appointment_id]
,c.[full_name]
,s.[client_id]
,s.[case_number]
,s.[appointment_date]
,s.[appointment_type]
,s.[appointment_status]
,ca.[last_direct_only_date]
,t.[authorization_status]
,t.[service_type]
,case
when s.appointment_status = '*_Cancelled' then 'Cancelled'
when s.appointment_status = '*_Complete' then 'Complete'
when s.appointment_status = 'Complete' then 'Complete'
else 'Active'
end as Appointment_Status_Roll_up
,case when
s.[appointment_type] like '%IND%' or s.[appointment_type] like '%indirect%' then '0' else '1'
end as Appointment_type_roll_up
Into #Last_Direct_treatment
FROM [appointment] s
INNER JOIN [client] c
ON s.[client_id] = c.[client_id]
INNER JOIN [client_case] ca
ON c.[client_id] = ca.[client_id]
INNER JOIN [authorization] t
ON ca.[case_number] = t.[case_number]
ORDER BY appointment_date DESC
I was successful in making the following queries find the min and max date of the fields I needed.
My desired result would be to incorporate these two new fields into one final table where I have these as the last two columns and a business date calculation of the two columns as a new column with the fields from the first query.
select
a.client_id,
Max(a.appointment_date) as max_date
From #Last_Direct_treatment A
where a.[appointment_status] = 'Complete'
and a.[Appointment_type_roll_up] = '1'
Group by client_id
select
a.client_id,
min(a.appointment_date) as min_date
From #Last_Direct_treatment A
where a.[Appointment_Status_Roll_up] = 'active'
and a.appointment_date > a.last_direct_only_date
Group by client_id

The original measures appear to resemble case expressions rather than a set of where clause predicates, e.g.
CASE WHEN MAX(IF [Appt Status] = 'Complete'
AND [Appointment Type Roll-Up] = '1' THEN [Appt Date] END
CASE WHEN MIN(IF[Appointment Status Roll-UP]='Active'
AND [Appt Date] > [Last Direct Trmt Appt] THEN [Appt Date] END
and in a combined query these might be as follows:
SELECT
a.client_id
, CASE
WHEN a.[appointment_status] = 'Complete'
AND a.[Appointment_type_roll_up] = '1'
THEN Max(a.appointment_date)
END AS max_date
, CASE
WHEN a.[Appointment_Status_Roll_up] = 'active'
AND a.appointment_date > a.last_direct_only_date
THEN min(a.appointment_date)
END AS min_date
FROM #Last_Direct_treatment A
In the unlikely event that you need those dates in a single column, note that case expressions allow for multiple sets of conditions to be considered, e.g.:
SELECT
a.client_id
, CASE
WHEN a.[appointment_status] = 'Complete'
AND a.[Appointment_type_roll_up] = '1'
THEN Max(a.appointment_date)
WHEN a.[Appointment_Status_Roll_up] = 'active'
AND a.appointment_date > a.last_direct_only_date
THEN min(a.appointment_date)
-- ELSE ...
END AS min_or_max_date
FROM #Last_Direct_treatment A
These condition sets are evaluated top-to bottom, if one applies that determines the result and the condition sets that follow are not used. If no condition sets apply else may be used.

Related

SQL question: how to compare a column against a itself with time dependence

I have a table with emails and date. I want to compare 2022's emails against 2021's emails. I got the code down below. Pretty straight forward. But now I want to add 2022's emails to the "bucket" as time passes. So Jan-2022 emails will be compared to 2021's emails, Feb-2022 emails will be compared to 2021's emails + Jan-2022, and so on.
Any helpful advice would be greatly appreciated.
SELECT T1.date
,IFF(T1.email = T2.email,'TRUE','FALSE') "Logic"
,SUM(CASE WHEN "Logic" = 'FALSE' THEN 1 ELSE 0 END) "New email"
,SUM(CASE WHEN "Logic" = 'TRUE' THEN 1 ELSE 0 END) "Repeated email"
FROM T1
LEFT JOIN
(SELECT DISTINCT email
,date
FROM T1
WHERE
AND "date" >= '2021-01-01'
AND "date" <= '2021-12-31') T2
ON T1.email = T2.email
WHERE T1.date >= '2022-01-01'
AND T1.date <= '2022-12-31'
GROUP BY 1,2);
If I understand you, you want to count the number of emails per day; those that were new and those that you had seen before on a prior day.
If this is right, I would take a completely different approach and use the count window/analytic function for each email:
select
email, date,
count (*) over (partition by email order by date) as email_count
from t1
This will tell you for each email how many times total (chronologically) it has occurred, which means a count of 1 means it's the first time and anything else means it's a repeat.
From there, you can just do a normal grouping:
with foo as (
select
email, date,
count (*) over (partition by email order by date) as email_count
from t1
)
select
date,
sum (case when email_count > 1 then 1 else 0 end) as repeat_count,
sum (case when email_count = 1 then 1 else 0 end) as new_count
from foo
group by date
order by date
You can apply a date filter in the final query, provided the CTE is always all data.
It's not clear what you're working with as the SQL statement is incomplete / invalid (mismatching parenthesis), and the data granularity is also unclear - is it daily, or monthly data?
But I think the problem you are having is that you're only determining "new" emails based on whether you saw the email address last year, where you need to determine based on whether you've seen the address at all, e.g. yesterday or any day prior.
To do this you'll want to
avoid filtering the T2 "list of emails we've seen" subquery by date, and
use a conditional date clause when joining T2 back to T1
Like this
SELECT T1.date
,IFF(T1.email = T2.email,'TRUE','FALSE') "Logic"
,SUM(CASE WHEN "Logic" = 'FALSE' THEN 1 ELSE 0 END) "New email"
,SUM(CASE WHEN "Logic" = 'TRUE' THEN 1 ELSE 0 END) "Repeated email"
FROM T1
LEFT JOIN
(SELECT DISTINCT email
,date
FROM T1
WHERE
-- remove these:
-- AND "date" >= '2021-01-01'
-- AND "date" <= '2021-12-31'
) T2
ON T1.email = T2.email
WHERE T1.date >= '2022-01-01'
AND T1.date <= '2022-12-31'
-- add this:
AND T1.date > T2.date
GROUP BY 1,2);
The key here is that the join condition doesn't need to be an equality join; it can be any expression or function that evaluates to a boolean TRUE/FALSE. So for every email and date, you can compare it to every matching email on a previous (historical) date.

Why can't I correctly group data in SQL?

I am trying to pull data and format with these headers in this order.
I'm pulling the data from Snowflake using SQL, and in the table I'm pulling the data from, the PD_amt and CN_amt are listed as separate transactions. Currently, when I pull the data, it still shows up as two separate lines with null values rather than being correctly grouped into a single row under a single user id; you can see it here in the output (I highlighted a few rows to show the issue).
select atr.user_id
, date_trunc('day', atr.trans_date) trans_day
, atr.year_month
, case when atr.trans_type = 'PD' then atr.trans_id end as PD_transaction_id
, case when atr.trans_type = 'PD' then sum(atr.amount) end as PD_amt
, case when atr.trans_type = 'CN' then atr.trans_id end as CN_transaction_id
, case when atr.trans_type = 'CN' then sum(atr.amount) end as CN_amt
from wisen_data.sm_account_trans atr
where ((trans_type = 'CN' and trans_sub_type = 'HSNMLP')
or trans_type in ('PP','PD'))
and atr.status not in ('VOIDED','FAILED')
and atr.trans_date >= date_trunc('month', current_date-3*365)
group by atr.user_id, trans_day, atr.year_month, atr.trans_type, atr.trans_id
order by trans_day
I'm not super proficient with SQL so I'm hoping to get some quick help to get this to work. Thank you!
Use conditional aggregation. The case expression is the argument to the sum():
select atr.user_id,
date_trunc('day', atr.trans_date) as trans_day
atr.year_month
sum(case when atr.trans_type = 'PD' then atr.amount end) as PD_amt,
sum(case when atr.trans_type = 'CN' then atr.amount end) as CN_amt
from wisen_data.sm_account_trans atr
where ((trans_type = 'CN' and trans_sub_type = 'HSNMLP')
or trans_type in ('PP','PD')
) and
atr.status not in ('VOIDED','FAILED')
atr.trans_date >= date_trunc('month', current_date-3*365)
group by atr.user_id, trans_day, atr.year_month
order by trans_day;
I removed PD_transaction_id and CN_transaction_id because I'm not sure what these are supposed to be.

Add A column that gives a NO or Yes Based on Date [Borrowed Date]

Basically what I want is a column that will appear beside [Borrowed Date]
This column will be called Status it will have two values
The values will depend on a condition that if the Date Today will be 3 days greater than [Borrowed Date] then the Status column = Overdue, Else it should show Not Overdue
Here is my code.
SELECT dbo.borrowing.[book_id],
dbo.bookregistration.[book_title],
dbo.bookregistration.[book_category],
dbo.bookregistration.[book_type],
dbo.bookregistration.edition,
dbo.borrowing.[borrowed_date],
dbo.borrowing.[adm_no]
FROM dbo.bookregistration
INNER JOIN dbo.borrowing ON dbo.bookregistration.[book_id] = dbo.borrowing.[book_id]
Here's your query. First we will make it shorter using alias, 2nd using select case
Here's your complete query.
SELECT t2.[Book_ID]
, t1.[Book_Title]
, t1.[Book_Category]
, t1.[Book_Type]
, t1.Edition
, case when t2.[Borrowed_Date] < getdate() - 3 then 'OverDue' else 'Not Overdue' end as [Status]
, t2.[Adm_NO]
FROM dbo.BookRegistration t1
INNER JOIN dbo.Borrowing t2 ON t1.[Book_ID] = t2.[Book_ID]
You can use DATE_ADD and CASE statement to achieve your requirement as below-
SELECT
CASE
WHEN DATE_ADD([Borrowed Date], INTERVAL 3 DAY) > NOW() THEN 'Not Overdue'
ELSE 'Overdue'
END Status
FROM dbo.BookRegistration
INNER JOIN dbo.Borrowing
ON dbo.BookRegistration.[Book_ID] = dbo.Borrowing.[Book_ID]
SELECT
dbo.Borrowing.[Book_ID],
dbo.BookRegistration.[Book_Title],
dbo.BookRegistration.[Book_Category],
dbo.BookRegistration.[Book_Type],
dbo.BookRegistration.Edition,
dbo.Borrowing.[Borrowed_Date],
dbo.Borrowing.[Adm_NO],
Case
when
DATEDIFF(SYSDATE(), Borrowed_Date) = 3
then
'OVERDUE'
ELSE
'NOT OVERDUE'
END
AS STATUS
FROM
dbo.BookRegistration
INNER JOIN
dbo.Borrowing
ON dbo.BookRegistration.[Book_ID] = dbo.Borrowing.[Book_ID]
Here we go:
SELECT dbo.Borrowing.[Book_ID], dbo.BookRegistration.[Book_Title]
, dbo.BookRegistration.[Book_Category], dbo.BookRegistration.[Book_Type]
, dbo.BookRegistration.Edition, dbo.Borrowing.[Borrowed_Date] AS [Borrowed_Date]
, case when DATEADD(DD,3,[Borrowed_Date]) > GETDATE() THEN 'Not Overdue' ELSE 'Overdue' END AS Status
, dbo.Borrowing.[Adm_NO]
FROM dbo.BookRegistration
INNER JOIN dbo.Borrowing ON dbo.BookRegistration.[Book_ID] = dbo.Borrowing.[Book_ID]
Try this:
SELECT
B.[Book_ID]
, BR.[Book_Title]
, BR.[Book_Category]
, BR.[Book_Type]
, BR.Edition
, CASE WHEN DATE_ADD([B.Borrowed_Date], INTERVAL 3 DAY) > GetDate() THEN 'Not Overdue' ELSE 'Overdue' END Status
, BR.[Adm_NO]
FROM dbo.BookRegistration BR
INNER JOIN dbo.Borrowing B ON BR.[Book_ID] = B.[Book_ID]

compare two different date ranges sales in two columns

I want to compare two different date ranges sales in two columns.. I am using query below but its giving wrong sales.. please correct my query
select s1.Itm_cd,s1.Itm_Name,Sum(S1.amount),Sum(s2.amount)
from salestrans s1,salestrans s2
where s1.Itm_cd = S2.Itm_cd
and S1.Tran_dt between '20181101' and'20181130'
and S2.Tran_dt between '20171101' and '20171130'
group by s1.Itm_cd,s1.Itm_Name
Order by s1.Itm_cd
I suspect that you want conditional aggregation here:
WITH cte AS (
SELECT
s1.Itm_cd,
s1.Itm_Name,
SUM(CASE WHEN s1.Tran_dt BETWEEN '20181101' AND '20181130'
THEN s1.amount ELSE 0 END) AS sum_2018,
SUM(CASE WHEN s1.Tran_dt BETWEEN '20171101' AND '20171130'
THEN s1.amount ELSE 0 END) AS sum_2017
FROM salestrans s1
GROUP BY
s1.Itm_cd,
s1.Itm_Name
)
SELECT
Itm_cd,
Itm_Name,
sum_2018,
sum_2017,
CASE WHEN COALESCE(sum_2017, 0) <> 0
THEN FORMAT(100.0 * (sum_2018 - sum_2017) / sum_2017, 'N', 'en-us')
ELSE 'NA' END AS growth_pct
FROM cte
ORDER BY
Itm_cd;
Please try the following
select s1.Itm_cd,s1.Itm_Name,Sum(S1.amount),Sum(s2.amount)
from salestrans s1,salestrans s2
where s1.Itm_cd = S2.Itm_cd
and Convert(Varchar(10),S1.Tran_dt,112) between '20181101' and'20181130'
and Convert(Varchar(10),S2.Tran_dt,112) between '20171101' and '20171130'
group by s1.Itm_cd,s1.Itm_Name
Order by s1.Itm_cd
Here the logic is that in right side while comparision you are providing only date and not any separator and time. The same way should be applied to the column in left side for comparision.
if(Convert(Varchar(10), getdate(),112) = '20181224')
print 'Matched'
else
print 'Not Matched'
if(getdate() = '20181224')
print 'Matched'
else
print 'Not Matched'
Here the output is Matched for first and Not Matched because in first case both side same format has been taken for comparison.

SQL Query: Cannot perform aggregate functions on sub queries

I have the following SQL query
SELECT
[Date],
DATENAME(dw,[Date]) AS Day,
SUM(CASE WHEN ChargeCode IN (SELECT ChargeCode FROM tblChargeCodes WHERE Chargeable = 1) THEN Units ELSE 0 END) ChargeableTotal,
SUM(CASE WHEN ChargeCode IN (SELECT ChargeCode FROM tblChargeCodes WHERE Chargeable = 0) THEN Units ELSE 0 END) NotChargeableTotal,
SUM(Units) AS TotalUnits
FROM
tblTimesheetEntries
WHERE
UserID = 'PJW'
AND Date >= '2013-01-01'
GROUP BY
[Date]
ORDER BY
[Date] DESC;
But I get the error message:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Because I am using sub queries in the Case Else Summation.
How can I revise my query to get 2 x Sums of [Units] one for Chargeable = true, and one for Chargeable = false, even though the Chargeable field is in a different table to all the other information. The two tables are linked by ChargeCode which appears in both tblTimesheetEntries and tblChargeCodes.
Have you tried joining the tables on the chargeCode:
SELECT e.[Date],
DATENAME(dw,e.[Date]) AS Day,
SUM(CASE WHEN c.Chargeable = 1 THEN e.Units ELSE 0 END) ChargeableTotal,
SUM(CASE WHEN c.Chargeable = 0 THEN e.Units ELSE 0 END) NotChargeableTotal,
SUM(e.Units) AS TotalUnits
FROM tblTimesheetEntries e
LEFT JOIN tblChargeCodes c
on e.ChargeCode = c.ChargeCode
WHERE e.UserID = 'PJW'
AND e.Date >= '2013-01-01'
GROUP BY e.[Date]
ORDER BY e.[Date] DESC;