How to nest multiple case when expressions and add a condition - sql

I am trying to divide customers (contact_key) column who shopped in 2021 (A.TXN_MTH) into new and 'returning' with returning meaning that they had not shopped in the last 12 months (YYYYMM in X.Fiscal_mth_idnt column).
I am using CASE WHEN A.TXN_MTH = MIN(X.FISCAL_MTH_IDNT) THEN 'NEW' which is correct. The next case when should be when the max month before X.TXN_MTH is 12 or more months previous. I have added the 12 months part in the Where statement. Should I be nesting 3 CASE WHEN'S instead of WHERE?
SELECT
T.CONTACT_KEY
, A.TXN_MTH
, CASE WHEN A.TXN_MTH = MIN(X.FISCAL_MTH_IDNT) THEN 'NEW'
WHEN (MAX(CASE WHEN X.FISCAL_MTH_IDNT < A.TXN_MTH THEN X.FISCAL_MTH_IDNT ELSE NULL END)) THEN 'RETURNING'
END AS CUST_TYPE
FROM B_TRANSACTION T
INNER JOIN B_TIME X
ON T.TRANSACTION_DT_KEY = X.DATE_KEY
INNER JOIN A
ON A.CONTACT_KEY = T.CONTACT_KEY AND A.BU_KEY = T.BU_KEY
WHERE (MAX(CASE WHEN X.FISCAL_MTH_IDNT < A.TXN_MTH THEN X.FISCAL_MTH_IDNT ELSE NULL END)) < A.TXN_MTH - (date_format(add_months(concat_ws('-',substr(yearmonth,1,4),substr(yearmonth,5,2),'01'),-12),'yyyyMM')
GROUP BY
T.CONTACT_KEY
, TXN_MTH;

You have not described your tables, so assuming fiscal_mth_idnt is a DATE column then you can use the LAG analytic function to find the previous row's value:
SELECT contact_key,
txn_mth,
CASE
WHEN prev_fiscal_mth_idnt IS NULL
THEN 'NEW'
WHEN ADD_MONTHS(prev_fiscal_mth_idnt, 12) < fiscal_mth_idnt
THEN 'RETURNING'
ELSE 'CURRENT'
END AS cust_type
FROM (
SELECT T.CONTACT_KEY,
A.TXN_MTH,
yearmonth,
X.FISCAL_MTH_IDNT,
LAG(X.FISCAL_MTH_IDNT) OVER (
PARTITION BY T.CONTACT_KEY
ORDER BY X.FISCAL_MTH_IDNT
) AS prev_fiscal_mth_idnt
FROM B_TRANSACTION T
INNER JOIN B_TIME X
ON T.TRANSACTION_DT_KEY = X.DATE_KEY
INNER JOIN A
ON A.CONTACT_KEY = T.CONTACT_KEY AND A.BU_KEY = T.BU_KEY
)
WHERE yearmonth LIKE '2021%';

Related

SQL query : transform rows to columns

Here's an example of my table.
I need to do a query that shows those IDs who have 0 as a fee on one of two months (11 or 12) or both.
So from the example, I need to show ID 1,3,4 but not 2, like on the screenshot below.
I tried the query below:
SELECT
t1.id, t1.month, t1.fee, t2.id, t2.month, t2.fee
FROM
table t1, table t2
WHERE t1.id = t2.id
AND t1.month = '11'
AND t2.month = '12'
AND (t1.fee = 0 OR t2.fee = 0);
But with this query, I only see ID 1,3 but not ID 4. I guess it's because of t1.id = t2.id but no idea how to do otherwise.
You can use conditional aggregation. In Postgres, this can make use of the filter syntax:
SELECT t.id,
11 as month,
MAX(t.fee) FILTER (WHERE t.month = 11) as fee_11,
12 as month,
MAX(t.fee) FILTER (WHERE t.month = 12) as fee_12
FROM t
GROUP BY t.id
HAVING MAX(t.fee) FILTER (WHERE t.month = 11) = 0 OR
MAX(t.fee) FILTER (WHERE t.month = 12) = 0;
Note: The two month columns are redundant.
you need conditional aggregation
select id,month,max(case when month=11 then fee end) fee11,
max(case when month=12 then fee end) as fee12
from (
select * from table t1
where t1.id in ( select id from table where fee=0)
) a group by id,month
Sql ansi compliant query
SELECT id,
MAX(CASE WHEN MONTH = 11 THEN MONTH ELSE NULL END) AS month11,
MAX(CASE WHEN MONTH = 11 THEN fee ELSE NULL END) AS fee11,
MAX(CASE WHEN MONTH = 12 THEN MONTH ELSE NULL END) AS month12,
MAX(CASE WHEN MONTH = 12 THEN fee ELSE NULL END ) AS fee12
FROM t
GROUP BY id
HAVING ( MAX(CASE WHEN MONTH = 11 THEN fee ELSE NULL END) = 0 OR MAX(CASE WHEN MONTH = 12 THEN fee ELSE NULL END ) = 0 )
ORDER BY id

How to improve slow sql query with aggregate functions

I want to show top ten customers,sales,margin where customers is registred during this accounting year. The query takes about 65seconds to run and it is not accepted :-(
As you may see i am not good at sql and will be very happy for help to improve the query.
SELECT Top 10
AcTr.R3, Actor.Nm,
SUM(CASE WHEN AcTr.AcNo<='3999' THEN AcAm*-1 ELSE 0 END) AS Sales ,
SUM(AcAm*-1) AS TB
FROM AcTr, Actor
WHERE (Actor.CustNo = AcTr.R3) AND
(Actor.CustNo <> '0') AND
(Actor.CreDt >= '20180901') AND
(Actor.CreDt <= '20190430') AND
AcTr.AcYr = '2018' AND
AcTr.AcPr <= '8' AND
AcTr.AcNo>='3000' AND
AcTr.AcNo <= '4999'
GROUP BY AcTr.R3, Actor.Nm
ORDER BY Sales DESC
Welcome to the community. You have a good start, but future, it is more helpful if you can provide (as commented), the CREATE table declarations so users know the actual data types. Not always required, but helps.
As for your query layout, it is more common to show the JOIN syntax instead of WHERE showing relations between tables, but that comes in time and practice.
Indexes help and should be based on a combination of both WHERE/JOIN criteria AND Grouping fields. Also, if fields are numeric, then do not 'quote' them, just leave as numbers. For example, your AcYr, AcPr, AcNo. I would think that an account number really would be a string value vs number for accounting purposes.
I would suggest the following indexes on your tables
Table Index
Actr ( AcYr, AcPr, AcNo, R3 )
Actor ( CustNo, CreDt )
The Actr table I have the filtering criteria first and the R3 last to help optimize the GROUP BY. The Actor table by the customer number, then the CreDt (Create date??), and is it really a string, or is it a date field? If so, the date criteria would be something like '2018-09-01' and '2019-04-30'
select TOP 10
Actor.Nm,
PreSum.Sales,
PreSm.TB
from
( select
R3,
SUM(CASE WHEN AcTr.AcNo <= '3999'
THEN AcAm * -1 ELSE 0 END) AS Sales,
SUM( AcAm * -1) AS TB
from
Actr
where
AcTr.AcYr = 2018
AND AcTr.AcPr <= 8
AND AcTr.AcNo >= '3000'
AND AcTr.AcNo <= '4999'
GROUP BY
AcTr.R3 ) PreSum
JOIN Actor
on PreSum.R3 = Actor.CustNo
AND Actor.CustNo <> 0
AND Actor.CreDt >= '20180901'
AND Actor.CreDt <= '20190430'
order by
Sales DESC
Per latest inquiry / comment, wanting by year comparison and getting rid of the top 10 performers per a given time period.
select
Actor.Nm,
PreSum.Sales2018,
PreSum.Sales2019,
PreSum.TB2018,
PreSum.TB2019
from
( select
AcTr.R3,
SUM(CASE WHEN AcTr.AcYr = 2018
AND AcTr.AcNo <= '3999'
THEN AcAm * -1 ELSE 0 END) AS Sales2018,
SUM(CASE WHEN AcTr.AcYr = 2019 AND AcTr.AcNo <= '3999'
THEN AcAm * -1 ELSE 0 END) AS Sales2019,
SUM( CASE WHEN AcTr.AcYr = 2018
THEN AcAm * -1 else 0 end ) AS TB2018
SUM( CASE WHEN AcTr.AcYr = 2019
THEN AcAm * -1 else 0 end ) AS TB2019
from
Actr
where
AcTr.AcYr IN ( 2018, 2019 )
AND AcTr.AcPr <= 8
AND AcTr.AcNo >= '3000'
AND AcTr.AcNo <= '4999'
GROUP BY
AcTr.R3 ) PreSum
JOIN Actor
on PreSum.R3 = Actor.CustNo
AND Actor.CustNo <> 0
AND Actor.CreDt >= '20180901'
AND Actor.CreDt <= '20190430'
order by
Sales DESC

SQL group by count across two tables

I have got two tables called baseline and revisits
baseline
formid-------NoOfIssues
1--------------3
2--------------4
3--------------5
revisits
id------formid-------NoOfIssues-----------date--------------fid
1---------2--------------4-------------5/06/2016------------1
2---------3--------------3-------------15/06/2016-----------1
3---------1--------------4-------------20/07/2016-----------1
4---------1--------------3-------------25/07/2016-----------1
5---------2--------------5-------------28/07/2016-----------1
6---------1--------------5-------------01/06/2016-----------1
7---------3--------------8-------------21/02/2016-----------1
8---------3--------------2-------------21/02/2016-----------2
These tables are joined by 'formid'.
I need to compare number of issues in baseline vs revisits(only first) and get a count as reduced, increased or equal
Based upon the above table i am expecting the following, for example across all three baseline entries no equals were found comparing NoOfissues in first revisit against same formid, but 1 equal and 2 increased were found
Addition: if same date and same formid is found than take the lower fid, so in the last two rows of revisits table both formid and date are equal but need to consider the lower formid which is 1
status----------Count
reduced----------0
equal------------1
increased--------2
I'm not familiar with intersystems-cache, but you can see if the following is valid SQL with that DB:
SELECT
CASE
WHEN BL.NoOfIssues = FR.NoOfIssues THEN 'equal'
WHEN BL.NoOfIssues > FR.NoOfIssues THEN 'reduced'
WHEN BL.NoOfIssues < FR.NoOfIssues THEN 'increased'
END AS status,
COUNT(*) AS Count
FROM
Baseline BL
INNER JOIN Revisits FR ON FR.formid = BL.formid
LEFT OUTER JOIN Revisits R ON
R.formid = BL.formid AND
(
R.date < FR.date OR
(R.date = FR.date AND R.fid > FR.fid)
)
WHERE
R.formid IS NULL
GROUP BY
CASE
WHEN BL.NoOfIssues = FR.NoOfIssues THEN 'equal'
WHEN BL.NoOfIssues > FR.NoOfIssues THEN 'reduced'
WHEN BL.NoOfIssues < FR.NoOfIssues THEN 'increased'
END
Some quick notes on your database though - You should probably decide on a standard of plural or singular table names and stick with it. Also, try to avoid common reserved words for object names, like date. Finally, if a revisit is basically the same as a visit, just on a later date then you should consider keeping them all in the same table.
I would do this in one row rather than three:
select sum(case when numissues = rcnt then 1 else 0 end) as equal,
sum(case when numissues > rcnt then 1 else 0 end) as reduced,
sum(case when numissues < rcnt then 1 else 0 end) as incrased
from (select b.form_id, b.numissues, count(r.form_id) as rcnt
from baseline b left join
revisits r
on b.form_id = r.form_id
group by b.form_id, b.numissues
) br;
I would use a window function to get the number of ussues at the minimum date, then compare that to the baseline number of issues
select
case when baseline.NoOfIssues = rev.NoOfIssues then 'equal'
when baseline.NoOfIssues > rev.NoOfIssues then 'reduced'
when baseline.NoOfIssues < rev.NoOfIssues then 'increased'
end as status,
count(*) as count
from baseline
inner join(
select
formid,
case when date = min(date) over(partition by formid) then NoOfIssues else null end as first_rev_issues
from revisits
) rev
on baseline.formid = rev.formid
group by
case when baseline.NoOfIssues = rev.NoOfIssues then 'equal'
when baseline.NoOfIssues > rev.NoOfIssues then 'reduced'
when baseline.NoOfIssues < rev.NoOfIssues then 'increased'
end
Or like this:
WITH CTE AS
(
SELECT
BL.FORMID,
BL.NOOFISSUES AS BLI,
RV.NOOFISSUES AS RVI,
RV.DATE,
ROW_NUMBER() OVER(PARTITION BY BL.FORMID ORDER BY RV.DATE) AS RN
FROM Baseline BL
INNER JOIN Revisits RV ON RV.FORMID = BL.FORMID
)
SELECT COALESCE(SUM(CASE WHEN RVI > BLI THEN 1 END), 0) AS INCREASED,
COALESCE(SUM(CASE WHEN RVI < BLI THEN 1 END), 0) AS DECREASED,
COALESCE(SUM(CASE WHEN RVI = BLI THEN 1 END), 0) AS EQUAL
FROM CTE
WHERE RN=1;

sqlite select: grouping in one line, 12 entries

I have a table with fields:
Client_ID, Date, Value
where there is an entry for each of the 12 months of a year (i.e., 12 entries for each client).
I would like to create a table with just one row per Client_ID that contains all the values from each months. Something like:
Client_ID, Date_January, Value_January, Date_February, Value_February, ........, Date_December, Value_December
Can anyone help me with the query?
This is what I'm trying to do (not working...):
select
Client_Id,
case when ((strftime('%m', Date) = '01')) then
Date as Date_January,
Value as Value_January,
else null end
case when ((strftime('%m', Date) = '01')) then
Date as Date_February,
Value as Value_February,
else null end
....
from Test_Table
where
strftime('%Y', Date) = '2013'
;
First you need to untangle your case constructs as they generate a single value. Use:
case
when ((strftime('%m', Date) = '01')) then Date
else null
end as Date_January,
case when ((strftime('%m', Date) = '01')) then Value
else null
end as Value_January,
Then, if you want one row per client, use GROUP BY ClientID.
The third issue is how to aggregate all the Date_January columns into one row. If you really know that there is exactly one row per month per client, you can use MAX() knowing that the not null value will be higher than the NULL values:
select
Client_Id,
MAX(case
when ((strftime('%m', Date) = '01')) then Date
else null
end) as Date_January,
MAX(case when ((strftime('%m', Date) = '01')) then Value
else null
end) as Value_January,
MAX(case
when ((strftime('%m', Date) = '02')) then Date
else null
end) as Date_February,
MAX(case when ((strftime('%m', Date) = '02')) then Value
else null
end) as Value_February,
....
from Test_Table
where
strftime('%Y', Date) = '2013'
group by Client_Id;

SQL Query: Cannot perform aggregate functions on sub queries

I have the following SQL query
SELECT
[Date],
DATENAME(dw,[Date]) AS Day,
SUM(CASE WHEN ChargeCode IN (SELECT ChargeCode FROM tblChargeCodes WHERE Chargeable = 1) THEN Units ELSE 0 END) ChargeableTotal,
SUM(CASE WHEN ChargeCode IN (SELECT ChargeCode FROM tblChargeCodes WHERE Chargeable = 0) THEN Units ELSE 0 END) NotChargeableTotal,
SUM(Units) AS TotalUnits
FROM
tblTimesheetEntries
WHERE
UserID = 'PJW'
AND Date >= '2013-01-01'
GROUP BY
[Date]
ORDER BY
[Date] DESC;
But I get the error message:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Because I am using sub queries in the Case Else Summation.
How can I revise my query to get 2 x Sums of [Units] one for Chargeable = true, and one for Chargeable = false, even though the Chargeable field is in a different table to all the other information. The two tables are linked by ChargeCode which appears in both tblTimesheetEntries and tblChargeCodes.
Have you tried joining the tables on the chargeCode:
SELECT e.[Date],
DATENAME(dw,e.[Date]) AS Day,
SUM(CASE WHEN c.Chargeable = 1 THEN e.Units ELSE 0 END) ChargeableTotal,
SUM(CASE WHEN c.Chargeable = 0 THEN e.Units ELSE 0 END) NotChargeableTotal,
SUM(e.Units) AS TotalUnits
FROM tblTimesheetEntries e
LEFT JOIN tblChargeCodes c
on e.ChargeCode = c.ChargeCode
WHERE e.UserID = 'PJW'
AND e.Date >= '2013-01-01'
GROUP BY e.[Date]
ORDER BY e.[Date] DESC;