Trying to count number of complaint, cause, corrections - sql

I am trying to create a query with the following fields, that counts the number of Complaints, causes, and corrections show up in the data.
My current error: Select non-aggregate values must be part of the associated group.
I am very new to SQL queries, and am not sure what else I'm missing. All I've done is merged to queries, and seem to be missing something.
select L.Case_ID,
L.Case_Line_ID,
A.Dealer_ID,
M.DealerCode,
H.DealerName,
substr(L.Estimate_Created_At,1,7) as CaseMonth,
count(distinct L.Complaint) as Complaint,
count(distinct C.Cause) as Cause,
count(distinct C.Correction) as Correction
from Decisiv_Tables_Prod.Stg_Decisiv_LineItems L
join Decisiv_Tables_Prod.Stg_Decisiv_Cases A on L.Case_ID = A.Case_ID
join Decisiv_Tables_Prod.Rpt_DecisivDealerMap M on A.Dealer_ID = M.DecisivDealerID
and cast(substr(L.Estimate_Created_At,1,10) as date format 'YYYY-MM-DD') between M.EffectiveStartDate and coalesce(M.EffectiveEndDate, cast('2099-12-31' as date format 'YYYY-MM-DD'))
join Decisiv_Tables_Prod.Rpt_DealerDirectoryHierarchy H on M.DealerCode = H.DealerCode
join Decisiv_Tables_Prod.Stg_Decisiv_LineItems_Clobs C on C.Case_ID = L.Case_ID
and C.Case_Line_ID = L.Case_Line_ID
group by 1,2,3,4,5
Looking to get a table with the following data example:
Dealer ID, Dealer Code, Dealer Name, Case Month, Count of Case_ID, Count of Case_Line_ID, Count of Complaint, Count of Cause, Count of Correction

You have six unaggregated columns:
select L.Case_ID,
L.Case_Line_ID,
A.Dealer_ID,
M.DealerCode,
H.DealerName,
substr(L.Estimate_Created_At,1,7) as CaseMonth,
These should all be in the group by:
group by L.Case_ID, L.Case_Line_ID, A.Dealer_ID,
M.DealerCode, H.DealerName,
substr(L.Estimate_Created_At,1,7) as CaseMonth

As Gordon wrote, the GROUP BY list doesn't match your Select, you need to either remove both Case_ID & Case_Line_ID or aggregate them:
SELECT
A.Dealer_ID,
M.DealerCode,
H.DealerName,
Substr(L.Estimate_Created_At,1,7) AS CaseMonth,
Count(L.Case_ID), -- distinct ?
Count(L.Case_Line_ID), -- distinct ?
Count(DISTINCT L.Complaint) AS Complaint,
Count(DISTINCT C.Cause) AS Cause,
Count(DISTINCT C.Correction) AS Correction
FROM Decisiv_Tables_Prod.Stg_Decisiv_LineItems AS L
JOIN Decisiv_Tables_Prod.Stg_Decisiv_Cases AS A
ON L.Case_ID = A.Case_ID
JOIN Decisiv_Tables_Prod.Rpt_DecisivDealerMap AS M
ON A.Dealer_ID = M.DecisivDealerID
AND Cast(Substr(L.Estimate_Created_At,1,10) AS DATE FORMAT 'YYYY-MM-DD') BETWEEN M.EffectiveStartDate AND Coalesce(M.EffectiveEndDate, DATE '2099-12-31')
JOIN Decisiv_Tables_Prod.Rpt_DealerDirectoryHierarchy AS H
ON M.DealerCode = H.DealerCode
JOIN Decisiv_Tables_Prod.Stg_Decisiv_LineItems_Clobs AS C
ON C.Case_ID = L.Case_ID
AND C.Case_Line_ID = L.Case_Line_ID
GROUP BY
A.Dealer_ID,
M.DealerCode,
H.DealerName,
CaseMonth
I simplified cast('2099-12-31' as date format 'YYYY-MM-DD') to DATE '2099-12-31' and used the column names/aliases in Group By (recommended over 1,2,3,4 in production code).
As Distinct is quite expensive check if you actually need to add it to those counts.

Related

SQL add a column with COUNT(*) to my query

I need to add a column with the content of this query :
SELECT COUNT(*) FROM account_subscriptiongroups WHERE account_subscriptiongroups.active = true AND account_subscriptiongroups.user_id = account_user.id
to this query :
SELECT
account_user.id as user_id, account_user.email, account_user.first_name, account_user.last_name, account_user.phone,
account_subscriptiongroup.id as sub_group_id,
account_adminaction.description,
account_adminaction.id as admin_action_id,
account_adminaction.created_on as subscription_ended_on
FROM
account_adminaction
LEFT JOIN
account_user ON account_user.id = account_adminaction.user_id
LEFT JOIN
account_subscriptiongroup ON account_adminaction.sub_group_id = account_subscriptiongroup.id
WHERE
account_adminaction.created_on >= '2021-04-07' AND account_adminaction.created_on <= '2021-04-13' AND
((account_adminaction.description LIKE 'Arrêt de l''abonnement%') OR (account_adminaction.description LIKE 'L''utilisateur a arrêté%'))
ORDER BY
subscription_ended_on
I tried adding a LEFT JOIN like that:
LEFT JOIN
account_subscriptiongroup all_sg ON account_user.id = account_subscriptiongroup.user_id
with this line in my WHERE statement :
AND all_sg.active = true
and this line in my SELECT :
COUNT(all_sg.id)
but I get an error :
ERROR: column "account_user.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: account_user.id as user_id, account_user.email, account_us...
^
I don't understand how I could perform this action properly
To count something, you need to specify a group where that count applies.
So every column that you select (and is not used in an aggregate function, like COUNT or SUM), you need to mention in the GROUP BY clause.
Or to put it the other way around: the non-aggregate columns must apply to all rows that are contained in that particular COUNT.
So between the WHERE and ORDER BY clauses, add a GROUP BY clause:
GROUP BY account_user.id, account_user.email, account_user.first_name, account_user.last_name, account_user.phone,
account_subscriptiongroup.id,
account_adminaction.description,
account_adminaction.id,
account_adminaction.created_on
If, on the other hand, you want a count from a different table, you can add a sub-select:
SELECT
account_user.id as user_id, account_user.email, account_user.first_name, account_user.last_name, account_user.phone,
account_subscriptiongroup.id as sub_group_id,
account_adminaction.description,
account_adminaction.id as admin_action_id,
account_adminaction.created_on as subscription_ended_on,
(SELECT COUNT(*)
FROM account_subscriptiongroups
WHERE account_subscriptiongroups.active = true
AND account_subscriptiongroups.user_id = account_user.id) AS groupcount
FROM
account_adminaction
LEFT JOIN
account_user ON account_user.id = account_adminaction.user_id
You can left join to to a derived table that does the grouping and counting:
SELECT au.id as user_id, au.email, au.first_name, au.last_name, au.phone,
asg.id as sub_group_id,
ad.description,
ad.id as admin_action_id,
ad.created_on as subscription_ended_on,
asgc.num_groups
FROM account_adminaction ad
LEFT JOIN account_user au ON au.id = ad.user_id
LEFT JOIN account_subscriptiongroups asg on ON ad.sub_group_id = asg.id
LEFT JOIN (
SELECT user_id, count(*) as num_groups
FROM account_subscriptiongroups ag
WHERE ag.active
GROUP by user_id
) asgc on asgc.user_id = au.id
WHERE ad.created_on >= '2021-04-07'
AND ad.created_on <= '2021-04-13'
AND ((ad.description LIKE 'Arrêt de l''abonnement%') OR (ad.description LIKE 'L''utilisateur a arrêté%'))
ORDER BY subscription_ended_on
It's not entirely clear to me, what you are trying to count, but another option (most probably slower) could be to use a window function combined with a filter clause:
count(*) filter (where asg.active) over (partition by asg.user_id) as num_groups
EDIT: my answer is the same as submitted by a_horse_with_no_name
Two answers, a literal one just solving the problem you posed, and then another one questioning whether what you asked for is really what you want.
Simple answer: modify your desired query to add user_id to the Select and remove user_id from the Where clause. Now you have a table that can be left-joined to the rest of your larger query.
...
Left Join (Select user_id, count(*) as total_count
From account_subscriptiongroup
Where account_subscriptiongroups.active = true
Group By user_id) Z
On Z.user_id=account_user.id
I question whether this count is what you really want here. This counts every account_subscriptiongroup entry for all time but only the active ones. Your larger query brings back inactive as well as active records, so if your goal is to create a denominator for a percentage, you are mixing 'apples and oranges' here.
If you decide you want a total by user of the records in your query instead, then you can add one more element to your larger query without adding any more tables. Use a windowing function like this:
Select ..., Sum(Case When account_subscriptiongroup.active Then 1 else 0 End) Over (Group By account_user.id) as total count
This just counts the records within the date range and having the desired actions.

How to use case expression and min() with group by

The following query when I that execute
SELECT St.FirstName,St.LastName,
CASE
WHEN St.ISPackage='N' THEN Min(St.VisitingDate)
WHEN St.ISPackage='Y' THEN St.VisitingDate
END AS VisitingDate
FROM SalesTransaction As St
(...inner join and where clause)
GROUP BY St.FirstName,St.LastName, St.VisitingDate
If I use St.VisitingDate in Group By the result is duplicated.
If not used St.VisitingDate in Group By show error
invalid in the select list because it is not contained in either an aggregate function
How can I solve this problem?
I suspect that you want to group by name and get the visit date when ISPackage is "Y", and fallback on the earliest visit date if there is no date where ISPackage is "Y".
If so, you can do:
SELECT St.FirstName,St.LastName,
COALESCE(MIN(CASE WHEN St.ISPackage = 'Y' THEN St.VisitingDate END), MIN(St.VisitingDate)) VisitingDate
FROM SalesTransaction As St
-- (...inner join and where clause)
GROUP BY St.FirstName,St.LastName
you can use below SQL-
I calculated min date using a subquery.
SELECT St.FirstName,St.LastName,
CASE
WHEN St.ISPackage='Y' THEN St.VisitingDate
ELSE min_dt.min_VisitingDate
END AS VisitingDate
FROM
SalesTransaction As St
(...inner join and where clause)
left join (select St.FirstName,St.LastName, min(St.VisitingDate) min_VisitingDate from SalesTransaction group by St.FirstName,St.LastName) min_dt
ON St.FirstName =min_dt.FirstName ,St.LastName =min_dt.LastName

How to use 1 SQL query related to date and time in order to compare value difference

SELECT c.treatment_category, a.treatment_id, MAX(a.counts - b.counts) AS ReviewDifference
FROM
(SELECT treatment_id, COUNT(treatment_id) AS counts
FROM review
WHERE DATE(review.created) BETWEEN DATE(TIMESTAMP'2016-01-01 00:00:00.0') AND DATE(TIMESTAMP'2016-12-31 23:59:59.999')
GROUP BY treatment_id) a
LEFT JOIN
(SELECT treatment_id, COUNT(treatment_id)
FROM review
WHERE DATE(review.created) BETWEEN DATE(TIMESTAMP'2015-01-01 00:00:00.0') AND DATE(TIMESTAMP'2015-12-31 23:59:59.999')
GROUP BY treatment_id) b
ON a = b
LEFT JOIN
(SELECT t.treatment_category AS category, r.treatment_id AS number
FROM treatment t
LEFT JOIN review r
ON t.treatment_id = r.treatment_id
GROUP BY category, number) c
ON b.treatment_id = c.number
GROUP BY a.treatment_id, c.treatment_category
ORDER BY ReviewDifference DESC
LIMIT 1;
I need some hints or simpler query on how to do this question since it is related to date and time. Thank you.
What treatment category has seen the biggest increase in reviews from 2015 to 2016?
Please see below for the tables.
I have provided my code snippet and I would like to find a simpler and cleaner way on writing the code.
SELECT t.treatment_id, t.treatment_name,
COUNT( CASE WHEN YEAR(created) = 2016 THEN r.review_id END)
- COUNT( CASE WHEN YEAR(created) = 2015 THEN r.review_id END) as review_count
FROM treatments t
JOIN reviews r
ON t.treatment_id = r.treatment_id
GROUP BY t.treatment_id, t.treatment_name,
ORDER BY review_count DESC

By store (number), list the maximum number of store visits made by a customer

The three tables that I'm linking are item_scan_fact, member_dimension and store_dimension. So far this is what I have:
SELECT
store_dimension.store_number,
member_dimension.member_number
COUNT (item_scan_fact.visit_number) AS NumVisits
FROM
member_dimension,
item_scan_fact
INNER JOIN store_dimension
ON item_scan_fact.member_key = member_dimension.member_key
AND item_scan_fact.store_key = store_dimension.store_key
GROUP BY
store_dimension.store_number,
member_dimension.member_number, NumVisits;
On the surface it appears solvable with a couple Common Table Expressions
Does this help point you in the right direction?
WITH s1 -- JJAUSSI: Find the visit_number_count by member_key and store_key
AS
(SELECT isf.member_key
,isf.store_key
-- JJAUSSI: DISTINCT resolves a potential 1:N (one to many) relationship here
,COUNT( DISTINCT isf.visit_number) AS visit_number_count
FROM item_scan_fact isf
GROUP BY isf.member_key
,isf.store_key),
s2 -- JJAUSSI: Find the visit_number_count_max by member_key
AS
(SELECT s1.member_key
,MAX(s1.visit_number_count) AS visit_number_count_max
FROM s1
GROUP BY s1.member_key)
-- JJAUSSI: Use this version to see the list of store_key values
-- that have the visit_number_count_max value. This has the potential
-- to be a 1:N relationship.
SELECT s1.member_key
,md.member_number
,s1.store_key
,sd.store_number
,s1.visit_number_count
FROM s2 INNER JOIN s1
ON s2.member_key = s1.member_key
AND s2.visit_number_count_max = s1.visit_number_count
INNER JOIN store_dimension sd
ON sd.store_key = s1.store_key
INNER JOIN member_dimension md
ON md.member_key = s1.member_key;
If this is what you were going for...congratulations! On to the next query!
If you ultimately are after a single store_key response for each member_key (basically you want to determine the member_key's "primary" store_key) then an additional step is probably needed (depending on your data).
Here are some ideas:
Evaluate the member_key based on some other summable facet of
item_scan_fact (like total price paid?)
If you consider all store_key values of equal merit that have the same visit_number_count_max value for a given member_key, just choose a store_key with MAX or MIN
You would seem to want:
SELECT member_number, MAX(NumVisits)
FROM (SELECT sd.store_number, md.member_number
COUNT(*) AS NumVisits
FROM member_dimension md JOIN
item_scan_fact isf
ON md.member_key = isf.member_key JOIN
store_dimension sd
ON isf.store_key = sd. store_key
GROUP BY sd.store_number, md.member_number
) sm
GROUP BY member_number;
If you want to return both the max and the matching customer number you can apply a Teradata SQL extension, qualify:
SELECT sd.store_number, md.member_number
COUNT(*) AS NumVisits
FROM member_dimension md JOIN
item_scan_fact isf
ON md.member_key = isf.member_key JOIN
store_dimension sd
ON isf.store_key = sd. store_key
GROUP BY sd.store_number, md.member_number
QUALIFY
rank() -- might return multiple rows with the same max, ROW_NUMBER a single row
over (partition by sd.store_number
order by NumVisits desc) = 1

Left Join on subquery

Can someone help me with this query? In the subquery P, a fiscal year exists where the same year does not exist in BudgetActivityDetailCurrentBiennium. I need this query to show a null value for Amount in that year. Currently that year does not show up at all.
SELECT
P.FiscalYear
,P.BudgetNbr
,SUM(sec.BudgetActivityDetailCurrentBiennium.TranAmount) AS Amount
FROM
(SELECT
sec.BudgetIndexCurrentBiennium.BudgetNbr
,AVG(CAST(sec.BudgetIndexCurrentBiennium.BienniumYear AS INT)+1) AS FiscalYear
FROM
sec.BudgetIndexCurrentBiennium
GROUP BY
sec.BudgetIndexCurrentBiennium.BudgetNbr
UNION ALL
SELECT
sec.BudgetIndexCurrentBiennium.BudgetNbr
,AVG(CAST(sec.BudgetIndexCurrentBiennium.BienniumYear AS INT)+2) AS FiscalYear
FROM
sec.BudgetIndexCurrentBiennium
GROUP BY
sec.BudgetIndexCurrentBiennium.BudgetNbr) AS P
LEFT JOIN sec.BudgetActivityDetailCurrentBiennium
ON
P.BudgetNbr = sec.BudgetActivityDetailCurrentBiennium.BudgetNbr
AND P.FiscalYear = sec.BudgetActivityDetailCurrentBiennium.FiscalYear
WHERE sec.BudgetActivityDetailCurrentBiennium.BudgetNbr = '076036'
GROUP BY
P.FiscalYear
,P.BudgetNbr
I think the problem is your WHERE clause,
WHERE sec.BudgetActivityDetailCurrentBiennium.BudgetNbr = '076036'
which applies to the SELECT as a whole, not just (as I guess you intend) to the left join.
Change the WHERE to an AND. That should do the trick.