Aggregate by another table after annotate

Aggregate by another table after annotate - sql

I have next annotate
qs.annotate(
goods_with_sales=Count('goods', filter=Q(goods__history__sales__gt=0)),
)
Same goods_with_sales_percent=goods_with_sales / sum_sales * 100
I need get percent of goods_with_sales to sum of all goods__history__sales. I try it with Window but not happiness...
This is raw sql query:
SELECT
"wb_brand"."id",
"wb_brand"."name",
COUNT("wb_good"."id") FILTER (
WHERE
"wb_stockshistory"."sales" > 0
) AS "goods_with_sales"
FROM
"wb_brand"
LEFT OUTER JOIN "wb_good" ON (
"wb_brand"."id" = "wb_good"."brand_id"
)
LEFT OUTER JOIN "wb_stockshistory" ON (
"wb_good"."id" = "wb_stockshistory"."good_id"
)
GROUP BY
"wb_brand"."id"
Also I tried this with CTE, but not happiness also.
How can I solve it with Django ORM (prefer) or with SQL ?
P.S. Also must be CASE/WHEN condition for divisizon by zero if sum_sales == 0.

Related

Getting issue under SQL View error (The MAX function requires 1 argument(s).)

I am getting error "The MAX function requires 1 argument(s)" while running below view. Please anybody help me on it.
CREATE VIEW SQL_RP("ID_NUMERIC", "SAMPLING_POINT", "SAMPLED_DATE", "COMPOSITE_INTERVAL", "TEST_NUMBER", "ANALYSIS", "TEST_COUNT", "COMPONENT_NAME", "RESULT_VALUE", "RESULT_TEXT", "UNITS", "RESULT_TYPE", "RESULT_STATUS", "ENTERED_BY", "UPDATED_DATE")
AS
SELECT sample.id_numeric,
sample.sampling_point,
sample.sampled_date,
sample.composite_interval,
test.test_number,
test.analysis,
test.test_count,
result.name AS component_name,
result.Value AS result_value,
result.text AS result_text,
result.units,
result.result_type,
result.status AS result_status,
result.entered_by,
MAX (ISNULL(test.date_authorised, (GETDATE - 365*100)), ISNULL(result.entered_on, (GETDATE - 365*100)), ISNULL(result.date_authorised, (GETDATE - 365*100)), ISNULL(esav.audit_date, (GETDATE - 365*100)) ) AS updated_date
FROM sample
INNER JOIN test
ON sample.id_numeric = test.sample
INNER JOIN result
ON test.test_number = result.test_number
LEFT OUTER JOIN eims_sample_audit_view esav
ON esav.id_numeric = sample.id_numeric
WHERE result.status <> 'U';

MAX() is an aggregation function, not a scalar function. What you really want is GREATEST(), but SQL Server does not have such a function (although most databases do).
This can be a pain using CASE because of nulls, so the simplest way is probably OUTER APPLY in the FROM clause:
SELECT . . .,
v.updated_date
FROM sample JOIN
test
ON sample.id_numeric = test.sample
result JOIN
ON test.test_number = result.test_number LEFT JOIN
eims_sample_audit_view esav (alias) -- huh?
ON esav.id_numeric = sample.id_numeric OUTER APPLY
(SELECT MAX(dte) as updated_date
FROM (VALUES (test.date_authorised),
(result.entered_on),
(result.date_authorised),
(esav.audit_date),
(GETDATE() - 365*100)
) v(dte)
) v;

SQL-Query (with subquery, group and order by) optimization

coud you help me optimizing the following statement. It has a bad prerformance when dealing with huge amount of data (in my case 3Mio Messages and 25Mio MessageWorkItems).
Does anybody have any suggestions? Thank you in advance.
select distinct msg.id, msgWorkItem_1.description
from message msg
left outer join message_work_item msgWorkItem_1 on msg.id=msgWorkItem_1.message_id
and ( msgWorkItem_1.id in (
select max(msgWorkItem_2.id)
from message_work_item msgWorkItem_2
inner join message_work_item_type msgWorkItem_Type on msgWorkItem_2.message_work_item_type_id=msgWorkItem_Type.id
where
msgWorkItem_2.creation_type= 'mobile'
and msgWorkItem_2.description is not null
and msgWorkItem_Type.code <> 'sent-to-app-manually'
-- Is it possible to avoid this correlation to the outer query ? )
and msgWorkItem_2.message_id = msg.id)
)
where msg.deactivation_time > ?
order by msgWorkItem_1.description asc

STEP 1 : Lay out the query so I have any hope of reading it
SELECT
DISTINCT
msg.id,
msgWorkItem_1.description
FROM
message msg
LEFT OUTER JOIN
message_work_item AS msgWorkItem_1
ON msgWorkItem_1.message_id = msg.id
AND msgWorkItem_1.id =
(
SELECT
MAX(msgWorkItem_2.id)
FROM
message_work_item AS msgWorkItem_2
INNER JOIN
message_work_item_type AS msgWorkItem_Type
ON msgWorkItem_2.message_work_item_type_id=msgWorkItem_Type.id
WHERE
msgWorkItem_2.creation_type= 'mobile'
AND msgWorkItem_2.description IS NOT NULL
AND msgWorkItem_Type.code <> 'sent-to-app-manually'
-- Is it possible to avoid this correlation to the outer query ?
AND msgWorkItem_2.message_id = msg.id
)
WHERE
msg.deactivation_time > ?
ORDER BY
msgWorkItem_1.description ASC
STEP 2 : rewrite using analytic functions instead of MAX()
SELECT
DISTINCT
message.id,
message_work_item_sorted.description
FROM
message
LEFT OUTER JOIN
(
SELECT
message_work_item.message_id,
message_work_item.description,
ROW_NUMBER() OVER (PARTITION BY message_work_item.message_id
ORDER BY message_work_item.id DESC
)
AS row_ordinal
FROM
message_work_item
INNER JOIN
message_work_item_type
ON message_work_item.message_work_item_type_id = message_work_item_type.id
WHERE
message_work_item.creation_type= 'mobile'
AND message_work_item.description IS NOT NULL
AND message_work_item_type.code <> 'sent-to-app-manually'
)
message_work_item_sorted
ON message_work_item_sorted.message_id = message.id
AND message_work_item_sorted.row_ordinal = 1
WHERE
message.deactivation_time > ?
ORDER BY
message_work_item_sorted.description ASC
With more information we could probably help further, but as you gave no definition of the tables, constraints, or business logic, this is just a re-write of what you're already implemented.
For example, I strongly doubt you need the DISTINCT (provided that the id columns in your tables are unique).

Multi level GROUP BY clause not allowed

I have the following CrossTab query
TRANSFORM Max(VWDRSSTA.DATUM_ZEIT) AS MaxOfDATUM_ZEIT
SELECT VWDRSSTA.ANTRAGSNUMMER
,IIF(VWDRSSTA.SYSTEM = 'VS', (
SELECT (Max(VWDRSSTA.DUNKEL)) AS d
FROM VWDRSSTA
), NULL) AS Dunkel
,Max(VWDRSSTA.VERS_NR_INT) AS Versicherungsnummer
FROM VWDRSSTA
INNER JOIN V_NAMES ON (VWDRSSTA.SYSTEM = V_NAMES.SYSTEM_CODE)
AND (VWDRSSTA.EREIGNIS = V_NAMES.EREIGNIS)
GROUP BY VWDRSSTA.ANTRAGSNUMMER
ORDER BY VWDRSSTA.ANTRAGSNUMMER
PIVOT V_NAMES.MAPPED_NAME;
which gives me the error "Multi-level GROUP BY clause is not allowed in a subquery". Where am I going wrong with the code?

Try with VWDRSSTA.SYSTEM in the GROUP BY clause.
It should do the trick.
Edit to detail my answer :
TRANSFORM Max(VWDRSSTA.DATUM_ZEIT) AS MaxOfDATUM_ZEIT
SELECT VWDRSSTA.ANTRAGSNUMMER
,IIF(VWDRSSTA.SYSTEM = 'VS', (
SELECT (Max(VWDRSSTA.DUNKEL)) AS d
FROM VWDRSSTA
), NULL) AS Dunkel
,Max(VWDRSSTA.VERS_NR_INT) AS Versicherungsnummer
FROM VWDRSSTA
INNER JOIN V_NAMES ON (VWDRSSTA.SYSTEM = V_NAMES.SYSTEM_CODE)
AND (VWDRSSTA.EREIGNIS = V_NAMES.EREIGNIS)
GROUP BY VWDRSSTA.ANTRAGSNUMMER, VWDRSSTA.SYSTEM
ORDER BY VWDRSSTA.ANTRAGSNUMMER
PIVOT V_NAMES.MAPPED_NAME;

You cross-tab query contains a secondary "aggregated" SQL and this is not allowed in ACCESS Cross-tab query.
change the Select max(vwdrssta.dunkel) ... part to DMax("dunkel", "vwdrssta") which will get rid of that error message.

Access SQL: subquery into IIF?

I have the following query that works fine:
SELECT NomComplet, IIF(Count(FS3.Index) = 0, '0 (RAS)', Count(FS3.Index))
FROM ControleAcces INNER JOIN (
Employes LEFT JOIN (
SELECT FS1.Index, FS1.OTP, FS1.OTP, FS1.Axe, FS1.FaitSaillant, FS1.Utilisateur, FS2.DateInsertion
FROM FaitsSaillants AS FS1 INNER JOIN (
SELECT Axe, Index, Max(FaitsSaillants.DateInsertion) AS DateInsertion
FROM FaitsSaillants
WHERE DateValue(DateInsertion) > #2010-01-01#
AND DateValue(DateInsertion) < #2011-12-31#
GROUP BY Axe, Index
) AS FS2
ON (FS1.DateInsertion = FS2.DateInsertion
AND FS1.Index = FS2.Index)
WHERE FS1.Axe = 'Project' AND FS2.Axe = 'Project'
) AS FS3
ON Employes.CIP = FS3.Utilisateur
)
ON ControleAcces.Valeur = Employes.CIP
GROUP BY NomComplet
ORDER BY NomComplet
Don't bother to fully understand it, all I want it to edit my IIF condition on the first line. Actually, the condition doesn't do much, it checks how many FS3.Index the query returns and concatenate (RAS) if it's 0. However, in fact, I would like it to check if there is any row in FaitsSaillants where Axe = 'RAS'. If the Count() of this is > 0, then the condition is met.
Can I do a subquery into the IIF segment, something like SELECT COUNT(Index) FROM FaitsSaillants WHERE Axe = 'RAS' AND Utilisateur = FS1.Utilisateur? If the result is 0, then I add the RAS to my second field's results. If not, it stays Count(FS3.Index).
I tried it and while the syntax is correct, the problem is it can't check for the Utilisateur = FS1.Utilisateur condition because FS1 is in the main query. However, I must check this because this is the only way to be sure that I'm looking for the right thing: it must be the same Utilisateur whether I'm in the main query or the subquery.
EDIT:
Here is a shorter version of what I tried from the answers/comments below.
SELECT NomComplet, IIf(FS2.AxeCount > 0, "0 (RAS)", count(FS3.index))
FROM ControleAcces INNER JOIN (Employes LEFT JOIN (SELECT FS2.AxeCount, FS1.Index, FS1.OTP, FS1.OTP, FS1.Axe, FS1.FaitSaillant, FS1.Utilisateur, FS2.DateInsertion
FROM FaitsSaillants AS FS1 INNER JOIN (
SELECT Axe, Index, Max(FaitsSaillants.DateInsertion) AS DateInsertion, SUM(IIf(Axe = 'RAS', 1, 0)) As AxeCount
FROM FaitsSaillants
GROUP BY Axe, Index
) AS FS2
ON (FS1.DateInsertion = FS2.DateInsertion
AND FS1.Index = FS2.Index)
) AS FS3 ON Employes.CIP = FS3.Utilisateur) ON ControleAcces.Valeur = Employes.CIP
GROUP BY NomComplet;
I still got an error about FS2.AxeCount that isn't a part of the aggregate function (iff).
I've also tried this:
SELECT NomComplet, IIf((select count(*) from FaitsSaillants where axe='RAS' and Utilisateur=ControleAcces.Valeur) > 0, "0 (RAS)", count(FS3.index))
FROM ControleAcces INNER JOIN (Employes LEFT JOIN (SELECT FS2.AxeCount, FS1.Index, FS1.OTP, FS1.OTP, FS1.Axe, FS1.FaitSaillant, FS1.Utilisateur, FS2.DateInsertion
FROM FaitsSaillants AS FS1 INNER JOIN (
SELECT Axe, Index, Max(FaitsSaillants.DateInsertion) AS DateInsertion, SUM(IIf(Axe = 'RAS', 1, 0)) As AxeCount
FROM FaitsSaillants
GROUP BY Axe, Index
) AS FS2
ON (FS1.DateInsertion = FS2.DateInsertion
AND FS1.Index = FS2.Index)
) AS FS3 ON Employes.CIP = FS3.Utilisateur) ON ControleAcces.Valeur = Employes.CIP
GROUP BY NomComplet, ControleAccess.Valeur;

FS3.Index is NULL if there is no corresponding record, because of the LEFT JOIN. Wouldn't a test
IIf(IsNull(FS3.Index), ..., ...)
... be sufficient? I'm not sure so, since other conditions and joins are involved as well.
UPDATE (recaptulation of comments)
We can get the desired count (AxeCount) from the innermost nested SELECT (FS2):
SELECT
Axe, Index, Max(FaitsSaillants.DateInsertion) AS DateInsertion,
SUM(IIf(Axe = 'RAS', 1, 0)) As AxeCount
FROM FaitsSaillants
...
This intermediate result must be passed to the outermost SELECT by including it in the select list of the intermediate SELECT (FS3):
SELECT FS2.AxeCount, FS1.Index, ...
The outer most SELECT has a GROUP BY clause. In this case, all fields of the select list must either be included in the GROUP BY clause or must be included in an aggregate function. The GROUP BY clause groups rows by the fields listed in this very clause. This usually reduces the number of rows, as several rows similar in their group fields are condensed to form one row. This means that the values of the remaining fields of the select list (not in the group fields) must be combined together. This is what the aggregate function does. Aggregate functions are
Avg (average)
Count
First, Last
Min, Max (minimum, maximum)
StDev, StDevP (standard deviation)
Sum
Var, VarP (variance)
See SQL Aggregate Functions (Access)
Now, we can add this in the outermost select list
IIf(SUM(FS2.AxeCount) > 0, ..., ...)

I'm a little unclear on exactly what you want to check in the FaitsSaillants table; you said:
I would like it to check if there is any row in FaitsSaillants where
Axe = 'RAS'. If the Count() of this is > 0, then the condition is met.
But you also stated:
something like SELECT COUNT(Index) FROM FaitsSaillants WHERE Axe =
'RAS' AND Employes.CIP = FS3.Utilisateur
My guess is that you meant SELECT COUNT(Index) FROM FaitsSaillants WHERE Axe = 'RAS', because in the second SQL statement you're joining to two tables that aren't being referenced in the FROM clause of your subquery.
What about using DCount in your IIF statement?
IIF(DCount("Index", "FaitsSaillants", "Axe='RAS'"), '0 (RAS)', Count(FS3.Index))
This should work if you meant SELECT COUNT(Index) FROM FaitsSaillants WHERE Axe = 'RAS', but it would need to be modified if you had something else in mind.
Just a caveat though: I would try to use the "domain" functions (DCount, DLookup, etc...) sparingly, because they are slow.
By the way, I believe you can also use a subquery in your IIF statement (is it giving you an error? It seems to work for me):
IIF((SELECT COUNT(*) FROM FaitsSaillants WHERE Axe = 'RAS'), '0 (RAS)', Count(FS3.Index))
Just make sure you are putting the subquery in parenthesis.

SQL - Derived tables issue

I have the following SQL query:
SELECT VehicleRegistrations.ID, VehicleRegistrations.VehicleReg,
VehicleRegistrations.Phone, VehicleType.VehicleTypeDescription,
dt.ID AS 'CostID', dt.IVehHire, dt.FixedCostPerYear, dt.VehicleParts,
dt.MaintenancePerMile, dt.DateEffective
FROM VehicleRegistrations
INNER JOIN VehicleType ON VehicleRegistrations.VehicleType = VehicleType.ID
LEFT OUTER JOIN (SELECT TOP (1) ID, VehicleRegID, DateEffective, IVehHire,
FixedCostPerYear, VehicleParts, MaintenancePerMile
FROM VehicleFixedCosts
WHERE (DateEffective <= GETDATE())
ORDER BY DateEffective DESC) AS dt
ON dt.VehicleRegID = VehicleRegistrations.ID
What I basically want to do is always select the top 1 record from the 'VehicleFixedCosts' table, where the VehicleRegID matches the one in the main query. What is happening here is that it's selecting the top row before the join, so if the vehicle registration of the top row doesn't match the one we're joining to it returns nothing.
Any ideas? I really don't want to have use subselects for each of the columns I need to return

Try this:
SELECT vr.ID, vr.VehicleReg,
vr.Phone, VehicleType.VehicleTypeDescription,
dt.ID AS 'CostID', dt.IVehHire, dt.FixedCostPerYear, dt.VehicleParts,
dt.MaintenancePerMile, dt.DateEffective
FROM VehicleRegistrations vr
INNER JOIN VehicleType ON vr.VehicleType = VehicleType.ID
LEFT OUTER JOIN (
SELECT ID, VehicleRegID, DateEffective, IVehHire, FixedCostPerYear, VehicleParts, MaintenancePerMile
FROM VehicleFixedCosts vfc
JOIN (
select VehicleRegID, max(DateEffective) as DateEffective
from VehicleFixedCosts
where DateEffective <= getdate()
group by VehicleRegID
) t ON vfc.VehicleRegID = t.VehicleRegID and vfc.DateEffective = t.DateEffective
) AS dt
ON dt.VehicleRegID = vr.ID
Subquery underneath dt might need some grouping but without schema (and maybe sample data) it's hard to say which column should be involved in that.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Aggregate by another table after annotate - sql

Related

Getting issue under SQL View error (The MAX function requires 1 argument(s).)

SQL-Query (with subquery, group and order by) optimization

Multi level GROUP BY clause not allowed

Access SQL: subquery into IIF?

SQL - Derived tables issue

Categories

Resources