How to use aggregated result in order by clause with math operation in Postgres

How to use aggregated result in order by clause with math operation in Postgres - sql

I have a query as below
select "products".*,
AVG(score_values.score) as average_scores,
(select count(*) from "comments" where "products"."id" = "comments"."product_id") as comments_count
from "products"
inner join "score_values" on "products"."id" = "score_values"."product_id" and "score_values"."active" = 1
group by "products"."id"
order by average_scores desc
limit 5
When I add math operator to order clause, I get error which is column not exists.
order by average_scores * 0.9 + comments_count * 5 / 1000 desc
[42703] ERROR: column "average_scores" does not exist
How can I solve this problem ?

You have two options:
Repeat the expressions in the ORDER BY clause:
ORDER BY AVG(score_values.score) * 0.9
+ (select count(*) from "comments"
where "products"."id" = "comments"."product_id") * 5 / 1000
Use a subquery like GMB's answer suggests.
The second option is the better one.
Note that this behavior is documented:
A sort_expression can also be the column label or number of an output column, as in:
SELECT a + b AS sum, c FROM table1 ORDER BY sum;
SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
both of which sort by the first output column. Note that an output column name has to stand alone, that is, it cannot be used in an expression — for example, this is not correct:
SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong
This restriction is made to reduce ambiguity.

You can use aggregate expressions in the order by clause. The aggregates will be calculated once.
select
products.*,
avg(score_values.score) as average_scores,
count(comments.*) as comments_count
from products
inner join comments on products.id = comments.product_id
inner join score_values on products.id = score_values.product_id and score_values.active = 1
group by products.id
order by avg(score_values.score) * 0.9 + count(comments.*) * 5 / 1000 desc
limit 5

You could work around this by wrapping your query as a subquery, and the order in the outer query, like:
select *
from (
select "products".*,
AVG(score_values.score) as average_scores,
(select count(*) from "comments" where "products"."id" = "comments"."product_id") as comments_count
from "products"
inner join "score_values" on "products"."id" = "score_values"."product_id" and "score_values"."active" = 1
group by "products"."id"
limit 5
) x
order by average_scores * 0.9 + comments_count * 5 / 1000 desc

Related

Syntax error to combine left join and select

I'm getting a syntax error at Left Join. So in trying to combine the two, i used the left join and the brackets. I'm not sure where the problem is:
SELECT DISTINCT a.order_id
FROM fact.outbound AS a
ORDER BY Rand()
LIMIT 5
LEFT JOIN (
SELECT
outbound.marketplace_name,
outbound.product_type,
outbound.mpid,
outbound.order_id,
outbound.sku,
pbdh.mpid,
pbdh.product_type,
pbdh.validated_exp_reach,
pbdh.ultimate_sales_rank_de,
pbdh.ultimate_sales_rank_fr,
(
pbdh.very_good_stock_count + good_stock_count + new_Stock_count
) as total_stock
FROM
fact.outbound AS outbound
LEFT JOIN reporting_layer.pricing_bi_data_historisation AS pbdh ON outbound.mpid = pbdh.mpid
AND trunc(outbound.ordered_date) = trunc(pbdh.importdate)
WHERE
outbound.ordered_date > '2022-01-01'
AND pbdh.importdate > '2022-01-01'
LIMIT
5
) AS b ON a.orderid = b.order_id
Error:
You have an error in your SQL syntax; it seems the error is around: 'LEFT JOIN ( SELECT outbound.marketplace_name, outbound.product_t' at line 9
What could be the reason?

Place the first limit logic into a separate subquery, and then join the two subqueries:
SELECT DISTINCT a.order_id
FROM
(
SELECT order_id
FROM fact.outbound
ORDER BY Rand()
LIMIT 5
) a
LEFT JOIN
(
SELECT
outbound.marketplace_name,
outbound.product_type,
outbound.mpid,
outbound.order_id,
outbound.sku,
pbdh.mpid,
pbdh.product_type,
pbdh.validated_exp_reach,
pbdh.ultimate_sales_rank_de,
pbdh.ultimate_sales_rank_fr,
(pbdh.very_good_stock_count +
good_stock_count + new_Stock_count) AS total_stock
FROM fact.outbound AS outbound
LEFT JOIN reporting_layer.pricing_bi_data_historisation AS pbdh
ON outbound.mpid = pbdh.mpid AND
TRUNC(outbound.ordered_date) = TRUNC(pbdh.importdate)
WHERE outbound.ordered_date > '2022-01-01' AND
pbdh.importdate > '2022-01-01'
-- there should be an ORDER BY clause here...
LIMIT 5
) AS b
ON a.orderid = b.order_id;
Note that the select clause of the b subquery can be reduced to just the order_id, as no values from this subquery are actually selected in the end.

You can skip the LEFT JOIN, since no b columns are selected. (And SELECT DISTINCT makes sure any duplicates are eliminated.)
SELECT DISTINCT order_id
FROM fact.outbound
ORDER BY Rand()
LIMIT 5

How to get all the TOP rows that are equal

I am trying to write a query to answer the question "What Pop Tart flavor is sold in the most stores?"
Here is my schema:
I have this attempt:
SELECT COUNT(*) AS CountOfStores, PTF.PopTartFlavor AS PopTartFlavor
FROM tKrogerStore_PopTartFlavor KSPTF
INNER JOIN tPopTartFlavor PTF
ON KSPTF.PopTartFlavorID = PTF.PopTartFlavorID
GROUP BY PTF.PopTartFlavor
ORDER BY CountOfStores DESC
but the query gives me this:
and I just want the first 2 rows because they both have a count of 2, but if I say TOP 1 I just get the first row with a count of 2 and I don't get both rows that have a count of 2. How do I get both rows that have the same count (2)?

you could try uisng having the count(*) = to the max result
SELECT COUNT(*) AS CountOfStores, PTF.PopTartFlavor AS PopTartFlavor
FROM tKrogerStore_PopTartFlavor KSPTF
INNER JOIN tPopTartFlavor PTF
ON KSPTF.PopTartFlavorID = PTF.PopTartFlavorID
GROUP BY PTF.PopTartFlavor
HAVING COUNT(*) = (
select max(CountOfStores)
from (
SELECT COUNT(*) AS CountOfStores, PTF.PopTartFlavor AS PopTartFlavor
FROM tKrogerStore_PopTartFlavor KSPTF
INNER JOIN tPopTartFlavor PTF
ON KSPTF.PopTartFlavorID = PTF.PopTartFlavorID
GROUP BY PTF.PopTartFlavor
) t
)

I would capture the binned count and the overall max count in the same subquery query (the latter using an aggregated windowed function), then in the outer query filter for when they're equal.
select CountOfStores, PopTartFlavor
from (
select CountOfStores = count(*),
maxCountOfStores = max(count(*)) over(),
PTF.PopTartFlavor
from #tKrogerStore_PopTartFlavor KSPTF
join #tPopTartFlavor PTF ON KSPTF.PopTartFlavorID = PTF.PopTartFlavorID
group by PTF.PopTartFlavor
) c
where countOfStores = maxCountOfStores

How to use multiple count and where condition sql server 2008?

I have this two query
1.
select CL_Clients.cl_id,CL_Clients].cl_name,COUNT(*) AS number_of_orders
from CL_Clients,CLOI_ClientOrderItems
where CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id
2.
select CL_Clients.cl_id,count(cloi_current_status) as dis
from CLOI_ClientOrderItems,CL_Clients
where cloi_current_status]='12'
and CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id,CLOI_ClientOrderItems.cloi_current_status
i have this column i need to put count function and where condition
[cloi_current_status]
166
30
30
30
150
150
150
150
150
150
150

Quite simple, you just encapsulate the queries and give their result sets an alias and then do a JOIN between their aliases on the column that is common. (In the query below I assume you'll be joining by client id)
SELECT *
FROM (
SELECT CL_Clients.cl_id,
CL_Clients].cl_name,
COUNT(*) AS number_of_orders
FROM CL_Clients,
CLOI_ClientOrderItems
WHERE CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id
) A
INNER JOIN (
SELECT CL_Clients.cl_id,
count(cloi_current_status) AS dis
FROM CLOI_ClientOrderItems,
CL_Clients
WHERE cloi_current_status] = '12'
AND CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id,
CLOI_ClientOrderItems.cloi_current_status
) B
ON A.cl_id = B.cl_id
WHERE ...
GROUP BY ...
This will be treated as a separate result set, so you can also filter results with a WHERE or just a GROUP BY, just like in a normal SELECT.
UPDATE:
To answer the question in your comments, when you join two tables that have a column with the same value and use
SELECT * FROM A INNER JOIN B the * will show all columns returned by the join, meaning all columns from A and all columns from B, this is why you have duplicate columns.
If you want to filter the columns returned you can specifiy which columns you want returned. So, in your case, the top SELECT * can be replaced with
SELECT A.cl_id, A.cl_name, A.number_of_orders, B.dis so, your query becomes:
SELECT A.cl_id, A.cl_name, A.number_of_orders, B.dis
FROM (
SELECT CL_Clients.cl_id,
CL_Clients].cl_name,
COUNT(*) AS number_of_orders
FROM CL_Clients,
CLOI_ClientOrderItems
WHERE CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id
) A
INNER JOIN (
SELECT CL_Clients.cl_id,
count(cloi_current_status) AS dis
FROM CLOI_ClientOrderItems,
CL_Clients
WHERE cloi_current_status] = '12'
AND CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id,
CLOI_ClientOrderItems.cloi_current_status
) B
ON A.cl_id = B.cl_id
UPDATE #2:
For your last question, you need to GROUP BY at the end of the big query and use a HAVING condtion, like this:
GROUP BY A.cl_id, A.cl_name, A.number_of_orders, B.dis
HAVING COUNT(cloi_current_status) > 100

All depends on what data you are trying to get, but you can go about it like this.
SELECT Column_x, Column_y, etc..
FROM ClL_Clients a
JOIN (select CL_Clients.cl_id,CL_Clients].cl_name,COUNT(*) AS number_of_orders
from CL_Clients,CLOI_ClientOrderItems
where CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id) b
on a.cl_id = b.cl_id
JOIN (select CL_Clients.cl_id,count(cloi_current_status) as dis
from CLOI_ClientOrderItems,CL_Clients
where cloi_current_status]='12'
and CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id,CLOI_ClientOrderItems.cloi_current_status) c
on a.cl_id = c.cl_id
Group by BLAH BLAH
Hope this gets you in the right direction.

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.

SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.

Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by

Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE

Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

MS-Access -> SELECT AS + ORDER BY = error

I'm trying to make a query to retrieve the region which got the most sales for sweet products. 'grupo_produto' is the product type, and 'regiao' is the region. So I got this query:
SELECT TOP 1 r.nm_regiao, (SELECT COUNT(*)
FROM Dw_Empresa
WHERE grupo_produto='1' AND
cod_regiao = d.cod_regiao) as total
FROM Dw_Empresa d
INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao ORDER BY total DESC
Then when i run the query, MS-Access asks for the "total" parameter. Why it doesn't consider the newly created 'column' I made in the select clause?
Thanks in advance!

Old Question I know, but it may help someone knowing than while you cant order by aliases, you can order by column index. For example, this will work without error :
SELECT
firstColumn,
IIF(secondColumn = '', thirdColumn, secondColumn) As yourAlias
FROM
yourTable
ORDER BY
2 ASC
The results would then be ordered by the values found in the second column wich is the Alias "yourAlias".

Aliases are only usable in the query output. You can't use them in other parts of the query. Unfortunately, you'll have to copy and paste the entire subquery to make it work.

You can do it like this
select * from(
select a + b as c, * from table)
order by c
Access has some differences compared to Sql Server.

Why it doesn't consider the newly
created 'column' I made in the select
clause?
Because Access (ACE/Jet) is not compliant with the SQL-92 Standard.
Consider this example, which is valid SQL-92:
SELECT a AS x, c - b AS y
FROM MyTable
ORDER
BY x, y;
In fact, x and y the only valid elements in the ORDER BY clause because all others are out of scope (ordinal numbers of columns in the SELECT clause are valid though their use id deprecated).
However, Access chokes on the above syntax. The equivalent Access syntax is this:
SELECT a AS x, c - b AS y
FROM MyTable
ORDER
BY a, c - b;
However, I understand from #Remou's comments that a subquery in the ORDER BY clause is invalid in Access.

Try using a subquery and order the results in an outer query.
SELECT TOP 1 * FROM
(
SELECT
r.nm_regiao,
(SELECT COUNT(*)
FROM Dw_Empresa
WHERE grupo_produto='1' AND cod_regiao = d.cod_regiao) as total
FROM Dw_Empresa d
INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao
) T1
ORDER BY total DESC
(Not tested.)

How about:
SELECT TOP 1 r.nm_regiao
FROM (SELECT Dw_Empresa.cod_regiao,
Count(Dw_Empresa.cod_regiao) AS CountOfcod_regiao
FROM Dw_Empresa
WHERE Dw_Empresa.[grupo_produto]='1'
GROUP BY Dw_Empresa.cod_regiao
ORDER BY Count(Dw_Empresa.cod_regiao) DESC) d
INNER JOIN tb_regiao AS r
ON d.cod_regiao = r.cod_regiao

I suggest using an intermediate query.
SELECT r.nm_regiao, d.grupo_produto, COUNT(*) AS total
FROM Dw_Empresa d INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao
GROUP BY r.nm_regiao, d.grupo_produto;
If you call that GroupTotalsByRegion, you can then do:
SELECT TOP 1 nm_regiao, total FROM GroupTotalsByRegion
WHERE grupo_produto = '1' ORDER BY total DESC
You may think it's extra work to create the intermediate query (and, in a sense, it is), but you will also find that many of your other queries will be based off of GroupTotalsByRegion. You want to avoid repeating that logic in many other queries. By keeping it in one view, you provide a simplified route to answering many other questions.

How about use:
WITH xx AS
(
SELECT TOP 1 r.nm_regiao, (SELECT COUNT(*)
FROM Dw_Empresa
WHERE grupo_produto='1' AND
cod_regiao = d.cod_regiao) as total
FROM Dw_Empresa d
INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao
) SELECT * FROM xx ORDER BY total

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to use aggregated result in order by clause with math operation in Postgres - sql

Related

Syntax error to combine left join and select

How to get all the TOP rows that are equal

How to use multiple count and where condition sql server 2008?

Limit join to one row

MS-Access -> SELECT AS + ORDER BY = error

Categories

Resources