must appear in the group by clause in sql - sql

I have a sql statement and I am trying to add order by, when I add order statement I get an error
ERROR: column "items.id" must appear in the GROUP BY clause or be used in an aggregate function
My query is.
WITH "has_children_cte"
AS (SELECT DISTINCT "parent_id" AS "item_id",
1 AS "has_children"
FROM "items")
SELECT "item_category_id",
Count(*) AS "count"
FROM "items"
INNER JOIN "items" AS "root_item"
ON ( "root_item"."id" = "items"."root_id" )
LEFT JOIN "item_types"
ON ( "items"."item_type_id" = "item_types"."id" )
LEFT JOIN "item_categories"
ON ( "item_categories"."id" = "item_types"."item_category_id" )
INNER JOIN "order_items"
ON ( "items"."order_item_id" = "order_items"."id" )
INNER JOIN "orders"
ON ( "order_items"."order_id" = "orders"."id" )
LEFT JOIN "has_children_cte"
ON ( "items"."id" = "has_children_cte"."item_id" )
WHERE ( ( "items"."parent_id" IS NULL )
AND ( "items"."state" != 'discarded' ) )
GROUP BY "item_category_id"
ORDER BY "items"."id";
I have add the ORDER BY "items"."id";
Then I get this error. When I try to add items.id into group by I got bad results.
Unfortunately I am unable to handle this error.

The ORDER BY (logically) takes place after the aggregation. And after the aggregation, "items"."id" is not available in each row.
So just use an aggregation function:
ORDER BY MIN("items"."id")

Related

ORDER in CTE lost after GROUP BY

I have the following SQL
WITH tally AS (
SELECT results.answer,
results.poll_id,
count(1) AS votes
FROM (
SELECT pr.poll_id,
unnest(pr.response) AS answer
FROM poll_responses pr
LEFT JOIN polls p ON pr.poll_id = p.id
LEFT JOIN poll_collections pc ON pc.id = p.poll_collection_id
WHERE pc.id = ${pollCollectionId}
) AS results
GROUP BY results.answer, results.poll_id
),
all_choices AS (SELECT unnest(pls.choices) AS choice,
pls.id AS poll_id
FROM poll_collections pcol
INNER JOIN polls pls
ON pcol.id = pls.poll_collection_id
WHERE pcol.id = ${pollCollectionId}),
unvoted_tally AS (SELECT ac.choice AS answer,
ac.poll_id,
0 AS total
FROM all_choices ac
LEFT JOIN tally t ON t.answer = ac.choice
WHERE t.answer IS NULL),
final_tally AS (SELECT *
FROM tally
UNION
ALL
SELECT *
FROM unvoted_tally),
sorted_tally AS (
SELECT ft.*
FROM final_tally ft
ORDER BY array_position(array(SELECT choice FROM all_choices), ft.answer)
)
SELECT json_agg(poll_results.polls) AS polls
FROM (
SELECT json_array_elements(json_agg(results)) -> 'poll' AS polls
FROM (
SELECT json_build_object(
'id', st.poll_id,
'question', pls.question,
'choice-type', pls.choice_type,
'results',
json_agg(json_build_object('choice', st.answer, 'votes', st.votes)),
'chosen', pr.response
) AS poll
FROM sorted_tally st
LEFT JOIN polls pls
ON
pls.id = st.poll_id
LEFT JOIN poll_responses pr
ON
pr.poll_id = st.poll_id AND
pr.email = ${email}
GROUP BY st.poll_id, pls.choice_type, pr.response, pls.question
) AS results)
AS poll_results;
I have a poll_responses table which store the user responses of a poll. I want to order the responses in exactly the same order they are stored in the polls table - as an array e.g., {Yes, No, Maybe}.
I applied the ORDER BY array_position(array(SELECT choice FROM all_choices), ft.answer) in the sorted_tally CTE.
However, in the file SELECT after applying GROUP BY the order is lost.
Is there a way to preserve the order of the choices?
Also, are there any optimizations applicable?
Much appreciated!
In json_build_object or json_agg you can set ORDER BY clause. First, have the last CTE SELECT needed order expression as a new column, then run in outermost query:
CTE
...
sorted_tally AS (
SELECT ft.votes
, ft.poll_id
, ft.answer
, array_position(array(SELECT choice FROM all_choices),
ft.answer) AS choice_order
FROM final_tally ft
ORDER BY
)
Outermost Query
...
json_build_object(
'id', st.poll_id,
'question', pls.question,
'choice-type', pls.choice_type,
'results', json_agg(json_build_object('choice', st.answer,
'votes', st.votes)
ORDER BY st.choice_order),
'chosen', pr.response
) AS poll
ORDER BY in a CTE doesn't really matter. It may work, but SQL Server is free to re-order the rows unless you specify ORDER BY in the outermost query to order all the results.

Postgres, how to limit number of rows returned from joined tables

I have the following query that return the data I want, however for the joined tables, I want to limit the number of rows returned and preferrably be able to specify for each joined table.
I tried using limit with the select itself, but doesn't seem to be supported.
Is this possible? I am using Postgres 11.
select array_to_json(array_agg(t)) from (
select
tbl_327.field_43,tbl_327.field_1,tbl_327.field_2,
jsonb_agg(distinct jsonb_build_object('id',tbl_332.id,'data',tbl_332.fullname)) as field_7,
jsonb_agg(distinct jsonb_build_object('id',tbl_312.id,'data',tbl_312.fullname)) as field_33
from schema_1.tbl_327 tbl_327
left join schema_1.tbl_327_to_tbl_332_field_7 field_7 on field_7.tbl_327_id=tbl_327.id
left join schema_1.tbl_332_customid tbl_332 on tbl_332.id = field_7.tbl_332_id
left join schema_1.tbl_327_to_tbl_312_field_33 field_33 on field_33.tbl_327_id=tbl_327.id
left join schema_1.tbl_312_customid tbl_312 on tbl_312.id = field_33.tbl_312_id
group by tbl_327.field_43,tbl_327.field_1,tbl_327.field_2
) t
UPDATED
here is my new query. I simplified it, but the issue is it's no longer returning correct data. For the field_4 field, it's returing rows/data that isn't associated with the record. Do I have something wrong?
select array_to_json(array_agg(t)) from (
select
tbl_342.field_1,tbl_342.field_2,tbl_342.id,
jsonb_agg(distinct jsonb_build_object('id',tbl_312.id,'data',tbl_312.fullname)) as field_4
from schema_1.tbl_342 tbl_342
left join lateral (
select distinct field_4.*
from schema_1.tbl_342_to_tbl_312_field_4 field_4
where field_4.tbl_342_id=tbl_342.id
limit 50) field_4 on true
left join lateral (
select distinct tbl_312.*
from schema_1.tbl_312_customid tbl_312
where tbl_312_id = field_4.tbl_312_id
limit 5
) tbl_312 on true
group by tbl_342.field_1,tbl_342.field_2,tbl_342.id
) t
One approach is to turn each left join to a lateral join; you can then set the limit within each subquery:
select array_to_json(array_agg(t)) from (
select
tbl_327.field_43,tbl_327.field_1,tbl_327.field_2,
jsonb_agg(distinct jsonb_build_object('id',tbl_332.id,'data',tbl_332.fullname)) as field_7,
jsonb_agg(distinct jsonb_build_object('id',tbl_312.id,'data',tbl_312.fullname)) as field_33
from schema_1.tbl_327 tbl_327
left join lateral (
select field_7.*
from schema_1.tbl_327_to_tbl_332_field_7 field_7
where field_7.tbl_327_id=tbl_327.id
order by ...
limit 5
) field_7 on true
left join lateral (
select tbl_332.*
from schema_1.tbl_332_customid tbl_332
where tbl_332.id = field_7.tbl_332_id
order by ??
limit 5
) tbl_332 on true
left join lateral ...
group by tbl_327.field_43,tbl_327.field_1,tbl_327.field_2
) t
Note that you need an order by to go along with limit in order to get stable results - you can replace the question marks in the query with the revelant columns or set of columns.

Create Subquery Select for COUNT with JOIN inside the subquery

Have this COUNT subquery but I can't get the syntax to work .
SELECT products.client_id,
clients.name AS client_name,
cars.vin,
cars.make,
cars.model
(SELECT COUNT(*) FROM manheim_auction_listings AS listings_sub
JOIN products ON
manheim_auction_listings.product_id = products.id
JOIN product_purchases ON
products.current_product_purchase_id = product_purchases.id
WHERE listings_sub.car_id = manheim_auction_listings.car_id AND
listings_sub.id <> manheim_auction_listings.id and
manheim_auction_listings.product_purchase_id =
product_purchases.id) as previous_auction_count
FROM manheim_auction_listings
JOIN cars ON
cars.id = manheim_auction_listings.car_id .....
The (SELECT COUNT(*) will not pass syntax with the JOIN's I Need to get the right count.
You are missing GROUP BY in your inner query.
WHERE listings_sub.car_id = manheim_auction_listings.car_id AND
listings_sub.id <> manheim_auction_listings.id and
manheim_auction_listings.product_purchase_id =
product_purchases.id
GROUP BY <list_of_column(s)>
Also, there are couple more syntax error(s)
You are missing comma , just after cars.model column MUST be separated by ,.
cars.model
(SELECT COUNT(*) FROM manheim_auction_listings AS listings_sub

How to use outer field name or alias in a nested subquery

I would use the selected field Referencia in the subquery.
I have tried including the field name the alias, and table name but not works.
How I can achieve this ?
Thanks
SELECT * FROM
(
SELECT
articulos.Codigo AS Referencia,
articulos.Nombre AS Descripcion,
barras.Codigo AS [Codigo de Barras],
ROW_NUMBER() OVER (PARTITION BY articulos.Codigo ORDER BY
articulos.Codigo ASC) as cantidad,
articulos.Familia,
articulos.Marca,
categorias.Codigo as Categoria,
articulos.ImpuestoEspecial AS Ecotasa,
articulos.Fase,
articulos.Iva,
--
-- Tarifa1
( SELECT [Codigo],[EuroPrecio]
FROM [GES16100].[dbo].[Tarifas]
WHERE [Codigo] = 1 AND [Articulo] = <------- Here, Referencia
)AS T1,
articulos.Proveedor,
articulos.GUID_Registro
FROM [GES16100].[dbo].[Articulos] as articulos
FULL JOIN [GES16100].[dbo].[Barras] as barras
ON articulos.Codigo = barras.Articulo
FULL JOIN [GES16100].[dbo].[Categorias_Asignaciones] catasignaciones
ON catasignaciones.GUID_RegistroFichero =articulos.GUID_Registro
FULL JOIN [GES16100].[dbo].[CategoriasFicheros] categorias
ON categorias.GUID_Registro = catasignaciones.GUID_Categoria
)AS supersub
WHERE supersub.cantidad = 1
Use table aliases and qualified column names whenever you have more than one table in a query.
Second, your subquery will not work because it returns two columns where one is expected.
For your example, I am guessing:
( SELECT t.EuroPrecio
FROM [GES16100].[dbo].[Tarifas] t
WHERE t.Codigo = 1 AND t.Articulo = a.Codigo
) AS T1,
You cannot use the column alias Referencias because it is defined in the same SELECT. Just use the column it is refering to.

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...