How to rewrite long query?

How to rewrite long query? - sql

I have the following 2 tables:
items:
id int primary key
bla text
events:
id_items int
num int
when timestamp without time zone
ble text
composite primary key: id_items, num
and want to select to each item the most recent event (the newest 'when').
I wrote an request, but I don't know if it could be written more efficiently.
Also on PostgreSQL there is a issue with comparing Timestamp objects:
2010-05-08T10:00:00.123 == 2010-05-08T10:00:00.321
so I select with 'MAX(num)'
Any thoughts how to make it better? Thanks.
SELECT i.*, ea.* FROM items AS i JOIN
( SELECT t.s AS t_s, t.c AS t_c, max(e.num) AS o FROM events AS e JOIN
( SELECT DISTINCT id_item AS s, MAX(when) AS c FROM events GROUP BY s ORDER BY c ) AS t
ON t.s = e.id_item AND e.when = t.c GROUP BY t.s, t.c ) AS tt
ON tt.t_s = i.id JOIN events AS ea ON ea.id_item = tt.t_s AND ea.cas = tt.t_c AND ea.num = tt.o;
EDIT: had bad data, sorry, my bad, however thanks for finding better SQL query

SELECT (i).*, (e).*
FROM (
SELECT i,
(
SELECT e
FROM events e
WHERE e.id_items = i.id
ORDER BY
when DESC
LIMIT 1
) e
FROM items i
) q

If you're using 8.4:
select * from (
select item.*, event.*,
row_number() over(partition by item.id order by event."when" desc) as row_number
from items item
join events event on event.id_items = item.id
) x where row_number = 1

For this kind of joins, I prefer the DISTINCT ON syntax (example).
It's a Postgresql extension (not SQL standard syntax), but it comes very handy:
SELECT DISTINCT ON (it.id)
it.*, ev.*
FROM items it, events ev
WHERE ev.id_items = it.id
ORDER by it.id, ev.when DESC;
You can't beat that, on terms of simplicity and readability.
That query assumes that every item has at least one event. If not, and if you want all
events, you'll need an outer join:
SELECT DISTINCT ON (it.id)
it.*, ev.*
FROM items it LEFT JOIN events ev
ON ev.id_items = it.id
ORDER BY it.id, ev.when DESC;
BTW: There is no "timestamp issue" in Postgresql, perhaps you should change the title.

Related

ORDER in CTE lost after GROUP BY

I have the following SQL
WITH tally AS (
SELECT results.answer,
results.poll_id,
count(1) AS votes
FROM (
SELECT pr.poll_id,
unnest(pr.response) AS answer
FROM poll_responses pr
LEFT JOIN polls p ON pr.poll_id = p.id
LEFT JOIN poll_collections pc ON pc.id = p.poll_collection_id
WHERE pc.id = ${pollCollectionId}
) AS results
GROUP BY results.answer, results.poll_id
),
all_choices AS (SELECT unnest(pls.choices) AS choice,
pls.id AS poll_id
FROM poll_collections pcol
INNER JOIN polls pls
ON pcol.id = pls.poll_collection_id
WHERE pcol.id = ${pollCollectionId}),
unvoted_tally AS (SELECT ac.choice AS answer,
ac.poll_id,
0 AS total
FROM all_choices ac
LEFT JOIN tally t ON t.answer = ac.choice
WHERE t.answer IS NULL),
final_tally AS (SELECT *
FROM tally
UNION
ALL
SELECT *
FROM unvoted_tally),
sorted_tally AS (
SELECT ft.*
FROM final_tally ft
ORDER BY array_position(array(SELECT choice FROM all_choices), ft.answer)
)
SELECT json_agg(poll_results.polls) AS polls
FROM (
SELECT json_array_elements(json_agg(results)) -> 'poll' AS polls
FROM (
SELECT json_build_object(
'id', st.poll_id,
'question', pls.question,
'choice-type', pls.choice_type,
'results',
json_agg(json_build_object('choice', st.answer, 'votes', st.votes)),
'chosen', pr.response
) AS poll
FROM sorted_tally st
LEFT JOIN polls pls
ON
pls.id = st.poll_id
LEFT JOIN poll_responses pr
ON
pr.poll_id = st.poll_id AND
pr.email = ${email}
GROUP BY st.poll_id, pls.choice_type, pr.response, pls.question
) AS results)
AS poll_results;
I have a poll_responses table which store the user responses of a poll. I want to order the responses in exactly the same order they are stored in the polls table - as an array e.g., {Yes, No, Maybe}.
I applied the ORDER BY array_position(array(SELECT choice FROM all_choices), ft.answer) in the sorted_tally CTE.
However, in the file SELECT after applying GROUP BY the order is lost.
Is there a way to preserve the order of the choices?
Also, are there any optimizations applicable?
Much appreciated!

In json_build_object or json_agg you can set ORDER BY clause. First, have the last CTE SELECT needed order expression as a new column, then run in outermost query:
CTE
...
sorted_tally AS (
SELECT ft.votes
, ft.poll_id
, ft.answer
, array_position(array(SELECT choice FROM all_choices),
ft.answer) AS choice_order
FROM final_tally ft
ORDER BY
)
Outermost Query
...
json_build_object(
'id', st.poll_id,
'question', pls.question,
'choice-type', pls.choice_type,
'results', json_agg(json_build_object('choice', st.answer,
'votes', st.votes)
ORDER BY st.choice_order),
'chosen', pr.response
) AS poll

ORDER BY in a CTE doesn't really matter. It may work, but SQL Server is free to re-order the rows unless you specify ORDER BY in the outermost query to order all the results.

How to assign unique key to values using array_agg() function in bigquery

I am trying to assign key to each distinct value in bigquery with row_number. But it is giving resource exceeding error. So can I achieved same thing with array_aggegate function?
code :
select a.values
, a.type
, max_key + row_number() over(order BY a.values) key
, a.event_date
from gaid_raw a
LEFT JOIN existing_key_table e
on e.type = a.type
and e.values = a.values
left join (
select type, coalesce(max(key),0) max_key from existing_key_table group by 1
) e1
on e1.type = a.type
where e.key is null

I'm not sure if this will fix your problem, but I think this is the logic you want:
select gr.values, gr.type
coalesce(max_key, 0) + row_number() over (partition by gr.type order by gr.values) as key,
gr.event_date
from gaid_raw gr left join
(select type, max(key) as max_key
from existing_key_table
group by 1
) e
on e.type = gr.type
where not exists (select 1
from existing_key_table e
where e.type = gr.type and e.values = gr.values
);
For unrecognized types, you need the coalesce() in the outer select, not the subquery.
You also seem to want to assign sequential numbers based on the type.
If you still get resource errors, there is a way to fix this, but a bit more information is needed about the data. However, I have in the past used random values for such keys -- assuming the ordering is not needed. There is such a small chance of collision that it has worked on fairly large data.
Now, I would use GENERATE_UUID() for a unique id.

SQL Most Recent Register FROM Second Table by Id

I have 2 tables (Opportunity and Stage). I need to get each opportunity with the most recent stage by StageTypeId.
Opportunity: Id, etc
Stage: Id, CreatedOn, OpportunityId, StageTypeId.
Let's suppose I have "opportunity1" and "opportunity2" each one with many Stages added.
By passing the StageTypeId I need to get the opportunity which has this StageTypeId as most recent.
I'm trying the following query but it´s replicating the same Stage for all the Opportunities.
It seems that it's ignoring this line: "AND {Stage}.[OpportunityId] = ID"
SELECT {Opportunity}.[Id] ID,
{Opportunity}.[Name],
{Opportunity}.[PotentialAmount],
{Contact}.[FirstName],
{Contact}.[LastName],
(SELECT * FROM
(
SELECT {Stage}.[StageTypeId]
FROM {Stage}
WHERE {Stage}.[StageTypeId] = #StageTypeId
AND {Stage}.[OpportunityId] = ID
ORDER BY {Stage}.[CreatedOn] DESC
)
WHERE ROWNUM = 1) AS StageTypeId
FROM {Opportunity}
LEFT JOIN {Contact}
ON {Opportunity}.[ContactId] = {Contact}.[Id]
Thank you

Most of DBMS support fetch first clause So, you can do :
select o.*
from Opportunity o
where o.StageTypeId = (select s.StageTypeId
from Stage s
where s.OpportunityId = o.id
order by s.CreatedOn desc
fetch first 1 rows only
);

you can try below way all dbms will support
select TT*. ,o*. from
(
select s1.OpportunityId,t.StageTypeId from Stage s1 inner join
(select StageTypeId,max(CreatedOn) as createdate Stage s
group by StageTypeId
) t
on s1.StageTypeId=t.StageTypeId and s1.CreatedOn=t.createdate
) as TT inner join Opportunity o on TT.OpportunityId=o.id

PostgreSQL - how to query "result IN ALL OF"?

I am new to PostgreSQL and I have a problem with the following query:
WITH relevant_einsatz AS (
SELECT einsatz.fahrzeug,einsatz.mannschaft
FROM einsatz
INNER JOIN bergefahrzeug ON einsatz.fahrzeug = bergefahrzeug.id
),
relevant_mannschaften AS (
SELECT DISTINCT relevant_einsatz.mannschaft
FROM relevant_einsatz
WHERE relevant_einsatz.fahrzeug IN (SELECT id FROM bergefahrzeug)
)
SELECT mannschaft.id,mannschaft.rufname,person.id,person.nachname
FROM mannschaft,person,relevant_mannschaften WHERE mannschaft.leiter = person.id AND relevant_mannschaften.mannschaft=mannschaft.id;
This query is working basically - but in "relevant_mannschaften" I am currently selecting each mannschaft, which has been to an relevant_einsatz with at least 1 bergefahrzeug.
Instead of this, I want to select into "relevant_mannschaften" each mannschaft, which has been to an relevant_einsatz WITH EACH from bergefahrzeug.
Does anybody know how to formulate this change?

The information you provide is rather rudimentary. But tuning into my mentalist skills, going out on a limb, I would guess this untangled version of the query does the job much faster:
SELECT m.id, m.rufname, p.id, p.nachname
FROM person p
JOIN mannschaft m ON m.leiter = p.id
JOIN (
SELECT e.mannschaft
FROM einsatz e
JOIN bergefahrzeug b ON b.id = e.fahrzeug -- may be redundant
GROUP BY e.mannschaft
HAVING count(DISTINCT e.fahrzeug)
= (SELECT count(*) FROM bergefahrzeug)
) e ON e.mannschaft = m.id
Explain:
In the subquery e I count how many DISTINCT mountain-vehicles (bergfahrzeug) have been used by a team (mannschaft) in all their deployments (einsatz): count(DISTINCT e.fahrzeug)
If that number matches the count in table bergfahrzeug: (SELECT count(*) FROM bergefahrzeug) - the team qualifies according to your description.
The rest of the query just fetches details from matching rows in mannschaft and person.
You don't need this line at all, if there are no other vehicles in play than bergfahrzeuge:
JOIN bergefahrzeug b ON b.id = e.fahrzeug
Basically, this is a special application of relational division. A lot more on the topic under this related question:
How to filter SQL results in a has-many-through relation

Do not know how to explain it, but here is an example how I solved this problem, just in case somebody has the some question one day.
WITH dfz AS (
SELECT DISTINCT fahrzeug,mannschaft FROM einsatz WHERE einsatz.fahrzeug IN (SELECT id FROM bergefahrzeug)
), abc AS (
SELECT DISTINCT mannschaft FROM dfz
), einsatzmannschaften AS (
SELECT abc.mannschaft FROM abc WHERE (SELECT sum(dfz.fahrzeug) FROM dfz WHERE dfz.mannschaft = abc.mannschaft) = (SELECT sum(bergefahrzeug.id) FROM bergefahrzeug)
)
SELECT mannschaft.id,mannschaft.rufname,person.id,person.nachname
FROM mannschaft,person,einsatzmannschaften WHERE mannschaft.leiter = person.id AND einsatzmannschaften.mannschaft=mannschaft.id;

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.

SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.

Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by

Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE

Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to rewrite long query? - sql

SELECT (i)., (e). FROM ( SELECT i, ( SELECT e FROM events e WHERE e.id_items = i.id ORDER BY when DESC LIMIT 1 ) e FROM items i ) q

If you're using 8.4: select * from ( select item., event., row_number() over(partition by item.id order by event."when" desc) as row_number from items item join events event on event.id_items = item.id ) x where row_number = 1

Related

ORDER in CTE lost after GROUP BY

How to assign unique key to values using array_agg() function in bigquery

SQL Most Recent Register FROM Second Table by Id

PostgreSQL - how to query "result IN ALL OF"?

Limit join to one row

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to rewrite long query? - sql

SELECT (i).*, (e).* FROM ( SELECT i, ( SELECT e FROM events e WHERE e.id_items = i.id ORDER BY when DESC LIMIT 1 ) e FROM items i ) q

If you're using 8.4: select * from ( select item.*, event.*, row_number() over(partition by item.id order by event."when" desc) as row_number from items item join events event on event.id_items = item.id ) x where row_number = 1

Related

ORDER in CTE lost after GROUP BY

How to assign unique key to values using array_agg() function in bigquery

SQL Most Recent Register FROM Second Table by Id

PostgreSQL - how to query "result IN ALL OF"?

Limit join to one row

Categories

Resources

SELECT (i)., (e). FROM ( SELECT i, ( SELECT e FROM events e WHERE e.id_items = i.id ORDER BY when DESC LIMIT 1 ) e FROM items i ) q

If you're using 8.4: select * from ( select item., event., row_number() over(partition by item.id order by event."when" desc) as row_number from items item join events event on event.id_items = item.id ) x where row_number = 1