How to aggregate on a left join in a Postgres CTE? - sql

In this CTE, each row in mytable can have 0 or many rows joined to it in jointable. I'm trying to returning an array_agg of the jointable's value column in this query, but I get an error saying I can't have an aggregate in a RETURNING.
WITH updated as(
UPDATE mytable SET status = 'A'
FROM
(
SELECT id FROM mytable
WHERE status = 'B'
ORDER BY mycolumn
LIMIT 100
FOR UPDATE
) sub
LEFT JOIN jointable j USING (id)
WHERE mytable.id = sub.id
GROUP BY (mytable.id)
RETURNING mytable.id, array_agg(j.value)
)
select *
from updated
ORDER BY mycolumn

You cannot have a GROUP BY clause in an UPDATE statement. Also, the UPDATE won't necessarily visit all matching rows in the joined table, so it wouldn't be able to return then anyway.
You will have to join jointable again in the outer query to get the desired result.

Related

Joined query producing more results compared to solo query

I am performing the following query which has an inner join against another table.
select count(myTable.name)
from sch2.sample_detail as myTable
inner join sch1.otherTable as otherTable on myTable.name = otherTable.name
where otherTable.is_valid = 1
and myTable.name IS NOT NULL;
This produces a count of 4912304.
The following is a query just on a single table (my table).
SELECT COUNT(myTable.name)
from sch2.sample_detail as myTable
where myTable.name IS NOT NULL;
This produces a count of 2864654.
But how is this possible? Both queries have the clause where myTable.name IS NOT NULL.
Shouldn't the second query produce same results or if not even more cos the second query doesn't have the otherTable.is_valid = 1 clause?
Why does the inner join produces a higher count of result?
Please advice if there is something I should amend in the 1st query, thanks.
Inner, left or cross join can duplicate rows. sch1.otherTable.name is not unique and this causing rows duplication because for each row in left table all corresponding rows from right table are being selected, this is normal join behavior.
To get duplicate names list use this query and decide how to remove duplicated rows: filter or distinct or filter by row_number, etc.
select count(*) cnt,
name
from sch1.otherTable
having count(*)>1
order by cnt desc;
If you need EXISTS (and do not need to select columns from otherTable), use left semi join.
Also subquery with distinct can be used to pre-aggregate name before join and filter:
select count(myTable.name)
from sch2.sample_detail as myTable
LEFT SEMI JOIN (select distinct name from sch1.otherTable otherTable where otherTable.is_valid = 1 ) as otherTable on myTable.name = otherTable.name
where myTable.name IS NOT NULL;

postgres update value based on count of associated join table

I have a simple scenario TableA, TableB and JoinTable that joins TableA and TableB. I want to store in TableA for each row in TableA the count of records in JoinTable that have TableAId. I can select it correctly as follows:
SELECT "Id", (SELECT COUNT(*) FROM "JoinTable" WHERE "JoinTable"."TableAId" = "TableA"."Id")
AS TOT FROM "TableA" LIMIT 100
However I'm having a hard time writing an update query. I want to update TableA.JoinCount with this result.
You can use a correlated subquery:
update tablea a
set tot = (
select count(*)
from jointable j
where t.tableaid = a.id
)
This updates all rows of tablea with the count of matches from jointable; if there are not matches, tot is set to 0.
I would not necessarily, however, recommend storing such derived information. While it can easily be intialized with the above statement, maintaining it is tedious. You will soon find yourself creating triggers for every DML operation on the join table (update, delete, insert). Instead, you could put the information in a view:
create view viewa as
select id,
(select count(*) from jointable j where j.tableaid = a.id) as tot
from tablea a
Side note: in general, don't use quoted identifiers in Postgres. This this link for more.
You can use the group by query as the source for an UPDATE statement:
update "TableA" a
set "JoinCount" = t.cnt
from (
select "TableAId" as id, count(*) as cnt
from "JoinTable"
group by "TableAId"
) t
WHERE t.id = a."Id"

entry cannot be referenced in this part of the query (subquery) Error

I'm getting the following error on my query:
here is an entry for table "table1", but it cannot be referenced from this part of the query.
This is my query:
SELECT id
FROM property_import_image_results table1
LEFT JOIN (
SELECT created_at
FROM property_import_image_results
WHERE external_url = table1.external_url
ORDER BY created_at DESC NULLS LAST
LIMIT 1
) as table2 ON (pimr.created_at = table2.created_at)
WHERE table2.created_at is NULL
You need a lateral join to be able to reference the outer table in the sub-select for the join.
You are also referencing an alias pimr in the join condition, which isn't available anywhere in the query. So you need to change that to table1 in the join condition.
You should also given the table in the inner query an alias to avoid confusion:
SELECT id
FROM property_import_image_results table1
LEFT JOIN LATERAL (
SELECT p2.created_at
FROM property_import_image_results p2
WHERE p2.external_url = table1.external_url
ORDER BY p2.created_at DESC NULLS LAST
LIMIT 1
) as table2 ON (table1.created_at = table2.created_at)
WHERE table2.created_at is NULL
Edit
This kind of query can also be solved using window functions:
select id
from (
select id,
max(created_at) over (partition by external_url) as max_created
FROM property_import_image_results
) t
where created_at <> max_created;
This might be faster than aggregating and joining as you do. But it's hard to tell. The lateral joins are quite efficient as well. It has the advantage that you can add any column you like to the result because no grouping is required.

Firebird group clause

I can't to understand firebird group logic
Query:
SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
works perfectly
But when I try to get other fields:
SELECT * FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
I get error: Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause)
When you use GROUP BY in your query, the field or fields specified are used as 'keys', and data rows are grouped based on unique combinations of those 2 fields. In the result set, every such unique combination has one and only one row.
In your case, the only identifier in the group is t.id. Now consider that you have 2 records in the table, both with t.id = 1, but having different values for another column, say, t.name. If you try to select both id and name columns, it directly contradicts the constraint that one group can have only one row. That is why you cannot select any field apart from the group key.
For aggregate functions it is different. That is because, when you sum or count values or get the maximum, you are basically performing that operation only based on the id field, effectively ignoring the data in the other columns. So, there is no issue because there can only be one answer to, say, count of all names with a particular id.
In conclusion, if you want to show a column in the results, you need to group by it. This will however, make the grouping more granular, which may not be desirable. In that case, you can do something like this:
select * from T1 t
where t.id in
(SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id)
When you using GROUP BY clause in SELECT you should use only aggreagted functions or columns that listed in GROUP BY clause. More about GROUP BY clause:http://www.firebirdsql.org/manual/nullguide-aggrfunc.html
As example:
SELECT Max(t.jid), t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
SELECT * FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
This will not execute,cause you have used t.id in group by, So all your columns in select clause should be using aggregate function , else those should be included in group by clause.
Select * means you are selecting all columns, so all columns except t.id are neither in group by nor in aggregate function.
Try this link, How to use GROUP BY in firebird

Outer Join with Where returning Nulls

Hi I have 2 tables. I want to list
all records in table1 which are present in
table2
all records in table2 which are not present in table1 with a where condition
Null rows will be returned by table1 in second condition but I am unable to get the query working correctly. It is only returning null rows
SELECT
A.CLMSRNO,A.CLMPLANO,A.GENCURRCODE,A.CLMNETLOSSAMT,
A.CLMLOSSAMT,A.CLMCLAIMPRCLLOSSSHARE
FROM
PAKRE.CLMCLMENTRY A
RIGHT OUTER JOIN (
SELECT
B.CLMSRNO,B.UWADVICETYPE,B.UWADVICENO,B.UWADVPREMCURRCODE,
B.GENSUBBUSICLASS,B.UWADVICENET,B.UWADVICEKIND,B.UWADVYEAR,
B.UWADVQTR,B.ISMANUAL,B.UWCLMNOREFNO
FROM
PAKRE.UWADVICE B
WHERE
B.ISMANUAL=1
) r
ON a.CLMSRNO=r.CLMSRNO
ORDER BY
A.CLMSRNO DESC;
Which OS are you using ?
Table aliases are case sensistive on some platforms, which is why your join condition ON a.CLMSRNO=r.CLMSRNO fails.
Try with A.CLMSRNO=r.CLMSRNO and see if that works
I'm not understanding your first attempt, but here's basically what you need, I think:
SELECT *
FROM TABLE1
INNER JOIN TABLE2
ON joincondition
UNION ALL
SELECT *
FROM TABLE2
LEFT JOIN TABLE1
ON joincondition
AND TABLE1.wherecondition
WHERE TABLE1.somejoincolumn IS NULL
I think you may want to remove the subquery and put its columns into the main query e.g.
SELECT A.CLMSRNO, A.CLMPLANO, A.GENCURRCODE, A.CLMNETLOSSAMT,
A.CLMLOSSAMT, A.CLMCLAIMPRCLLOSSSHARE,
B.CLMSRNO, B.UWADVICETYPE, B.UWADVICENO, B.UWADVPREMCURRCODE,
B.GENSUBBUSICLASS, B.UWADVICENET, B.UWADVICEKIND, B.UWADVYEAR,
B.UWADVQTR, B.ISMANUAL, B.UWCLMNOREFNO
FROM PAKRE.CLMCLMENTRY A
RIGHT OUTER JOIN PAKRE.UWADVICE B
ON A.CLMSRNO = B.CLMSRNO
WHERE B.ISMANUAL = 1
ORDER
BY A.CLMSRNO DESC;