"plan should not reference subplan's variable" error - sql

I need to update 2 columns in a table with values from another table
UPDATE transakcje t SET s_dzien = s_dzien0, s_cena = s_cena0
FROM
(SELECT c.price AS s_cena0, c.dzien AS s_dzien0 FROM ciagle c
WHERE c.dzien = t.k_dzien ORDER BY s_cena0 DESC LIMIT 1) AS zza;
But I got an error:
plan should not reference subplan's variable.
DB structure is as simple as possible: transakcje has k_dzien, k_cena, s_dzien, s_cena and ciagle has fields price, dzien.
I'm running PostgreSQL 9.3.
Edit
I want to update all records from transakcje.
For each row I must find one row from ciagle with same dzien and maximum price and save this price and dzien into transakcje.
In ciagle there are many rows with the same dzien (column is not distinct).

Problem
The form you had:
UPDATE tbl t
SET ...
FROM (SELECT ... WHERE col = t.col LIMIT 1) sub
... is illegal to begin with. As the error message tells you, a subquery cannot reference the table in the UPDATE clause. Items in the FROM list generally cannot reference other items on the same level (except with LATERAL in Postgres 9.3 or later). And the table in the UPDATE clause can never be referenced by subqueries in the FROM clause (and that hasn't changed in Postgres 9.3).
Even if that was possible the result would be nonsense for two reasons:
The subquery with LIMIT 1 produces exactly one row (total), while you obviously want a specific value per dzien:
one row from ciagle with same dzien
Once you amend that and compute one price per dzien, you would end up with something like a cross join unless you add a WHERE condition to unambiguously join the result from the subquery to the table to be updated. Quoting the manual on UPDATE:
In other words, a target row shouldn't join to more than one row from
the other table(s). If it does, then only one of the join rows will be
used to update the target row, but which one will be used is not readily predictable.
Solution
All of this taken into account your query could look like this:
UPDATE transakcje t
SET s_dzien = c.dzien
, s_cena = c.price
FROM (
SELECT DISTINCT ON (dzien)
dzien, price
FROM ciagle
ORDER BY dzien, price DESC
) c
WHERE t.k_dzien = c.dzien
AND (t.s_dzien IS DISTINCT FROM c.dzien OR
t.s_cena IS DISTINCT FROM c.price)
Get the highest price for every dzien in ciagle in a subquery with DISTINCT ON. Details:
Select first row in each GROUP BY group?
Like #wildplasser commented, if you all you need is the highest price, you could also use the aggregate function max() instead of DISTINCT ON:
...
FROM (
SELECT dzien, max(price) AS price
FROM ciagle
GROUP BY czien
) c
...
transakcje ends up with the same value in s_dzien and k_dzien where related rows are present in ciagle.
The added WHERE clause prevents empty updates, which you probably don't want: only cost and no effect (except for exotic special cases with triggers et al.) - a common oversight.

Related

Get 5 most recent records for each Number

Data I have a table in Access that has a Part Number and PriceYr and Price associated to each Part Number.There are over 10,000 records and the PartNumber are repeated and has different PriceYr and Price associated to it. However, I need a query to just find the 5 most recent price and date associated with it.
I tried using MAX(PriceYr) however, it only returns 1 most recent record for each PartNumber.
I also tried the following query but it doesn't seem to work.
SELECT Catalogs.PartNumber,Catalogs.PriceYr, Catalogs.Price FROM Catalogs
WHERE Catalogs.PriceYr in
(SELECT TOP 5 Catalogs.PriceYr
FROM Catalogs as Temp
WHERE Temp.PartNumber = Catalogs.PartNumber
ORDER By Catalogs.PriceYr DESC)
Any help will be greatly appreciated. Thank you.
Desired Result that i am trying to get.
Consider a correlated count subquery to filter by a rank variable. Right now, you pull top 5 overall on matching PartNumber not per PartNumber.
SELECT main.*
FROM
(SELECT c.PartNumber, c.PriceYr, c.Price,
(SELECT Count(*)
FROM Catalogs AS Temp
WHERE Temp.PartNumber = c.PartNumber
AND Temp.PriceYr >= c.PriceYr) As rank
FROM Catalogs c
) As main
WHERE main.rank <= 5
MAX() is an aggregating function, meaning that it groups all the data and takes the maximal value in the specified column. You need to use a GROUP BY statement to prevent the query from grouping the whole dataset in a single row.
On the other hand, your query seems to needlessly use a subquery. The following query should work quite fine :
SELECT TOP 5 c.PartNumber, c.PriceYr, c.Price
FROM Catalogs c
ORDER BY c.PriceYr DESC
WHERE c.PartNumber = #partNumber -- if you want the query to
-- work on a specific part number
(please post a table creation query to make sure this example works)

Oracle subquery in select

I have a table that keeps costs of products. I'd like to get the average cost AND last buying invoice for each product.
My solution was creating a sub-select to get last buying invoice but unfortunately I'm getting
ORA-00904: "B"."CODPROD": invalid identifier
My query is
SELECT (b.cod_aux) product,
-- here goes code to get average cost,
(SELECT round(valorultent, 2)
FROM (SELECT valorultent
FROM pchistest
WHERE codprod = b.codprod
ORDER BY dtultent DESC)
WHERE ROWNUM = 1)
FROM pchistest a, pcembalagem b
WHERE a.codprod = b.codprod
GROUP BY a.codprod, b.cod_aux
ORDER BY b.cod_aux
In short what I'm doing on sub-select is ordering descendantly and getting the first row given the product b.codprod
Your problem is that you can't use your aliased columns deeper than one sub-query. According to the comments, this was changed in 12C, but I haven't had a chance to try it as the data warehouse that I use is still on 11g.
I would use something like this:
SELECT b.cod_aux AS product
,ROUND (r.valorultent, 2) AS valorultent
FROM pchistest a
JOIN pcembalagem b ON (a.codprod = b.codprod)
JOIN (SELECT valorultent
,codprod
,ROW_NUMBER() OVER (PARTITION BY codprod
ORDER BY dtultent DESC)
AS row_no
FROM pchistest) r ON (r.row_no = 1 AND r.codprod = b.codprod)
GROUP BY a.codprod, b.cod_aux
ORDER BY b.cod_aux
I avoid sub-queries in SELECT statements. Most of the time, the optimizer wants to run a SELECT for each item in the cursor, OR it does some crazy nested loops. If you do it as a sub-query in the JOIN, Oracle will normally process the rows that you are joining; normally, it is more efficient. Finally, complete your per item functions (in this case, the ROUND) in the final product. This will prevent Oracle from doing it on ALL rows, not just the ones you use. It should do it correctly, but it can get confused on complex queries.
The ROW_NUMBER() OVER (PARTITION BY ..) is where the magic happens. This adds a row number to each group of CODPRODs. This allows you to pluck the top row from each CODPROD, so this allows you to get the newest/oldest/greatest/least/etc from your sub-query. It is also great for filtering duplicates.

Postgres Error: More than one row returned by a subquery used as an expression

I have two separate databases. I am trying to update a column in one database to the values of a column from the other database:
UPDATE customer
SET customer_id=
(SELECT t1 FROM dblink('port=5432, dbname=SERVER1 user=postgres password=309245',
'SELECT store_key FROM store') AS (t1 integer));
This is the error I am receiving:
ERROR: more than one row returned by a subquery used as an expression
Any ideas?
Technically, to remove the error, add LIMIT 1 to the subquery to return at most 1 row. The statement would still be nonsense.
... 'SELECT store_key FROM store LIMIT 1' ...
Practically, you want to match rows somehow instead of picking an arbitrary row from the remote table store to update every row of your local table customer.
I assume a text column match_name in both tables (UNIQUE in store) for the sake of this example:
... 'SELECT store_key FROM store
WHERE match_name = ' || quote_literal(customer.match_name) ...
But that's an extremely expensive way of doing things.
Ideally, you completely rewrite the statement.
UPDATE customer c
SET customer_id = s.store_key
FROM dblink('port=5432, dbname=SERVER1 user=postgres password=309245'
, 'SELECT match_name, store_key FROM store')
AS s(match_name text, store_key integer)
WHERE c.match_name = s.match_name
AND c.customer_id IS DISTINCT FROM s.store_key;
This remedies a number of problems in your original statement.
Obviously, the basic error is fixed.
It's typically better to join in additional relations in the FROM clause of an UPDATE statement than to run correlated subqueries for every individual row.
When using dblink, the above becomes a thousand times more important. You do not want to call dblink() for every single row, that's extremely expensive. Call it once to retrieve all rows you need.
With correlated subqueries, if no row is found in the subquery, the column gets updated to NULL, which is almost always not what you want. In my updated query, the row only gets updated if a matching row is found. Else, the row is not touched.
Normally, you wouldn't want to update rows, when nothing actually changes. That's expensively doing nothing (but still produces dead rows). The last expression in the WHERE clause prevents such empty updates:
AND c.customer_id IS DISTINCT FROM sub.store_key
Related:
How do I (or can I) SELECT DISTINCT on multiple columns?
The fundamental problem can often be simply solved by changing an = to IN, in cases where you've got a one-to-many relationship. For example, if you wanted to update or delete a bunch of accounts for a given customer:
WITH accounts_to_delete AS
(
SELECT account_id
FROM accounts a
INNER JOIN customers c
ON a.customer_id = c.id
WHERE c.customer_name='Some Customer'
)
-- this fails if "Some Customer" has multiple accounts, but works if there's 1:
DELETE FROM accounts
WHERE accounts.guid =
(
SELECT account_id
FROM accounts_to_delete
);
-- this succeeds with any number of accounts:
DELETE FROM accounts
WHERE accounts.guid IN
(
SELECT account_id
FROM accounts_to_delete
);
This means your nested SELECT returns more than one rows.
You need to add a proper WHERE clause to it.
This error means that the SELECT store_key FROM store query has returned two or more rows in the SERVER1 database. If you would like to update all customers, use a join instead of a scalar = operator. You need a condition to "connect" customers to store items in order to do that.
If you wish to update all customer_ids to the same store_key, you need to supply a WHERE clause to the remotely executed SELECT so that the query returns a single row.
USE LIMIT 1 - so It will return only 1 row.
Example
customerId- (select id from enumeration where enumerations.name = 'Ready To Invoice' limit 1)
The result produced by the Query is having no of rows that need proper handling this issue can be resolved if you provide the valid handler in the query like
1. limiting the query to return one single row
2. this can also be done by providing "select max(column)" that will return the single row

Oracle Select Max Date on Multiple records

I've got the following SELECT statement, and based on what I've seen here: SQL Select Max Date with Multiple records I've got my example set up the same way. I'm on Oracle 11g. Instead of returning one record for each asset_tag, it's returning multiples. Not as many records as in the source table, but more than (I think) it should be. If I run the inner SELECT statement, it also returns the correct set of records (1 per asset_tag), which really has me stumped.
SELECT
outside.asset_tag,
outside.description,
outside.asset_type,
outside.asset_group,
outside.status_code,
outside.license_no,
outside.rentable_yn,
outside.manufacture_code,
outside.model,
outside.manufacture_vin,
outside.vehicle_yr,
outside.meter_id,
outside.mtr_uom,
outside.mtr_reading,
outside.last_read_date
FROM mp_vehicle_asset_profile outside
RIGHT OUTER JOIN
(
SELECT asset_tag, max(last_read_date) as last_read_date
FROM mp_vehicle_asset_profile
group by asset_tag
) inside
ON outside.last_read_date=inside.last_read_date
Any suggestions?
Try with analytical functions:
SELECT outside.asset_tag,
outside.description,
outside.asset_type,
outside.asset_group,
outside.status_code,
outside.license_no,
outside.rentable_yn,
outside.manufacture_code,
outside.model,
outside.manufacture_vin,
outside.vehicle_yr,
outside.meter_id,
outside.mtr_uom,
outside.mtr_reading,
outside.last_read_date
FROM ( SELECT *, ROW_NUMBER() OVER(PARTITION BY asset_tag ORDER BY last_read_date DESC) Corr
FROM mp_vehicle_asset_profile) outside
WHERE Corr = 1
I think you need to add...
AND outside.asset_tag=inside.asset_tag
...to the criteria in your ON list.
Also a RIGHT OUTER JOIN is not needed. An INNER JOIN will give the same results (and may be more efficicient), since there will be cannot be be combinations of asset_tag and last_read_date in the subquery that do not exist in mp_vehicle_asset_profile.
Even then, the query may return more than one row per asset tag if there are "ties" -- that is, multiple rows with the same last_read_date. In contrast, #Lamak's analytic-based answer will arbitrarily pick exactly one row this situation.
Your comment suggests that you want to break ties by picking the row with highest mtr_reading for the last_read_date.
You could modify #Lamak's analyic-based answer to do this by changing the ORDER BY in the OVER clause to:
ORDER BY last_read_date DESC, mtr_reading DESC
If there are still ties (that is, multiple rows with the same asset_tag, last_read_date, and mtr_reading), the query will again abritrarily pick exactly one row.
You could modify my aggregate-based answer to break ties using highest mtr_reading as follows:
SELECT
outside.asset_tag,
outside.description,
outside.asset_type,
outside.asset_group,
outside.status_code,
outside.license_no,
outside.rentable_yn,
outside.manufacture_code,
outside.model,
outside.manufacture_vin,
outside.vehicle_yr,
outside.meter_id,
outside.mtr_uom,
outside.mtr_reading,
outside.last_read_date
FROM
mp_vehicle_asset_profile outside
INNER JOIN
(
SELECT
asset_tag,
MAX(last_read_date) AS last_read_date,
MAX(mtr_reading) KEEP (DENSE_RANK FIRST ORDER BY last_read_date DESC) AS mtr_reading
FROM
mp_vehicle_asset_profile
GROUP BY
asset_tag
) inside
ON
outside.asset_tag = inside.asset_tag
AND
outside.last_read_date = inside.last_read_date
AND
outside.mtr_reading = inside.mtr_reading
If there are still ties (that is, multiple rows with the same asset_tag, last_read_date, and mtr_reading), the query may again return more than one row.
One other way that the analytic- and aggregate-based answers differ is in their treatment of nulls. If any of asset_tag, last_read_date, or mtr_reading are null, the analytic-based answer will return related rows, but the aggregate-based one will not (because the equality conditions in the join do not evaluate to TRUE when a null is involved.

sql get max based on field

I need to get the ID based from what ever the max amount is. Below is giving me an error
select ID from Prog
where Amount = MAX(Amount)
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
The end result is that I need to get the just the ID as I need to pass it something else that is expecting it.
You need to order by Amount and select 1 record instead...
SELECT ID
FROM Prog
ORDER BY Amount DESC
LIMIT 1;
This takes all the rows in Prog, orders them in descending order by Amount (in other words, the first sorted row has the highest Amount), then limits the query to select only one row (the one with the highest Amount).
Also, subqueries are bad for performance. This code runs on a table with 200k records in half the time as the subquery versions.
Just pass a subquery with the max value to the where clause :
select ID from Prog
where Amount = (SELECT MAX(Amount) from Prog)
If you're using SQL Server that should do it :
SELECT TOP 1 ID
FROM Prog
ORDER BY Amount DESC
This should be something like:
select P.ID from Prog P
where P.Amount = (select max(Amount) from Prog)
EDIT:
If you really want only 1 row, you should do:
select max(P.ID) from Prog P
where P.Amount = (select max(Amount) from Prog);
However, if you have multiple rows that would match amount and you only want 1 row, you should have some kind of logic behind how you pick your one row. Not just relying on this max trick, or limit 1 type logic.
Also, I don't write limit 1, because this is not ANSI sql -- it works in mysql but OP doesn't say what he wants. Every db is different -- see here: Is there an ANSI SQL alternative to the MYSQL LIMIT keyword? Don't get used to one db's extensions unless you only want to work in 1 db for the rest of your life.
select min(ID) from Prog
where Amount in
(
select max(amount)
from prog
)
The min statement ensures that you get only one result.