SQL Server: UPDATE statement where MAX query - sql

I'm doing a data migration in SQL Server 2008 R2. I'm a SQL-Server noob, but I know Ingres and MySql pretty well.
I need to set "default values" for two new fields to "the current values" from another table. Here's my first naive attempt (how I'd do it in Ingres).
update rk_risk
set n_target_probability_ID = a.n_probability_ID
, n_target_consequence_ID = a.n_consequence_ID
from rk_assess a
WHERE a.n_assess_id = (
SELECT MAX(n_assess_id)
FROM rk_assess a2
WHERE a2.n_risk_id = a.n_risk_id
);
The above query executes without error in sequel, but it sets ALL the n_target_probability_ID's & n_target_consequence_ID's to the same value... that of the OUTRIGHT last assessment (as apposed to "the last assessment OF THIS RISK").
The rk_assess table contains a complete history of assessment records for rk_risks, and my mission is to "default" the new target probability & consequence column of the risk table to the values from "the current" (i.e. the last) assessment record. The rk_assess.n_assess_id column is an auto-incremented identifier (immutable once set), so the max-id should allways be the last-entered record.
I've had a bit of a search, both in google and SO, and tried a few different version of the query, but I'm still stuck. Here's a couple of other epic-fails, with references.
update rk_risk
set n_target_probability_ID = (select a.n_probability_ID from rk_assess a where a.n_assess_id = (select max(n_assess_id) from rk_assess a2 where a2.n_risk_id = a.n_risk_id) as ca)
, n_target_consequence_ID = (select a.n_consequence_ID from rk_assess a where a.n_assess_id = (select max(n_assess_id) from rk_assess a2 where a2.n_risk_id = a.n_risk_id) as ca)
;
http://stackoverflow.com/questions/6256844/sql-server-update-from-select
update r
set r.n_target_probability_ID = ca.n_probability_ID
, r.n_target_consequence_ID = ca.n_consequence_ID
from rk_risk r
join rk_assess a
on a.n_risk_id = r.n_risk_id
select r.n_risk_id
, r.n_target_probability_ID, r.n_target_consequence_ID
, ca.n_probability_ID, ca.n_consequence_ID
from rk_risk r
join rk_assess a
on a.n_risk_id = r.n_risk_id
http://stackoverflow.com/questions/4024489/sql-server-max-statement-returns-multiple-results
UPDATE rk_risk
SET n_target_probability_ID = ca.n_probability_ID
, n_target_consequence_ID = ca.n_consequence_ID
FROM ( rk_assess a
INNER JOIN (
SELECT MAX(a2.n_assess_id)
FROM rk_assess a2
WHERE a2.n_risk_id = a.n_risk_id
) ca -- current assessment
Any pointers would be greatly appreciated. Thank you all in advance, for even reading this far.
Cheers. Keith.

How about this:
update rk_risk
set n_target_probability_ID = a.n_probability_ID
, n_target_consequence_ID = a.n_consequence_ID
from rk_assess a
JOIN (
SELECT n_risk_id, MAX(n_assess_id) max_n_assess_id
FROM rk_assess
GROUP BY n_risk_id
) b
ON a.n_risk_id = b.n_risk_id AND a.n_assess_id = b.max_n_assess_id
WHERE a.n_risk_id = rk_risk.n_risk_id

if you're using sql 2005 or greater you can in addition to Jerad's answer use the row_number function
With b
(
SELECT n_risk_id,
n_assess_id,
n_probability_ID,
n_consequence_ID,
row_number() over (partition by n_risk_id order by n_assess_id desc) row
FROM rk_assess
)
update rk_risk
set n_target_probability_ID = b.n_probability_ID
, n_target_consequence_ID = b.n_consequence_ID
from b
WHERE a.n_risk_id = rk_risk.n_assess_id
and row =1
Or CROSS JOIN
update rk_risk
set n_target_probability_ID = b.n_probability_ID
, n_target_consequence_ID = b.n_consequence_ID
from rh_risk r
CROSS JOIN
(
SELECT TOP 1
n_risk_id,
n_assess_id,
n_probability_ID,
n_consequence_ID
FROM rk_assess
order by n_assess_id desc
WHERE a.n_risk_id = r.n_assess_id) b

I tried this, looks like it is working:
update rk_risk
set n_target_probability_ID = a.n_probability_ID,
n_target_consequence_ID = a.n_consequence_ID
from rk_assess a, rk_risk r
WHERE a.n_risk_id = r.n_risk_id
and a.n_assess_id in (select MAX(n_assess_id) from rk_assess group by n_risk_id)

I discovered this from another question on SO just today. The UPDATE-FROM construction is not standard SQL, and MySQL's non-standard version is different from Postgres's non-standard version. From the problem here, it looks like SQL Server follows Postgres.
The problem, as Jerad points out in his edit, is that there is no link between the table being updated and the tables in the subquery. MySQL seems to create some implicit join here (on column names? in the other SO example, it was by treating two copies of the same table as the same, not separate).
I don't know if SQL Server allows windowing in the subquery, but if it does, I think you want
UPDATE rk_risk
set n_target_probability_ID = a.n_probability_ID
, n_target_consequence_ID = a.n_consequence_ID
from
( SELECT * FROM
( SELECT n_risk_id, n_probability_ID, n_consequence_ID,
row_number() OVER (PARTITION BY n_risk_id ORDER BY n_assess_ID DESC) AS rn
FROM rk_assess)
WHERE rn = 1) AS a
WHERE a.n_risk_id=rk_risk.n_risk_id;

Related

UPDATE statement with JOIN in SQL Server Not Working as Expected

I'm attempting to update the LAST_INSPECTION_FW field for all records in the VEHICLES_FW table with the last JOB_DATE_FW for records with the REASON_CODE_FW = 35. However, what's happening is that once the below code is executed, it's not taking into consideration the WHERE clause. This causes all of the records to update when it should just be updating those with the REASON_CODE_FW = 35.
Is there a way to restructure this code to get it working correctly? Please help, thanks!
UPDATE VEHICLES_FW
SET VEHICLES_FW.LAST_INSPECTION_FW = JOB_HEADERS_FW.FIELD2MAX
FROM VEHICLES_FW
INNER JOIN (SELECT VEHICLE_ID_FW, MAX(JOB_DATE_FW) AS FIELD2MAX
FROM JOB_HEADERS_FW
GROUP BY VEHICLE_ID_FW) AS JOB_HEADERS_FW
ON VEHICLES_FW.VEHICLE_ID_FW = JOB_HEADERS_FW.VEHICLE_ID_FW
INNER JOIN JOB_DETAILS_FW
ON JOB_NUMBER_FW = JOB_NUMBER_FW
WHERE REASON_CODE_FW = '35'
Common Table Expressions are your friend here. SQL Server's strange UPDATE ... FROM syntax is not. EG
with JOB_HEADERS_FW_BY_VEHICLE_ID as
(
SELECT VEHICLE_ID_FW, MAX(JOB_DATE_FW) AS FIELD2MAX
FROM JOB_HEADERS_FW
GROUP BY VEHICLE_ID_FW
), q as
(
Select VEHICLES_FW.LAST_INSPECTION_FW, JOB_HEADERS_FW_BY_VEHICLE_ID.FIELD2MAX NEW_LAST_INSPECTION_FW
FROM VEHICLES_FW
INNER JOIN JOB_HEADERS_FW_BY_VEHICLE_ID
ON VEHICLES_FW.VEHICLE_ID_FW = JOB_HEADERS_FW_BY_VEHICLE_ID.VEHICLE_ID_FW
INNER JOIN JOB_DETAILS_FW
ON JOB_NUMBER_FW = JOB_NUMBER_FW
WHERE REASON_CODE_FW = '35'
)
UPDATE q set LAST_INSPECTION_FW = NEW_LAST_INSPECTION_FW
I suspect this does what you want:
update v
set last_inspection_fw = (
select max(j.job_date_fw)
from job_headers_fw j
inner join job_details_fw jd on jd.job_number_fw = j.job_number_fw
where j.vehicle_id_fw = v.vehicle_id_fw and jd.reason_code_fw = 35
)
from vehicles_fw v

SQL Server Execution Plan Review Request

Having trouble understanding why my query is taking so long, looking for advice to optimise please.
update Laserbeak_Main.dbo.ACCOUNT_MPN set
DateUpgrade = ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.AccountNumber = Laserbeak_Main.dbo.ACCOUNT_MPN.AccountNumber
AND ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
AND DateUpgrade <> ord.ConnectedDate
Execution plan as requested on brentozar.com
UPDATE: Following suggestions the new query looks like this & seems to work much more quickly. However if you run the query it sets the rows as expected, then run again it updates the exact same number of rows. Converting to a select confirms that the same rows are being updated each time. The <> clause should stop this but it doesn't. I believed it was something to do with collation but have been unable to confirm if its possible to have different collations at table level in the same database.
;WITH cteOrderInfo AS (
SELECT DISTINCT ord.AccountNumber, ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
)
UPDATE Laserbeak_Main.dbo.ACCOUNT_MPN
SET Laserbeak_Main.dbo.ACCOUNT_MPN.DateUpgrade = cteOrderInfo.ConnectedDate
FROM cteOrderInfo
INNER JOIN Laserbeak_Main.dbo.ACCOUNT_MPN acc
ON cteOrderInfo.AccountNumber = acc.AccountNumber
WHERE cteOrderInfo.ConnectedDate <> acc.DateUpgrade
The SELECT to confirm:
;WITH cteOrderInfo AS (
SELECT DISTINCT ord.AccountNumber, ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
)
SELECT cteOrderInfo.ConnectedDate, acc.DateUpgrade
FROM cteOrderInfo
INNER JOIN Laserbeak_Main.dbo.ACCOUNT_MPN acc
ON cteOrderInfo.AccountNumber = acc.AccountNumber
WHERE cteOrderInfo.ConnectedDate <> acc.DateUpgrade
SELECT Results Sample:
As Serge suggested, we did not have unique rows.
the solution we arrived at:
;WITH cteSourceStuff AS (
SELECT AccountNumber, MpnUpgrade, MAX(DateConnected) maxConnDate
FROM ORDER_DETAIL, [ORDER]
WHERE ORDER_DETAIL.OrderID = [ORDER].OrderID
AND LEN(MpnUpgrade) > 10
AND OrderTypeID = 2
GROUP BY AccountNumber, MpnUpgrade
)
UPDATE Laserbeak_Main.dbo.ACCOUNT_MPN set
DateUpgrade = cteSourceStuff.maxConnDate
FROM cteSourceStuff
WHERE cteSourceStuff.MpnUpgrade = ACCOUNT_MPN.Mpn
AND cteSourceStuff.AccountNumber = ACCOUNT_MPN.AccountNumber
AND DateUpgrade <> cteSourceStuff.maxConnDate
This works because the duplicates are initially removed, then we only update the rows that we are actually targeting. The reason we have issues before was that SQL was updating the 1st row it found, then when we re-ran or ran the select it was return rows matched on the key but that had not previously been updated.

Only return value that matches the ID on table 1

I have tried all possible joins and sub-queries but I cant get the data to only return one value from table 2 that exactly matches the vendor ID. If I dont have the address included in the query, I get one hit for the vendor ID. How can I make it so that when I add the address, I only want the one vendor that I get prior to adding the address.
The vendor from table one must be VEN-CLASS IS NOT NULL.
This was my last attempt using subquery:
SELECT DISTINCT APVENMAST.VENDOR_GROUP,
APVENMAST.VENDOR,
APVENMAST.VENDOR_VNAME,
APVENMAST.VENDOR_CONTCT,
APVENMAST.TAX_ID,
Subquery.ADDR1
FROM (TEST.dbo.APVENMAST APVENMAST
INNER JOIN
(SELECT APVENADDR.ADDR1,
APVENADDR.VENDOR_GROUP,
APVENADDR.VENDOR,
APVENMAST.VEN_CLASS
FROM TEST.dbo.APVENADDR APVENADDR
INNER JOIN TEST.dbo.APVENMAST APVENMAST
ON (APVENADDR.VENDOR_GROUP = APVENMAST.VENDOR_GROUP)
AND (APVENADDR.VENDOR = APVENMAST.VENDOR)
WHERE (APVENMAST.VEN_CLASS IS NOT NULL)) Subquery
ON (APVENMAST.VENDOR_GROUP = Subquery.VENDOR_GROUP)
AND (APVENMAST.VENDOR = Subquery.VENDOR))
INNER JOIN TEST.dbo.APVENLOC APVENLOC
ON (APVENMAST.VENDOR_GROUP = APVENLOC.VENDOR_GROUP)
AND (APVENMAST.VENDOR = APVENLOC.VENDOR)
WHERE (APVENMAST.VEN_CLASS IS NOT NULL)
Try this:
SELECT APVENMAST.VENDOR_GROUP
, APVENMAST.VENDOR
, APVENMAST.VENDOR_VNAME
, APVENMAST.VENDOR_CONTCT
, APVENMAST.TAX_ID
, APVENADDR.ADDR1
FROM TEST.dbo.APVENMAST APVENMAST
INNER JOIN (
select VENDOR_GROUP, VENDOR, ADDR1
, row_number() over (partition by VENDOR_GROUP, VENDOR order by ADDR1) r
from TEST.dbo.APVENADDR
) APVENADDR
ON APVENADDR.VENDOR_GROUP = APVENMAST.VENDOR_GROUP
AND APVENADDR.VENDOR = APVENMAST.VENDOR
AND APVENADDR.r = 1
--do you need this table; you're not using it...
--INNER JOIN TEST.dbo.APVENLOC APVENLOC
--ON APVENMAST.VENDOR_GROUP = APVENLOC.VENDOR_GROUP
--AND APVENMAST.VENDOR = APVENLOC.VENDOR
WHERE APVENMAST.VEN_CLASS IS NOT NULL
--if the above inner join was to filter results, you can do this instead:
and exists (
select top 1 1
from TEST.dbo.APVENLOC APVENLOC
ON APVENMAST.VENDOR_GROUP = APVENLOC.VENDOR_GROUP
AND APVENMAST.VENDOR = APVENLOC.VENDOR
)
I found another column in the APVENLOC table that I can filter on to get the unique vendor. Turns out if the vendor address is for the main office, the vendor location is set blank.
Easier than I thought it would be!
SELECT DISTINCT APVENMAST.VENDOR_GROUP,
APVENMAST.VENDOR,
APVENMAST.VENDOR_VNAME,
APVENADDR.ADDR1,
APVENMAST.VENDOR_SNAME,
APVENADDR.LOCATION_CODE,
APVENMAST.VEN_CLASS
FROM TEST.dbo.APVENMAST APVENMAST
INNER JOIN TEST.dbo.APVENADDR APVENADDR
ON (APVENMAST.VENDOR_GROUP = APVENADDR.VENDOR_GROUP)
AND (APVENMAST.VENDOR = APVENADDR.VENDOR)
WHERE (APVENADDR.LOCATION_CODE = ' ')
Shaji

Why does a CTE in SQL Server execute the INNER JOIN when no conditions are met?

I have table mse that have all rows StatusId = 1. But in query like this INNER JOINED VIEW is executed regardless of value of column StatusId. How to prevent it?
WITH cte201401291517 AS
(
SELECT
'QuantityOutPerShift' = SUM([vsqo].[QuantityOutPerShift])
, [mse].[ShiftGroup]
, [mse].[Station]
, [vsqo].[Shift]
FROM
[dbo].[mse] AS mse
INNER JOIN
[dbo].[vmsqo] AS vsqo ON [mse].[Station] = [vsqo].[FromStation]
AND ( [mse].[ShiftGroup] = [vsqo].[ShiftGroup]
OR [mse].[ShiftGroup] = 'ALL') -- order is important!
WHERE
[mse].[StatusId] = 3
GROUP BY
[mse].[ShiftGroup]
, [mse].[Station]
, [vsqo].[Shift])
UPDATE
[dbo].[mse]
SET
[dbo].[mse].[QuantityOutPerShift] = [cte].[QuantityOutPerShift]
, [dbo].[mse].[ShiftCurrent] = [cte].[Shift]
--OUTPUT INSERTED.*
FROM
cte201401291517 AS cte
WHERE
[dbo].[mse].[Station] = [cte].[Station]
AND ( [dbo].[mse].[ShiftGroup] = [cte].[ShiftGroup]
OR [dbo].[mse].[ShiftGroup] = 'ALL' ) -- order is important!
AND [dbo].[mse].[StatusId] = 3;
I can't do this without CTE because of fact that I'm updating table with SUM that cannot be used in UPDATE statement.
I'm using SQL Server 2005

sql server update from select

Following the answer from this post, I have something like this:
update MyTable
set column1 = otherTable.SomeColumn,
column2 = otherTable.SomeOtherColumn
from MyTable
inner join
(select *some complex query here*) as otherTable
on MyTable.key_field = otherTable.key_field;
However, I keep getting this error:
The column prefix 'otherTable' does
not match with a table name or alias
name used in the query.
I'm not sure what's wrong. Can't I do such an update from a select query like this?
Any help would be greatly appreciated.
(I'm using *blush* sql server 2000.)
EDIT:
here's the actual query
update pdx_projects set pr_rpc_slr_amount_year_to_date = summary.SumSLR, pr_rpc_hours_year_to_date = summary.SumHours
from pdx_projects pr join (
select pr.pr_pk pr_pk, sum(tc.stc_slr_amount) SumSLR, sum(tc.stc_worked_hours) SumHours from pdx_time_and_cost_from_rpc tc
join pdx_rpc_projects sp on tc.stc_rpc_project_id = sp.sol_rpc_number
join pdx_rpc_links sl on sl.sol_fk = sp.sol_pk
join pdx_projects pr on pr_pk = sl.pr_fk
where tc.stc_time_card_year = year(getdate())
group by pr_pk
) as summary
on pr.pr_pk = summary.pr_pk
and the actual error message is
Server: Msg 107, Level 16, State 2,
Line 1 The column prefix 'summary'
does not match with a table name or
alias name used in the query.
I submit to you this altered query:
update x
set x.pr_rpc_slr_amount_year_to_date = summary.sumSLR,
x.pr_rpc_hours_year_to_date = summary.sumHours
from pdx_projects x
join (
select pr.pr_pk as pr_pk,
sum(tc.stc_slr_amount) as SumSLR,
sum(tc.stc_worked_hours) as SumHours
from pdx_time_and_cost_from_rpc tc
join pdx_rpc_projects sp on tc.stc_rpc_project_id = sp.sol_rpc_number
join pdx_rpc_links sl on sp.sol_pk = sl.sol_fk
join pdx_projects pr on sl.pr_fk = pr.pr_pk
where tc.stc_time_card_year = year(getdate())
group by pr.pr_pk
) as summary
on x.pr_pk = summary.pr_pk
Notably different here: I don't re-use the alias pr inside and outside of the complex query. I re-ordered the joins the way I like them (previously referenced table first,) and explicitly notated pr_pk in 2 places. I also changed the update syntax to use update <alias>.
Maybe not the answer you're looking for, but instead of generating hugely complex queries, I usually default to inserting the some complex query here into a table variable. Then you can do a simple update to MyTable with a join to the table variable. It may not be quite as efficient, but its much easier to maintain.
I couldn't replicate your error using SQL 2008 in 80 compatibility level. While this option doesn't guarantee that I'll get the same results as you, nothing appears to be out of place.
create table pdx_projects
(
pr_rpc_slr_amount_year_to_date varchar(max)
, pr_rpc_hours_year_to_date varchar(max)
, pr_pk varchar(max)
)
create table pdx_time_and_cost_from_rpc
(
stc_slr_amount decimal
, stc_worked_hours decimal
, stc_rpc_project_id varchar(max)
, stc_time_card_year varchar(max)
)
create table pdx_rpc_projects
(
sol_rpc_number varchar(max)
, sol_pk varchar(max)
)
create table pdx_rpc_links
(
sol_fk varchar(max)
, pr_fk varchar(max)
)
update pdx_projects
set
pr_rpc_slr_amount_year_to_date = summary.SumSLR
, pr_rpc_hours_year_to_date = summary.SumHours
from
pdx_projects pr
join (
select pr.pr_pk pr_pk
, sum(tc.stc_slr_amount) SumSLR
, sum(tc.stc_worked_hours) SumHours
from pdx_time_and_cost_from_rpc tc
join pdx_rpc_projects sp on tc.stc_rpc_project_id = sp.sol_rpc_number
join pdx_rpc_links sl on sl.sol_fk = sp.sol_pk
join pdx_projects pr on pr_pk = sl.pr_fk
where tc.stc_time_card_year = year(getdate())
group by pr_pk
) as summary
on pr.pr_pk = summary.pr_pk