increasing speed of sql update postgresql - sql

I'm performing the 2 below queries on my database, and I'm trying to figure out how to make it faster.
The first query takes 208796.8ms. The second one takes 611654.9ms. I'm not sure there is a way to make them faster. I need these updates to be in the same transaction, so I'm also not sure if the update by batches of n records would be faster. I will take any idea !
UPDATE ticket_memberships AS my_table
SET ticket_id = foreign_table.id
FROM tickets AS foreign_table
WHERE my_table.agency_id = 2
AND foreign_table.agency_id = 2
AND my_table.ticket_id IS NOT NULL
AND my_table.ticket_id = foreign_table.old_id
UPDATE ticket_memberships AS my_table
SET person_contact_id = foreign_table.id
FROM person_contacts AS foreign_table
WHERE my_table.agency_id = 2
AND foreign_table.agency_id = 2
AND my_table.person_contact_id IS NOT NULL
AND my_table.person_contact_id = foreign_table.old_id

Related

How to insert one column from a table into another based on a join/where clause

I have two tables, temp_am and amphibian. The relationship between the two tables comes from the lake_id and the survey_date column in both tables. Both tables have 24,109 entries.
temp_am
id
lake_id
survey_date
1
10,001
7/25/2001
5
10,005
7/27/2001
6
10,006
7/29/2001
etc...
amphibain
id
lake_id
survey_date
amhibian_survey_id
1
10,002
7/25/2001
2
10,005
7/27/2001
etc...
I want to input the temp_am.id into the amphibian.amphibian_survey_id when both lake_ids and survey dates equal each other.
I have tried this sql query but it never worked. I canceled the query after 600 seconds as I figured a 29,000 observation table should not take that long. Please let me know if you see any issues in my query statement.
update amphibian
set amphibian_survey_id = tm.id
from amphibian a
inner join temp_am tm
on a.lake_id = tm.lake_id
and a.survey_date = tm.survey_date
This query worked in microsoft access but not on DBeaver
UPDATE amphibian
inner JOIN amphibian_survey_meta_data md ON
(amphibian.survey_date = md.survey_date) AND (amphibian.lake_id = md.lake_id) SET amphibian.amphibian_survey_id = [md.id];
Postgres does not require repeating the table name for an update join. In this case even the join is not necessary just set <column> = ( select ... ) is sufficient. See demo here.
update amphibain a
set amhibian_survey_id =
( select tm.id
from temp_am tm
where (tm.lake_id, tm.survey_date) =
(a.lake_id, a.survey_date)
) ;

Sub-query works but would a join or other alternative be better?

I am trying to select rows from one table where the id referenced in those rows matches the unique id from another table that relates to it like so:
SELECT *
FROM booklet_tickets
WHERE bookletId = (SELECT id
FROM booklets
WHERE bookletNum = 2000
AND seasonId = 9
AND bookletTypeId = 3)
With the bookletNum/seasonId/bookletTypeId being filled in by a user form and inserted into the query.
This works and returns what I want but seems messy. Is a join better to use in this type of scenario?
If there is even a possibility for your subquery to return multiple value you should use in instead:
SELECT *
FROM booklet_tickets
WHERE bookletId in (SELECT id
FROM booklets
WHERE bookletNum = 2000
AND seasonId = 9
AND bookletTypeId = 3)
But I would prefer exists over in :
SELECT *
FROM booklet_tickets bt
WHERE EXISTS (SELECT 1
FROM booklets b
WHERE bookletNum = 2000
AND seasonId = 9
AND bookletTypeId = 3
AND b.id = bt.bookletId)
It is not possible to give a "Yes it's better" or "no it's not" answer for this type of scenario.
My personal rule of thumb if number of rows in a table is less than 1 million, I do not care optimising "SELECT WHERE IN" types of queries as SQL Server Query Optimizer is smart enough to pick an appropriate plan for the query.
In reality however you often need more values from a joined table in the final resultset so a JOIN with a filter WHERE clause might make more sense, such as:
SELECT BT.*, B.SeasonId
FROM booklet_tickes BT
INNER JOIN booklets B ON BT.bookletId = B.id
WHERE B.bookletNum = 2000
AND B.seasonId = 9
AND B.bookletTypeId = 3
To me it comes down to a question of style rather than anything else, write your code so that it'll be easier for you to understand it months later. So pick a certain style and then stick to it :)
The question however is old as the time itself :)
SQL JOIN vs IN performance?

Too long execution time for update query

I'm trying to update the table with a query, that executing in ~5 sec on Postgresql and Oracle but takes too long on Firebird 2.5.
UPDATE GoodsCatUnit SET isDisplay=1
WHERE Id In (SELECT Min(gcu.Id) FROM GoodsCatUnit gcu GROUP BY gcu.GoodsCat_Id);
In the GoodsCatUnit ~34k rows and updating first 200 takes 15 seconds.
Try writing this using a correlated subquery and defining an index.
The query is:
UPDATE GoodsCatUnit gcu
SET isDisplay = 1
WHERE gcu.id = (SELECT MIN(gcu2.id)
FROM GoodsCatUnit gcu2
WHERE gcu2.GoodsCat_Id = gcu.GoodsCat_Id
) AND
gcu.isDisplay <> 1;
The index is on GoodsCatUnit(GoodsCat_Id, id).

Update data from the same table for 235 rows

I need help updating rows to equal to another set of rows for the same table for example:
M005E globalpickesequense = 6627,
globalallocationsequense = 7080,
globalputawaysequence = 4268
so these numbers need to equal the same numbers as the M005D 7607,8068,5256.
M006E same thing needs to equal M007D globals.
and so forth...
I have to do this update for a total of 235 rows but in the image I am just adding part of rows in the database.
So if this possible to do? a query that can update all at the same time without updating row by row individually
I have been using this query where it works but I have do one by one changing the ids so that would take me too much time for all my 235 row I need a query that can do the update all at the same time
UPDATE lc1
SET lc1.globalPickSequence = lc2.globalPickSequence,
lc1.globalAllocationSequence = lc2.globalAllocationSequence,
lc1.globalPutAwaySequence = lc2.globalPutAwaySequence
from mytable lc1
JOIN mytable lc2 ON lc1.id = 27234 AND lc2.id = 16358

SQL Update table - 2 tables based on date - Table 2 Subset of table 1

Ok I have a rather unique situation and I can't believe there is not a better way of doing this than my solution.
Requirements:
Table 2 - EpmTask_UserView_RM is a subset of table 1 -
MSP_EpmTask_UserView So while all the fields match Table 1 has many
more rows than table 2
Table 2 needs to get updated from table 1 based on the date a task has changed (We can't do a drop and replace) There are three cases:
Task updates where something has changed about the task (We will know based on the task date stamp)
Task Deletes where a task has been deleted
Task Adds where a new task exists
I have 3 different queries that do this and am thinking there is a better way.
**** DELETE Tasks from ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM table if no longer present in Production***/
USE [ProjectWebApp]
GO
DELETE FROM [dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM]
WHERE [dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM].TaskUID IN
(SELECT
/*Subquery to select all records in ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM NOT found in MSP_EpmTask_UserView_RM */
[ProjectWebApp].[dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM].[TaskUID]
FROM [ProjectWebApp].[dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM]
LEFT JOIN [MSPSPRO].[ProjectWebApp].[dbo].[MSP_EpmTask_UserView] as Prod
on Prod.TaskUID = [ProjectWebApp].[dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM].TASKuid
where Prod.TaskUID is NULL)
Query 2 the Update
UPDATE [dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM]
SET
[ProjectUID] = Source.[ProjectUID]
,[TaskUID] = Source.[TaskUID]
,[TaskName] = Source.[TaskName]
,[TaskIndex] = Source.[TaskIndex]
,[TaskOutlineLevel] = Source.[TaskOutlineLevel]
,[TaskOutlineNumber] = Source.[TaskOutlineNumber]
,[TaskStartDate] = Source.[TaskStartDate]
,[TaskFinishDate] = Source.[TaskFinishDate]
,[TaskActualStartDate] = Source.[TaskActualStartDate]
,[TaskActualFinishDate] = Source.[TaskActualFinishDate]
,[TaskPercentCompleted] = Source.[TaskPercentCompleted]
,[Health] = Source.[Health]
,[Milestone Significance Level] = Source.[Milestone Significance Level]
,[TaskModifiedDate] = Source.[TaskModifiedDate]
,[TaskBaseline1StartDate] = Source.[TaskBaseline1StartDate]
,[TaskBaseline1FinishDate] = Source.[TaskBaseline1FinishDate]
,[TaskBaseline1Duration] = Source.[TaskBaseline1Duration]
,[QueryTimestamp] = GetDate()
FROM [MSPSPRO].[ProjectWebApp].[dbo].[MSP_EpmTask_UserView] AS Source
WHERE Source.TaskUID = [dbo].[ZZZ_TEST_OF_UPDATE_MSP_EpmTask_UserView_RM].TaskUID
AND GetDate() - Source.TaskModifiedDate <= .01 -- Update any task changed in last 14 minutes (14 minutes = 1% of a full day, ie '.01')
GO
Task 3 the add
SELECT
[MSP_EpmProject_UserView].[ProjectUID]
,[TaskUID]
,[TaskName]
,[TaskIndex]
,[TaskOutlineLevel]
,[TaskOutlineNumber]
,[TaskStartDate]
,[TaskFinishDate]
,[TaskActualStartDate]
,[TaskActualFinishDate]
,[TaskPercentCompleted]
,[Health]
,[Milestone Significance Level]
,[TaskModifiedDate]
,[TaskBaseline1StartDate]
,[TaskBaseline1FinishDate]
,[TaskBaseline1Duration]
,GetDate() as QueryTimestamp
INTO [ProjectWebApp].[dbo].[MSP_EpmTask_UserView_RM]
FROM [MSPSPRO].[ProjectWebApp].[dbo].[MSP_EpmTask_UserView]
Inner Join [MSPSPRO].[ProjectWebApp].[dbo].[MSP_EpmProject_UserView]
on [MSP_EpmProject_UserView].projectUID = [MSP_EpmTask_UserView].ProjectUID
WHERE [SMO Programs] = 'SMO Day 1 Release Management'
AND [Milestone Significance Level] is not null
/*AND [TaskModifiedDate] > (getdate() - 1)*/
Thoughts?
This looks like an ideal situation for a MERGE statement. If you haven't used them much or at all, I'd strongly suggest this site as a primer.
A MERGE can carry out INSERT, UPDATE, and DELETE in one shot, in the right conditions. The basic idea is that you compare rows in two tables, your source and destination, and from that comparison (and potentially other conditions) you then take the appropriate action.
MERGE can perform very well because it carries out these actions in bulk - but do test it out. Sometimes people have found them to be slower than using the separate statements in some situations. Indexing correctly (Microsoft suggest indexing the columns used to join in both tables) can help immensely. Writing the MERGE statement correctly and well is important in terms of both getting the right result, and getting good performance - so definitely do your reading up if you haven't used them before. The above link is a good starter, but there are plenty of other articles around.