Update values in each row based on foreign_key value - sql

Downloads table:
id (primary key)
user_id
item_id
created_at
updated_at
The user_id and item_id in this case are both incorrect, however, they're properly stored in the users and items table, respectively (import_id for in each table). Here's what I'm trying to script:
downloads.each do |download|
user = User.find_by_import_id(download.user_id)
item = item.find_by_import_id(download.item_id)
if user && item
download.update_attributes(:user_id => user.id, :item.id => item.id)
end
end
So,
look up the user and item based on
their respective "import_id"'s. Then
update those values in the download record
This takes forever with a ton of rows. Anyway to do this in SQL?

If I understand you correctly, you simply need to add two sub-querys in your SELECT statement to lookup the correct IDs. For example:
SELECT id,
(SELECT correct_id FROM User WHERE import_id=user_id) AS UserID,
(SELECT correct_id FROM Item WHERE import_id=item_id) AS ItemID,
created_at,
updated_at
FROM Downloads
This will translate your incorrect user_ids to whatever ID you want to come from the User table and it will do the same for your item_ids. The information coming from SQL will now be correct.
If, however, you want to update the tables with the correct information, you could write this like so:
UPDATE Downloads
SET user_id = User.user_id,
item_id = Item.item_id
FROM Downloads
INNER JOIN User ON Downloads.user_id = User.import_id
INNER JOIN Item ON Downloads.item_id = Item.import_id
WHERE ...
Make sure to put something in the WHERE clause so you don't update every record in the Downloads table (unless that is the plan). I rewrote the above statement to be a bit more optimized since the original version had two SELECT statements per row, which is a bit intense.
Edit:
Since this is PostgreSQL, you can't have the table name in both the UPDATE and the FROM section. Instead, the tables in the FROM section are joined to the table being updated. Here is a quote about this from the PostgreSQL website:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the fromlist, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
http://www.postgresql.org/docs/8.1/static/sql-update.html
With this in mind, here is an example that I think should work (can't test it, sorry):
UPDATE Downloads
SET user_id = User.user_id,
item_id = Item.item_id
FROM User, Item
WHERE Downloads.user_id = User.import_id AND
Downloads.item_id = Item.import_id
That is the basic idea. Don't forget you will still need to add extra criteria to the WHERE section to limit the rows that are updated.

i'm totally guessing from your question, but you have some kind of lookup table that will match an import user_id with the real user_id, and similarly from items. i.e. the assumption is your line of code:
User.find_by_import_id(download.user_id)
hits the database to do the lookup. the import_users / import_items tables are just the names i've given to the lookup tables to do this.
UPDATE downloads
SET downloads.user_id = users.user_id
, downloads.item_id = items.items_id
FROM downloads
INNER JOIN import_users ON downloads.user_id = import_users.import_user_id
INNER JOIN import_items ON downloads.item_id = import_items.import_item_id
Either way (lookup is in DB, or it's derived from code), would it not just be easier to insert the information correctly in the first place? this would mean you can't have any FK's on your table since sometimes they point to one table, and others they point to another. seems a bit odd.

Related

SQL Query across two tables only show most recently updated result per tag address

I have two tables: violator_state and violator_tags
violator_state:
m_state_id
is_violating
m_translatedid
m_tag
m_violator_tag
This table holds the "tags" which has an unchanging row count of 10 in this case. The purpose is to list out each tag present, connect the full tag address (m_violator_tag) with its shorthand name (m_tag) and state whether it is in "violation". I need to use this table as reference because of the link between m_violator_tag and m_tag.
violator_tags
m_violator_id
m_eval_time_from
m_eval_time_to
m_tag
m_tag_peers
m_tag_position
This table is constantly having new rows added to it holding the information of what tags are in violation with a specific tag. So it would show T6 in violation with T1,T2,T9 ect.
I am looking to create a query which joins the two tables to show only the most recently updated (largest m_eval_time_from) for each tag.
I am using the following query to join the two tables but I expect m_translatedid and m_tag to match but they do not. Unsure why.
SELECT violator_state.m_violator_tag, violator_state.is_violating, violator_state.m_translatedid, violator_tags.m_tag, violator_tags.m_eval_time_to, violator_tags.m_tag_peers,
violator_tags.m_tag_position, violator_tags.m_eval_time_from
FROM violator_tags CROSS JOIN
violator_state
Violation_state table
violation_tags table
results of my (incorrect) query
Any suggestions on what I should try?
Your CROSS JOIN will give you a cartesian product where EVERY row in the first table is paired with ALL the rows in the second table e.g. if you have 10 rows in each, you will get 10 x 10 = 100 rows in the result! I believe you need to join the tables on the m_tag column and select the violator_tags row with the latest date. The query below should do this for you (though you haven't provided your question in a manner that makes it easy for me to double-check my code - see the link provided by a_horse_with_no_name for more on this or use a website like db-fiddle to set up your example).
SELECT vs.m_violator_tag,
vs.is_violating,
vs.m_translatedid,
vt.m_tag,
vt.m_eval_time_to,
vt.m_tag_peers,
vt.m_tag_position,
vt.m_eval_time_from
FROM violator_tags vt
JOIN violator_state vs
ON vt.m_tag = vs.m_tag
AND vt.m_eval_time_from = (SELECT MAX(vt.m_eval_time_from)
FROM violator_tags
WHERE m_tag = vt.m_tag)

UPDATE statement won't update anything

I've been creating a recipe database for a class project using SSMS. I have one table with all the recipe names that I'd inserted into a lookup table (RecipeDetails), but I'd left out their associated IDs, so I wanted to write a bit of code to update my table with those values. I feel like this code should work fine (it seemed to work in the past, but I screwed up something unrelated and had to restore my most recent database backup), but now it just isn't. It says it's affected all the rows I expect it to affect, but those rows still list the RecipeID as NULL.
I'm pulling my data from RecipesTable, which includes the names of each recipe and an ID for each. In RecipeDetails I have a column for RecipeName, RecipeID, Ingredient, IngredientID, and an ID for each row in the table. As of now, I have all my recipes in the table, but not their associated ID. I would like to move the ID's over from one table to another.
UPDATE rd
SET rd.RecipeID = rt.RecipeID
FROM RecipeDetail AS rd
FULL JOIN RecipesTable AS rt ON rd.RecipeID = rt.RecipeID
WHERE rt.RecipeName = rd.RecipeName;
You should use an inner join rather than an outer join.
UPDATE rd
SET rd.RecipeID = rt.RecipeID
FROM RecipeDetail rd JOIN
RecipesTable rt
ON rd.RecipeName = rt.RecipeName;

Updating column values in a table based on join with another table?

I have two tables called resource and resource_owners.
The resource_owners table contains two columns called resource_id and owner_id.
resource_id | owner_id |
-------------+-----------
The resource table contains two relevant columns: parentresource_id and id.
parentresource_id | id |
-------------------+------
resource_owners.resource_id, resource.id and resource.parentresource_id are all join columns between the two tables. Now what I want to do is the following:
For every row in the resource table, take the value in id, match it with a corresponding resource_owners.resource_id, retrieve the corresponding resource_owners.owner_id value (call it $owner_value), then set resource_owners.owner_id to $owner_value where resource_owners.resource_id equals resource.parentresource_id.
In conversational terms, this is what I want to do: For each resource, I want to re-assign the parent-resource's owner_id to be the resource's owner_id.
I've tried to wrap my head around this problem and it looks like I'll need two different table joins (resource.id with resource_owners.resource_id and resource.parentresource_id with resource_owners.resource_id).
Can someone point me in the right direction? Is what I want even possible with a single query? I'm okay with a PostgreSQL script as well if that works better for my use case.
I'm not sure what database you are using but you should be able to accomplish using the logic below if I understood your question correctly:
UPDATE RESOURCE_OWNER SET
OWNER_ID = UP.OWNER_ID
FROM (SELECT rc.ID, TMP.OWNER_ID FROM (SELECT RSC.ID, ROWRS.OWNER_ID, ROWRS.RESOURCE_ID FROM RESOURCE RSC JOIN RESOURCE_OWNER ROWRS
ON RSC.ID = ROWRS.RESOURCE_ID) TMP JOIN RESOURCE rc on rc.PARENTRESOURCE_ID = TMP.RESOURCE_ID) UP WHERE RESOURCE_OWNER.RESOURCE_ID = UP.ID;

Avoiding Duplicates when appending records in Access

I am aware this has been asked multiple times, but for one reason or another the solutions are not working for me.
Database Layout:
I have Table1 (Scanner_Location) Who is getting data pulled from another table/ subform on a form (Scanner IBOB) * Holds Columns: FP#, Count, Location, Model_ID, PK-SL_ID
Table2 (Scanner Detail) Holds Two of the three data columns: (FP#, Location PK-SN)
Table3 (Scanner_Model) Holds the last data column, displayed in a subform. (PK-Model_ID)
The user will input FP#, and location in one section of the form, then navigate to the subform, and select multiple Models, and enter the count (Textbox). Once Selected, they click an 'update' button that executes my queries. (Of which I have an update, AND an Append Query)
The problem is, just using an update query doesn't add the records. And using an Append query creates duplicates of the existing data.
Here's how the flow carries out:
User selects Model 1 and Model 2 with a count of 4 and an FP# of 100. Clicks update.
The queries update, and the information enters correctly.
User Selects the same models again (Model_Select), with the same FP# and count, the Table1 has the same information entered again, with a different primary key.
The goal:
The append query creates duplicates of existing data. I only want my update and/or append queries to:
Update the existing data - Looking for anything with the same FP#
Add any records that do not exist already (Looking at Model_ID and FP#)
INSERT INTO Scanner_Location ( Model_ID, FootPrints_Num, Location_ID, Scanner_Loc_Cnt )
SELECT Scanner_Model.Model_ID, [Forms]![Scanner_IBOB]![fpNum_txt] AS [FP#],
[Forms]![Scanner_IBOB]![Location_Cbo_main] AS Location,
[Forms]![Scanner_IBOB]![Scanner_Loc_CntTxt] AS [Count]
FROM Scanner_Detail
RIGHT JOIN Scanner_Model ON Scanner_Detail.Model_ID = Scanner_Model.Model_ID
WHERE (((Scanner_Model.SM_Acc_Select)=True)
AND ((NOT Exists (SELECT * FROM Scanner_location
WHERE (((Forms!Scanner_IBOB!fpNum_txt)=Forms!Scanner_IBOB!fpNum_tx‌​t)
And ((Scanner_Model.SM_Acc_Select)=True)); ))=False));
No query named 'Update_SLoc_Acc53' - there are 'Update_SLoc_Acc3' and 'Update_SLoc_Acc54'. I modified 'Update_SLoc_Acc54' because it is the one called by the code.
The query was not pulling the Location_ID from the combobox. I found the Bound Column was set to 1 and should be 0 to reference the Location_ID column because column index begins with 0. Can hide this column from user by setting width to 0.
This query seems to work:
INSERT INTO Scanner_Location ( Model_ID, FootPrints_Num, Location_ID, Scanner_Loc_Cnt )
SELECT Scanner_Model.Model_ID, [Forms]![Scanner_IBOB]![fpNum_txt] AS FPNum,
[Forms]![Scanner_IBOB]![Location_Cbo_main] AS Location,
[Forms]![Scanner_IBOB]![Scanner_Loc_CntTxt] AS CountMod
FROM Scanner_Model
WHERE (((Scanner_Model.SM_Acc_Select)<>False)
AND (([Model_ID] & [Forms]![Scanner_IBOB]![fpNum_txt] &
[Forms]![Scanner_IBOB]![Location_Cbo_main])
NOT IN (SELECT Model_ID & Footprints_Num & Location_ID FROM Scanner_Location)));
Note I did not use # in field name. Advise not to use punctuation/special characters in names with only exception of underscore. Also used CountMod instead of Count as field name.
Why the requirement to select two models? What if one is added and the other isn't?
I have concerns about the db structure.
Don't think App_Location and App_Detail should both be linking to other tables. Why is Location_ID the primary key in App_Location as well as primary key in Location_Data? This is a 1-to-1 relationship.
Is Serial_Number the serial number for scanner? Why is it a primary key in Telnet? This also results in a 1-to-1 relationship in which case might as well combine them.
If an app is associated with a scanner and scanner is associated with a location then don't need location associated with app. Same goes for scanner and telnet.
Scanner_Location table is not linked to anything. If purpose of this table is to track a count of models/footprints/locations -- as already advised this is usually not a good idea. Ideally, count data should be calculated by aggregate query of raw data records when the information is needed.
Maybe use NOT IN, something like:
[some identifier field] NOT IN (SELECT [some identifier field] FROM
Review EXISTS vs IN
Consider following adjusted append query that checks existence of matched Model_ID and FP_Num in Scanner_Location. If matches do not exist, then query imports selected records as they would be new records and not duplicates. Also, table aliases are used for readability and subquery correlation.
INSERT INTO Scanner_Location ( Model_ID, FootPrints_Num, Location_ID, Scanner_Loc_Cnt )
SELECT m.Model_ID, [Forms]![Scanner_IBOB]![fpNum_txt] AS [FP#],
[Forms]![Scanner_IBOB]![Location_Cbo_main] AS Location,
[Forms]![Scanner_IBOB]![Scanner_Loc_CntTxt] AS [Count]
FROM Scanner_Detail d
RIGHT JOIN Scanner_Model m ON d.Model_ID = m.Model_ID
WHERE ((m.SM_Acc_Select = True)
AND (NOT EXISTS (SELECT 1 FROM Scanner_Location loc
WHERE ((loc.FootPrints_Num = Forms!Scanner_IBOB!fpNum_tx‌​t)
AND (loc.Model_ID = m.Model_ID)) ) ));

SQL Delete based on max value

I have a table that has a composite key of 3 columns
st_id, sj_id, order
and want to delete a row based on a specific st_id and sj_id and by taking the max(order)
Could you please help?
As far as I know, you'll need to do this in two steps (this is from memory, so may not compile first time):
DELETE
FROM table
WHERE st_id = my_st_id
AND sj_id = my_sj_id
AND order IN (
SELECT MAX(order)
FROM table
WHERE st_id = my_st_id
AND sj_id = my_sj_id)
What this does is perform the inner (SELECT) query first, returning the maximum order. Those results then get passed to the outer query which does the delete.