Ok so i am currently developing in Oracle 11G express edition for a college assignment. I have ran into an issue on how to auto increment and update in the following parent table. So i have an address table and then a city table for instance. Here is the address SQL code
create table address(
addressid int primary key,
cityid int,
countyid int,
streetnameid int,
postcodeid int,
doornumid int,
natid int,
foreign key (doornumid) references doornum,
foreign key (postcodeid) references postcode,
foreign key (streetnameid) references streetname,
foreign key (countyid) references county,
foreign key (cityid) references city,
foreign key (natid) references nat);
So as you can see I am referencing the city table as a foreign key and this is the city SQL table code below:
create table city(
cityid int primary key,
city varchar(45));
The city code is using a sequence to auto increment when stuff is inserted it:
create SEQUENCE seq_cityID
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 11;
So simply I am auto incrementing all inputs by a cityid like this:
INSERT INTO city (cityid, city)
values(seq_cityID.nextval, Oxford);
INSERT INTO city (cityid, city)
values(seq_cityID.nextval, Oxford);
But my issue is that when I am referencing this in its parent table e.g. the address table how do i reference the ID for that row of data and make sure that the correct ID is pulled without having to manually type it in?
INSERT INTO address (addressid, cityid, countyid, streetnameid, postcodeid, doornumid, natid)
values(seq_addressID.nextval, seq_cityID.nextval, seq_countyID.nextval, seq_streetnameID.nextval, seq_postcodeID.nextval, seq_doornumID.nextval, seq_natID.nextval);
INSERT INTO address (addressid, cityid, countyid, streetnameid, postcodeid, doornumid, natid)
values(seq_addressID.nextval, seq_cityID.nextval, seq_countyID.nextval, seq_streetnameID.nextval, seq_postcodeID.nextval, seq_doornumID.nextval, seq_natID.nextval);
This is the simple insert into address which is only currently referencing nextval but i do not believe it will pull that row of data from that ID. How do i effectively pull through that ID from the child table and have it correctly into the parent automatically?
You have a misconception about tables and their relations.
Your datamodel has one table for all city names, one for all street names, one for all door numbers, etc. But why would you have a table that contains door numbers? What does it tell you? Door numbers are no entity by themselves; they belong to a street. And streets belong to cities. It would make no sense to find all people who live in some number 12 or in some Main Street.
One possible data model:
country (country_id, country_name, country_code)
pk country_id
county (county_id, county_name)
pk county_id
fk country_id -> country
city (city_id, city_name, postcode, county_id)
pk city_id
fk county_id -> county
street (street_id, street name, city_id)
pk street_id
fk city_id -> city
address (address_id, street_id, door_number
pk address_id
fk street_id -> street
With this data model we can check for consistency. If we want to enter the address 111 Millstreet, Oxford, Italy, the database will tell us there is no Oxford in Italy. We can also easily find addresses in the same street (and not like "there are five million addresses in Main Street", but "800 addresses in Main Street, Ohio").
If you want to insert a new address, you look up the country, then the county, etc. until you get to the street ID.
With your original tables:
declare
v_countyid integer;
v_cityid integer;
v_streetnameid integer;
begin
insert into city (city_name) values ('Saint Petersburg') returning cityid into v_city_d;
insert into streetname (street_name) values ('Park Drive') returning streetnameid into v_streetnameid;
...
insert into address (cityid, streetnameid, ...) values (v_cityid, v_streetnameid, ...);
commit;
end;
But then, there would be duplicate cities, street names, etc. in the tables. So we'd rather:
declare
v_countyid integer;
v_cityid integer;
v_streetnameid integer;
begin
select cityid into v_cityid from city where city_name = 'Saint Petersburg';
if cityid is null then
insert into city (city_name) values ('Saint Petersburg') returning cityid into v_city_d;
end if;
select streetnameid into v_streetnameid from streetname where street_name = 'Park Drive';
if v_streetnameid is null then
insert into streetname (street_name) values ('Park Drive') returning streetnameid into v_streetnameid;
end if;
...
insert into address (cityid, streetnameid, ...) values (v_cityid, v_streetnameid, ...);
commit;
end;
But as mentioned in my other answer. This data model doesn't make much sense anyway.
Related
I am trying to create a star schema and am currently working on the dimension tables. I want to copy several columns from one table to another but at the same time I want to make the result values unique by 1 of the columns.
These are the tables I am using:
DWH_PRICE_PAID_RECORDS
CREATE TABLE "DWH_PRICE_PAID_RECORDS" ("TRANSACTION_ID" VARCHAR(50) NOT NULL, "PRICE" INTEGER, "DATE_OF_TRANSFER" DATE NOT NULL, "PROPERTY_TYPE" CHAR(1), "OLD_NEW" CHAR(1), "DURATION" CHAR(1), "TOWN_CITY" VARCHAR(50), "DISTRICT" VARCHAR(50), "COUNTY" VARCHAR(50), "PPDCATEGORY_TYPE" CHAR(1), "RECORD_TYPE" CHAR(1));
ALTER TABLE "DWH_PRICE_PAID_RECORDS" ADD CONSTRAINT "PK3" PRIMARY KEY ("TRANSACTION_ID");
and DIM_REGION
CREATE TABLE "DIM_REGION" ("REGION_ID" INTEGER generated always as identity (start with 1 increment by 1), "TRANSACTION_ID" VARCHAR(50), "TOWN" VARCHAR(50), "COUNTY" VARCHAR(50), "DISTRICT" VARCHAR(50), "LATITUDE" VARCHAR(50), "LONGITUDE" VARCHAR(50), "COUNTRY_STRING" VARCHAR(50));
ALTER TABLE "DIM_REGION" ADD CONSTRAINT "PK8" PRIMARY KEY ("REGION_ID");
My first attempt was to use "select distinct" but that only removes all duplicates of ALL columns combined. I want to have a region dimension and the "town" should be the identifier to match DIM_REGION with the fact table on the data mart that I will create later (called DM_PRICE_PAID_RECORDS).
The DWH_PRICE_PAID_RECORDS table has around 10k records but only 938 unique towns. I want to have those 938 towns in the dim_region as ID along with other columns such as county, district etc.
This is what works but then of course everything else is NULL but town:
INSERT INTO DIM_REGION (TOWN) SELECT (town_city) from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
So I thought I only have to add additional columns
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
but when I do that I get this error message (the error message is german and I had to translate, sorry):
ERROR 42Y36 Column reference: "DWH_PRICE_PAID_RECORDS.COUNTY" is invalid or part of a invalid statement. When using SELECT and GROUP BY the selected columns and statements must be valid group- or aggregation expressions.
Can you help me or do you have another idea how else I could get the result I seek?
Thank you very much!
If the other 2 columns don't matter, you can do this:
INSERT INTO DIM_REGION (TOWN, County, District)
SELECT town_city, MAX(county), MAX(district)
FROM DWH_PRICE_PAID_RECORDS
GROUP BY town_city
This will get you only 1 row for each town.
You are so close!
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city, county, district;
That should do the job. When using a group by, everything in the SELECT list that isn't an aggregate has to appear in the GROUP BY clause.
As an aside, does TRANSACTION_ID really belong in the dimension table?
I'm new to databases, so I'll start by showing what I would do if I was using a simple table in a csv file. Presently, I'm building a Shiny (R) app to keep track of people taking part in studies. The idea is to make sure no one is doing more than one study at the same time, and that enough time has passed between studies.
A single table would look like something like this:
study_title contact_person tasks first_name last_name
MX9345-3 John Doe OGTT Michael Smith
MX9345-3 John Doe PVT Michael Smith
MX9345-3 John Doe OGTT Julia Barnes
MX9345-3 John Doe PVT Julia Barnes
...
So each study has a single contact person, but multiple tasks. It is possible other studies will use the same tasks.
Each task should have a description
Each person can be connected to multiple studies (the final database has timestamps to make sure this does not happen at the same time), and consequently repeat the same tasks.
the SQLite code could look something like this
CREATE TABLE studies (study_title TEXT, contact_person TEXT);
CREATE TABLE tasks (task_name TEXT, description TEXT);
CREATE TABLE participants (first_name TEXT, last_name TEXT);
Now I'm stuck. If I add a primary key and foreign keys (say in studies an ID for each study, and foreign keys for each task and person), the primary keys will repeat, which is not possible. A Study is defined by the tasks it contains (akin to an album and music tracks).
How should I approach this situation in SQLite? And importantly, how are the INSERTs done in these situations? I've seen lots of SELECT examples, but few INSERTs that match all IDs in each table, for example when adding a new person to a running study.
What you do is use tables to map/reference/relate/associate.
The first step would be to utilise alias's of the rowid so instead of :-
CREATE TABLE studies (study_title TEXT, contact_person TEXT);
CREATE TABLE tasks (task_name TEXT, description TEXT);
CREATE TABLE participants (first_name TEXT, last_name TEXT);
you would use :-
CREATE TABLE studies (id INTEGER PRIMARY KEY,study_title TEXT, contact_person TEXT);
CREATE TABLE tasks (id INTEGER PRIMARY KEY, task_name TEXT, description TEXT);
CREATE TABLE participants (id INTEGER PRIMARY KEY, first_name TEXT, last_name TEXT);
With SQLite INTEGER PRIMARY KEY (or INTEGER PRIMARY KEY AUTOINCREMENT) makes the column (id in the above although they can have any valid column name) and alias of the rowid (max of 1 per table), which uniquely identifies the rows.
Why not to use AUTOINCREMENT plus more seeSQLite Autoincrement
Insert some data for demonstration :-
INSERT INTO studies (study_title, contact_person)
VALUES ('Maths','Mr Smith'),('English','Mrs Taylor'),('Geography','Mary White'),('Phsyics','Mr Smith');
INSERT INTO tasks (task_name,description)
VALUES ('Task1','Do task 1'),('Task2','Do task 2'),('Task3','Do task 3'),('Task4','Do task 4'),('Mark','Mark the sudies');
INSERT INTO participants (first_name,last_name)
VALUES ('Tom','Jones'),('Susan','Smythe'),('Sarah','Toms'),('Alan','Francis'),('Julian','MacDonald'),('Fred','Bloggs'),('Rory','Belcher');
First mapping/reference... Table :-
CREATE TABLE IF NOT EXISTS study_task_relationship (study_reference INTEGER, task_reference INTEGER, PRIMARY KEY (study_reference,task_reference));
Map/relate Study's with Tasks (many-many possible)
Do some mapping (INSERT some data) :-
INSERT INTO study_task_relationship
VALUES
(1,2), -- Maths Has Task2
(1,5), -- Maths has Mark Questions
(2,1), -- English has Task1
(2,4), -- English has Task4
(2,5), -- English has Mark questions
(3,3), -- Geography has Task3
(3,1), -- Geoegrapyh has Task1
(3,2), -- Geography has Task2
(3,5), -- Geography has Mark Questions
(4,4) -- Physics has Task4
;
- See comments on each line
List the Studies along with the tasks
SELECT study_title, task_name -- (just want the Study title and task name)
FROM study_task_relationship -- use the mapping table as the main table
JOIN studies ON study_reference = studies.id -- get the related studies
JOIN tasks ON task_reference = tasks.id -- get the related tasks
ORDER BY study_title -- Order by Study title
results in :-
List each study with all it's tasks
SELECT study_title, group_concat(task_name,'...') AS tasklist
FROM study_task_relationship
JOIN studies ON study_reference = studies.id
JOIN tasks ON task_reference = tasks.id
GROUP BY studies.id
ORDER by study_title;
results in :-
Add study-participants associative table and populate it :-
CREATE TABLE IF NOT EXISTS study_participants_relationship (study_reference INTEGER, participant_reference INTEGER, PRIMARY KEY (study_reference,participant_reference));
INSERT INTO study_participants_relationship
VALUES
(1,1), -- Maths has Tom Jones
(1,5), -- Maths has Julian MacDonald
(1,6), -- Maths has Fred Bloggs
(2,4), -- English has Alan Francis
(2,7), -- English has Rory Belcher
(3,3), -- Geography has Sarah Toms
(3,2) -- Susan Smythe
;
You can now, as an example, get a list of participants the tasks along with the study title :-
SELECT study_title, task_name, participants.first_name ||' '||participants.last_name AS fullname
FROM study_task_relationship
JOIN tasks ON study_task_relationship.task_reference = tasks.id
JOIN studies On study_task_relationship.study_reference = studies.id
JOIN study_participants_relationship ON study_task_relationship.study_reference = study_participants_relationship.study_reference
JOIN participants ON study_participants_relationship.participant_reference = participants.id
ORDER BY fullname, study_title
which would result in :-
FOREIGN KEYS
As you can see there is no actual need for defining FOREIGN KEYS. They are really just an aid to stop you inadvertently doing something like :-
INSERT INTO study_participants_relationship VALUES(30,25);
No such study nor no such participant
To utilise FOREIGN KEYS you have to ensure that they are enabled, the simplest is just to issue the command to turn them on (as if it were a normal SQL statment).
PRAGMA foreign_keys=1
A FOREIGN KEY is a constraint, it stops you INSERTING, UPDATING or DELETING a row that would violate the constraint/rule.
Basically the rule is that the column to which the FOREIGN key is defined (the child) must have a value that is in the referenced table/column the parent.
So assumning that FOREIGN KEYS are turned on then coding :-
CREATE TABLE IF NOT EXISTS study_participants_relationship
(
study_reference INTEGER REFERENCES studies(id), -- column foreign key constraint
participant_reference INTEGER,
FOREIGN KEY (participant_reference) REFERENCES participants(id) -- table foreign key constraint
PRIMARY KEY (study_reference,participant_reference
)
);
Would result in INSERT INTO study_participants_relationship VALUES(30,25); failing e.g.
FOREIGN KEY constraint failed: INSERT INTO study_participants_relationship VALUES(30,25);
It fails as there is no row in studies with an id who's value is 30 (the first column foreign key constraint). If the value 30 did exist then the second constraint would kick in as there is no row in participants with an id of 25.
There is no difference between a column Foreign key constraint and a table Foreign key constraint other than where and how they are coded.
However, the above wouldn't stop you deleting all rows from the study_participants_relationship table as it would stop you deleting a row from the studies or participants table if they were referenced by the study_participants_relationship table.
"deal with an id that needs to be matched to multiple ids in SQLite?"
For many-to-many couplings, make extra coupling tables, like the study_task and participent_task tables below. This is many-to-many since a task can be on many studies and a study can have many tasks.
"make sure no one is doing more than one study at the same time"
That could be handled by letting each participant only have a column for current study (no place for more than one then).
PRAGMA foreign_keys = ON;
CREATE TABLE study (id INTEGER PRIMARY KEY, study_title TEXT, contact_person TEXT);
CREATE TABLE task (id INTEGER PRIMARY KEY, task_name TEXT, description TEXT);
CREATE TABLE participant (
id INTEGER PRIMARY KEY,
first_name TEXT,
last_name TEXT,
id_current_study INTEGER references study(id),
started_current_study DATE
);
CREATE TABLE study_task (
id_study INTEGER NOT NULL references study(id),
id_task INTEGER NOT NULL references task(id),
primary key (id_study,id_task)
);
CREATE TABLE participant_task (
id_participant INTEGER NOT NULL references participant(id),
id_task INTEGER NOT NULL references task(id),
status TEXT check (status in ('STARTED', 'DELIVERED', 'PASSED', 'FAILED')),
primary key (id_participant,id_task)
);
insert into study values (1, 'MX9345-3', 'John Doe');
insert into study values (2, 'MX9300-2', 'Jane Doe');
insert into participant values (1001, 'Michael', 'Smith', 1,'2018-04-21');
insert into participant values (1002, 'Julia', 'Barnes', 1, '2018-04-10');
insert into task values (51, 'OGTT', 'Make a ...');
insert into task values (52, 'PVT', 'Inspect the ...');
insert into study_task values (1,51);
insert into study_task values (1,52);
insert into study_task values (2,51);
--insert into study_task values (2,66); --would fail since 66 doesnt exists (controlled and enforced by foreign key)
The PRAGMA on the first line is needed to make SQLite (above version 3.6 from 2009 I think) enforce foreign keys, without it it just accepts the foreign key syntax, but no controlling is done.
I am using Postgres 9.6, and have an issue where I am using jsonb_populate_recordset.
I created an UNIQUE constraint on the table, but I am able to bypass this when performing an INSERT with null values.
Is there was a way to force the unique constraint to keep only 1 record, even if it has null values, and not allow duplicates afterward?
Here is a quick example:
CREATE TABLE person(
person_id SERIAL PRIMARY KEY,
person_name TEXT,
CONSTRAINT unq_person UNIQUE(person_name)
);
INSERT INTO person (person_name) VALUES ('Frank');
CREATE TABLE locations(
location_id SERIAL PRIMARY KEY,
city TEXT,
state TEXT,
address TEXT,
address_country TEXT,
postal_code TEXT,
person_id INTEGER REFERENCES person(person_id)
ON DELETE CASCADE,
CONSTRAINT unq_location UNIQUE(city, state, address, address_country, postal_code, person_id)
);
In this example, city and address are null (but theoretically, they could all be null, or any combination of record properties).
Every time I run the following query, a new record gets inserted. I don't want more than one of these records.
INSERT INTO locations (city, state, address, address_country, postal_code, person_id)
SELECT city, state, address, address_country, postal_code, person_id
FROM jsonb_populate_recordset(NULL::locations, '[{"city": null, "address": null, "address_country": "USA", "state": "NY", "person_id": 1, "postal_code": "10001"}]'::jsonb)
How can I only allow 1 record, and not multiple when inserting a JSONB object into Postgres?
To your existing query:
insert into locations
(fields)
select values
from etc
add this filter
where not exists
(select 1
from locations l2
where locations.person_id = l2.person_id
)
It has nothing to do with null values.
I have a table which contains all information about a player. What I'd like to do is take this information and split the data into different tables. For example:
Table PlayerData has all the information. Table Address holds information about the player's residence and table Info holds information such as Name, date of birth and the Address ID (pointing to the Address table).
I can use an INSERT INTO...SELECT to copy data across. However, my issue comes in doing this sequentially such that the correct ID outputted from the Address table is inserted into the Info table otherwise there would be a mix up between which address belongs to which player. How can I get the identity created for an Address insert and use that in the subsequent Staff insert?
Speed is not a priority as this is only done once to initialise the database, the integrity is crucial.
Thanks
If you are learning how to do this, then learn the right way: the OUTPUT clause (documented here).
This allows you to put the results into a temporary table (usually a table variable). An example is:
DECLARE #ids TABLE (AddressId int);
INSERT INTO ADDRESS( . . .)
OUTPUT inserted.AddressId INTO #ids
VALUES ( . . . );
INSERT INTO info( . . ., AddressId)
SELECT . . . , i.AddressId
FROM #ids i;
You should really be using foreign keys to reference a main table. Now I've kept this simple but if you can follow this then you will be able to edit it for your needs.
Ok, let's make some sample data. This is an equivalent of your current table;
CREATE TABLE PlayerData (PlayerID int, PlayerName varchar (20), DateOfBirth date, Address1 varchar(20), Address2 varchar(20))
INSERT INTO PlayerData (PlayerID, PlayerName, DateOfBirth, Address1, Address2)
VALUES
(1,'Mike Hunt','1980-01-01','Mike Street','Hunt Town')
,(2,'Harry Dong','1970-02-02','Harry Street','Dong Town')
,(3,'Hugh Gass','1960-03-03','Hugh Street','Gass Town')
,(4,'Neil Down','1950-04-04','Neil Street','Down Town')
,(5,'Seymore Butts','1940-05-05','Seymore Street','Butts Town')
I'm going to create one table that holds a unique list of my player id numbers, this is where I would put a little further information that doesn't fit into the other tables. For this example I've just got the one field;
CREATE TABLE PlayerNum (PlayerID int PRIMARY KEY CLUSTERED)
I'm now going to make my new AddressData table. Notice it's got it's own identity field but also has a PlayerID that will reference the PlayerNum table;
CREATE TABLE AddressData (AddressID int identity(10,1) PRIMARY KEY CLUSTERED, PlayerID int, Address1 varchar(20), Address2 varchar(20), FOREIGN KEY (PlayerID) REFERENCES PlayerNum(PlayerID))
I'm going to do the same for the table that will contain my player's personal info;
CREATE TABLE PlayerPersonalInfo (InfoID int identity(50,1) PRIMARY KEY CLUSTERED, PlayerID int, PlayerName varchar(20), DateOfBirth date, FOREIGN KEY (PlayerID) REFERENCES PlayerNum(PlayerID))
So I've now got my new 3 tables that are empty and one table with data to insert into them.
Let's first populate our PlayerNum table, this needs to be first because of the foreign key constraints on the other tables;
INSERT INTO PlayerNum (PlayerID)
SELECT PlayerID
FROM PlayerData
Now I've done that, let's insert our data into AddressData. Notice I'm not inserting data into the AddressID field as it's an identity field. It will start from 10 and increment by 1 as per the table definition;
INSERT INTO AddressData (PlayerID, Address1, Address2)
SELECT PlayerID, Address1, Address2
FROM PlayerData
I'm going to do the same with my PlayerPersonalInfo data. The identity for this table will start from 50 and increment by 1;
INSERT INTO PlayerPersonalInfo (PlayerID, PlayerName, DateOfBirth)
SELECT PlayerID, PlayerName, DateOfBirth
FROM PlayerData
You can now get rid of the PlayerData table if you're confident you don't need it.
DROP TABLE PlayerData
You'll now have 3 tables;
PlayerNum
PlayerID
1
2
3
4
5
AddressData
AddressID PlayerID Address1 Address2
10 1 Mike Street Hunt Town
11 2 Harry Street Dong Town
12 3 Hugh Street Gass Town
13 4 Neil Street Down Town
14 5 Seymore Street Butts Town
PlayerPersonalInfo
InfoID PlayerID PlayerName DateOfBirth
50 1 Mike Hunt 1980-01-01
51 2 Harry Dong 1970-02-02
52 3 Hugh Gass 1960-03-03
53 4 Neil Down 1950-04-04
54 5 Seymore Butts 1940-05-05
Notice that the PlayerID in the final two tables can now be linked to PlayerNum in order to retrieve your data.
As we're using foreign keys, you cannot have a player with information in AddressData or PlayerPersonalInfo without a corresponding entry in PlayerNum
I have two tables in postgresql. One is SateTable having column Statename and Statecode and another is DistTable having columns DistName, Statename and statecode. In the DistTable the columns DistName and Statename are populated. I want to update 'Statecode' column in the DistTable. Kindly help.
create table Statetable (statename varchar, statecode varchar);
create table Disttable (Distname varchar, Statename varchar, statecode varchar);
insert into Statetable
values
('new york','NY'),
('Nebraska','NB'),
('Alaska','AL');
insert into Disttable values
('King','New York', null),
('salt lake','Nebraska', null),
('Hanlulu','AL', null);
Corrcet me if I am wrong, but I think your model should be:
create table Statetable (statename varchar, statecode varchar);
create table Disttable (distname varchar, statecode varchar);
insert into Statetable
values
('new york','NY'),
('Nebraska','NB'),
('Alaska','AL');
insert into Disttable values
('King','NY'),
('salt lake','NB'),
('Hanlulu','AL');
futhermore I think it would be wise to have some identification on disttable so later you can add other address tables.
What you asked for:
(which is the wrong solution)
update disttable
set statecode = st.statecode
from statetable st
where st.statename = disttable.statename;
BUT: the above statement will only update a single row from your sample data because
'new york' is not the same as 'New York'
There is no statename 'AL' in the statetable.
so the join between the two tables will only find (and update) the row with Nebraska.
SQLFiddle demo: http://sqlfiddle.com/#!15/977fe/1
The correct solution:
The correct solution is to normalize your tables. Give the state table a proper primary key (statecode is probably a good choice) and git rid of the useless table suffix for your tables.
Create the states table with a primary key:
create table states
(
statecode varchar(2) not null primary key,
statename varchar
);
insert into states (statename, statecode)
values
('new york','NY'),
('Nebraska','NB'),
('Alaska','AL');
The distribution table only references the state table:
create table distributions
(
-- you are missing a primary key here as well.
Distname varchar,
statecode varchar not null references states
);
insert into distribution values
('King','NY'),
('salt lake','NB'),
('Hanlulu','AL');
If you need to display the distname together with the statename, use a join:
select d.distname,
st.statename,
st.statecode
from distribution d
join states s on s.statecode = d.statecode;
If you don't want to type that all the time, create a view with the above statement.
This solution also avoids the problem that the UPDATE with a join doesn't find the corresponding rows because of incorrectly spelled states or wrong values for the state name.