I have two tables in postgresql. One is SateTable having column Statename and Statecode and another is DistTable having columns DistName, Statename and statecode. In the DistTable the columns DistName and Statename are populated. I want to update 'Statecode' column in the DistTable. Kindly help.
create table Statetable (statename varchar, statecode varchar);
create table Disttable (Distname varchar, Statename varchar, statecode varchar);
insert into Statetable
values
('new york','NY'),
('Nebraska','NB'),
('Alaska','AL');
insert into Disttable values
('King','New York', null),
('salt lake','Nebraska', null),
('Hanlulu','AL', null);
Corrcet me if I am wrong, but I think your model should be:
create table Statetable (statename varchar, statecode varchar);
create table Disttable (distname varchar, statecode varchar);
insert into Statetable
values
('new york','NY'),
('Nebraska','NB'),
('Alaska','AL');
insert into Disttable values
('King','NY'),
('salt lake','NB'),
('Hanlulu','AL');
futhermore I think it would be wise to have some identification on disttable so later you can add other address tables.
What you asked for:
(which is the wrong solution)
update disttable
set statecode = st.statecode
from statetable st
where st.statename = disttable.statename;
BUT: the above statement will only update a single row from your sample data because
'new york' is not the same as 'New York'
There is no statename 'AL' in the statetable.
so the join between the two tables will only find (and update) the row with Nebraska.
SQLFiddle demo: http://sqlfiddle.com/#!15/977fe/1
The correct solution:
The correct solution is to normalize your tables. Give the state table a proper primary key (statecode is probably a good choice) and git rid of the useless table suffix for your tables.
Create the states table with a primary key:
create table states
(
statecode varchar(2) not null primary key,
statename varchar
);
insert into states (statename, statecode)
values
('new york','NY'),
('Nebraska','NB'),
('Alaska','AL');
The distribution table only references the state table:
create table distributions
(
-- you are missing a primary key here as well.
Distname varchar,
statecode varchar not null references states
);
insert into distribution values
('King','NY'),
('salt lake','NB'),
('Hanlulu','AL');
If you need to display the distname together with the statename, use a join:
select d.distname,
st.statename,
st.statecode
from distribution d
join states s on s.statecode = d.statecode;
If you don't want to type that all the time, create a view with the above statement.
This solution also avoids the problem that the UPDATE with a join doesn't find the corresponding rows because of incorrectly spelled states or wrong values for the state name.
Related
So I've been going through SQL migrations to insert data in a SEQUENTIAL manner specifically from parent to child.
I've inserted data in the parent table. Now I've to store the primary key value of that
specific row (WHERE condition is defined in query for reference " where description = '1234'") in a variable.
And while inserting data to the child table I've to use that primary key value stored in a variable in place of a foreign key column("country_code_id") of the child table.
I'm using Postgresql
CREATE TABLE Countries
(
id SERIAL,
description VARCHAR(100),
CONSTRAINT coutry_pkey PRIMARY KEY (id)
);
CREATE TABLE Cities
(
country_code_id int ,
city_id int,
description VARCHAR(100),
CONSTRAINT cities_pkey PRIMARY KEY (city_id),
CONSTRAINT fk_cities_countries FOREIGN KEY (country_code_id) REFERENCES Countries (id)
);
INSERT INTO COUNTRIES (description) VALUES('asdf');
#countrid = SELECT id FROM COUNTRIES WHERE description = 'asdf';
INSERT INTO cities VALUES (countrid, 1 , 'abc');
SQL does not have variables. The normal way to do this is to use INSERT ... RETURNING:
INSERT INTO countries (description) VALUES ('1234')
RETURNING id;
This will return the automatically generated primary key. You store that in a variable on the client side and run a second statement:
INSERT INTO cities (country_code_id, city_id, description)
VALUES (4711, 1, 'abc');
where 4711 is the value returned from the first statement. To avoid hard-coding the value, you can use a prepared statement, which also will boost performance.
An alternative, more complicated, solution is to run both statements in a single statement using a common table expression:
WITH country_ids AS (
INSERT INTO countries (description) VALUES ('1234')
RETURNING id
INSERT INTO (country_code_id, city_id, description)
SELECT id, 1, 'abc'
FROM country_ids;
I'm running the following SQLite workaround to add a primary key to a table that did not have one. I am getting a datatype mismatch on
INSERT INTO cities
SELECT id, name FROM old_cities;
However, the fields have exactly the same type. Is it possible that his happens due to running the queries from DbBrowser for SQLite?
CREATE table cities (
id INTEGER NOT NULL,
name TEXT NOT NULL
);
INSERT INTO cities (id, name)
VALUES ('pan', 'doul');
END TRANSACTION;
PRAGMA foreign_keys=off;
BEGIN TRANSACTION;
ALTER TABLE cities RENAME TO old_cities;
--CREATE TABLE cities (
-- id INTEGER NOT NULL PRIMARY KEY,
-- name TEXT NOT NULL
--);
CREATE TABLE cities (
id INTEGER NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY (id)
);
SELECT * FROM old_cities;
INSERT INTO cities
SELECT id, name FROM old_cities;
DROP TABLE old_cities;
COMMIT;
You have defined the column id of the table cities to be INTEGER, but with this:
INSERT INTO cities (id, name) VALUES ('pan', 'doul');
you insert the string 'pan' as id.
SQLite does not do any type checking in this case and allows it.
Did you mean to insert 2 rows each having the names 'pan' and 'doul'?
If so, you should do something like:
INSERT INTO cities (id, name) VALUES (1, 'pan'), (2, 'doul');
Later you rename the table cities to old_cities and you recreate cities but you do something different: you define id as INTEGER and PRIMARY KEY.
This definition is the only one that forces type checking in SQLite.
So, when you try to insert the rows from old_cities to cities you get an error because 'pan' is not allowed in the column id as it is defined now.
In Postgres, is there a way to atomically insert a row into a table, where one column references another table, and we look up to see if the desired row exists in the referenced table and inserts it as well if it is not?
For example, say we have a US states table and a cities table which references the states table:
CREATE TABLE states (
state_id serial primary key,
name text
);
CREATE TABLE cities (
city_id serial,
name text,
state_id int references states(state_id)
);
When I want to add the city of Austin, Texas, I want to be able to see whether Texas exists in the states table, and if so use its state_id in the new row I'm inserting in the cities table. If Texas doesn't exist in the states table, I want to create it and then use its id in the cities table.
I tried this query, but I got an error saying
ERROR: WITH clause containing a data-modifying statement must be at the top level
LINE 2: WITH inserted AS (
^
WITH state_id AS (
WITH inserted AS (
INSERT INTO states(name)
VALUES ('Texas')
ON CONFLICT DO NOTHING
RETURNING state_id),
already_there AS (
SELECT state_id FROM states
WHERE name='Texas')
SELECT * FROM inserted
UNION
SELECT * FROM already_there)
INSERT INTO cities(name, state_id)
VALUES
('Austin', (SELECT state_id FROM state_id));
Am I overlooking a simple solution?
Here is one option:
with inserted as (
insert into states(name) values ('Texas')
on conflict do nothing
returning state_id
)
insert into cities(name, state_id)
values (
'Dallas',
coalesce(
(select state_id from inserted),
(select state_id from states where name = 'Texas')
)
);
The idea is to attempt to insert in a CTE, and then, in the main insert, check if a value was inserted, else select it.
For this to work properly, you need a unique constraint on states(name):
create table states (
state_id serial primary key,
name text unique
);
Demo on DB Fiddlde
You can force the insert statement to return a value:
WITH inserted AS (
INSERT INTO states (name)
VALUES ('Texas')
ON CONFLICT (name) DO UPDATE SET name = EXCLUDED.NAME
RETURNING state_id
)
. . .
The DO UPDATE SET forces the INSERT to return something.
I notice that you don't have a unique constraint, so you also need that:
ALTER TABLE states ADD CONSTRAINT unq_state_name
UNIQUE (name);
Otherwise the ON CONFLICT doesn't have anything to work with.
I am trying to create a star schema and am currently working on the dimension tables. I want to copy several columns from one table to another but at the same time I want to make the result values unique by 1 of the columns.
These are the tables I am using:
DWH_PRICE_PAID_RECORDS
CREATE TABLE "DWH_PRICE_PAID_RECORDS" ("TRANSACTION_ID" VARCHAR(50) NOT NULL, "PRICE" INTEGER, "DATE_OF_TRANSFER" DATE NOT NULL, "PROPERTY_TYPE" CHAR(1), "OLD_NEW" CHAR(1), "DURATION" CHAR(1), "TOWN_CITY" VARCHAR(50), "DISTRICT" VARCHAR(50), "COUNTY" VARCHAR(50), "PPDCATEGORY_TYPE" CHAR(1), "RECORD_TYPE" CHAR(1));
ALTER TABLE "DWH_PRICE_PAID_RECORDS" ADD CONSTRAINT "PK3" PRIMARY KEY ("TRANSACTION_ID");
and DIM_REGION
CREATE TABLE "DIM_REGION" ("REGION_ID" INTEGER generated always as identity (start with 1 increment by 1), "TRANSACTION_ID" VARCHAR(50), "TOWN" VARCHAR(50), "COUNTY" VARCHAR(50), "DISTRICT" VARCHAR(50), "LATITUDE" VARCHAR(50), "LONGITUDE" VARCHAR(50), "COUNTRY_STRING" VARCHAR(50));
ALTER TABLE "DIM_REGION" ADD CONSTRAINT "PK8" PRIMARY KEY ("REGION_ID");
My first attempt was to use "select distinct" but that only removes all duplicates of ALL columns combined. I want to have a region dimension and the "town" should be the identifier to match DIM_REGION with the fact table on the data mart that I will create later (called DM_PRICE_PAID_RECORDS).
The DWH_PRICE_PAID_RECORDS table has around 10k records but only 938 unique towns. I want to have those 938 towns in the dim_region as ID along with other columns such as county, district etc.
This is what works but then of course everything else is NULL but town:
INSERT INTO DIM_REGION (TOWN) SELECT (town_city) from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
So I thought I only have to add additional columns
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
but when I do that I get this error message (the error message is german and I had to translate, sorry):
ERROR 42Y36 Column reference: "DWH_PRICE_PAID_RECORDS.COUNTY" is invalid or part of a invalid statement. When using SELECT and GROUP BY the selected columns and statements must be valid group- or aggregation expressions.
Can you help me or do you have another idea how else I could get the result I seek?
Thank you very much!
If the other 2 columns don't matter, you can do this:
INSERT INTO DIM_REGION (TOWN, County, District)
SELECT town_city, MAX(county), MAX(district)
FROM DWH_PRICE_PAID_RECORDS
GROUP BY town_city
This will get you only 1 row for each town.
You are so close!
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city, county, district;
That should do the job. When using a group by, everything in the SELECT list that isn't an aggregate has to appear in the GROUP BY clause.
As an aside, does TRANSACTION_ID really belong in the dimension table?
How to add multiple values in a single column of table in SQL? My table looks like this:
Create table emp
(
id number(5),
name varchar(25),
phone varchar(25)
);
Now I want to add values and multiple phones in the phone column. How to do that? I tried using
insert into emp values (id, name, phone)
values (1, lee, (23455, 67543));
but this is not working
Use two insert statements instead
insert into emp values (id, name,phone) values (1,'lee','23455');
insert into emp values (id, name,phone) values (1,'lee','67543');
or If you want to store both the values in single row
insert into emp values (id, name,phone) values (1,'lee','23455,67543');
Here table is not normalised. You either need to store Phone Number info in separate table or use two different column in same table.
Try changing you table design like this.
EMP table
CREATE TABLE emp
(
emp_id INT IDENTITY(1, 1) PRIMARY KEY,
name VARCHAR(25)
);
PhoneNumber Table
CREATE TABLE PhoneNumber
(
phoneno_id INT IDENTITY(1, 1),
emp_id INT,
Phone_Number int,
Cell_Number Int,
FOREIGN KEY (emp_id) REFERENCES emp(emp_id)
)
Note : Auto increment syntax may differ based on the database you are using.
The proper and only real well-designed way to do this in a relational setting is to use a separate table for your phones (this is in SQL Server specific syntax - it might be slightly different, depending on which concrete database system you're using):
Create table emp
(
id INT PRIMARY KEY,
name varchar(25)
)
create table phone
(
phoneId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
empid INT NOT NULL,
phone varchar(25) NOT NULL,
CONSTRAINT FK_Phone_Emp
FOREIGN KEY(empid) REFERENCES dbo.emp(id)
);
and then you insert the employee data into emp :
insert into emp(id, name)
values (1, lee);
and the phones into phone:
insert into phone(empid, phone) values(1, 23455);
insert into phone(empid, phone) values(1, 67543);
With this setup, you have proper normalization for the database, and you can store as many phones as you like, for each employee.