Postgresql insert if does not exist - sql

I have the following query
INSERT INTO address (house_number, street, city_id)
values(11, 'test st', (select id from city where LOWER(city) = LOWER('somecity')))
Is there anyway to insert "somecity" in the city table if "somecity" does not exist in city then after inserting, it would return the ID for the inserted row?
I did find this answer that says upsert can be used to achieve this
https://stackoverflow.com/a/31742830/492015
but I can't find an example that inserts if select does not return the row.

Instead of nesting the INSERTs, you could use a CTE
to perform the INSERTs one after the other but as a single statement:
WITH tmp AS (
INSERT INTO test_city (city) VALUES ('somecity')
ON CONFLICT (lower(city)) DO UPDATE SET city = excluded.city
RETURNING id, city
)
INSERT INTO test_address (house_number, street, city_id)
SELECT house_number, street, id
FROM (VALUES (11, 'test st', 'somecity')) val (house_number, street, city)
LEFT JOIN tmp USING (city)
RETURNING *
Using this setup:
DROP TABLE IF EXISTS test_address;
DROP TABLE IF EXISTS test_city;
CREATE TABLE test_address (
house_number int
, street text
, city_id int
);
CREATE TABLE test_city (
id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
, city text
);
CREATE UNIQUE INDEX test_city_uniq_idx ON test_city USING btree (lower(city));
INSERT INTO test_city (city) VALUES ('Somecity');
and with the INSERT above, the query
SELECT * FROM test_address;
yields
| house_number | street | city_id |
|--------------+---------+---------|
| 11 | test st | 1 |
and
SELECT * FROM test_city;
yields
| id | city |
|----+----------|
| 1 | somecity |
Note that the CTE replaces
(select id from city where LOWER(city) = LOWER('somecity'))
with an INSERT .. ON CONFLICT .. DO UPDATE statement:
INSERT INTO test_city (city) VALUES ('somecity')
ON CONFLICT (lower(city)) DO UPDATE SET city = excluded.city
RETURNING id, city
I used DO UPDATE instead of DO NOTHING so that RETURNING id, city will always return something. If you use DO NOTHING, then nothing is returned when there is a conflict.
Note however that a consequence of using city = excluded.city is that the original 'Somecity'
gets replaced by 'somecity'. I'm not sure you'll find that behavior acceptable, but unfortunately I haven't figured out how to do nothing when there is a conflict and yet return id and city at the same time.
Another issue you may have with the above solution is that I used a unique index on lower(city):
CREATE UNIQUE INDEX test_city_uniq_idx ON test_city USING btree (lower(city));
This allows you to use the identical condition in the INSERT statement:
INSERT ... ON CONFLICT (lower(city))
as a substitute for the condition LOWER(city) = LOWER('somecity') which appeared in your SELECT statement. It produces the desired effect, but the trade-off is that now you have a unique index
on (lower(city)).
Regarding the followup question
of how to insert into more than 2 tables:
You can chain together more than one CTE, and the subsequent CTEs can even reference the prior CTEs. For example,
CREATE UNIQUE INDEX city_uniq_idx ON city USING btree (lower(city));
CREATE UNIQUE INDEX state_uniq_idx ON state USING btree (lower(state_code));
WITH tmpcity AS
(
INSERT INTO
city (city)
VALUES
(
'Miami'
)
ON CONFLICT (lower(city)) DO
UPDATE
SET
city = excluded.city RETURNING id, city
)
, tmpstate as
(
INSERT INTO
state (state_code)
VALUES
(
'FL'
)
ON CONFLICT (lower(state_code)) DO
UPDATE
SET
state_code = excluded.state_code RETURNING id, state_code
)
INSERT INTO
address (house_number, street, city_id, state_id)
SELECT
house_number,
street,
tmpcity.id,
tmpstate.id
FROM
(
VALUES
(
12,
'fake st.',
'Miami',
'FL'
)
)
val (house_number, street, city, state_code)
LEFT JOIN
tmpcity USING (city)
LEFT JOIN
tmpstate USING (state_code)
ON CONFLICT (street) DO NOTHING

Related

show the affected rows after update/insert/delete in DB2

hello I used to work with Postgres and there if I want to see the affected rows after a manipulation I used the key word RETURNING
Example to show all the columns of the affected row(s):
UPDATE tblName
SET colName='something'
WHERE colName='something'
RETURNING *;
can anyone tell me how I do the same thing with DB2 ?
Example for DB2:
CREATE TABLE MYTABLE (
ID INTEGER GENERATED ALWAYS AS IDENTITY,
NAME CHAR(30),
AGE SMALLINT,
)
SELECT ID, NAME, AGE
FROM FINAL TABLE
(
INSERT INTO MYTABLE (NAME, AGE)
VALUES('Jon Smith', 35)
)
Result:
ID NAME AGE
1 Jon Smith 35

Adding a LEFT JOIN on a INSERT INTO....RETURNING

My query Inserts a value and returns the new row inserted
INSERT INTO
event_comments(date_posted, e_id, created_by, parent_id, body, num_likes, thread_id)
VALUES(1575770277, 1, '9e028aaa-d265-4e27-9528-30858ed8c13d', 9, 'December 7th', 0, 'zRfs2I')
RETURNING comment_id, date_posted, e_id, created_by, parent_id, body, num_likes, thread_id
I want to join the created_by with the user_id from my user's table.
SELECT * from users WHERE user_id = created_by
Is it possible to join that new returning row with another table row?
Consider using a WITH structure to pass the data from the insert to a query that can then be joined.
Example:
-- Setup some initial tables
create table colors (
id SERIAL primary key,
color VARCHAR UNIQUE
);
create table animals (
id SERIAL primary key,
a_id INTEGER references colors(id),
animal VARCHAR UNIQUE
);
-- provide some initial data in colors
insert into colors (color) values ('red'), ('green'), ('blue');
-- Store returned data in inserted_animal for use in next query
with inserted_animal as (
-- Insert a new record into animals
insert into animals (a_id, animal) values (3, 'fish') returning *
) select * from inserted_animal
left join colors on inserted_animal.a_id = colors.id;
-- Output
-- id | a_id | animal | id | color
-- 1 | 3 | fish | 3 | blue
Explanation:
A WITH query allows a record returned from an initial query, including data returned from a RETURNING clause, which is stored in a temporary table that can be accessed in the expression that follows it to continue work on it, including using a JOIN expression.
You were right, I misunderstood
This should do it:
DECLARE mycreated_by event_comments.created_by%TYPE;
INSERT INTO
event_comments(date_posted, e_id, created_by, parent_id, body, num_likes, thread_id)
VALUES(1575770277, 1, '9e028aaa-d265-4e27-9528-30858ed8c13d', 9, 'December 7th', 0, 'zRfs2I')
RETURNING created_by into mycreated_by
SELECT * from users WHERE user_id = mycreated_by

How to detect duplicate records with sub table records

Let's say I'm creating an address book in which the main table contains the basic contact information and a phone number sub table -
Contact
===============
Id [PK]
Name
PhoneNumber
===============
Id [PK]
Contact_Id [FK]
Number
So, a Contact record may have zero or more related records in the PhoneNumber table. There is no constraint on uniqueness of any column other than the primary keys. In fact, this must be true because:
Two contacts having different names may share a phone number, and
Two contacts may have the same name but different phone numbers.
I want to import a large dataset which may contain duplicate records into my database and then filter out the duplicates using SQL. The rules for identifying duplicate records are simple ... they must share the same name and the same number of phone records having the same content.
Of course, this works quite effectively for selecting duplicates from the Contact table but doesn't help me to detect actual duplicates given my rules:
SELECT * FROM Contact
WHERE EXISTS
(SELECT 'x' FROM Contact t2
WHERE t2.Name = Contact.Name AND
t2.Id > Contact.Id);
It seems as if what I want is a logical extension to what I already have, but I must be overlooking it. Any help?
Thanks!
In my question, I created a greatly simplified schema that reflects the real-world problem I'm solving. Przemyslaw's answer is indeed a correct one and did what I was asking both with the sample schema and, when extended, with the real one.
But, after doing some experiments with the real schema and a larger (~10k records) dataset, I found that performance was an issue. I don't claim to be an index guru, but I wasn't able to find a better combination of indices than what was already in the schema.
So, I came up with an alternate solution which fills the same requirements but executes in a small fraction (< 10%) of the time, at least using SQLite3 - my production engine. In hopes that it may assist someone else, I'll offer it as an alternative answer to my question.
DROP TABLE IF EXISTS Contact;
DROP TABLE IF EXISTS PhoneNumber;
CREATE TABLE Contact (
Id INTEGER PRIMARY KEY,
Name TEXT
);
CREATE TABLE PhoneNumber (
Id INTEGER PRIMARY KEY,
Contact_Id INTEGER REFERENCES Contact (Id) ON UPDATE CASCADE ON DELETE CASCADE,
Number TEXT
);
INSERT INTO Contact (Id, Name) VALUES
(1, 'John Smith'),
(2, 'John Smith'),
(3, 'John Smith'),
(4, 'Jane Smith'),
(5, 'Bob Smith'),
(6, 'Bob Smith');
INSERT INTO PhoneNumber (Id, Contact_Id, Number) VALUES
(1, 1, '555-1212'),
(2, 1, '222-1515'),
(3, 2, '222-1515'),
(4, 2, '555-1212'),
(5, 3, '111-2525'),
(6, 4, '111-2525');
COMMIT;
SELECT *
FROM Contact c1
WHERE EXISTS (
SELECT 1
FROM Contact c2
WHERE c2.Id > c1.Id
AND c2.Name = c1.Name
AND (SELECT COUNT(*) FROM PhoneNumber WHERE Contact_Id = c2.Id) = (SELECT COUNT(*) FROM PhoneNumber WHERE Contact_Id = c1.Id)
AND (
SELECT COUNT(*)
FROM PhoneNumber p1
WHERE p1.Contact_Id = c2.Id
AND EXISTS (
SELECT 1
FROM PhoneNumber p2
WHERE p2.Contact_Id = c1.Id
AND p2.Number = p1.Number
)
) = (SELECT COUNT(*) FROM PhoneNumber WHERE Contact_Id = c1.Id)
)
;
The results are as expected:
Id Name
====== =============
1 John Smith
5 Bob Smith
Other engines are bound to have differing performance which may be quite acceptable. This solution seems to work quite well with SQLite for this schema.
The author stated the requirement of "two people being the same person" as:
Having the same name and
Having the same number of phone numbers and all of which are the same.
So the problem is a bit more complex than it seems (or maybe I just overthought it).
Sample data and (an ugly one, I know, but the general idea is there) a sample query which I tested on below test data which seems to be working correctly (I'm using Oracle 11g R2):
CREATE TABLE contact (
id NUMBER PRIMARY KEY,
name VARCHAR2(40))
;
CREATE TABLE phone_number (
id NUMBER PRIMARY KEY,
contact_id REFERENCES contact (id),
phone VARCHAR2(10)
);
INSERT INTO contact (id, name) VALUES (1, 'John');
INSERT INTO contact (id, name) VALUES (2, 'John');
INSERT INTO contact (id, name) VALUES (3, 'Peter');
INSERT INTO contact (id, name) VALUES (4, 'Peter');
INSERT INTO contact (id, name) VALUES (5, 'Mike');
INSERT INTO contact (id, name) VALUES (6, 'Mike');
INSERT INTO contact (id, name) VALUES (7, 'Mike');
INSERT INTO phone_number (id, contact_id, phone) VALUES (1, 1, '123'); -- John having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (2, 1, '456'); -- John having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (3, 2, '123'); -- John the second having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (4, 2, '456'); -- John the second having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (5, 3, '123'); -- Peter having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (6, 3, '456'); -- Peter having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (7, 3, '789'); -- Peter having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (8, 4, '456'); -- Peter the second having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (9, 5, '123'); -- Mike having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (10, 5, '456'); -- Mike having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (11, 6, '123'); -- Mike the second having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (12, 6, '789'); -- Mike the second having number 456
-- Mike the third having no number
COMMIT;
-- does not meet the requirements described in the question - will return Peter when it should not
SELECT DISTINCT c.name
FROM contact c JOIN phone_number pn ON (pn.contact_id = c.id)
GROUP BY name, phone_number
HAVING COUNT(c.id) > 1
;
-- returns correct results for provided test data
-- take all people that have a namesake in contact table and
-- take all this person's phone numbers that this person's namesake also has
-- finally (outer query) check that the number of both persons' phone numbers is the same and
-- the number of the same phone numbers is equal to the number of (either) person's phone numbers
SELECT c1_id, name
FROM (
SELECT c1.id AS c1_id, c1.name, c2.id AS c2_id, COUNT(1) AS cnt
FROM contact c1
JOIN contact c2 ON (c2.id != c1.id AND c2.name = c1.name)
JOIN phone_number pn ON (pn.contact_id = c1.id)
WHERE
EXISTS (SELECT 1
FROM phone_number
WHERE contact_id = c2.id
AND phone = pn.phone)
GROUP BY c1.id, c1.name, c2.id
)
WHERE cnt = (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id)
AND (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id) = (SELECT COUNT(1) FROM phone_number WHERE contact_id = c2_id)
;
-- cleanup
DROP TABLE phone_number;
DROP TABLE contact;
Check at SQL Fiddle: http://www.sqlfiddle.com/#!4/36cdf/1
Edited
Answer to author's comment: Of course I didn't take that into account... here's a revised solution:
-- new test data
INSERT INTO contact (id, name) VALUES (8, 'Jane');
INSERT INTO contact (id, name) VALUES (9, 'Jane');
SELECT c1_id, name
FROM (
SELECT c1.id AS c1_id, c1.name, c2.id AS c2_id, COUNT(1) AS cnt
FROM contact c1
JOIN contact c2 ON (c2.id != c1.id AND c2.name = c1.name)
LEFT JOIN phone_number pn ON (pn.contact_id = c1.id)
WHERE pn.contact_id IS NULL
OR EXISTS (SELECT 1
FROM phone_number
WHERE contact_id = c2.id
AND phone = pn.phone)
GROUP BY c1.id, c1.name, c2.id
)
WHERE (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id) IN (0, cnt)
AND (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id) = (SELECT COUNT(1) FROM phone_number WHERE contact_id = c2_id)
;
We allow a situation when there are no phone numbers (LEFT JOIN) and in outer query we now compare the number of person's phone numbers - it must either be equal to 0, or the number returned from the inner query.
The keyword "having" is your friend. The generic use is:
select field1, field2, count(*) records
from whereever
where whatever
group by field1, field2
having records > 1
Whether or not you can use the alias in the having clause depends on the database engine. You should be able to apply this basic principle to your situation.

Insert query issue- with foreign key

I am new to SQL. Need a help from you guys :)
I am building a java appl and stuck in one of the scenario for insert with foreign key. Suppose I have 2 tables Employee_Type and Employee:
Table Employee_Type
| idType | position |
| -------- | -------------- |
| 1| Manager|
Table Employee
empId
EmpName
emp_type
FK (emp_type) reference Employee_type(idType)
Now values in Employee_Type
1,
Manager
I am inserting manually into Employee Table
INSERT INTO
employee (empId, name, emp_type)
VALUES
(
10, 'prashant', 1
)
Here in above insert I am inserting manually emp_type which is FK . My question, is there any way to insert FK value automatically using select like below example?
INSERT INTO
employee(empId, name, emp_type)
VALUES
(
10, 'prashant',
(
SELECT
idType
FROM
Employee_type,
employee
WHERE
employee.emp_type = employee_type.idtype
)
)
You don't specify your RDBMS and the syntax may therefore differ, but you should be able to restructure the statement to use literal values in an INSERT INTO ... SELECT format:
INSERT INTO employee (empId,name,emp_type)
SELECT
/* Build a SELECT statement which includes the static values as literals */
'10' AS empId,
'prashant' AS name,
/* and the idType column */
idType
FROM Employee_type,employee
WHERE employee.emp_type=employee_type.idtype
Note that without anything else in the WHERE clause, the above will insert one row into employee for every row matched by the SELECT statement.

Add or delete repeated row

I have an output like this:
id name date school school1
1 john 11/11/2001 nyu ucla
1 john 11/11/2001 ucla nyu
2 paul 11/11/2011 uft mit
2 paul 11/11/2011 mit uft
I would like to achieve this:
id name date school school1
1 john 11/11/2001 nyu ucla
2 paul 11/11/2011 mit uft
I am using direct join as in:
select distinct
a.id, a.name,
b.date,
c.school
a1.id, a1.name,
b1.date,
c1.school
from table a, table b, table c,table a1, table b1, table c1
where
a.id=b.id
and...
Any ideas?
We will need more information such as what your tables contain and what you are after.
One thing I noticed is you have a school and then school1. 3nf states that you should never duplicate fields and append numbers to them to get more information even if you think that the relationship will only be 1 or 2 additional items. You need to create a second table that stores a user associated with 1 to many schools.
I agree with everyone else that both your source table and your desired output are poor design. While you probably can't do anything about your source table, I recommend the following code and output:
Select id, name, date, school from MyTable;
union
Select id, name, date, school1 from MyTable;
(repeat as necessary)
This will give you results in the format:
id name date school
1 john 11/11/2001 nyu
1 john 11/11/2001 ucla
2 paul 11/11/2011 mit
2 paul 11/11/2011 uft
(Note: in my version of SQL, union queries automatically select distinct records so the distinct flag isn't needed)
With this format, you could easily count the number of schools per student, number of students per school, etc.
If processing time and/or storage space is a factor here, you could then split this into 2 tables, 1 with the id,name & date, the other with the id & school (basically what JonH just said). But if you're just working up some simple statistics, this should suffice.
This problem was just too irresistable, so I just took a guess at the data structures that we are dealing with. The technology wasn't specified in the question. This is in Transact-SQL.
create table student
(
id int not null primary key identity,
name nvarchar(100) not null default '',
graduation_date date not null default getdate(),
)
go
create table school
(
id int not null primary key identity,
name nvarchar(100) not null default ''
)
go
create table student_school_asc
(
student_id int not null foreign key references student (id),
school_id int not null foreign key references school (id),
primary key (student_id, school_id)
)
go
insert into student (name, graduation_date) values ('john', '2001-11-11')
insert into student (name, graduation_date) values ('paul', '2011-11-11')
insert into school (name) values ('nyu')
insert into school (name) values ('ucla')
insert into school (name) values ('uft')
insert into school (name) values ('mit')
insert into student_school_asc (student_id, school_id) values (1,1)
insert into student_school_asc (student_id, school_id) values (1,2)
insert into student_school_asc (student_id, school_id) values (2,3)
insert into student_school_asc (student_id, school_id) values (2,4)
select
s.id,
s.name,
s.graduation_date as [date],
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s1 where s1.rank_num = 1) as school,
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s2 where s2.rank_num = 2) as school1
from
student s
Result:
id name date school school1
--- ----- ---------- ------- --------
1 john 2001-11-11 nyu ucla
2 paul 2011-11-11 mit uft