insert into table from two another tables - oracle - sql

i have 3 tables. Two First tables has data and i want 3rd table insert data from that first two.
TABLE A :
CREATE TABLE z_ostan ( id NUMBER PRIMARY KEY,
name VARCHAR2(30) NOT NULL CHECK (upper(name)=name)
);
TABLE B:
CREATE TABLE z_shahr ( id NUMBER PRIMARY KEY,
name VARCHAR2(30) NOT NULL CHECK (upper(name)=name),
ref_ostan NUMBER,
CONSTRAINT fk_ref_ostan FOREIGN KEY (ref_ostan) REFERENCES z_ostan(id)
);
TABLE C:
CREATE TABLE z_shar2 ( shahr_name VARCHAR2(30),
ostan_name VARCHAR2(30),
payetakht number);
insert data from TABLE A and B into C by this conditions:
shahr_name in TABLE C comes from z_shahr.name in TABLE B
ostan_name in TABLE C comes from z_ostan .name in TABLE A
and payetakht has two mode:
default null else
if ostan_name is 'somthing' then =1
i CANT INSERT BY This Conditions on TABLE C

Looks like a join:
INSERT INTO z_shar2 (shahr_name, ostan_name, payetakht)
SELECT b.name,
a.name,
CASE WHEN a.name = 'somthing' THEN 1 ELSE NULL END payetakht
FROM z_shahr b JOIN z_ostan a ON a.id = b.ref_ostan
As of payetakht column's value: I initially thought that you, actually, meant when a.name is not null but that can't be as name column is declared as not null, so ... that's probably really (misspelled) somthing.

Related

Counting Occurrences from One Table and Inserting into Another but Getting an Error

I am trying to count the amount of times each school shows up in a set of records and record that value in a new table with its corresponding school name and ID.
The tables being used are similar to the following:
Table 1-> school_probs
school_code (pk, bigint)
school (text)
probability
1
school1
Irrelevant info
2
school2
ii
3
school3
ii
Table2-> simulated_records
record_id (pk, bigint)
school (text)
grade
1
school1
ii
2
school2
ii
3
school1
ii
4
school3
ii
I'm expecting to get an output like
school_code (fk, bigint)
school (text)
schoolCount (integer)
1
school1
2
2
school2
1
3
school3
1
and I was able to achieve this with the following code:
SELECT COUNT (simulated_records.school) AS schoolCount, school_probs.school_code, school_probs.school
FROM simulated_records, school_probs WHERE school_probs.school = simulated_records.school
GROUP BY simulated_records.school, school_probs.school_code, school_probs.school;
However, I need the result to be saved in a table. But when I try
CREATE TABLE studentCount (
studentNum integer, school_code bigint, school text,
CONSTRAINT fk_sC FOREIGN KEY (school_code) REFERNCES school_probs (school_code)
)
SELECT COUNT (simulated_records.school) AS schoolCount, school_probs.school_code, school_probs.school
FROM simulated_records, school_probs WHERE school_probs.school = simulated_records.school
GROUP BY simulated_records.school, school_probs.school_code, school_probs.school;
I get "ERROR: syntax error at or near "SELECT" LINE 5: SELECT COUNT (simulated_records.school) AS schoolCount, . . . SQL state: 42601 "
Line 5 reads:
SELECT COUNT (simulated_records.school) AS schoolCount, school_probs.school_code, school_probs.school
Can anyone point me in the right direction? I plan on creating a function out of this.
The code to create the tables:
DROP TABLE IF EXISTS school_probs;
CREATE TABLE school_probs
(
school_code bigint NOT NULL PRIMARY KEY,
school text NOT NULL,
probs numeric[] NOT NULL
);
INSERT INTO school_probs VALUES
(1,'school1','{0.05,0.08,0.18,0.3,0.11,0.28}'),
(2,'school2','{0.06,0.1,0.295,0.36,0.12,0.065}'),
(3,'school3','{0.05,0.11,0.35,0.32,0.12,0.05}');
DROP TABLE IF EXISTS simulated_records;
CREATE TABLE simulated_records
(
record_id bigint NOT NULL PRIMARY KEY,
school text NOT NULL,
grade text NOT NULL
);
INSERT INTO simulated_records VALUES
(1,'school1','-'),
(2,'school2','-'),
(3,'school1','-'),
(4, 'school3', '-');
Look up the JOIN syntax and don't use , in the FROM clause. Table aliases could also help.
And the syntax to create a table from a query is CREATE TABLE <table name> AS SELECT .... There are no column or constraint definitions. You can use explicit casts in the query to determine column types. Constraint definitions have to be added later with ALTER TABLE.
CREATE TABLE studentcount
AS
SELECT count(sr.school)::integer studentnum,
sp.school_code::bigint,
sp.school::text
FROM simulated_records sr
INNER JOIN school_probs sp
ON sp.school = sr.school
GROUP BY sp.school,
sp.school_code;
ALTER TABLE studentcount
ADD CONSTRAINT fk_sc
FOREIGN KEY (school_code)
REFERENCES school_probs
(school_code);
Alternatively you can first issue a "normal" CREATE TABLE with column and constraint definitions and then insert the rows from the query.
CREATE TABLE studentcount
(studentnum integer,
school_code bigint,
school text,
CONSTRAINT fk_sc
FOREIGN KEY (school_code)
REFERENCES school_probs
(school_code));
INSERT INTO studentcount
(studentnum,
school_code,
school)
SELECT count(sr.school),
sp.school_code,
sp.school
FROM simulated_records sr
INNER JOIN school_probs sp
ON sp.school = sr.school
GROUP BY sp.school,
sp.school_code;
But be aware that you're creating data redundancy either way. That can lead to inconsistencies and should be avoided. If you don't just need that temporarily but later again with then current values consider a view.
CREATE VIEW studentcount
AS
SELECT count(sr.school)::integer studentnum,
sp.school_code::bigint,
sp.school::text
FROM simulated_records sr
INNER JOIN school_probs sp
ON sp.school = sr.school
GROUP BY sp.school,
sp.school_code;

Join two tables so foreign key column data (integer) changes to text data from parent table (concerned column in parent table is not primary key)

I am a beginner with SQLite, and I am still a little unfamiliar with joining tables (and the limitations therein). I would like to know how to join 2 tables so that the column data in Table B (several columns) changes to reflect data from Table A.
CREATE TABLE "A" (
"person_id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
"person_name" TEXT NOT NULL,
"email" TEXT NOT NULL
);
CREATE TABLE "B" (
"company_id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
"company_name" TEXT NOT NULL,
"contact_one" INTEGER NOT NULL,
"contact_two" INTEGER,
"contact_three" INTEGER,
FOREIGN KEY("contact_one") REFERENCES "A"("person_id") ON DELETE SET NULL
FOREIGN KEY("contact_two") REFERENCES "A"("person_id") ON DELETE SET NULL
FOREIGN KEY("contact_three") REFERENCES "A"("person_id") ON DELETE SET NULL
);
How do I write a query so that the resulting table shows columns "company_name", "contact_one", "contact_two" and "contact_three" BUT with the contact tables showing the contact name rather than the integer (as it would appear in Table B).
See below an image representing Tables A and B and the desired output of SQLite query:
When I do a left join, the query succeeds only for a single contact column.
SELECT B.company_name, A.person_name
FROM B
LEFT JOIN A ON B.contact_one = A.person_id
I tried to add an "OR" to the query (I know "AND" will not work), but I get an "ambiguous column name: A.person_name" error when I try to run the query:
SELECT B.company_name, A.person_name
FROM B
LEFT JOIN A ON B.contact_one = A.person_id
OR LEFT JOIN A ON B.contact_two = A.person_id
OR LEFT JOIN A ON B.contact_three = A.person_id
How do I write the query so that I can get contact_two and contact_three also in the resulting table, with all three contacts' names displayed?
Any guidance will be greatly appreciated!
You need to join the table A 3 times like this:
SELECT B.company_name,
a1.person_name name1,
a2.person_name name2,
a3.person_name name3
FROM B
LEFT JOIN A a1 ON B.contact_one = a1.person_id
LEFT JOIN A a2 ON B.contact_two = a2.person_id
LEFT JOIN A a3 ON B.contact_three = a3.person_id

Multi-table, multi-row SQL select

How would I list all of the info about a freelancer given the schema below? Including niche, language, market, etc. The issue I am having is that every freelancer can have multiple entries for each table. So, how would I do this? Is it even possible using SQL or would I need to use my primary language (golang) for this?
CREATE TABLE freelancer (
freelancer_id SERIAL PRIMARY KEY,
ip inet NOT NULL,
username VARCHAR(20) NOT NULL,
password VARCHAR(100) NOT NULL,
email citext NOT NULL UNIQUE,
email_verified int NOT NULL,
fname VARCHAR(20) NOT NULL,
lname VARCHAR(20) NOT NULL,
phone_number VARCHAR(30) NOT NULL,
address VARCHAR(50) NOT NULL,
city VARCHAR(30) NOT NULL,
state VARCHAR(30) NOT NULL,
zip int NOT NULL,
country VARCHAR(30) NOT NULL,
);
CREATE TABLE market (
market_id SERIAL PRIMARY KEY,
market_name VARCHAR(30) NOT NULL,
);
CREATE TABLE niche (
niche_id SERIAL PRIMARY KEY,
niche_name VARCHAR(30) NOT NULL,
);
CREATE TABLE medium (
medium_id SERIAL PRIMARY KEY,
medium_name VARCHAR(30) NOT NULL,
);
CREATE TABLE format (
format_id SERIAL PRIMARY KEY,
format_name VARCHAR(30) NOT NULL,
);
CREATE TABLE lang (
lang_id SERIAL PRIMARY KEY,
lang_name VARCHAR(30) NOT NULL,
);
CREATE TABLE freelancer_by_niche (
id SERIAL PRIMARY KEY,
niche_id int NOT NULL REFERENCES niche (niche_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_medium (
id SERIAL PRIMARY KEY,
medium_id int NOT NULL REFERENCES medium (medium_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_market (
id SERIAL PRIMARY KEY,
market_id int NOT NULL REFERENCES market (market_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_format (
id SERIAL PRIMARY KEY,
format_id int NOT NULL REFERENCES format (format_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_lang (
id SERIAL PRIMARY KEY,
lang_id int NOT NULL REFERENCES lang (lang_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
SELECT *
FROM freelancer
INNER JOIN freelancer_by_niche USING (freelancer_id)
INNER JOIN niche USING (niche_id)
INNER JOIN freelancer_by_medium USING (freelancer_id)
INNER JOIN medium USING (medium_id)
INNER JOIN freelancer_by_market USING (freelancer_id)
INNER JOIN market USING (market_id)
INNER JOIN freelancer_by_format USING (freelancer_id)
INNER JOIN format USING (format_id)
INNER JOIN freelancer_by_lang USING (freelancer_id)
INNER JOIN lang USING (lang_id);
And if you want to lose the unnecessary attributes from join tables like freelancer_by_format, then you can do this
SELECT a.ip, a.username, a.password, a.email, a.email_verified,
a.fname, a.lname, a.phone_number, a.address, a.city,
a.state, a.zip, a.country,
b.niche_name, c.medium_name, d.market_name, e.format_name, f.lang_name
FROM freelancer a
INNER JOIN freelancer_by_niche USING (freelancer_id)
INNER JOIN niche b USING (niche_id)
INNER JOIN freelancer_by_medium USING (freelancer_id)
INNER JOIN medium c USING (medium_id)
INNER JOIN freelancer_by_market USING (freelancer_id)
INNER JOIN market d USING (market_id)
INNER JOIN freelancer_by_format USING (freelancer_id)
INNER JOIN format e USING (format_id)
INNER JOIN freelancer_by_lang USING (freelancer_id)
INNER JOIN lang f USING (lang_id);
And if you want to change the column names, for example change "market_name" to just "market", then you go with
SELECT a.ip, ... ,
d.market_name "market", e.format_name AS "format", ...
FROM ...
Remarks
In your join tables (for example freelancer_by_niche) there is not UNIQUE constraint on freelancer_id, which means that you could have the same freelancer in multiple markets (that's ok and probably intended).
But then you also don't have a UNIQUE constraint on both attributes (freelancer_id, niche_id), which means that every freelancer could be in the SAME niche multiple times. ("Joe is in electronics. Three times").
You could prevent that by making (freelancer_id, niche_id) UNIQUE in freelancer_by_niche.
This way you would also not need a surrogate (artificial) PRIMARY KEY freelancer_by_id (id).
So what could go wrong then?
For example imagine the same information about a freelancer in the same niche three times (the same data parts of the row three times):
freelancer_by_niche
id | freelancer_id | niche_id
1 | 1 | 1 -- <-- same data (1, 1), different serial id
2 | 1 | 1 -- <-- same data (1, 1), different serial id
3 | 1 | 1 -- <-- same data (1, 1), different serial id
Then the result of the above query would return each possible row three (!) times with the same (!) content, because freelancer_by_niche can be combined three times with all the other JOINs.
You can eliminate duplicates by using SELECT DISTINCT a.id, ... FROM ... above with DISTINCT.
What if you get many duplicate rows, for example 10 data duplicates in each of the 5 JOIN tables (freelancer_by_niche, freelancer_by_medium etc)? You would get 10 * 10 * 10 * 10 * 10 = 10 ^ 5 = 100000 duplicates, which all have the exact same information.
If you then ask your DBMS to eliminate duplicates with SELECT DISTINCT ... then it has to sort 100000 duplicate rows per different row, because duplicates can be detected by sorting only (or hashing, but never mind). If you have 1000 different rows for freelancers on markets, niches, languages etc, then you are asking your DBMS to SORT 1.000 * 100.000 = 100.000.000 rows to reduce the duplicates down to the unique 1000 rows.
That is 100 million unnecessary rows.
Please make UNIQUE (freelancer_id, niche_id) for freelancer_by_niche and the other JOIN tables.
(By data duplicates i mean that the data (niche_id, freelancer_id) is the same, and only the id is auto incremented serial.)
You can easily reproduce the problem by doing the following:
-- this duplicates all data of your JOIN tables once. Do it many times.
INSERT INTO freelancer_by_niche
SELECT (niche_id, freelancer_id) FROM freelancer_by_niche;
INSERT INTO freelancer_by_medium
SELECT (medium_id, freelancer_id) FROM freelancer_by_medium;
INSERT INTO freelancer_by_market
SELECT (market_id, freelancer_id) FROM freelancer_by_market;
INSERT INTO freelancer_by_format
SELECT (format_id, freelancer_id) FROM freelancer_by_format;
INSERT INTO freelancer_by_lang
SELECT (lang_id, freelancer_id) FROM freelancer_by_lang;
Display the duplicates using
SELECT * FROM freelancer_by_lang;
Now try the SELECT * FROM freelancer INNER JOIN ... thing.
If it still runs fast, then do all the INSERT INTO freelancer_by_niche ... again and again, until it takes forever to calculate the results.
(or you get duplicates, which you can remove with DISTINCT).
Create UNIQUE data JOIN tables
You can prevent duplicates in your join tables.
Remove the id SERIAL PRIMARY KEY and replace it with a multi-attribute PRIMARY KEY (a, b):
CREATE TABLE freelancer_by_niche (
niche_id int NOT NULL REFERENCES niche (niche_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id),
PRIMARY KEY (freelancer_id, niche_id)
);
(Apply this for all your join tables).
The PRIMARY KEY (freelancer_id, niche_id) will create a UNIQUE index.
This way you cannot insert duplicate data (try the INSERTs above, the will be rejected, because the information is already there once. Adding another time will not add more information AND would make your query runtime much slower).
NON-unique index on the other part of the JOIN tables
With PRIMARY KEY (freelancer_id, niche_id), Postgres creates a unique index on these two attributes (columns).
Accessing or JOINing by freelancer_id is fast, because it's first in the index. Accessing or JOINing into freelancer_by_niche.niche_id will be slow (Full Table Scan on freelancer_by_niche).
Therefore you should create an INDEX on the second part niche_id in this table freelancer_by_niche, too.
CREATE INDEX ON freelancer_by_niche (niche_id) ;
Then joins into this table on niche_id will also be faster, because they are accelerated by an index. The index makes queries faster (usually).
Summary
You have a very good normalized database schema! It's very good. But small improvements can be made (see above).

Get rows that no foreign keys point to

I have two tables
CREATE TABLE public.city_url
(
id bigint NOT NULL DEFAULT nextval('city_url_id_seq'::regclass),
url text,
city text,
state text,
country text,
common_name text,
CONSTRAINT city_url_pkey PRIMARY KEY (id)
)
and
CREATE TABLE public.email_account
(
id bigint NOT NULL DEFAULT nextval('email_accounts_id_seq'::regclass),
email text,
password text,
total_replied integer DEFAULT 0,
last_accessed timestamp with time zone,
enabled boolean NOT NULL DEFAULT true,
deleted boolean NOT NULL DEFAULT false,
city_url_id bigint,
CONSTRAINT email_accounts_pkey PRIMARY KEY (id),
CONSTRAINT email_account_city_url_id_fkey FOREIGN KEY (city_url_id)
REFERENCES public.city_url (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
I want to come up with a query that fetches rows in the city_url only if there is no row in the email_account pointing to it with the city_url_id column.
NOT EXISTS comes to mind:
select c.*
from city_url c
where not exists (select 1
from email_account ea
where ea.city_url_id = c.id
);
There's also this option:
SELECT city_url.*
FROM city_url
LEFT JOIN email_account ON email_account.city_url_id = city_url.id
WHERE email_account.id IS NULL
A NOT EXISTS is absolutely the answer to the "... if there is no row ...".
Nonetheless it would be preferable to accomplish this by selecting then difference quantity.
Which is in principle:
SELECT a.*
FROM table1 a
LEFT JOIN table2 b
ON a.[columnX] = b.[columnY]
WHERE b.[columnY] IS NULL
Using the tablenames here, this would be:
SELECT c.*
FROM city_url c
LEFT JOIN email_account e
ON c.id = e.city_url
WHERE e.city_url IS NULL
I believe NOT IN could be used here as well, although this might be less performant on large datasets:
SELECT *
FROM city_url
WHERE id NOT IN (
SELECT city_url_id FROM email_account
)

How to differentiate between “no child rows exist” and “no parent row exists” in one SELECT query?

Say I have a table C that references rows from tables A and B:
id, a_id, b_id, ...
and a simple query:
SELECT * FROM C WHERE a_id=X AND b_id=Y
I would like to differentiate between the following cases:
No row exists in A where id = X
No row exists in B where id = Y
Both such rows in A and B exist, but no rows in C exist where a_id = X and b_id = Y
The above query will return empty result in all those cases.
In case of one parent table I could do a LEFT JOIN like:
SELECT * FROM A LEFT JOIN C ON a.id = c.a_id WHERE c.a_id = X
and then check if the result is empty (no row in A exists), has one row with NULL c.id (row in A exists, but no rows in C exist) or 1+ rows with non-NULL c.id (row in A exists and at least one row in C exists). A bit messy but it works, but I was wondering if there is a better way of doing this, especially if there is more than one parent table?
For example:
C is "things owned by people", A is "people", B is "types of things". When someone asks "give me a list of games owned by Bill", and there are no such records in C, I would like to return an empty list only if both "Bill" and "games" exist in their corresponding tables, but an error code if either of them doesn't.
So if there are no records matching "Bill" and "games" in table C, I would like to say "I don't know who Bill is" instead of "Bill has no games" if I don't have a record about Bill in table A.
create table a(a_id integer not null primary key);
create table b(b_id integer not null primary key);
create table c(a_id integer not null references a(a_id)
, b_id integer not null references b(b_id)
, primary key (a_id,b_id)
);
insert into a(a_id) values(0),(2),(4),(6);
insert into b(b_id) values(0),(3),(6);
insert into c(a_id,b_id) values(6,6);
PREPARE omg(integer,integer) AS
SELECT EXISTS(SELECT * FROM a where a.a_id = $1) AS a_exists
, EXISTS(SELECT * FROM b where b.b_id = $2) AS b_exists
, EXISTS(SELECT * FROM c where c.a_id = $1 and c.b_id = $2) AS c_exists
;
EXECUTE omg(1,1);
EXECUTE omg(2,1);
EXECUTE omg(1,3);
EXECUTE omg(6,6);
-- with optional payload:
PREPARE omg2(integer,integer) AS
SELECT val.a_id AS va_id
, val.b_id AS vb_id
, EXISTS(SELECT * FROM a WHERE a.a_id = $1) AS a_exists
, EXISTS(SELECT * FROM b WHERE b.b_id = $2) AS b_exists
, EXISTS(select * FROM c WHERE c.ca_id = val.a_id AND c.cb_id = val.b_id ) AS c_exists
, a.*
, b.*
, c.*
FROM (values ($1,$2)) val(a_id,b_id)
LEFT JOIN a ON a.a_id = val.a_id
LEFT JOIN b ON b.b_id = val.b_id
LEFT JOIN c ON c.ca_id = val.a_id AND c.cb_id = val.b_id
;
EXECUTE omg2(1,1);
EXECUTE omg2(2,1);
EXECUTE omg2(1,3);
EXECUTE omg2(6,6);
I think I managed to get a satisfactory solution using the following two features:
Subselect bound to a column, which allows me to check if a row exists and (importantly) get a NULL value otherwise (e.g. SELECT (SELECT id FROM a WHERE id = 1) as a_id))
Common Table Expressions
Initial data:
CREATE TABLE people
(
id integer not null primary key,
name text not null
);
CREATE TABLE thing_types
(
id integer not null primary key,
name text not null
);
CREATE TABLE things
(
id integer not null primary key,
person_id integer not null references people(id),
thing_type_id integer not null references thing_types(id),
name text not null
);
INSERT INTO people VALUES (1, 'Bill');
INSERT INTO thing_types VALUES (1, 'game');
INSERT INTO things VALUES (1, 1, 1, 'Duke Nukem');
INSERT INTO things VALUES (2, 1, 1, 'Warcraft 2');
And the query:
WITH v AS (
SELECT (SELECT id FROM people WHERE id=<person_id_param>) AS person_id,
(SELECT id FROM thing_types WHERE id=<thing_type_param>) AS thing_type_id
)
SELECT v.person_id, v.thing_type_id, things.name
FROM
v LEFT JOIN things
ON v.person_id = things.person_id AND v.thing_type_id = things.thing_type_id
This query will always return at least one row, and I just need to check which, if any, of the three columns of the first row are NULLs.
In case if both parent table ids are valid and there are some records, none of them will be NULL:
person_id thing_type_id name
-------------------------------------
1 1 Duke Nukem
1 1 Warcraft 2
If either person_id or thing_type_id are invalid, I get one row where name is NULL and either person_id or thing_type_id is NULL:
person_id thing_type_id name
-------------------------------------
NULL 1 NULL
If both person_id and thing_type_id are valid but there are no records in things, I get one row where both person_id and thing_type_id are not NULL, but the name is NULL:
person_id thing_type_id name
-------------------------------------
1 1 NULL
Since I have a NOT NULL constraint on things.name, I know that this case can only mean that there are no matching records in things. If NULLs were allowed in things.name, I could include things.id instead and check that for NULLness.
You have 3 cases, the third one is a bit more complex but can be achieved by using cross join between a and b, all three cases in a union could be like this
select a_id, b_id , 'case 1' from c
where not exists (select 1 from a where a.a_id=c.a_id)
union all
select a_id, b_id ,'case 2' from c
where not exists (select 1 from b where b.b_id=c.b_id)
union all
select a_id, b_id, 'case 3' from a cross join b
where exists (select 1 from c where c.a_id=a.a_id)
and exists (select 1 from c where c.b_id=b.b_id)
and not exists (select 1 from c where c.b_id=b.b_id and c.a_id=a.a_id)