Ambiguity with column reference [duplicate] - sql

This question already has answers here:
SQL column reference "id" is ambiguous
(5 answers)
Closed 4 months ago.
I try to run a simple code as follows:
Create Table weather (
city varchar(80),
temp_lo int,
temp_hi int,
prcp real,
date date
);
Insert Into weather Values ('A', -5, 40, 25, '2018-01-10');
Insert Into weather Values ('B', 5, 45, 15, '2018-02-10');
Create Table cities (
city varchar(80),
location point
);
Insert Into cities Values ('A', '(12,10)');
Insert Into cities Values ('B', '(6,4)');
Insert Into cities Values ('C', '(18,13)');
Select * From cities, weather Where city = 'A'
But what I get is
ERROR: column reference "city" is ambiguous.
What is wrong with my code?

If I were you I'd model things slightly differently.
To normalise things a little, we'll start with the cities table and make a few changes:
create table cities (
city_id integer primary key,
city_name varchar(100),
location point
);
Note that I've used an integer to denote the ID and Primary Key of the table, and stored the name of the city separately. This gives you a nice easy to maintain lookup table. By using an integer as the primary key, we'll also use less space in the weather table when we're storing data.
Create Table weather (
city_id integer,
temp_lo int,
temp_hi int,
prcp real,
record_date date
);
Note that I'm storing the id of the city rather than the name. Also, I've renamed date as it's not a good idea to name columns after SQL reserved words.
Ensure that we use IDs in the test data:
Insert Into weather Values (1, -5, 40, 25, '2018-01-10');
Insert Into weather Values (2, 5, 45, 15, '2018-02-10');
Insert Into cities Values (1,'A', '(12,10)');
Insert Into cities Values (2,'B', '(6,4)');
Insert Into cities Values (3,'C', '(18,13)');
Your old query:
Select * From cities, weather Where city = 'A'
The name was ambiguous because both tables have a city column, and the database engine doesn't know which city you mean (it doesn't automatically know if it needs to use cities.city or weather.city). The query also performs a cartesian product, as you have not joined the tables together.
Using the changes I have made above, you'd require something like:
Select *
From cities, weather
Where cities.city_id = weather.city_id
and city_name = 'A';
or, using newer join syntax:
Select *
From cities
join weather on cities.city_id = weather.city_id
Where city_name = 'A';
The two queries are functionally equivalent - these days most people prefer the 2nd query, as it can prevent mistakes (eg: forgetting to actually join in the where clause).

Both tables cities and weather have a column called city. On your WHERE clause you filter city = 'A', which table's city is it refering to?
You can tell the engine which one you want to filter by preceding the column with it's table name:
Select * From cities, weather Where cities.city = 'A'
You can also refer to tables with alias:
Select *
From cities AS C, weather AS W
Where C.city = 'A'
But most important, make sure that you join tables together, unless you want all records from both tables to be matched without criterion (cartesian product). You can join them with explicit INNER JOIN:
Select
*
From
cities AS C
INNER JOIN weather AS W ON C.city = W.city
Where
C.city = 'A'
In the example you mention, this query is used:
SELECT *
FROM weather, cities
WHERE city = name;
But in here, cities table has name column (instead of city which is the one you used). So this WHERE clause is linking weather and cities table together, since city is weather column and name is cities column and there is no ambiguity because both columns are named different.

Related

PostgreSQL Insert into table with subquery selecting from multiple other tables

I am learning SQL (postgres) and am trying to insert a record into a table that references records from two other tables, as foreign keys.
Below is the syntax I am using for creating the tables and records:
-- Create a person table + insert single row
CREATE TABLE person (
pname VARCHAR(255) NOT NULL,
PRIMARY KEY (pname)
);
INSERT INTO person VALUES ('personOne');
-- Create a city table + insert single row
CREATE TABLE city (
cname VARCHAR(255) NOT NULL,
PRIMARY KEY (cname)
);
INSERT INTO city VALUES ('cityOne');
-- Create a employee table w/ForeignKey reference
CREATE TABLE employee (
ename VARCHAR(255) REFERENCES person(pname) NOT NULL,
ecity VARCHAR(255) REFERENCES city(cname) NOT NULL,
PRIMARY KEY(ename, ecity)
);
-- create employee entry referencing existing records
INSERT INTO employee VALUES(
SELECT pname FROM person
WHERE pname='personOne' AND <-- ISSUE
SELECT cname FROM city
WHERE cname='cityOne
);
Notice in the last block of code, where I'm doing an INSERT into the employee table, I don't know how to string together multiple SELECT sub-queries to get both the existing records from the person and city table such that I can create a new employee entry with attributes as such:
ename='personOne'
ecity='cityOne'
The textbook I have for class doesn't dive into sub-queries like this and I can't find any examples similar enough to mine such that I can understand how to adapt them for this use case.
Insight will be much appreciated.
There doesn’t appear to be any obvious relationship between city and person which will make your life hard
The general pattern for turning a select that has two base tables giving info, into an insert is:
INSERT INTO table(column,list,here)
SELECT column,list,here
FROM
a
JOIN b ON a.x = b.y
In your case there isn’t really anything to join on because your one-column tables have no column in common. Provide eg a cityname in Person (because it seems more likely that one city has many person) then you can do
INSERT INTO employee(personname,cityname)
SELECT p.pname, c.cname
FROM
person p
JOIN city c ON p.cityname = c.cname
But even then, the tables are related between themselves and don’t need the third table so it’s perhaps something of an academic exercise only, not something you’d do in the real world
If you just want to mix every person with every city you can do:
INSERT INTO employee(personname,cityname)
SELECT pname, cname
FROM
person p
CROSS JOIN city c
But be warned, two people and two cities will cause 4 rows to be inserted, and so on (20 people and 40 cities, 800 rows. Fairly useless imho)
However, I trust that the general pattern shown first will suffice for your learning; write a SELECT that shows the data you want to insert, then simply write INSERT INTO table(columns) above it. The number of columns inserted to must match the number of columns selected. Don’t forget that you can select fixed values if no column from the query has the info (INSERT INTO X(p,c,age) SELECT personname, cityname, 23 FROM ...)
The following will work for you:
INSERT INTO employee
SELECT pname, cname FROM person, city
WHERE pname='personOne' AND cname='cityOne';
This is a cross join producing a cartesian product of the two tables (since there is nothing to link the two). It reads slightly oddly, given that you could just as easily have inserted the values directly. But I assume this is because it is a learning exercise.
Please note that there is a typo in your create employee. You are missing a comma before the primary key.

How to handle a range within a data field

I have a set of data with ranges of numbers that are saved to the field itself. So for example, in the age column there are entries like "60-64", "65+" and in the income field "30\,000-40\,000". Is there a way to query these fields and treat them as number ranges? So a query for 52500 would match the "50\,000-60\,000" income range?
Preprocessing the input is my current top idea, where I just map the user input to all possible values for these fields before I query the database. But I was wondering if there is a better way.
Assume that I cannot modify the database or create a new database at all.
There is no easy way with SQLite that I know off, and you certainly would be better off to restructure all your range columns into two columns each, range_start and range_end.
If your ranges are fixed ranges, you can get the minimum / maximum from a separate table:
create table age_ranges (
name varchar(16) unique not null
, range_start integer unique not null
, range_end integer unique not null
);
insert into age_ranges (name, range_start,range_end) values ('60-64',60,64);
insert into age_ranges (name, range_start,range_end) values ('65+',65,999);
create table participant (
name varchar(16) unique not null
, age integer not null
, income integer not null
);
insert into participant (name, age, income) values ('Joe Blow', 65, 900);
insert into participant (name, age, income) values ('Jane Doe' , 61 , 1900)
;
create table question (
question varchar(64) not null
, relevant_age varchar(32) not null
);
insert into question (question,relevant_age) values('What is your favourite non-beige color?', '65+');
insert into question (question,relevant_age) values('What is your favourite car?', '60-64');
;
select
p.name,
q.question,
q.relevant_age
from participant p
join age_ranges r on (r.range_start <= p.age and p.age <= r.range_end)
join question q on q.relevant_age = r.name
SQL Fiddle
Alternatively, you can also try to parse the range start and range end out by using string functions such as LEFT() etc., but the performance will likely bad.

How do NOT EXISTS and correlated subqueries work internally

I would like to understand how NOT EXISTS works in a correlated subquery.
In this query, it's returned the patient that takes all the medications, but I don't understand why.
Could someone please explain what's happening in each step of execution of this query and which records are being considered and dismissed in each step.
create table medication
(
idmedic INT PRIMARY KEY,
name VARCHAR(20),
dosage NUMERIC(8,2)
);
create table patient
(
idpac INT PRIMARY KEY,
name VARCHAR(20)
);
create table prescription
(
idpac INT,
idmedic INT,
date DATE,
time TIME,
FOREIGN KEY (idpac) REFERENCES patient(idpac),
FOREIGN KEY (idmedic) REFERENCES medication(idmedic)
);
insert into patient (idpac, name)
values (1, 'joe'), (2, 'tod'), (3, 'ric');
insert into medication (idmedic, name, dosage)
values (1, 'tilenol', 0.01), (2, 'omega3', 0.02);
insert into prescription (idpac, idmedic, date, time)
values (1, 1, '2018-01-01', '20:00'), (1, 2, '2018-01-01', '20:00'),
(2, 2, '2018-01-01', '20:00');
select
pa.name
from
patient pa
where
not exists (select 1 from medication me
where not exists (select 1
from prescription pr
where pr.idpac = pa.idpac
and pr.idmedic = me.idmedic))
Your query is trying to find:
all the patients who TAKE ALL medications.
I have rewritten your script, to find
all the patients who have NOT TAKEN ANY medications.
-- This returns 1 Row, patient ric
-- all the patients who take all medications
select
pa.name
from
patient pa
where
not exists (select 1 from medication me
where /**** not ****/ exists (select 1
from prescription pr
where pr.idpac = pa.idpac
and pr.idmedic = me.idmedic))
DEMO:
Here is a SQL Fiddle for it.
I think that this query will clarify the usage of EXISTS operator to you.
If not, try to think of sub-queries as JOINs and EXISTS/NOT EXISTS as WHERE conditions.
EXISTS operator is explained as "Specifies a subquery to test for the existence of rows".
You could also check the examples on learn.microsoft.com Here.
If you see a doubly nested "not exists" in a query that's usually an indication that relational division is being performed (google that, you'll find plenty of stuff).
Translating the query clause by clause into informal language yields something like :
get patients
for which there does not exist
a medication
for which there does not exist
a prescription for that patient to take that medication
which translates to
patients for which there is no medication they don't take.
which translates to
patients that take all medications.
Relational division is the relational algebra operator that corresponds to universal quantification in predicate logic.

Parent/Child Tables Query Pattern

Suppose I have the following parent/child table relationship in my database:
TABLE offer_master( offer_id int primary key, ..., scope varchar )
TABLE offer_detail( offer_detail_id int primary key, offer_id int foreign key, customer_id int, ... )
where offer_master.scope can take on the value
INDIVIDUAL: when the offer is to made to particular customers. In this case,
whenever a row is inserted into offer_master, a corresponding row is
added to offer_detail for each customer to which the offer has been extended.
e.g.
INSERT INTO offer_master( 1, ..., 'INDIVIDUAL' );
INSERT INTO offer_detail( offer_detail_id, offer_id, customer_id, ... )
VALUES ( 1, 1, 100, ... )
INSERT INTO offer_detail( offer_detail_id, offer_id, customer_id, ... )
VALUES ( 2, 1, 101, ... )
GLOBAL: when the offer is made to all customers. In this case,
new offers can be added to the parent table as follows:
INSERT INTO offer_master( 2, ..., 'GLOBAL' );
INSERT INTO offer_master( 3, ..., 'GLOBAL' );
but a child row is added to offer_detail only
when a customer indicates some interest in the offer. So
it may be the case that, at some later point we will have
INSERT INTO offer_detail( offer_detail_id, offer_id, customer_id, ... )
VALUES ( 4, 3, 100, ... )
Given this situation, suppose we would like to query the database
to obtain all offers which have been extended to customer 100;
this includes 3 types of offers:
offers which have been extended specifically to customer 100.
global offers which customer 100 showed no interest in.
global offers which customer 100 did show interest in.
I see two approaches:
Using a Subquery:
SELECT *
FROM offer_master
WHERE offer_id in (
SELECT offer_id
FROM offer_detail
WHERE customer_id = 100 )
OR scope = 'GLOBAL'
Using a UNION
SELECT om.*
FROM offer_master om INNER JOIN
offer_detail od
ON om.offer_id = od.offer_id
WHERE od.customer_id = 100
UNION
SELECT *
FROM offer_master
WHERE scope = 'GLOBAL'
Note: a UNION ALL cannot be used since a global offer
which a customer has shown interest in would be duplicated.
My question is:
Does this query pattern have a name?
Which of the two query methods are preferable?
Should the database design be improved in some way?
I'm not aware of a pattern name.
To me, the second query is clearer but I think either is OK.
offer_detail seems to be a dual purpose table which is a bit of a red flag to me. You might have separate tables for the customers in an individual offer, and the customers who have expressed interest.

Update specific rows in Oracle using the count from another table

I am trying to update some records in one table (street) with a count from another table (house). I am trying to update the house_count in the street table with the correct number of houses on that street. I only want to update the records that are incorrect. I have been able to do this is MSSQL using the code below:
CREATE TABLE street
(
name varchar(255),
house_count int
);
Create table house
(
id varchar(255),
street_name varchar(255)
);
insert into street values ('oak',1)
insert into street values ('maple',2)
insert into street values ('birch',4)
insert into street values ('walnut',1)
insert into house values (1,'oak')
insert into house values (2,'oak')
insert into house values (1,'maple')
insert into house values (2,'maple')
insert into house values (1,'birch')
insert into house values (2,'birch')
insert into house values (3,'birch')
insert into house values (1,'walnut')
update s set s.house_count= hc.ActualCount
from street s
inner join
(select s.name, count(s.name) as 'ActualCount', s.house_count
from street s
inner join house h on s.name=h.street_name
group by s.name, s.house_count
having count(s.name) <> s.house_count) hc ON s.name=hc.name
where s.name=hc.name
I have a need to do something similar in Oracle but have ran into issues. From what I have found the join is not possible in Oracle but I am having a hard time getting something that will work. Any help in getting something like this to work in Oracle is greatly appreciated.
Thanks
You can do this with a correlated subquery:
update street
set house_count = (select count(s.name)
from house h
where street.name = h.street_name
);
This is a little different from your approach, because it will update all streets, even when the count does not change. There is no performance advantage using a subquery in trying to prevent the update.
EDIT:
This should solve the problem with the apartment streets versus house streets:
update street
set house_count = (select count(s.name)
from house h
where street.name = h.street_name
)
where exists (select 1 from house h where street.name = h.street_name);