Update specific rows in Oracle using the count from another table - sql

I am trying to update some records in one table (street) with a count from another table (house). I am trying to update the house_count in the street table with the correct number of houses on that street. I only want to update the records that are incorrect. I have been able to do this is MSSQL using the code below:
CREATE TABLE street
(
name varchar(255),
house_count int
);
Create table house
(
id varchar(255),
street_name varchar(255)
);
insert into street values ('oak',1)
insert into street values ('maple',2)
insert into street values ('birch',4)
insert into street values ('walnut',1)
insert into house values (1,'oak')
insert into house values (2,'oak')
insert into house values (1,'maple')
insert into house values (2,'maple')
insert into house values (1,'birch')
insert into house values (2,'birch')
insert into house values (3,'birch')
insert into house values (1,'walnut')
update s set s.house_count= hc.ActualCount
from street s
inner join
(select s.name, count(s.name) as 'ActualCount', s.house_count
from street s
inner join house h on s.name=h.street_name
group by s.name, s.house_count
having count(s.name) <> s.house_count) hc ON s.name=hc.name
where s.name=hc.name
I have a need to do something similar in Oracle but have ran into issues. From what I have found the join is not possible in Oracle but I am having a hard time getting something that will work. Any help in getting something like this to work in Oracle is greatly appreciated.
Thanks

You can do this with a correlated subquery:
update street
set house_count = (select count(s.name)
from house h
where street.name = h.street_name
);
This is a little different from your approach, because it will update all streets, even when the count does not change. There is no performance advantage using a subquery in trying to prevent the update.
EDIT:
This should solve the problem with the apartment streets versus house streets:
update street
set house_count = (select count(s.name)
from house h
where street.name = h.street_name
)
where exists (select 1 from house h where street.name = h.street_name);

Related

PostgreSQL Insert into table with subquery selecting from multiple other tables

I am learning SQL (postgres) and am trying to insert a record into a table that references records from two other tables, as foreign keys.
Below is the syntax I am using for creating the tables and records:
-- Create a person table + insert single row
CREATE TABLE person (
pname VARCHAR(255) NOT NULL,
PRIMARY KEY (pname)
);
INSERT INTO person VALUES ('personOne');
-- Create a city table + insert single row
CREATE TABLE city (
cname VARCHAR(255) NOT NULL,
PRIMARY KEY (cname)
);
INSERT INTO city VALUES ('cityOne');
-- Create a employee table w/ForeignKey reference
CREATE TABLE employee (
ename VARCHAR(255) REFERENCES person(pname) NOT NULL,
ecity VARCHAR(255) REFERENCES city(cname) NOT NULL,
PRIMARY KEY(ename, ecity)
);
-- create employee entry referencing existing records
INSERT INTO employee VALUES(
SELECT pname FROM person
WHERE pname='personOne' AND <-- ISSUE
SELECT cname FROM city
WHERE cname='cityOne
);
Notice in the last block of code, where I'm doing an INSERT into the employee table, I don't know how to string together multiple SELECT sub-queries to get both the existing records from the person and city table such that I can create a new employee entry with attributes as such:
ename='personOne'
ecity='cityOne'
The textbook I have for class doesn't dive into sub-queries like this and I can't find any examples similar enough to mine such that I can understand how to adapt them for this use case.
Insight will be much appreciated.
There doesn’t appear to be any obvious relationship between city and person which will make your life hard
The general pattern for turning a select that has two base tables giving info, into an insert is:
INSERT INTO table(column,list,here)
SELECT column,list,here
FROM
a
JOIN b ON a.x = b.y
In your case there isn’t really anything to join on because your one-column tables have no column in common. Provide eg a cityname in Person (because it seems more likely that one city has many person) then you can do
INSERT INTO employee(personname,cityname)
SELECT p.pname, c.cname
FROM
person p
JOIN city c ON p.cityname = c.cname
But even then, the tables are related between themselves and don’t need the third table so it’s perhaps something of an academic exercise only, not something you’d do in the real world
If you just want to mix every person with every city you can do:
INSERT INTO employee(personname,cityname)
SELECT pname, cname
FROM
person p
CROSS JOIN city c
But be warned, two people and two cities will cause 4 rows to be inserted, and so on (20 people and 40 cities, 800 rows. Fairly useless imho)
However, I trust that the general pattern shown first will suffice for your learning; write a SELECT that shows the data you want to insert, then simply write INSERT INTO table(columns) above it. The number of columns inserted to must match the number of columns selected. Don’t forget that you can select fixed values if no column from the query has the info (INSERT INTO X(p,c,age) SELECT personname, cityname, 23 FROM ...)
The following will work for you:
INSERT INTO employee
SELECT pname, cname FROM person, city
WHERE pname='personOne' AND cname='cityOne';
This is a cross join producing a cartesian product of the two tables (since there is nothing to link the two). It reads slightly oddly, given that you could just as easily have inserted the values directly. But I assume this is because it is a learning exercise.
Please note that there is a typo in your create employee. You are missing a comma before the primary key.

How to handle a range within a data field

I have a set of data with ranges of numbers that are saved to the field itself. So for example, in the age column there are entries like "60-64", "65+" and in the income field "30\,000-40\,000". Is there a way to query these fields and treat them as number ranges? So a query for 52500 would match the "50\,000-60\,000" income range?
Preprocessing the input is my current top idea, where I just map the user input to all possible values for these fields before I query the database. But I was wondering if there is a better way.
Assume that I cannot modify the database or create a new database at all.
There is no easy way with SQLite that I know off, and you certainly would be better off to restructure all your range columns into two columns each, range_start and range_end.
If your ranges are fixed ranges, you can get the minimum / maximum from a separate table:
create table age_ranges (
name varchar(16) unique not null
, range_start integer unique not null
, range_end integer unique not null
);
insert into age_ranges (name, range_start,range_end) values ('60-64',60,64);
insert into age_ranges (name, range_start,range_end) values ('65+',65,999);
create table participant (
name varchar(16) unique not null
, age integer not null
, income integer not null
);
insert into participant (name, age, income) values ('Joe Blow', 65, 900);
insert into participant (name, age, income) values ('Jane Doe' , 61 , 1900)
;
create table question (
question varchar(64) not null
, relevant_age varchar(32) not null
);
insert into question (question,relevant_age) values('What is your favourite non-beige color?', '65+');
insert into question (question,relevant_age) values('What is your favourite car?', '60-64');
;
select
p.name,
q.question,
q.relevant_age
from participant p
join age_ranges r on (r.range_start <= p.age and p.age <= r.range_end)
join question q on q.relevant_age = r.name
SQL Fiddle
Alternatively, you can also try to parse the range start and range end out by using string functions such as LEFT() etc., but the performance will likely bad.

Ambiguity with column reference [duplicate]

This question already has answers here:
SQL column reference "id" is ambiguous
(5 answers)
Closed 4 months ago.
I try to run a simple code as follows:
Create Table weather (
city varchar(80),
temp_lo int,
temp_hi int,
prcp real,
date date
);
Insert Into weather Values ('A', -5, 40, 25, '2018-01-10');
Insert Into weather Values ('B', 5, 45, 15, '2018-02-10');
Create Table cities (
city varchar(80),
location point
);
Insert Into cities Values ('A', '(12,10)');
Insert Into cities Values ('B', '(6,4)');
Insert Into cities Values ('C', '(18,13)');
Select * From cities, weather Where city = 'A'
But what I get is
ERROR: column reference "city" is ambiguous.
What is wrong with my code?
If I were you I'd model things slightly differently.
To normalise things a little, we'll start with the cities table and make a few changes:
create table cities (
city_id integer primary key,
city_name varchar(100),
location point
);
Note that I've used an integer to denote the ID and Primary Key of the table, and stored the name of the city separately. This gives you a nice easy to maintain lookup table. By using an integer as the primary key, we'll also use less space in the weather table when we're storing data.
Create Table weather (
city_id integer,
temp_lo int,
temp_hi int,
prcp real,
record_date date
);
Note that I'm storing the id of the city rather than the name. Also, I've renamed date as it's not a good idea to name columns after SQL reserved words.
Ensure that we use IDs in the test data:
Insert Into weather Values (1, -5, 40, 25, '2018-01-10');
Insert Into weather Values (2, 5, 45, 15, '2018-02-10');
Insert Into cities Values (1,'A', '(12,10)');
Insert Into cities Values (2,'B', '(6,4)');
Insert Into cities Values (3,'C', '(18,13)');
Your old query:
Select * From cities, weather Where city = 'A'
The name was ambiguous because both tables have a city column, and the database engine doesn't know which city you mean (it doesn't automatically know if it needs to use cities.city or weather.city). The query also performs a cartesian product, as you have not joined the tables together.
Using the changes I have made above, you'd require something like:
Select *
From cities, weather
Where cities.city_id = weather.city_id
and city_name = 'A';
or, using newer join syntax:
Select *
From cities
join weather on cities.city_id = weather.city_id
Where city_name = 'A';
The two queries are functionally equivalent - these days most people prefer the 2nd query, as it can prevent mistakes (eg: forgetting to actually join in the where clause).
Both tables cities and weather have a column called city. On your WHERE clause you filter city = 'A', which table's city is it refering to?
You can tell the engine which one you want to filter by preceding the column with it's table name:
Select * From cities, weather Where cities.city = 'A'
You can also refer to tables with alias:
Select *
From cities AS C, weather AS W
Where C.city = 'A'
But most important, make sure that you join tables together, unless you want all records from both tables to be matched without criterion (cartesian product). You can join them with explicit INNER JOIN:
Select
*
From
cities AS C
INNER JOIN weather AS W ON C.city = W.city
Where
C.city = 'A'
In the example you mention, this query is used:
SELECT *
FROM weather, cities
WHERE city = name;
But in here, cities table has name column (instead of city which is the one you used). So this WHERE clause is linking weather and cities table together, since city is weather column and name is cities column and there is no ambiguity because both columns are named different.

Conditionally insert multiple data rows into multiple Postgres tables

I have multiple rows of data. And this is just some made up data in order to make an easy example.
The data has name, age and location.
I want to insert the data into two tables, persons and locations, where locations has a FK to persons.
Nothing should be inserted if there already is a person with that name, or if the age is below 18.
I need to use COPY (in my real world example I'm using NPQSQL for .NET and I think that's the fastest way of inserting a lot of data).
So I'll do the following in a transaction (not 100% sure on the syntax, not on my computer right now):
-- Create a temp table
DROP TABLE IF EXISTS tmp_x;
CREATE TEMPORARY TABLE tmp_x
(
name TEXT,
age INTEGER,
location TEXT
);
-- Read the data into the temp table
COPY tmp_x (name, age) FROM STDIN (FORMAT BINARY);
-- Conditionally insert the data into persons
WITH insertedPersons AS (
INSERT INTO persons (name, age)
SELECT name, age
FROM tmp_x tmp
LEFT JOIN persons p ON p.name = tmp.name
WHERE
p IS NULL
AND tmp.age >= 18
RETURNING id, name
),
-- Use the generated ids to insert the relational data into locations
WITH insertedLocations AS (
INSERT INTO locations (personid, location)
SELECT ip.id, tmp.location
FROM tmp_x tmp
INNER JOIN insertedPersons ip ON ip.name = tmp.name
),
DROP TABLE tmp_x;
Is there a better/easier/more efficient way to do this?
Is there a better way to "link" the inserts instead of INNER JOIN insertedPersons ip ON ip.name = tmp.name. What if name wasn't unique? Can I update tmp_x with the new person ids and use that?

Is it possible to report on 2 tables without using a subquery?

You have one table against which you wish to count the number of items in two different tables. In this example I used buildings, men and women
DROP TABLE IF EXISTS building;
DROP TABLE IF EXISTS men;
DROP TABLE IF EXISTS women;
CREATE TABLE building(name VARCHAR(255));
CREATE TABLE men(building VARCHAR(255), name VARCHAR(255));
CREATE TABLE women(building VARCHAR(255), name VARCHAR(255));
INSERT INTO building VALUES('building1');
INSERT INTO building VALUES('building2');
INSERT INTO building VALUES('building3');
INSERT INTO men VALUES('building1', 'andy');
INSERT INTO men VALUES('building1', 'barry');
INSERT INTO men VALUES('building2', 'calvin');
INSERT INTO men VALUES(null, 'dwain');
INSERT INTO women VALUES('building1', 'alice');
INSERT INTO women VALUES('building1', 'betty');
INSERT INTO women VALUES(null, 'casandra');
select
r1.building_name,
r1.men,
GROUP_CONCAT(women.name) as women,
COUNT(women.name) + r1.men_count as count
from
(select
building.name as building_name,
GROUP_CONCAT(men.name) as men,
COUNT(men.name) as men_count
from
building
left join
men on building.name=men.building
GROUP BY building.name) as r1
left join
women on r1.building_name=women.building
GROUP BY r1.building_name;
Might there be another way? The problem with the above approach is that the columns of the two tables in the subquery are hidden and need to be redeclared in the outer query. Doing it in two separate set operations creates an asymmetry where there is none. We could equally have joined to the women first and then the men.
In SQL Server, I would just join two subqueries with two left joins - if symmetry is what you are looking for:
SELECT *
FROM building
LEFT JOIN (SELECT building, etc. FROM men GROUP BY etc.) AS men_summary
ON building.name = men_summary.building_name
LEFT JOIN (SELECT building, etc. FROM women GROUP BY etc.) AS women_summary
ON building.name = women_summary.building_name
I tend to use common table expressions declared first instead of subqueries - it's far more readable (but not ANSI - but then neither is GROUP_CONCAT).
Use Union to combine the data from the men/women tables
select building, [name] as menname, null as womenname from men
union
select building, null as menname, [name] as womenname from women
you now have a single 'table' addmitedly in a subquery against which you can join, count or whatever.
BTW I can see why Cas[s]andra is out in the cold as no-one belives her, but what about dwain, is he similarly cursed by the gods?