SQL Server 2005 Full Text Search over multiple tables and columns

SQL Server 2005 Full Text Search over multiple tables and columns - sql

I'm looking for a good solution to use the containstable feature of the SQL Serve r2005 effectivly. Currently I have, e.g. an Employee and an Address table.
-Employee
Id
Name
-Address
Id
Street
City
EmployeeId
Now the user can enter search terms in only one textbox and I want this terms to be split and search with an "AND" operator. FREETEXTTABLE seems to work with "OR" automatically.
Now lets say the user entered "John Hamburg". This means he wants to find John in Hamburg.
So this is "John AND Hamburg".
So the following will contain no results since CONTAINSTABLE checks every column for "John AND Hamburg".
So my question is: What is the best way to perform a fulltext search with AND operators across multiple columns/tables?
SELECT *
FROM Employee emp
INNER JOIN
CONTAINSTABLE(Employee, *, '(JOHN AND Hamburg)', 1000) AS keyTblSp
ON sp.ServiceProviderId = keyTblSp.[KEY]
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = emp.EmployeeId
UNION ALL
SELECT *
FROM Employee emp
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = emp.EmployeeId
INNER JOIN
CONTAINSTABLE([Address], *, '(JOHN AND Hamburg)', 1000) AS keyTblAddr
ON addr.AddressId = keyTblAddr.[KEY]
...

This is more of a syntax problem. How do you divine the user's intent with just one input box?
Are they looking for "John Hamburg" the person?
Are they looking for "John Hamburg Street"?
Are they looking for "John" who lives on "Hamburg Street" in Springfield?
Are they looking for "John" who lives in the city of "Hamburg"?
Without knowing the user's intent, the best you can hope for is to OR the terms, and take the highest ranking hits.
Otherwise, you need to program in a ton of logic, depending on the number of words passed in:
2 words:
Search Employee data for term 1, Search Employee data for term 2, Search Address data for term 1, Search address data for term 2. Merge results by term, order by most hits.
3 words:
Search Employee data for term 1, Search Employee data for term 2, Search employee data for term 3, Search Address data for term 1, Search address data for term 2, Search address data for term 3. Merge results by term, order by most hits.
etc...
I guess I would redesign the GUI to separate the input into Name and Address, at a minimum. If that is not possible, enforce a syntax rule to the effect "First words will be considered a name until a comma appears, any words after that will be considered addresses"
EDIT:
Your best bet is still OR the terms, and take the highest ranking hits. Here's an example of that, and an example why this is not ideal without some pre-processing of the input to divine the user's intent:
insert into Employee (id, [name]) values (1, 'John Hamburg')
insert into Employee (id, [name]) values (2, 'John Smith')
insert into Employee (id, [name]) values (3, 'Bob Hamburg')
insert into Employee (id, [name]) values (4, 'Bob Smith')
insert into Employee (id, [name]) values (5, 'John Doe')
insert into Address (id, street, city, employeeid) values (1, 'Main St.', 'Springville', 1)
insert into Address (id, street, city, employeeid) values (2, 'Hamburg St.', 'Springville', 2)
insert into Address (id, street, city, employeeid) values (3, 'St. John Ave.', 'Springville', 3)
insert into Address (id, street, city, employeeid) values (4, '5th Ave.', 'Hamburg', 4)
insert into Address (id, street, city, employeeid) values (5, 'Oak Lane', 'Hamburg', 5)
Now since we don't know what keywords will apply to what table, we have to assume they could apply to either table, so we have to OR the terms against each table, UNION the results, Aggregate them, and compute the highest rank.
SELECT Id, [Name], Street, City, SUM([Rank])
FROM
(
SELECT emp.Id, [Name], Street, City, [Rank]
FROM Employee emp
JOIN [Address] addr ON emp.Id = addr.EmployeeId
JOIN CONTAINSTABLE(Employee, *, 'JOHN OR Hamburg') AS keyTblEmp ON emp.Id = keyTblEmp.[KEY]
UNION ALL
SELECT emp.Id, [Name], Street, City, [Rank]
FROM Employee emp
JOIN [Address] addr ON emp.Id = addr.EmployeeId
JOIN CONTAINSTABLE([Address], *, 'JOHN OR Hamburg') AS keyTblAdd ON addr.Id = keyTblAdd.[KEY]
) as tmp
GROUP BY Id, [Name], Street, City
ORDER BY SUM([Rank]) DESC
This is less than ideal, here's what you get for the example (in your case, you would have wanted John Doe from Hamburg to show up first):
Id Name Street City Rank
2 John Smith Hamburg St. Springville 112
3 Bob Hamburg St. John Ave. Springville 112
5 John Doe Oak Lane Hamburg 96
1 John Hamburg Main St. Springville 48
4 Bob Smith 5th Ave. Hamburg 48
But that is the best you can do without parsing the input before submitting it to SQL to make a "best guess" at what the user wants.

I had the same problem. Here is my solution, which worked for my case:
I created a view that returns the columns that I want. I added another extra column which aggregates all the columns I want to search among. So, in this case the view would be like
SELECT emp.*, addr.*, ISNULL(emp.Name,'') + ' ' + ISNULL(addr.City, '') AS SearchResult
FROM Employee emp
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = emp.EmployeeId
After this I created a full-text index on SearchResult column. Then, I search on this column
SELECT *
FROM vEmpAddr ea
INNER JOIN CONTAINSTABLE(vEmpAddr, *, 'John AND Hamburg') a ON ea.ID = a.[Key]

Related

How to self join only a subset of rows in PostgreSQL?

Given the following table:
CREATE TABLE people (
name TEXT PRIMARY KEY,
age INT NOT NULL
);
INSERT INTO people VALUES
('Lisa', 30),
('Marta', 27),
('John', 32),
('Sam', 41),
('Alex', 12),
('Aristides',43),
('Cindi', 1)
;
I am using a self join to compare each value of a specific column with all the other values of the same column. My query looks something like this:
SELECT DISTINCT A.name as child
FROM people A, people B
WHERE A.age + 16 < B.age;
This query aims to spot potential sons/daughters based on age difference. More specifically, my goal is to identify the set of people that may have stayed in the same house as one of their parents (ordered by name), assuming that there must be an age difference of at least 16 years between a child and their parents.
Now I would like to combine this kind of logic with the information that is in another table.
The other table looks something like that:
CREATE TABLE houses (
house_name TEXT NOT NULL,
house_member TEXT NOT NULL REFERENCES people(name)
);
INSERT INTO houses VALUES
('house Smith', 'Lisa'),
('house Smith', 'Marta'),
('house Smith', 'John'),
('house Doe', 'Lisa'),
('house Doe', 'Marta'),
('house Doe', 'Alex'),
('house Doe', 'Sam'),
('house McKenny', 'Aristides'),
('house McKenny', 'John'),
('house McKenny', 'Cindi')
;
The two tables can be joined ON houses.house_member = people.name.
More specifically I would like to spot the children only within the same house. It does not make sense to compare the age of each person with the age of all the others, but instead it would be more efficient to compare the age of each person with all the other people in the same house.
My idea is to perform the self join from above but only within a PARTITION BY household_name. However, I don't think this is a good idea since I do not have an aggregate function. Same applies for GROUP BY statements as well. What could I do here?
The expected output should be the following, ordered by house_member:
house_member
Alex
Cindi
For simplicity I have created a fiddle.

At first join two tables to build one table that has all three bits of info: house_name, house_member, age.
And then join it with itself just as you did originally and add one extra filter to look only at the same households.
WITH
CTE_All
AS
(
SELECT
houses.house_name
,houses.house_member
,people.age
FROM
houses
INNER JOIN people ON people.name = houses.house_member
)
SELECT DISTINCT
Children.house_name
,Children.house_member AS child_name
FROM
CTE_All AS Children
INNER JOIN CTE_All AS Parents
ON Children.age + 16 < Parents.age
-- this is our age difference
AND Children.house_name = Parents.house_name
-- within the same house
;
All this is one single query. You don't have to use CTE, you can inline it as a subquery, but it is more readable with CTE.
Result
house_name | child_name
:------------ | :---------
house Doe | Alex
house McKenny | Cindi

SQL help selecting customer information from two tables

I have an assignment where I had to create two tables named Customer and Address. These tables are located within a database called HandsOnOne.
Customer has columns titled:
CustomerID, CustomerName, CustomerAddressID
Address has columns titled:
AddressID, Street, City, State, ZipCode
There is a foreign key relationship in which AddressID in the Address Table is the primary key and CustomerAddressID in the Customer Table is the foreign key.
I used the following code to insert values into each table:
USE HandsOnOne;
INSERT INTO Address (AddressID, Street, City, State, ZipCode)
VALUES (1, '2400 Broadway Drive', 'Missoula', 'MT', '59802'),
(2, '320 21st Street', 'Billings', 'MT', '59101'),
(3, '439 Skyline Blvd', 'Denver', 'CO', '80002'),
(4, '56 Park Avenue', 'New York', 'NY', '10001');
USE HandsOnOne;
INSERT INTO Customer (CustomerID, CustomerName, CustomerAddressID)
VALUES (1, 'Western Supply Company', 1),
(2, 'Nick Harper', 3),
(3, 'Alice Harper', 3),
(4, 'Abacus Consulting', 4);
From there, I have to sort based on certain specifications. The first specification was to list all customers with city and state sorted ascending by ZipCode then ascending by CustomerName.
Here is the code I used for this part:
USE HandsOnOne;
SELECT CustomerName, City, State
FROM Customer, Address
ORDER BY ZipCode ASC, CustomerName ASC;
When I execute this code, my return is 16 items instead of 4. Somehow, each customer is being assigned each address, giving me 4 items at each address.
The next question asks me to list the Street, City, State and ZipCode of all address without a customer associated with them. This query should return the address of 320 21st St Billings, MT 59101 because its AddressID value is 2 and there is no CustomerAddressID value of 2 in the Address table. However, I do not receive any results when I execute this query.
I have verified that there is a foreign key relationship. What am I doing wrong?

You aren't restricting your join. Also, the implicit join is no longer supported in most DBMS (especially recent versions of SQL server), so an explicit Inner Join is best.
Try this:
select CustomerName, City, State
from Customer
inner join Address
on AddressID = CustomerAddressID -- I assume this is the foreign key
order by ZipCode asc, CustomerName asc
Ah, question 2 (missed that due to formatting)
select A1.*
from Address A1
left join Customer C2
on A1.AddressID = C2.CustomerAddressID
where C2.CustomerAddressID is null

For your 2nd question: use NOT EXISTS to get all addresses without a customer. See below:
SELECT *
FROM Address A
WHERE NOT EXISTS (SELECT 1
FROM Customer C
WHERE C.CustomerAddressID = A.AddressID)

ms-access 2010: count duplicate names per household address

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks

You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *

GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN

Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1

How to detect duplicate records with sub table records

Let's say I'm creating an address book in which the main table contains the basic contact information and a phone number sub table -
Contact
===============
Id [PK]
Name
PhoneNumber
===============
Id [PK]
Contact_Id [FK]
Number
So, a Contact record may have zero or more related records in the PhoneNumber table. There is no constraint on uniqueness of any column other than the primary keys. In fact, this must be true because:
Two contacts having different names may share a phone number, and
Two contacts may have the same name but different phone numbers.
I want to import a large dataset which may contain duplicate records into my database and then filter out the duplicates using SQL. The rules for identifying duplicate records are simple ... they must share the same name and the same number of phone records having the same content.
Of course, this works quite effectively for selecting duplicates from the Contact table but doesn't help me to detect actual duplicates given my rules:
SELECT * FROM Contact
WHERE EXISTS
(SELECT 'x' FROM Contact t2
WHERE t2.Name = Contact.Name AND
t2.Id > Contact.Id);
It seems as if what I want is a logical extension to what I already have, but I must be overlooking it. Any help?
Thanks!

In my question, I created a greatly simplified schema that reflects the real-world problem I'm solving. Przemyslaw's answer is indeed a correct one and did what I was asking both with the sample schema and, when extended, with the real one.
But, after doing some experiments with the real schema and a larger (~10k records) dataset, I found that performance was an issue. I don't claim to be an index guru, but I wasn't able to find a better combination of indices than what was already in the schema.
So, I came up with an alternate solution which fills the same requirements but executes in a small fraction (< 10%) of the time, at least using SQLite3 - my production engine. In hopes that it may assist someone else, I'll offer it as an alternative answer to my question.
DROP TABLE IF EXISTS Contact;
DROP TABLE IF EXISTS PhoneNumber;
CREATE TABLE Contact (
Id INTEGER PRIMARY KEY,
Name TEXT
);
CREATE TABLE PhoneNumber (
Id INTEGER PRIMARY KEY,
Contact_Id INTEGER REFERENCES Contact (Id) ON UPDATE CASCADE ON DELETE CASCADE,
Number TEXT
);
INSERT INTO Contact (Id, Name) VALUES
(1, 'John Smith'),
(2, 'John Smith'),
(3, 'John Smith'),
(4, 'Jane Smith'),
(5, 'Bob Smith'),
(6, 'Bob Smith');
INSERT INTO PhoneNumber (Id, Contact_Id, Number) VALUES
(1, 1, '555-1212'),
(2, 1, '222-1515'),
(3, 2, '222-1515'),
(4, 2, '555-1212'),
(5, 3, '111-2525'),
(6, 4, '111-2525');
COMMIT;
SELECT *
FROM Contact c1
WHERE EXISTS (
SELECT 1
FROM Contact c2
WHERE c2.Id > c1.Id
AND c2.Name = c1.Name
AND (SELECT COUNT(*) FROM PhoneNumber WHERE Contact_Id = c2.Id) = (SELECT COUNT(*) FROM PhoneNumber WHERE Contact_Id = c1.Id)
AND (
SELECT COUNT(*)
FROM PhoneNumber p1
WHERE p1.Contact_Id = c2.Id
AND EXISTS (
SELECT 1
FROM PhoneNumber p2
WHERE p2.Contact_Id = c1.Id
AND p2.Number = p1.Number
)
) = (SELECT COUNT(*) FROM PhoneNumber WHERE Contact_Id = c1.Id)
)
;
The results are as expected:
Id Name
====== =============
1 John Smith
5 Bob Smith
Other engines are bound to have differing performance which may be quite acceptable. This solution seems to work quite well with SQLite for this schema.

The author stated the requirement of "two people being the same person" as:
Having the same name and
Having the same number of phone numbers and all of which are the same.
So the problem is a bit more complex than it seems (or maybe I just overthought it).
Sample data and (an ugly one, I know, but the general idea is there) a sample query which I tested on below test data which seems to be working correctly (I'm using Oracle 11g R2):
CREATE TABLE contact (
id NUMBER PRIMARY KEY,
name VARCHAR2(40))
;
CREATE TABLE phone_number (
id NUMBER PRIMARY KEY,
contact_id REFERENCES contact (id),
phone VARCHAR2(10)
);
INSERT INTO contact (id, name) VALUES (1, 'John');
INSERT INTO contact (id, name) VALUES (2, 'John');
INSERT INTO contact (id, name) VALUES (3, 'Peter');
INSERT INTO contact (id, name) VALUES (4, 'Peter');
INSERT INTO contact (id, name) VALUES (5, 'Mike');
INSERT INTO contact (id, name) VALUES (6, 'Mike');
INSERT INTO contact (id, name) VALUES (7, 'Mike');
INSERT INTO phone_number (id, contact_id, phone) VALUES (1, 1, '123'); -- John having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (2, 1, '456'); -- John having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (3, 2, '123'); -- John the second having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (4, 2, '456'); -- John the second having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (5, 3, '123'); -- Peter having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (6, 3, '456'); -- Peter having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (7, 3, '789'); -- Peter having number 123
INSERT INTO phone_number (id, contact_id, phone) VALUES (8, 4, '456'); -- Peter the second having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (9, 5, '123'); -- Mike having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (10, 5, '456'); -- Mike having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (11, 6, '123'); -- Mike the second having number 456
INSERT INTO phone_number (id, contact_id, phone) VALUES (12, 6, '789'); -- Mike the second having number 456
-- Mike the third having no number
COMMIT;
-- does not meet the requirements described in the question - will return Peter when it should not
SELECT DISTINCT c.name
FROM contact c JOIN phone_number pn ON (pn.contact_id = c.id)
GROUP BY name, phone_number
HAVING COUNT(c.id) > 1
;
-- returns correct results for provided test data
-- take all people that have a namesake in contact table and
-- take all this person's phone numbers that this person's namesake also has
-- finally (outer query) check that the number of both persons' phone numbers is the same and
-- the number of the same phone numbers is equal to the number of (either) person's phone numbers
SELECT c1_id, name
FROM (
SELECT c1.id AS c1_id, c1.name, c2.id AS c2_id, COUNT(1) AS cnt
FROM contact c1
JOIN contact c2 ON (c2.id != c1.id AND c2.name = c1.name)
JOIN phone_number pn ON (pn.contact_id = c1.id)
WHERE
EXISTS (SELECT 1
FROM phone_number
WHERE contact_id = c2.id
AND phone = pn.phone)
GROUP BY c1.id, c1.name, c2.id
)
WHERE cnt = (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id)
AND (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id) = (SELECT COUNT(1) FROM phone_number WHERE contact_id = c2_id)
;
-- cleanup
DROP TABLE phone_number;
DROP TABLE contact;
Check at SQL Fiddle: http://www.sqlfiddle.com/#!4/36cdf/1
Edited
Answer to author's comment: Of course I didn't take that into account... here's a revised solution:
-- new test data
INSERT INTO contact (id, name) VALUES (8, 'Jane');
INSERT INTO contact (id, name) VALUES (9, 'Jane');
SELECT c1_id, name
FROM (
SELECT c1.id AS c1_id, c1.name, c2.id AS c2_id, COUNT(1) AS cnt
FROM contact c1
JOIN contact c2 ON (c2.id != c1.id AND c2.name = c1.name)
LEFT JOIN phone_number pn ON (pn.contact_id = c1.id)
WHERE pn.contact_id IS NULL
OR EXISTS (SELECT 1
FROM phone_number
WHERE contact_id = c2.id
AND phone = pn.phone)
GROUP BY c1.id, c1.name, c2.id
)
WHERE (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id) IN (0, cnt)
AND (SELECT COUNT(1) FROM phone_number WHERE contact_id = c1_id) = (SELECT COUNT(1) FROM phone_number WHERE contact_id = c2_id)
;
We allow a situation when there are no phone numbers (LEFT JOIN) and in outer query we now compare the number of person's phone numbers - it must either be equal to 0, or the number returned from the inner query.

The keyword "having" is your friend. The generic use is:
select field1, field2, count(*) records
from whereever
where whatever
group by field1, field2
having records > 1
Whether or not you can use the alias in the having clause depends on the database engine. You should be able to apply this basic principle to your situation.

Inserting values into tables Oracle SQL

I'm trying to insert values into an 'Employee' table in Oracle SQL. I have a question regarding inputting values determined by a foreign key:
My employees have 3 attributes that are determined by foreign keys: State, Position, & Manager. I am using an INSERT INTO statement to insert the values and manually typing in the data. Do I need to physically look up each reference to input the data or is there a command that I can use? E.g.
INSERT INTO Employee
(emp_id, emp_name, emp_address, emp_state, emp_position, emp_manager)
VALUES
(001, "John Doe", "1 River Walk, Green Street", 3, 5, 1000)
This should populate the employee table with (John Doe, 1 River Walk, Green Street, New York, Sales Executive, Barry Green). New York is state_id=3 in the State table; Sales executive is position_id=5 in the positions table; and Barry Green is manager_id=1000 in the manager table.
Is there a way in which I can input the text values of the referenced tables, so that Oracle will recognise the text and match it with the relevant ID? I hope this question makes sense will be happy to clarify anything.
Thanks!

You can expend the following function in order to pull out more parameters from the DB before the insert:
--
-- insert_employee (Function)
--
CREATE OR REPLACE FUNCTION insert_employee(p_emp_id in number, p_emp_name in varchar2, p_emp_address in varchar2, p_emp_state in varchar2, p_emp_position in varchar2, p_emp_manager in varchar2)
RETURN VARCHAR2 AS
p_state_id varchar2(30) := '';
BEGIN
select state_id
into p_state_id
from states where lower(emp_state) = state_name;
INSERT INTO Employee (emp_id, emp_name, emp_address, emp_state, emp_position, emp_manager) VALUES
(p_emp_id, p_emp_name, p_emp_address, p_state_id, p_emp_position, p_emp_manager);
return 'SUCCESS';
EXCEPTION
WHEN others THEN
RETURN 'FAIL';
END;
/

INSERT
INTO Employee
(emp_id, emp_name, emp_address, emp_state, emp_position, emp_manager)
SELECT '001', 'John Doe', '1 River Walk, Green Street', state_id, position_id, manager_id
FROM dual
JOIN state s
ON s.state_name = 'New York'
JOIN positions p
ON p.position_name = 'Sales Executive'
JOIN manager m
ON m.manager_name = 'Barry Green'
Note that but a single spelling mistake (or an extra space) will result in a non-match and nothing will be inserted.

You can insert into a table from a SELECT.
INSERT INTO
Employee (emp_id, emp_name, emp_address, emp_state, emp_position, emp_manager)
SELECT
001,
'John Doe',
'1 River Walk, Green Street',
(SELECT id FROM state WHERE name = 'New York'),
(SELECT id FROM positions WHERE name = 'Sales Executive'),
(SELECT id FROM manager WHERE name = 'Barry Green')
FROM
dual
Or, similarly...
INSERT INTO
Employee (emp_id, emp_name, emp_address, emp_state, emp_position, emp_manager)
SELECT
001,
'John Doe',
'1 River Walk, Green Street',
state.id,
positions.id,
manager.id
FROM
state
CROSS JOIN
positions
CROSS JOIN
manager
WHERE
state.name = 'New York'
AND positions.name = 'Sales Executive'
AND manager.name = 'Barry Green'
Though this one does assume that all the look-ups exist. If, for example, there is no position name 'Sales Executive', nothing would get inserted with this version.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server 2005 Full Text Search over multiple tables and columns - sql

Related

How to self join only a subset of rows in PostgreSQL?

SQL help selecting customer information from two tables

ms-access 2010: count duplicate names per household address

How to detect duplicate records with sub table records

Inserting values into tables Oracle SQL

Categories

Resources