Simple MySQL problem - sql

I'm working on a MySQL database that contains persons. My problem is that, (I will simplify to make my point):
I have three tables:
Persons(id int, birthdate date)
PersonsLastNames(id int, lastname varchar(30))
PersonsFirstNames(id int, firstname varchar(30))
The id is the common key. There are separate tables for last names and first names because a single person can have many first names and many last names.
I want to make a query that returns all persons with, let's say, one last name. If I go with
select birthdate, lastname, firstname from Persons, PersonsLastNames,
PersonsFirstNames where Persons.id = PersonsLastNames.id and
Persons.id = PersonsFirstNames.id and lastName = 'Anderson'
I end up with a table like
1/1/1970 Anderson Steven //Person 1
1/1/1970 Anderson David //Still Person 1
2/2/1980 Smith Adam //Person 2
3/3/1990 Taylor Ed //Person 3
When presenting this, I would like to have
1/1/1970 Anderson Steven David
2/2/1980 Smith Adam [possibly null?]
3/3/1990 Taylor Ed [possibly null?]
How do I join the tables to introduce new columns in the result set if needed to hold several first names or last names for one person?

Does your application really need to handle unlimited first/last names per person? I don't know your specific needs, but that seems like it may be a little extreme. Regardless...
Since you can't really have a dynamic number of columns returned, you could do something like this:
SELECT birthdate, lastname, GROUP_CONCAT(firstname SEPARATOR '|') AS firstnames
FROM Persons, PersonsLastNames, PersonsFirstNames
WHERE Persons.id = PersonsLastNames.id
AND Persons.id = PersonsFirstNames.id
GROUP BY Persons.id
This would return one row per person that has a last name, with the (unlimited) first names separated by a pipe (|) symbol, GROUP_CONCAT function.
birthdate lastname firstnames
--- --- ---
1970-01-01 00:00:00 Anderson Steven|David
1980-02-02 00:00:00 Smith Adam
1990-03-03 00:00:00 Taylor Ed

SQL does not support a dynamic number of columns in the query select-list. You have to define exactly as many columns as you want (notwithstanding the * wildcard).
I recommend that you fetch the multiple names as rows, not columns. Then write some application code to loop over the result set and do whatever you want to do for presenting them.

The short answer is, you can't. You'll always have to pick a fixed number of columns. You can, however, greatly improve the syntax of your query by using the ON keyword. For example:
SELECT
birthdate,
firstName,
lastName
FROM
Persons
INNER JOIN PersonsLastNames
ON Persons.id = PersonsLastNames.id
INNER JOIN PersonsFirstNames
ON Persons.id = PersonsFirstNames.id
WHERE
lastName = 'Anderson'
GROUP BY
lastName, firstName
HAVING
count(lastName) = 1
Of course, my query includes a few extra provisions at the end so that only persons with only one last name specified would be grabbed, but you can always remove those.
Now, what you CAN do is choose a maximum number of these you'd like to retrieve and do something like this:
SELECT
birthdate,
lastName,
PersonsFirstNames.firstName,
IFNULL(p.firstName,''),
IFNULL(q.firstName,'')
FROM
Persons
INNER JOIN PersonsLastNames
ON Persons.id = PersonsLastNames.id
INNER JOIN PersonsFirstNames
ON Persons.id = PersonsFirstNames.id
LEFT JOIN PersonsFirstNames p
ON Persons.id = p.id
AND p.firstName <> PersonsFirstNames.firstName
LEFT JOIN PersonsFirstNames q
ON Persons.id = q.id
AND q.firstName <> PersonsFirstNames.firstName
AND q.firstName <> p.firstName
GROUP BY
lastName
But I really don't recommend that. The best bet is to retrieve multiple rows, and then iterate over them in whatever application you're using/developing.
Make sure you read up on your JOIN types (Left-vs-Inner), if you're not already familiar, before you start. Hope this helps.
EDIT: You also might want to consider, in that case, a slightly more complex GROUP BY clause, e.g.
GROUP BY
Persons.id, lastName

I think the closest thing you could do is to Group By Person.Id and then do string concatenation. Perhaps this post will help:
How to use GROUP BY to concatenate strings in MySQL?

Related

Find potential duplicate names in database

I have two tables in a SQL Server Database:
Table: People
Columns: ID, FirstName, LastName
Table: StandardNames
Columns: Nickname, StandardName
Sample Nicknames would be Rick, Rich, Richie when StandardName is Richard.
I would like to find duplicate contacts in my People table but replace any of the nicknames with the standard name. IE: sometimes I have Rich Smith other times it is Richard Smith in the People table. Is this possible? I realize it might be multiple joins to the same table but can't figure out how to start.
Firstly, you need to determine how many duplicates you have in your People table...
SELECT p.FirstName, COUNT(*)
FROM People AS p
INNER JOIN StandardNames AS sn
ON CHARINDEX(sn.Nickname, p.FirstName) > 0 OR
CHARINDEX(sn.Nickname, p.LastName) > 0
GROUP BY p.FirstName
HAVING COUNT(*) > 1
That's just to get an idea of what data you're trying to find in relation to the Nicknames that may possibly exist inside (as a wildcard word search) the Firstname and Lastname columns.
If you are happy with the items found then expand on the query to update the values.
Let's say you wanted to change the Firstname to be the Standardname...
UPDATE p2
SET p2.FirstName = p2.Standardname
FROM
(SELECT p.ID, sn.StandardName
FROM People AS p
INNER JOIN StandardNames AS sn
ON CHARINDEX(sn.Nickname, p.FirstName) > 0 OR
CHARINDEX(sn.Nickname, p.LastName) > 0) AS a
INNER JOIN People AS p2 ON p2.ID = a.ID
So this will obviously find all the People IDs that have a match based on the query above, and it will update the People table by replacing the FirstName with the StandardName.
However, there are issues with this due to the limitation of your question.
the StandardNames table should have its own ID field. All tables should have an ID column as its primary table. That's just my view.
this is only going to work for data it matches using the CHARINDEX() function. What you really need is something to find based on a "sound" or similarity to the nicknames. Check out the SOUNDEX() function and apply your logic from there.
And this is assuming your IDs above are unique!
Good luck
You could standardize the names by joining, and count the number of occurrences. Extracting the ID is a bit fiddly, but also quite possible. I'd suggest the following - use a case expression to find the contact with the standard name, and if you don't have one, just take the id of the first duplicate:
SELECT COALESCE(MIN(CASE FirstName WHEN StandardName THEN id END), MIN(id)),
StandardName,
LastName,
COUNT(*)
FROM People p
LEFT JOIN StandardNames s ON FirstName = Nickname AND
GROUP BY StandardName, LastName

Many Fields In the Group By Clause

I am learning SQL now, and I have a question. I recently came across a query that hand a large number of column names in the group by clause. I've used group by clauses before, and I've only ever seen one column name included in it.
SELECT TransportType.Description, TransportType.CargoCapacity, TransportType.Range, Transport.SerialNumber, Transport.PurchaseDate, Transport.RetiredDate,
MAX(Repair.BeginWorkDate) AS LatestRepairDate
FROM Transport INNER JOIN
TransportType ON Transport.TransportTypeID = TransportType.TransportTypeID LEFT OUTER JOIN
Repair ON Transport.TransportNumber = Repair.TransportNumber
GROUP BY TransportType.Description, TransportType.CargoCapacity, TransportType.Range, Transport.SerialNumber, Transport.PurchaseDate,
Transport.RetiredDate
HAVING (Transport.RetiredDate IS NULL)
ORDER BY TransportType.Description, Transport.SerialNumber
Why are there so many columns in the group by clause?
Except in MySQL & SQLite (which are lenient about the GROUP BY with sometimes indeterminate results), most RDBMS require every non-aggregated column (MAX(),MIN(),SUM(),COUNT(), etc) that appears in the SELECT list to be in the GROUP BY.
The behavior of MySQL & SQLite when columns from SELECT aren't listed in GROUP BY is not well defined. If for example, you execute a query like:
SELECT firstname, lastname, COUNT(*) FROM names GROUP BY lastname
MySQL would give you a result without complaint.
However, if your table included two different values of firstname having the same lastname, your resultant COUNT(*) would count both of them while only returning the firstname of one of them. What's more, which firstname MySQL chooses to return isn't defined so you can't really rely on it returning the first of the pair, for example.
From a table like:
firstname, lastname
--------------------
Jane Smith
John Smith
Peter Jones
The not-fully-correct result might be:
firstname, lastname, COUNT(*)
-----------------------------
Jane Smith 2 <----wrong!
Peter Jones 1
Outside MySQL & SQLite, columns referenced anywhere in the SELECT list not also appearing in the GROUP BY will result in a query parse error.
Commonly here on Stack Overflow, we encounter users with questions about the GROUP BY, having just begun working with an RDBMS that is stricter about its usage. If you learn aggregates in MySQL first, chances are you'll need to relearn to do them properly when moving to a different RDBMS.

One to many relation and subqueries

I have a table persons(id, fist, last) and another table properties(person_id, property_name, property_value).
Each person may have many (undefined) properties, the name of these properties can vary based on the favorites of users. For example, for these records, we want the following output:
properties
==========
person_id property_name property_value
----------------------------------------
1 Gender Male
1 Education Under
person
======
id first last
----------------
1 John Smith
result
======
id First Last Gender Education
-----------------------------------
1 John Smith Male Under
Is there an easy way to have this done without using several steps in querying the db? I mean using subqueries, join, group by, or any other means necessary to do the job?
PS.I am using sqlite2&3
Thank you,
Mahmoud
You can do
SELECT
pers.id,
pers.first,
pers.last,
gender.property_value AS gender,
edu.property_value AS education
FROM person pers
LEFT JOIN properties gender
ON (gender.person_id = pers.id AND gender.property_name = 'Gender')
LEFT JOIN properties edu
ON (edu.person_id = pers.id AND edu.property_name = 'Education')
But there is no way to do this for an arbitrary number of columns. You need to know which columns you want and join accordingly.
What you can do, is build a query dynamically beforehand in another language, and then execute that. If you do so, please use parameters. Please?
If you want, you can create a view showing all properties for person_ids:
CREATE VIEW all_properties (
person_id,
gender,
education
)
AS SELECT
pers.person_id,
gender.property_value,
edu.property_value
FROM (SELECT DISTINCT person_id FROM properties) pers
LEFT JOIN properties gender
ON (gender.person_id = pers.person_id AND gender.property_name = 'Gender')
LEFT JOIN properties edu
ON (edu.person_id = pers.person_id AND edu.property_name = 'Education')
But that's not going to make much difference.

Inner join (or intersect) over three tables

I have a database with three tables named: NameAddressPhone, NameAddressAge, and AgeSex.
Table NameAddressPhone has columns name, address, and phone.
Table NameAddressAge has columns name, address, and age.
Table AgeSex has columns age and sex.
I'm trying to write a (SQLite) query to find the names, addresses, and ages such that the names and addresses appear in both NameAddressPhone and NameAddressAge, and such that the ages appear in both NameAddressAgeand AgeSex. I'm able to get halfway there (i.e., with two tables) using inner join, but I only dabble in SQL and would appreciate some help from an expert in getting this right. I have seen solutions that appear to be similar, but don't quite follow their logic.
Thanks in advance.
Chris
I think you just want to join these together on their obvious keys:
select *
from NameAddressPhone nap join
NameAddressAge naa
on nap.name = naa.name and
nap.address = naa.address join
(select distinct age
from AgeSex asx
) asx
on asx.age = naa.age
This is selecting the distinct ages in the AgeSex to prevent the proliferation of rows. Presumably, one age could appear multiple times in that table, which would result in duplicate rows on output.
I am assuming your tables have the following layout
NameAddressPhone
================
Name
Address
Phone
NameAddressAge
==============
Name
Address
Age
AgeSex
======
Age
Sex
If I am understanding everything correctly, the solution might look kind of like this:
SELECT P.Name, P.Address, P.Phone, A.Age, S.Sex
FROM NameAddressPhone P
INNER JOIN NameAddressAge A ON P.Name = A.Name AND P.Address = A.Address
INNER JOIN AgeSex S ON A.Age = S.Age
Mind you, joining AgeSex could produce duplicate rows if there are multiple rows with the same age in AgeSex. There wouldn't be a way to distinguish 21 and Male from 21 and Female, for example.
I hope I can help and this is what you are looking for.

Join one row to multiple rows in another table

I have a table to entities (lets call them people) and properties (one person can have an arbitrary number of properties). Ex:
People
Name Age
--------
Jane 27
Joe 36
Jim 16
Properties
Name Property
-----------------
Jane Smart
Jane Funny
Jane Good-looking
Joe Smart
Joe Workaholic
Jim Funny
Jim Young
I would like to write an efficient select that would select people based on age and return all or some of their properties.
Ex: People older than 26
Name Properties
Jane Smart, Funny, Good-looking
Joe Smart, Workaholic
It's also acceptable to return one of the properties and total property count.
The query should be efficient: there are millions of rows in people table, hundreds of thousands of rows in properties table (so most people have no properties). There are hundreds of rows selected at a time.
Is there any way to do it?
Use:
SELECT x.name,
GROUP_CONCAT(y.property SEPARATOR ', ')
FROM PEOPLE x
LEFT JOIN PROPERTIES y ON y.name = x.name
WHERE x.age > 26
GROUP BY x.name
You want the MySQL function GROUP_CONCAT (documentation) in order to return a comma separated list of the PROPERTIES.property value.
I used a LEFT JOIN rather than a JOIN in order to include PEOPLE records that don't have a value in the PROPERTIES table - if you only want a list of people with values in the PROPERTIES table, use:
SELECT x.name,
GROUP_CONCAT(y.property SEPARATOR ', ')
FROM PEOPLE x
JOIN PROPERTIES y ON y.name = x.name
WHERE x.age > 26
GROUP BY x.name
I realize this is an example, but using a name is a poor choice for referencial integrity when you consider how many "John Smith"s there are. Assigning a user_id, being a unique value per user, would be a better choice.
You can use INNER JOIN to link the two tables together. More info on JOINs.
SELECT *
FROM People P
INNER JOIN Properties Pr
ON Pr.Name = P.Name
WHERE P.Name = 'Joe' -- or a specific age, etc
However, it's often a lot faster to add a unique primary key to tables like these, and to create an index to increase speed.
Say the table People has a field id
And the table Properties has a field peopleId to link them together
Then the query would then look something like this:
SELECT *
FROM People P
INNER JOIN Properties Pr
ON Pr.id = P.peopleId
WHERE P.Name = 'Joe'
SELECT x.name,(select GROUP_CONCAT(y.Properties SEPARATOR ', ')
FROM PROPERTIES y
WHERE y.name.=x.name ) as Properties FROM mst_People x
try this