Join one row to multiple rows in another table - sql

I have a table to entities (lets call them people) and properties (one person can have an arbitrary number of properties). Ex:
People
Name Age
--------
Jane 27
Joe 36
Jim 16
Properties
Name Property
-----------------
Jane Smart
Jane Funny
Jane Good-looking
Joe Smart
Joe Workaholic
Jim Funny
Jim Young
I would like to write an efficient select that would select people based on age and return all or some of their properties.
Ex: People older than 26
Name Properties
Jane Smart, Funny, Good-looking
Joe Smart, Workaholic
It's also acceptable to return one of the properties and total property count.
The query should be efficient: there are millions of rows in people table, hundreds of thousands of rows in properties table (so most people have no properties). There are hundreds of rows selected at a time.
Is there any way to do it?

Use:
SELECT x.name,
GROUP_CONCAT(y.property SEPARATOR ', ')
FROM PEOPLE x
LEFT JOIN PROPERTIES y ON y.name = x.name
WHERE x.age > 26
GROUP BY x.name
You want the MySQL function GROUP_CONCAT (documentation) in order to return a comma separated list of the PROPERTIES.property value.
I used a LEFT JOIN rather than a JOIN in order to include PEOPLE records that don't have a value in the PROPERTIES table - if you only want a list of people with values in the PROPERTIES table, use:
SELECT x.name,
GROUP_CONCAT(y.property SEPARATOR ', ')
FROM PEOPLE x
JOIN PROPERTIES y ON y.name = x.name
WHERE x.age > 26
GROUP BY x.name
I realize this is an example, but using a name is a poor choice for referencial integrity when you consider how many "John Smith"s there are. Assigning a user_id, being a unique value per user, would be a better choice.

You can use INNER JOIN to link the two tables together. More info on JOINs.
SELECT *
FROM People P
INNER JOIN Properties Pr
ON Pr.Name = P.Name
WHERE P.Name = 'Joe' -- or a specific age, etc
However, it's often a lot faster to add a unique primary key to tables like these, and to create an index to increase speed.
Say the table People has a field id
And the table Properties has a field peopleId to link them together
Then the query would then look something like this:
SELECT *
FROM People P
INNER JOIN Properties Pr
ON Pr.id = P.peopleId
WHERE P.Name = 'Joe'

SELECT x.name,(select GROUP_CONCAT(y.Properties SEPARATOR ', ')
FROM PROPERTIES y
WHERE y.name.=x.name ) as Properties FROM mst_People x
try this

Related

Find potential duplicate names in database

I have two tables in a SQL Server Database:
Table: People
Columns: ID, FirstName, LastName
Table: StandardNames
Columns: Nickname, StandardName
Sample Nicknames would be Rick, Rich, Richie when StandardName is Richard.
I would like to find duplicate contacts in my People table but replace any of the nicknames with the standard name. IE: sometimes I have Rich Smith other times it is Richard Smith in the People table. Is this possible? I realize it might be multiple joins to the same table but can't figure out how to start.
Firstly, you need to determine how many duplicates you have in your People table...
SELECT p.FirstName, COUNT(*)
FROM People AS p
INNER JOIN StandardNames AS sn
ON CHARINDEX(sn.Nickname, p.FirstName) > 0 OR
CHARINDEX(sn.Nickname, p.LastName) > 0
GROUP BY p.FirstName
HAVING COUNT(*) > 1
That's just to get an idea of what data you're trying to find in relation to the Nicknames that may possibly exist inside (as a wildcard word search) the Firstname and Lastname columns.
If you are happy with the items found then expand on the query to update the values.
Let's say you wanted to change the Firstname to be the Standardname...
UPDATE p2
SET p2.FirstName = p2.Standardname
FROM
(SELECT p.ID, sn.StandardName
FROM People AS p
INNER JOIN StandardNames AS sn
ON CHARINDEX(sn.Nickname, p.FirstName) > 0 OR
CHARINDEX(sn.Nickname, p.LastName) > 0) AS a
INNER JOIN People AS p2 ON p2.ID = a.ID
So this will obviously find all the People IDs that have a match based on the query above, and it will update the People table by replacing the FirstName with the StandardName.
However, there are issues with this due to the limitation of your question.
the StandardNames table should have its own ID field. All tables should have an ID column as its primary table. That's just my view.
this is only going to work for data it matches using the CHARINDEX() function. What you really need is something to find based on a "sound" or similarity to the nicknames. Check out the SOUNDEX() function and apply your logic from there.
And this is assuming your IDs above are unique!
Good luck
You could standardize the names by joining, and count the number of occurrences. Extracting the ID is a bit fiddly, but also quite possible. I'd suggest the following - use a case expression to find the contact with the standard name, and if you don't have one, just take the id of the first duplicate:
SELECT COALESCE(MIN(CASE FirstName WHEN StandardName THEN id END), MIN(id)),
StandardName,
LastName,
COUNT(*)
FROM People p
LEFT JOIN StandardNames s ON FirstName = Nickname AND
GROUP BY StandardName, LastName

SQL Table design (select * from table where ( field = search1, and field = search2)

PatientDX
Name Disease
Aa HIV
Aba DM
Bb HT
Bb DM
Aa HT
I want to get patient names who have both HIV, DM or all diseases or something like that. I want to make analysis by disease checkboxes in UI. How can I do it? Is my table design bad? Could you suggest me a better way to achieve that? Disease names can be as many as 100. So, I want to make it easy to find out patients who have particular 3 diseases, 4 or 5 and so on. Thank you.
Your schema is good.
Here is an example of how to query for patients having all 3 specific diseases:
select Name
from PatientDX
where Disease in ('HIV', 'DM', 'HT')
group by Name
having count(distinct Disease) = 3
A few important things to note here:
We use distinct in the having clause to make sure that if only 'HIV' was passed in 3 times (I'm assuming you will be using parameterized queries) we wouldn't get a result back.
The value you are comparing count to (in this case, 3) must match the number of values in the in clause.
The list of values in the IN clause must be unique before you count them.
Try grouping by user and then using having clause in your query. So to find a patient who has more than one disease you could query like:
SELECT NAME
FROM PatientDX
GROUP BY NAME
HAVING COUNT(DISTINCT Disease) > 1
If you wish to know patient who has all the diseases then you could do something like:
SELECT NAME
FROM PatientDX
GROUP BY NAME
HAVING COUNT(DISTINCT Disease) = (SELECT COUNT(DISTINCT Disease)
FROM PatientDX)-- although you could maintain disease in another table.

SQL show multiple values from same columns

There are 2 tables and I must do an inner join.
First table called People
Name, Surname, id, and value
Second table called Work
id (external key of fist table), category, state, roles, date.
Column "Roles" can have multiple values (employee director workers etc).
I must show with inner join the history of roles for each people in one row ( Name Surname roles1, roles 2 roles3)
Example Jack Brown employee director workers
How can I show in one row multiple values contained in one Columns?
If you just need to see the roles but don't really require them to be in separate columns you can use listagg()
select p.id,
p.name,
p.surname,
listagg(w.roles, ',') within group (order by start_date) as all_rows
from people p
join work w on p.id = w.id
group by p.id, p.name, p.surname
This would output something like this:
ID | NAME | SURNAME | ALL_ROLES
---+--------+---------+-------------------------
1 | Jack | Brown | employee,worker,director
You can't actually have each role in a separate column, because in SQL the number of columns in a result is fixed. So you can't have a result that has three columns for the roles of "Jack Brown" and two column for the roles of "Arthur Dent".
You could write PL-SQL function which will select all related records from table Work by given id from table People and iterate it in a cursor to build a string with all of roles or you could generate XML by using of DBMS_XMLGEN.GETXML function if you are using the 10g version of Oracle or higher

One to many relation and subqueries

I have a table persons(id, fist, last) and another table properties(person_id, property_name, property_value).
Each person may have many (undefined) properties, the name of these properties can vary based on the favorites of users. For example, for these records, we want the following output:
properties
==========
person_id property_name property_value
----------------------------------------
1 Gender Male
1 Education Under
person
======
id first last
----------------
1 John Smith
result
======
id First Last Gender Education
-----------------------------------
1 John Smith Male Under
Is there an easy way to have this done without using several steps in querying the db? I mean using subqueries, join, group by, or any other means necessary to do the job?
PS.I am using sqlite2&3
Thank you,
Mahmoud
You can do
SELECT
pers.id,
pers.first,
pers.last,
gender.property_value AS gender,
edu.property_value AS education
FROM person pers
LEFT JOIN properties gender
ON (gender.person_id = pers.id AND gender.property_name = 'Gender')
LEFT JOIN properties edu
ON (edu.person_id = pers.id AND edu.property_name = 'Education')
But there is no way to do this for an arbitrary number of columns. You need to know which columns you want and join accordingly.
What you can do, is build a query dynamically beforehand in another language, and then execute that. If you do so, please use parameters. Please?
If you want, you can create a view showing all properties for person_ids:
CREATE VIEW all_properties (
person_id,
gender,
education
)
AS SELECT
pers.person_id,
gender.property_value,
edu.property_value
FROM (SELECT DISTINCT person_id FROM properties) pers
LEFT JOIN properties gender
ON (gender.person_id = pers.person_id AND gender.property_name = 'Gender')
LEFT JOIN properties edu
ON (edu.person_id = pers.person_id AND edu.property_name = 'Education')
But that's not going to make much difference.

Simple MySQL problem

I'm working on a MySQL database that contains persons. My problem is that, (I will simplify to make my point):
I have three tables:
Persons(id int, birthdate date)
PersonsLastNames(id int, lastname varchar(30))
PersonsFirstNames(id int, firstname varchar(30))
The id is the common key. There are separate tables for last names and first names because a single person can have many first names and many last names.
I want to make a query that returns all persons with, let's say, one last name. If I go with
select birthdate, lastname, firstname from Persons, PersonsLastNames,
PersonsFirstNames where Persons.id = PersonsLastNames.id and
Persons.id = PersonsFirstNames.id and lastName = 'Anderson'
I end up with a table like
1/1/1970 Anderson Steven //Person 1
1/1/1970 Anderson David //Still Person 1
2/2/1980 Smith Adam //Person 2
3/3/1990 Taylor Ed //Person 3
When presenting this, I would like to have
1/1/1970 Anderson Steven David
2/2/1980 Smith Adam [possibly null?]
3/3/1990 Taylor Ed [possibly null?]
How do I join the tables to introduce new columns in the result set if needed to hold several first names or last names for one person?
Does your application really need to handle unlimited first/last names per person? I don't know your specific needs, but that seems like it may be a little extreme. Regardless...
Since you can't really have a dynamic number of columns returned, you could do something like this:
SELECT birthdate, lastname, GROUP_CONCAT(firstname SEPARATOR '|') AS firstnames
FROM Persons, PersonsLastNames, PersonsFirstNames
WHERE Persons.id = PersonsLastNames.id
AND Persons.id = PersonsFirstNames.id
GROUP BY Persons.id
This would return one row per person that has a last name, with the (unlimited) first names separated by a pipe (|) symbol, GROUP_CONCAT function.
birthdate lastname firstnames
--- --- ---
1970-01-01 00:00:00 Anderson Steven|David
1980-02-02 00:00:00 Smith Adam
1990-03-03 00:00:00 Taylor Ed
SQL does not support a dynamic number of columns in the query select-list. You have to define exactly as many columns as you want (notwithstanding the * wildcard).
I recommend that you fetch the multiple names as rows, not columns. Then write some application code to loop over the result set and do whatever you want to do for presenting them.
The short answer is, you can't. You'll always have to pick a fixed number of columns. You can, however, greatly improve the syntax of your query by using the ON keyword. For example:
SELECT
birthdate,
firstName,
lastName
FROM
Persons
INNER JOIN PersonsLastNames
ON Persons.id = PersonsLastNames.id
INNER JOIN PersonsFirstNames
ON Persons.id = PersonsFirstNames.id
WHERE
lastName = 'Anderson'
GROUP BY
lastName, firstName
HAVING
count(lastName) = 1
Of course, my query includes a few extra provisions at the end so that only persons with only one last name specified would be grabbed, but you can always remove those.
Now, what you CAN do is choose a maximum number of these you'd like to retrieve and do something like this:
SELECT
birthdate,
lastName,
PersonsFirstNames.firstName,
IFNULL(p.firstName,''),
IFNULL(q.firstName,'')
FROM
Persons
INNER JOIN PersonsLastNames
ON Persons.id = PersonsLastNames.id
INNER JOIN PersonsFirstNames
ON Persons.id = PersonsFirstNames.id
LEFT JOIN PersonsFirstNames p
ON Persons.id = p.id
AND p.firstName <> PersonsFirstNames.firstName
LEFT JOIN PersonsFirstNames q
ON Persons.id = q.id
AND q.firstName <> PersonsFirstNames.firstName
AND q.firstName <> p.firstName
GROUP BY
lastName
But I really don't recommend that. The best bet is to retrieve multiple rows, and then iterate over them in whatever application you're using/developing.
Make sure you read up on your JOIN types (Left-vs-Inner), if you're not already familiar, before you start. Hope this helps.
EDIT: You also might want to consider, in that case, a slightly more complex GROUP BY clause, e.g.
GROUP BY
Persons.id, lastName
I think the closest thing you could do is to Group By Person.Id and then do string concatenation. Perhaps this post will help:
How to use GROUP BY to concatenate strings in MySQL?