One to many relation and subqueries - sql

I have a table persons(id, fist, last) and another table properties(person_id, property_name, property_value).
Each person may have many (undefined) properties, the name of these properties can vary based on the favorites of users. For example, for these records, we want the following output:
properties
==========
person_id property_name property_value
----------------------------------------
1 Gender Male
1 Education Under
person
======
id first last
----------------
1 John Smith
result
======
id First Last Gender Education
-----------------------------------
1 John Smith Male Under
Is there an easy way to have this done without using several steps in querying the db? I mean using subqueries, join, group by, or any other means necessary to do the job?
PS.I am using sqlite2&3
Thank you,
Mahmoud

You can do
SELECT
pers.id,
pers.first,
pers.last,
gender.property_value AS gender,
edu.property_value AS education
FROM person pers
LEFT JOIN properties gender
ON (gender.person_id = pers.id AND gender.property_name = 'Gender')
LEFT JOIN properties edu
ON (edu.person_id = pers.id AND edu.property_name = 'Education')
But there is no way to do this for an arbitrary number of columns. You need to know which columns you want and join accordingly.
What you can do, is build a query dynamically beforehand in another language, and then execute that. If you do so, please use parameters. Please?
If you want, you can create a view showing all properties for person_ids:
CREATE VIEW all_properties (
person_id,
gender,
education
)
AS SELECT
pers.person_id,
gender.property_value,
edu.property_value
FROM (SELECT DISTINCT person_id FROM properties) pers
LEFT JOIN properties gender
ON (gender.person_id = pers.person_id AND gender.property_name = 'Gender')
LEFT JOIN properties edu
ON (edu.person_id = pers.person_id AND edu.property_name = 'Education')
But that's not going to make much difference.

Related

Find potential duplicate names in database

I have two tables in a SQL Server Database:
Table: People
Columns: ID, FirstName, LastName
Table: StandardNames
Columns: Nickname, StandardName
Sample Nicknames would be Rick, Rich, Richie when StandardName is Richard.
I would like to find duplicate contacts in my People table but replace any of the nicknames with the standard name. IE: sometimes I have Rich Smith other times it is Richard Smith in the People table. Is this possible? I realize it might be multiple joins to the same table but can't figure out how to start.
Firstly, you need to determine how many duplicates you have in your People table...
SELECT p.FirstName, COUNT(*)
FROM People AS p
INNER JOIN StandardNames AS sn
ON CHARINDEX(sn.Nickname, p.FirstName) > 0 OR
CHARINDEX(sn.Nickname, p.LastName) > 0
GROUP BY p.FirstName
HAVING COUNT(*) > 1
That's just to get an idea of what data you're trying to find in relation to the Nicknames that may possibly exist inside (as a wildcard word search) the Firstname and Lastname columns.
If you are happy with the items found then expand on the query to update the values.
Let's say you wanted to change the Firstname to be the Standardname...
UPDATE p2
SET p2.FirstName = p2.Standardname
FROM
(SELECT p.ID, sn.StandardName
FROM People AS p
INNER JOIN StandardNames AS sn
ON CHARINDEX(sn.Nickname, p.FirstName) > 0 OR
CHARINDEX(sn.Nickname, p.LastName) > 0) AS a
INNER JOIN People AS p2 ON p2.ID = a.ID
So this will obviously find all the People IDs that have a match based on the query above, and it will update the People table by replacing the FirstName with the StandardName.
However, there are issues with this due to the limitation of your question.
the StandardNames table should have its own ID field. All tables should have an ID column as its primary table. That's just my view.
this is only going to work for data it matches using the CHARINDEX() function. What you really need is something to find based on a "sound" or similarity to the nicknames. Check out the SOUNDEX() function and apply your logic from there.
And this is assuming your IDs above are unique!
Good luck
You could standardize the names by joining, and count the number of occurrences. Extracting the ID is a bit fiddly, but also quite possible. I'd suggest the following - use a case expression to find the contact with the standard name, and if you don't have one, just take the id of the first duplicate:
SELECT COALESCE(MIN(CASE FirstName WHEN StandardName THEN id END), MIN(id)),
StandardName,
LastName,
COUNT(*)
FROM People p
LEFT JOIN StandardNames s ON FirstName = Nickname AND
GROUP BY StandardName, LastName

Returning duplicated values only once from a join query

I'm trying to extract info from a table in my database based on a persons job. In one table i have all the clients info, in another table linked by ID_no their job title and the branches theyre associated with. the problem I'm having is when i join both tables I'm returning some duplicates because a person can be associated with more than one branch.
I would like to know how to return the duplicated values only once, because all I care about for the moment is the persons id number and what their job title is.
SELECT *
FROM dbo.employeeinfo AS ll
LEFT OUTER JOIN employeeJob AS lly
ON ll.id_no = lly.id_no
WHERE lly.job_category = 'cle'
I know Select Distinct will not work in this situation since the duplicated values return different branches.
Any help would be appreciated. Thanks
I'm using sql server 2008 by the way
*edit to show result i would like
------ ll. ll. lly. lly.
rec_ID --employeeID---Name-----JobTitle---Branch------
1 JX100 John cle london
2 JX100 John cle manchester
3 JX690 Matt 89899 london
4 JX760 Steve 12345 london
I would like the second record to not display because i'm not interested in the branch. i just need to know the employee id and his job title, but because of how the tables are structured it's returning JX100 twice because he's recorded as working in 2 different branches
You must use SELECT DISTINCT and specify you ONLY want person id number and job title.
I don't know exactly your fields name, but I think something like this could work.
SELECT DISTINCT ll.id_no AS person_id_number,
lly.job AS person_job
FROM dbo.employeeinfo AS ll LEFT OUTER JOIN
employeeJob AS lly ON ll.id_no = lly.id_no
WHERE lly.job_category = 'cle'

Join one row to multiple rows in another table

I have a table to entities (lets call them people) and properties (one person can have an arbitrary number of properties). Ex:
People
Name Age
--------
Jane 27
Joe 36
Jim 16
Properties
Name Property
-----------------
Jane Smart
Jane Funny
Jane Good-looking
Joe Smart
Joe Workaholic
Jim Funny
Jim Young
I would like to write an efficient select that would select people based on age and return all or some of their properties.
Ex: People older than 26
Name Properties
Jane Smart, Funny, Good-looking
Joe Smart, Workaholic
It's also acceptable to return one of the properties and total property count.
The query should be efficient: there are millions of rows in people table, hundreds of thousands of rows in properties table (so most people have no properties). There are hundreds of rows selected at a time.
Is there any way to do it?
Use:
SELECT x.name,
GROUP_CONCAT(y.property SEPARATOR ', ')
FROM PEOPLE x
LEFT JOIN PROPERTIES y ON y.name = x.name
WHERE x.age > 26
GROUP BY x.name
You want the MySQL function GROUP_CONCAT (documentation) in order to return a comma separated list of the PROPERTIES.property value.
I used a LEFT JOIN rather than a JOIN in order to include PEOPLE records that don't have a value in the PROPERTIES table - if you only want a list of people with values in the PROPERTIES table, use:
SELECT x.name,
GROUP_CONCAT(y.property SEPARATOR ', ')
FROM PEOPLE x
JOIN PROPERTIES y ON y.name = x.name
WHERE x.age > 26
GROUP BY x.name
I realize this is an example, but using a name is a poor choice for referencial integrity when you consider how many "John Smith"s there are. Assigning a user_id, being a unique value per user, would be a better choice.
You can use INNER JOIN to link the two tables together. More info on JOINs.
SELECT *
FROM People P
INNER JOIN Properties Pr
ON Pr.Name = P.Name
WHERE P.Name = 'Joe' -- or a specific age, etc
However, it's often a lot faster to add a unique primary key to tables like these, and to create an index to increase speed.
Say the table People has a field id
And the table Properties has a field peopleId to link them together
Then the query would then look something like this:
SELECT *
FROM People P
INNER JOIN Properties Pr
ON Pr.id = P.peopleId
WHERE P.Name = 'Joe'
SELECT x.name,(select GROUP_CONCAT(y.Properties SEPARATOR ', ')
FROM PROPERTIES y
WHERE y.name.=x.name ) as Properties FROM mst_People x
try this

Simple MySQL problem

I'm working on a MySQL database that contains persons. My problem is that, (I will simplify to make my point):
I have three tables:
Persons(id int, birthdate date)
PersonsLastNames(id int, lastname varchar(30))
PersonsFirstNames(id int, firstname varchar(30))
The id is the common key. There are separate tables for last names and first names because a single person can have many first names and many last names.
I want to make a query that returns all persons with, let's say, one last name. If I go with
select birthdate, lastname, firstname from Persons, PersonsLastNames,
PersonsFirstNames where Persons.id = PersonsLastNames.id and
Persons.id = PersonsFirstNames.id and lastName = 'Anderson'
I end up with a table like
1/1/1970 Anderson Steven //Person 1
1/1/1970 Anderson David //Still Person 1
2/2/1980 Smith Adam //Person 2
3/3/1990 Taylor Ed //Person 3
When presenting this, I would like to have
1/1/1970 Anderson Steven David
2/2/1980 Smith Adam [possibly null?]
3/3/1990 Taylor Ed [possibly null?]
How do I join the tables to introduce new columns in the result set if needed to hold several first names or last names for one person?
Does your application really need to handle unlimited first/last names per person? I don't know your specific needs, but that seems like it may be a little extreme. Regardless...
Since you can't really have a dynamic number of columns returned, you could do something like this:
SELECT birthdate, lastname, GROUP_CONCAT(firstname SEPARATOR '|') AS firstnames
FROM Persons, PersonsLastNames, PersonsFirstNames
WHERE Persons.id = PersonsLastNames.id
AND Persons.id = PersonsFirstNames.id
GROUP BY Persons.id
This would return one row per person that has a last name, with the (unlimited) first names separated by a pipe (|) symbol, GROUP_CONCAT function.
birthdate lastname firstnames
--- --- ---
1970-01-01 00:00:00 Anderson Steven|David
1980-02-02 00:00:00 Smith Adam
1990-03-03 00:00:00 Taylor Ed
SQL does not support a dynamic number of columns in the query select-list. You have to define exactly as many columns as you want (notwithstanding the * wildcard).
I recommend that you fetch the multiple names as rows, not columns. Then write some application code to loop over the result set and do whatever you want to do for presenting them.
The short answer is, you can't. You'll always have to pick a fixed number of columns. You can, however, greatly improve the syntax of your query by using the ON keyword. For example:
SELECT
birthdate,
firstName,
lastName
FROM
Persons
INNER JOIN PersonsLastNames
ON Persons.id = PersonsLastNames.id
INNER JOIN PersonsFirstNames
ON Persons.id = PersonsFirstNames.id
WHERE
lastName = 'Anderson'
GROUP BY
lastName, firstName
HAVING
count(lastName) = 1
Of course, my query includes a few extra provisions at the end so that only persons with only one last name specified would be grabbed, but you can always remove those.
Now, what you CAN do is choose a maximum number of these you'd like to retrieve and do something like this:
SELECT
birthdate,
lastName,
PersonsFirstNames.firstName,
IFNULL(p.firstName,''),
IFNULL(q.firstName,'')
FROM
Persons
INNER JOIN PersonsLastNames
ON Persons.id = PersonsLastNames.id
INNER JOIN PersonsFirstNames
ON Persons.id = PersonsFirstNames.id
LEFT JOIN PersonsFirstNames p
ON Persons.id = p.id
AND p.firstName <> PersonsFirstNames.firstName
LEFT JOIN PersonsFirstNames q
ON Persons.id = q.id
AND q.firstName <> PersonsFirstNames.firstName
AND q.firstName <> p.firstName
GROUP BY
lastName
But I really don't recommend that. The best bet is to retrieve multiple rows, and then iterate over them in whatever application you're using/developing.
Make sure you read up on your JOIN types (Left-vs-Inner), if you're not already familiar, before you start. Hope this helps.
EDIT: You also might want to consider, in that case, a slightly more complex GROUP BY clause, e.g.
GROUP BY
Persons.id, lastName
I think the closest thing you could do is to Group By Person.Id and then do string concatenation. Perhaps this post will help:
How to use GROUP BY to concatenate strings in MySQL?

Should I split this table into two?

I am trying to wrap my head around database normalization. This is my first time trying to create a working database so please forgive me for my ignorance. I am trying to create an automated grad Check system for a class project. The following table keeps track of all options for a major for a set number of catalog years. The table is as follows
PID Title Dept Courses Must_have
Some options give the user a choice of a set number of classes out of the total listed (hence the Must_have attribute). A completed row would look like this:
PID Title Dept Courses Must_have
--------------------------------------------
1 bis acct 201|202 NULL
Title is the name of the option that can come with the major. If bis (business information systems) had a choice of classes, one row would have a number in the Must_have for only one row.
My question is should I split this table into two different tables? I know the way I currently have it seems somewhat... well wrong. Any help would be greatly appreciated.
I would break dept into a separate table and associate it with a numeric ID. Then break your "courses" field into a "join table". Something like this:
majors
Id Title DepartmentID
major_courses
Id MajorId CourseId MustHave
departments
Id Title
So that, you may have a major like:
1 bis 1
a major_course like:
1 1 201 0
1 1 202 0
1 1 203 1 -- must have 203
then departments like:
1 bis
So now, to get a list of courses for the first major you can do this:
SELECT major_courses.CourseId, major_courses.MustHave, departments.Title
FROM majors
RIGHT JOIN major_courses ON major_courses.CourseId = majors.Id
INNER JOIN departments ON departments.Id = majors.DepartmendID
WHERE major.id = 1
I would split it into three tables. The first would be majors and would contain PID, Title, Dept, the second would be courses, containing the course ID, course name and any other information, and the last would be a mapping between majors and courses (perhaps named courses_majors). The courses_majors table would contain the ID of the major, the ID of a course and a flag to show whether or not it is required by that major.
(This is assuming that one course could be used in multiple majors)