Primary keys for Join operation? - sql

I read from a classmate post:
“Joins are usually done using primary keys in a good database design.”
Is really using primary keys as predicate necessary for good design. I can't see how.
Thank you for your help!

Use of primary keys for a good database design could be a debate. classically according to RDBMS guideline it is recommended to create primary keys for good database design. but now a days there is a trend not to put much constraints on DB side to improve performance rather do the validations on business layer (not sure if it is true for primary keys as well).
Now coming to your question,
Primary keys are not mandatory for join operations, however it is mandatory to use columns which uniquely identifies the records of master table otherwise it can generate spurious records.
department
| dept| sub_dept | dsc |
| CS | CS | Computer sc.|
| CS | IT | Info Tech. |
student
| Name | age | sex | dept | sub_dept|
| abcd | 025 | M | CS | CS |
| wxyz | 023 | M | CS | IT |
Now if you join the tables on sub_dept you will get correct results.
select s.name, s.age, s.sex, d.dsc from student s, department d where
s.sub_dept = d.sub_dept
| Name | age | sex | dsc |
| abcd | 025 | M | Computer Sc. |
| wxyz | 023 | M | Computer Sc. |
if you join the tables on dept column you will get spurious tuples (2 extra rows)
select s.name, s.age, s.sex, d.dsc from student s, department d where s.dept = d.dept
| Name | age | sex | dsc |
| abcd | 025 | M | Computer Sc. |
| wxyz | 023 | M | Computer Sc. |
| abcd | 025 | M | Info Tech. |
| wxyz | 023 | M | Computer Sc. |

Related

SQL - specific requirement to compare tables

I'm trying to merge 2 queries into 1 (cuts the number of daily queries in half): I have 2 tables, I want to do a query against 1 table, then the same query against the other table that has the same list just less entries.
Basically its a list of (let's call it for obfuscation) people and hobby. One table is ALL people & hobby, the other shorter list is people & hobby that I've met. Table 2 would all be found in table 1. Table 1 includes entries (people I have yet to meet) not found in table 2
The tables are synced up from elsewhere, what I'm looking to do is print a list of ALL people in the first column then print the hobby ONLY of people that are on both lists. That way I can see the lists merged, and track the rate at which the gap between both lists is closing. I have tried a number of SQL combinations but they either filter out the first table and match only items that are true for both (i.e. just giving me table 2) or just adding table 2 to table 1.
Example of what I'm trying to do below:
+---------+----------+--+----------+---------+--+---------+----------+
| table1 | | | table2 | | | query | |
+---------+----------+--+----------+---------+--+---------+----------+
| name | hobby | | activity | person | | name | hobby |
| bob | fishing | | fishing | bob | | bob | fishing |
| bill | vidgames | | hiking | sarah | | bill | |
| sarah | hiking | | planking | sabrina | | sarah | hiking |
| mike | cooking | | | | | mike | |
| sabrina | planking | | | | | sabrina | planking |
+---------+----------+--+----------+---------+--+---------+----------+
Normally I'd just take the few days to learn SQL a bit better however I'm stretched pretty thin at work as it is!
I should mention the table 2 is flipped and the headings are all unique (don't think this matters)!
I think you just want a left join:
select t1.name, t2.activity as hobby
from table1 t1 left join
table2 t2
on t1.name = t2.person;

How to select table with a concatenated column?

I have the following data:
select * from art_skills_table;
+----+------+---------------------------+
| ID | Name | skills |
+----+------+---------------------------|
| 1 | Anna | ["painting","photography"]|
| 2 | Bob | ["drawing","sculpting"] |
| 3 | Cat | ["pastel"] |
+----+------+---------------------------+
select * from computer_table;
+------+------+-------------------------+
| ID | Name | skills |
+------+------+-------------------------+
| 1 | Anna | ["word","typing"] |
| 2 | Cat | ["code","editing"] |
| 3 | Bob | ["excel","code"] |
+------+------+-------------------------+
I would like to write an SQL statement which results in the following table.
+------+------+-----------------------------------------------+
| ID | Name | skills |
+------+------+-----------------------------------------------+
| 1 | Anna | ["painting","photography","word","typing"] |
| 2 | Bob | ["drawing","sculpting","excel","code"] |
| 3 | Cat | ["pastel","code","editing"] |
+------+------+-----------------------------------------------+
I've tried something like SELECT * from art_skills_table LEFT JOIN computer_table ON name. However it doesn't give what I need. I've read about array_cat but I'm having a bit of trouble implementing it.
if the skills column from both tables are arrays, then you should be able to get away with this:
SELECT a.ID, a.name, array_cat(a.skills, c.skills)
FROM art_skills_table a LEFT JOIN computer_table c
ON c.id = a.id
That said, While you used LEFT join in your sample, I think either an INNER or FULL (OUTER) join might serve you better.
First, i wondered why the data are stored in such a model.
Was of the opinion that NoSQL databases lack ability for joins and ...
... a semantic triple would be in the form of subject–predicate–object.
... a Key-value (KV) stores use associative arrays.
... a relational database would be normalized.
A few information about the use case would have helped.
Nevertheless, you can select the data with CONCAT and REPLACE for the desired form.
SELECT art_skills_table.ID, computer_table.name,
CONCAT(
REPLACE(art_skills_table.skills, '}',','),
REPLACE(computer_table.skills, '{','')
)
FROM art_skills_table JOIN computer_table ON art_skills_table.ID = computer_table.ID
The query returns the following result:
+----+------+--------------------------------------------+
| ID | Name | Skills |
+----+------+--------------------------------------------+
| 1 | Anna | {"painting","photography","word","typing"} |
| 2 | Cat | {"drawing","sculpting","code","editing"} |
| 3 | Bob | {"pastel","excel","code"} |
+----+------+--------------------------------------------+
I've used the ID for the JOIN, even though Bob has different values.
The JOIN should probably be done over the name.
JOIN computer_table ON art_skills_table.Name = computer_table.Name
BTW, you need to tell us what SQL engine you're running on.

Query M:N contains

I am trying to filter a set of tables that includes an M:N junction table in Android Room (SQLite).
An image can have many subjects. I'd like to allow filtering by a subject, so that I get a row with complete image information (including all subjects). So if an image had (National Park, Yosemite) filtering for either would result in one row with both keywords. Unless I messed something up, a typical join will result in multiple rows such that matching Yosemite would get the right image, but you'd be lacking National Park. I came up with this:
SELECT *,
(SELECT GROUP_CONCAT(name)
FROM meta_subject_junction
JOIN subject
ON subject.id = meta_subject_junction.subjectId
WHERE meta_subject_junction.metaId = meta.id) AS keywords,
(SELECT documentUri
FROM image_parent
WHERE meta.parentId = image_parent.id ) AS parentUri
FROM meta
Now this gets me the complete rows, but I think at this point I'd need to:
WHERE keywords LIKE(%YOSEMITE%)
and I think the LIKE is less than ideal, not to mention an imprecise match. Is there a better way to accomplish this? Thanks, this is bending my novice SQL brain.
Further details
meta
+----+----------+--+
| id | name | |
+----+----------+--+
| 1 | yosemite | |
| 2 | bryce | |
| 3 | flowers | |
+----+----------+--+
subject
+----+---------------+--+
| id | name | |
+----+---------------+--+
| 1 | National Park | |
| 2 | Yosemite | |
| 3 | Tulip | |
+----+---------------+--+
junction
+--------+-----------+
| metaId | subjectId |
+--------+-----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
+--------+-----------+
Although I may have done something wrong, as far as I can tell Android Room doesn't like:
+----+-----------+---------------+
| id | name | subject |
+----+-----------+---------------+
| 1 | yosemite | National Park |
| 1 | yosemite | Yosemite |
+----+-----------+---------------+
so I'm trying to reduce the rows:
+----+-----------+-------------------------+
| id | name | subject |
+----+-----------+-------------------------+
| 1 | yosemite | National Park, Yosemite |
+----+-----------+-------------------------+
which the above query does. However, I also want to query for a subject. So that National Park filter will yield:
+----+-----------+-------------------------+
| id | name | subject |
+----+-----------+-------------------------+
| 1 | yosemite | National Park, Yosemite |
| 2 | bryce | National Park |
+----+-----------+-------------------------+
I'd like to be more precise/efficient than LIKE with the already 'concat' subject. Most of my attempts end up with no results in Room (multi-row) or reducing the subject to only the filter keyword.
Update
Here's a test I've been using to compare the actual SQL results from a query to what Android Room ends up with:
http://sqlfiddle.com/#!7/0ac11/10/0
That join query is interpreted as four objects in Android Room, so I'm trying to reduce the rows, but retain the full subject results while filtering for any image containing the subject keyword.
If you want multiple keywords, then where and group by and having can be used:
select image_id
from image_subject
where subject_id in ('a', 'b', 'c') -- whatever
group by image-id
having count(distinct subject_id) = 3; -- same count as in `where`
This gets the result I need, though I'd love to hear a better option if this is particularly inefficient.
SELECT meta.*,
(SELECT GROUP_CONCAT(name)
FROM junction
JOIN subject
ON subject.id = junction.subjectId
WHERE junction.metaId = meta.id) AS keywords,
junction.subjectId
FROM meta
LEFT JOIN junction ON junction.metaId = meta.id
WHERE subjectId IN (1,2)
GROUP BY meta.id
+----+----------+------------------------+-----------+
| id | name | keywords | subjectId |
+----+----------+------------------------+-----------+
| 1 | yosemite | National Park,Yosemite | 2 |
| 2 | bryce | National Park | 1 |
+----+----------+------------------------+-----------+
http://sqlfiddle.com/#!7/86a76/13

Using a table to lookup multiple IDs on one row

I have two tables I am using at work to help me gain experience in writing SQL queries. One table contains a list of Applications and has three columns -
Application_Name, Application_Contact_ID and Business_Contact_ID. I then have a separate table called Contacts with two columns - Contact_ID and Contact_Name. I am trying to write a query that will list the Application_Name and Contact_Name for both the Applications_Contact_ID and Business_Contact_ID columns instead of the ID number itself.
I understand I need to JOIN the two tables but I haven't quite figured out how to formulate the correct statement. Help Please!
APPLICATIONS TABLE:
+------------------+------------------------+---------------------+
| Application_Name | Application_Contact_ID | Business_Contact_ID |
+------------------+------------------------+---------------------+
| Adobe | 23 | 23 |
| Word | 52 | 14 |
| NotePad++ | 44 | 989 |
+------------------+------------------------+---------------------+
CONTACTS TABLE:
+------------+--------------+
| Contact_ID | Contact_Name |
+------------+--------------+
| 23 | Tim |
| 52 | John |
| 14 | Jen |
| 44 | Carl |
| 989 | Sam |
+------------+--------------+
What I am trying to get is:
+------------------+--------------------------+-----------------------+
| Application_Name | Application_Contact_Name | Business_Contact_Name |
+------------------+--------------------------+-----------------------+
| Adobe | Tim | Tim |
| Word | John | Jen |
| NotePad++ | Carl | Sam |
+------------------+--------------------------+-----------------------+
I've tried the below but it is only returning the name for one of the columns:
SELECT Application_Name, Application_Contact_ID, Business_Contact_ID, Contact_Name
FROM Applications
JOIN Contact ON Contact_ID = Application_Contact_ID
This is a pretty critical and 101 part of SQL. Consider reading this other answer on a different question, which explains the joins in more depth. The trick to your query, is that you have to join the CONTACTS table twice, which is a bit hard to visualize, because you have to go there for both the application_contact_id and business_contact_id.
There are many flavors of joins (INNER, LEFT, RIGHT, etc.), which you'll want to familiarize yourself with for the future reference. Consider reading this article at the very least: https://www.techonthenet.com/sql_server/joins.php.
SELECT t1.application_name Application_Name,
t2.contact_name Application_Contact_name,
t3.contact_name Business_Contact_name
FROM applications t1
INNER JOIN contacts ON t2 t1.Application_Contact_ID = t2.contact_id -- join contacts for appName
INNER JOIN contacts ON t3 t1.business_Contact_ID = t3.contact_id; -- join contacts for busName

SQL: Joining two tables with email adresses in SQL Server [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have spent hours researching how to write the proper SQL for the following task, and finally I feel that I have to get some help as this is the most complex SQL query I have ever had to write :(
I am putting together an email list with all the email adresses that we have on our customers.
I have two tables: one customer table that contains customer level information, and one contact person table that contains person level information. Most of the data is overlapping, so the same email adress will occure in both tables. But the email adress field can be empty in both tables, and I do not want to return any empty rows.
Users that buy in our physical store are often only registered in the customer level table, but users that buys online are always registered both in the customer level table and the person level table.
I want to create a full list where I get all email adresses, where all email adresses are unique, no email adresses are duplicates and no email adresses are null.
Also I want to join in columns from the customer table when the data is retrieved from the person table (the zip code in my example below).
Customers
| CustomerID | Firstname | Lastname | Email | Zipcode |
| 22 | Jeff | Carson | jeffcar#mail.com | 81712 |
| 29 | John | Doe | null | 51211 |
| 37 | Gina | Andersen | null | 21147 |
| 42 | Brad | Cole | brad#company.org | 39261 |
Contact persons
| PersonID | CustomerID | Firstname | Lastname | Email |
| 8712 | 22 | Jeff | Carson | null || 8916 | 29 | Jane | Doe | jane#doe.net || 8922 | 29 | Danny | Doe | null |
| 9181 | 37 | Gina | Andersen | gina#gmail.com |
| 9515 | 37 | Ben | Andersen | ben88#gmail.com |
I want to join the tables to generate the following:
Final table
| PersonID | CustomerID | Firstname | Lastname | Email | Zipcode |
| 8712 | 22 | Jeff | Carson | jeffcar#mail.com | 81712 |
| 8916 | 29 | Jane | Doe | jane#doe.net | 51211 |
| 9181 | 37 | Gina | Andersen | gina#gmail.com | 21147 |
| 9515 | 37 | Ben | Andersen | ben88#gmail.com | 21147 |
| null | 42 | Brad | Cole | brad#company.org | 39261 |
I guessed this would be a fairly common task to do, but I haven't found anyone with a similar question, so I put my trust in the expertise out there.
This SQL will get you exactly the results table you were looking for. I've made a live demo you can play with here at SQLFiddle.
SELECT
ContactPerson.PersonID,
Customer.CustomerID,
COALESCE(ContactPerson.FirstName, Customer.FirstName) AS FirstName,
COALESCE(ContactPerson.LastName, Customer.LastName) AS LastName,
COALESCE(ContactPerson.Email, Customer.Email) AS Email,
Customer.ZipCode
FROM Customer
LEFT JOIN ContactPerson
ON ContactPerson.CustomerID = Customer.CustomerID
WHERE COALESCE(ContactPerson.Email, Customer.Email) IS NOT NULL
Results (identical to your desired results):
| PersonID | CustomerID | FirstName | LastName | Email | ZipCode |
| 8712 | 22 | Jeff | Carson | jeffcar#mail.com | 81712 |
| 8916 | 29 | Jane | Doe | jane#doe.net | 51211 |
| 9181 | 37 | Gina | Andersen | gina#gmail.com | 21147 |
| 9515 | 37 | Ben | Andersen | ben88#gmail.com | 21147 |
| NULL | 42 | Brad | Cole | brad#company.org | 39261 |
A quick explanation of some key points to aid understanding:
The query uses a LEFT JOIN to join the two tables together. JOINs are pretty common once you get into SQL problems like this. I won't go into an in-depth explanation here: now that you know what they are called you should have no trouble Googling for loads of info on them!
NB: COALESCE basically means 'the first one of these options which isn't null' (docs). So this query will grab their name and email address from ContactPerson IF POSSIBLE, otherwise from Customer. If NEITHER of these tables hold an email address, then the WHERE clause makes sure that record isn't included at all, as required.
This will work:
SELECT b.PersonID
,a.CustomerID
,a.FirstName
,a.LastName
,COALESCE(a.Email,b.Email) AS Email
,a.ZipCode
FROM Customers a
LEFT JOIN Contact b
ON a.CustomerID = b.CustomerID
WHERE COALESCE(a.Email, b.Email) IS NOT NULL
Demo: SQL Fiddle
select con.personid,
con.customerid,
con.firstname,
con.lastname,
coalesce(con.email, cus.email) email,
cus.zipcode
from contact_persons con
right join
customers cus
on con.customerid = cus.customerid