mysql select in two tables - sql

I have two tables and one reference table for the query. Any suggestion or help would greatly appreciated.
table1
user_id username firstname lastname address
1 john867 John Smith caloocan
2 bill96 Bill Jones manila
table2
user_name_id username firstname lastname address designation
1 jakelucas Jake Lucas caloocan employee
2 jadejones Jade Jones Quezon student
3 bong098 Bong Johnson pasig employee
reference table
ref_id username friend_username
1 tirso bill96
2 tirso jadejones
2 tirso bong098
the output should like this
user_id user_name_id username firstname lastname address designation
2 bill96 Bill Jones manila
2 jadejones Jade Jones Quezon student
3 bong098 Bong Johnson pasig employee

Since some decent union queries have already been posted, I'll talk about your db design a little bit.
I would definitely take what IronGoofy said into serious consideration before you take too much time looking into joining these tables together. It seems that you have a lot of duplicate data to manage with your tables, and that could get out of hand rather quickly should this scale up.
I think you should probably try and separate your data out so that the important information can be linked on the user_id.
So, for instance, you could have a few tables here...
User Information Table:
---------
User_id
Username
First Name
Last Name
Address
Designation_id
Friend Link Table:
---------
Friend_link_id
User_id
Friend_user_id
Designation Table:
---------
Designation_id
Designation_name
So, rather than link on your user names all over the place, you would simply join on the various ID's. A bit cleaner and missing the duplicate data issue that you had before IMO. Hope this helps...

Can you try something like this
SELECT [table1].[USER_ID],
NULL user_name_id,
[table1].username,
[table1].firstname,
[table1].lastname,
[table1].address,
NULL designation
FROM reference_table INNER JOIN
table1 ON [reference_table].friend_username = [table1].username
UNION
SELECT NULL USER_ID,
[table2].user_name_id,
[table2].username,
[table2].firstname,
[table2].lastname,
[table2].address,
[table2].designation
FROM reference_table INNER JOIN
table2 ON [reference_table].friend_username = [table2].username

It's not quite clear what you are trying to achieve, but here's a guess:
SELECT user_id, NULL as user_name_id, username, ...
FROM ref_tab r join table1 t1 on r.friend_username = t1.username
WHERE r.ref_id = 1
UNION
SELECT NULL as user_id, user_name_id, username, ...
FROM ref_tab r join table1 t2 on r.friend_username = t2.username
WHERE r.ref_id = 2
But I'd have a hard look at the DB design and think about some improvements ...

Related

SQL: Compare 2 tables and state if data was found

I'm a novice when it comes to SQL, so forgive me if this is a dumb question.
I have 2 tables, one with a list of users, and one that holds email history data.
Users Table:
userID fName lName ...
1 John Smith
2 Jane Doe
3 Kevin Cooper
Email History Table:
emailID userID subject sendDate ...
1 6 welcome 2020-10-17
2 3 hello 2020-10-20
3 7 welcome 2020-10-23
I am wanting to do some sort of select statement that would compare every customer in table 1, to every email in table 2 based on some sort of search query (in this case where subject = "hello" and sendDate = "2020-10-20" and would return something like this:
Returned Query:
userID fName lName ... emailSent?
1 John Smith ... No
2 Jane Doe ... No
3 Kevin Cooper ... Yes
One option uses exists and a correlated subquery:
select u.*,
exists (
select 1
from emailhistory eh
where eh.userid = u.userid and eh.subject = 'hello' and eh.senddate = '2020-20-20'
) emailSent
from users u
This gives you 0/1 values in column emailSent, where 1 indicates that a match exists. As compared to the left join approach, the upside is that it does not "multiplies" the user rows if more than one match is found in the history table.
For performance, consider an index on emailhistory(userid, subject, senddate).
You can left join the email table on, putting your date and subject criteria in the where clause:
SELECT
u.userid,
u.fname,
u.lname,
case when eh.emailid is null then 'No' else 'Yes' end as emailsent
FROM
users u
LEFT JOIN
emailhistory eh
ON
u.userid = eh.emailid AND
eh.subject = 'hello' AND
eh.senddate = '2020-10-20'
This conceptually filters the email table down to just that subject and day, then joins those records onto the users table. You get every row from users and only rows from emailhistory that match on userid, and also had that subject/date. You can then examine whether the emailid (a key of the join) is null or not; the only way it can be null is if no email was sent to that user on that date with that subject

Rows as Columns without Join

Please have a look at this. The result shows, indeed, a join of two sets. I want the output as following i.e. No Cartesian Product.
ID_1 TYPE_1 NAME_1 ID_2 TYPE_2 NAME_2
===============================================================
TP001 1 Adam Smith TV001 2 Leon Crowell
TP002 1 Ben Conrad TV002 2 Chris Hobbs
TP003 1 Lively Jonathan
I used one of the solution, join, known to me to select rows as columns but i need results in required format while join is not mandatory.
You need an artificial column as id. Use rownum for that on both types of teachers.
Because you do not know if there are more Teachers of type 1 or of Type 2, you must do a full outer join to combine both sets.
SELECT *
FROM (SELECT ROWNUM AS cnt, teacherid
, teachertype, teachername
FROM teachers
WHERE teachertype = 1) qry1
FULL OUTER JOIN (SELECT ROWNUM AS cnt, teacherid
, teachertype, teachername
FROM teachers
WHERE teachertype = 2) qry2
ON qry1.cnt = qry2.cnt
In general, databases think in rows, not in columns. In your example you are lucky - you only have two types of teachers. For every new type of teacher you would have to alter your statement and append a full outer join only to present the output of your query in a special way - one set per column.
But with a simple select you retrive the same Information and it will work regardless how many teacher types you have.
SQL is somewhat limited in presenting data, i would leave that to the client retriving the data or use PL/SQL for a more generic aproach.
There should be some constraint of keys on which you join table or tables. If there is no constraint it will always result in Cartesian Product i.e number of rows of first table x numbers of rows of second table
SELECT TONE.TEACHERID ID_1, TONE.TEACHERTYPE TYPE_1, TONE.TEACHERNAME NAME_1
,TTWO.TEACHERID ID_2, TTWO.TEACHERTYPE TYPE_2, TTWO.TEACHERNAME NAME_2
FROM
(SELECT TEACHERID, TEACHERTYPE, TEACHERNAME FROM TEACHERS WHERE TEACHERTYPE = 1)
TONE
FULL OUTER JOIN
(SELECT TEACHERID, TEACHERTYPE, TEACHERNAME FROM TEACHERS WHERE TEACHERTYPE = 2)
TTWO
ON TONE.TEACHERID = REPLACE(TTWO.TEACHERID,'TV','TP');
ID_1 TYPE_1 NAME_1 ID_2 TYPE_2 NAME_2
===== ====== ====== ====== ====== ======
TP001 1 Adam Smith TV001 2 Leon Crowell
TP002 1 Ben Conrad TV002 2 Chris Hobbs
TP003 1 Lively Jonathan (null) (null) (null)
http://www.sqlfiddle.com/#!4/c58f3/28

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance
Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,
Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table
This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.

Sql COALESCE entire rows?

I just learned about COALESCE and I'm wondering if it's possible to COALESCE an entire row of data between two tables? If not, what's the best approach to the following ramblings?
For instance, I have these two tables and assuming that all columns match:
tbl_Employees
Id Name Email Etc
-----------------------------------
1 Sue ... ...
2 Rick ... ...
tbl_Customers
Id Name Email Etc
-----------------------------------
1 Bob ... ...
2 Dan ... ...
3 Mary ... ...
And a table with id's:
tbl_PeopleInCompany
Id CompanyId
-----------------
1 1
2 1
3 1
And I want to query the data in a way that gets rows from the first table with matching id's, but gets from second table if no id is found.
So the resulting query would look like:
Id Name Email Etc
-----------------------------------
1 Sue ... ...
2 Rick ... ...
3 Mary ... ...
Where Sue and Rick was taken from the first table, and Mary from the second.
SELECT Id, Name, Email, Etc FROM tbl_Employees
WHERE Id IN (SELECT ID From tbl_PeopleInID)
UNION ALL
SELECT Id, Name, Email, Etc FROM tbl_Customers
WHERE Id IN (SELECT ID From tbl_PeopleInID) AND
Id NOT IN (SELECT Id FROM tbl_Employees)
Depending on the number of rows, there are several different ways to write these queries (with JOIN and EXISTS), but try this first.
This query first selects all the people from tbl_Employees that have an Id value in your target list (the table tbl_PeopleInID). It then adds to the "bottom" of this bunch of rows the results of the second query. The second query gets all tbl_Customer rows with Ids in your target list but excluding any with Ids that appear in tbl_Employees.
The total list contains the people you want — all Ids from tbl_PeopleInID with preference given to Employees but missing records pulled from Customers.
You can also do this:
1) Outer Join the two tables on tbl_Employees.Id = tbl_Customers.Id. This will give you all the rows from tbl_Employees and leave the tbl_Customers columns null if there is no matching row.
2) Use CASE WHEN to select either the tbl_Employees column or tbl_Customers column, based on whether tbl_Customers.Id IS NULL, like this:
CASE WHEN tbl_Customers.Id IS NULL THEN tbl_Employees.Name ELSE tbl_Customers.Name END AS Name
(My syntax might not be perfect there, but the technique is sound).
This should be pretty performant. It uses a CTE to basically build a small table of Customers that have no matching Employee records, and then it simply UNIONs that result with the Employee records
;WITH FilteredCustomers (Id, Name, Email, Etc)
AS
(
SELECT Id, Name, Email, Etc
FROM tbl_Customers C
INNER JOIN tbl_PeopleInCompany PIC
ON C.Id = PIC.Id
LEFT JOIN tbl_Employees E
ON C.Id = E.Id
WHERE E.Id IS NULL
)
SELECT Id, Name, Email, Etc
FROM tbl_Employees E
INNER JOIN tbl_PeopleInCompany PIC
ON C.Id = PIC.Id
UNION
SELECT Id, Name, Email, Etc
FROM FilteredCustomers
Using the IN Operator can be rather taxing on large queries as it might have to evaluate the subquery for each record being processed.
I don't think the COALESCE function can be used for what you're thinking. COALESCE is similar to ISNULL, except it allows you to pass in multiple columns, and will return the first non-null value:
SELECT Name, Class, Color, ProductNumber,
COALESCE(Class, Color, ProductNumber) AS FirstNotNull
FROM Production.Product
This article should explain it's application:
http://msdn.microsoft.com/en-us/library/ms190349.aspx
It sounds like Larry Lustig's answer is more along the lines of what you need though.

SQL View with Data from two tables

I can't seem to crack this - I have two tables (Persons and Companies), and I'm trying to create a view that:
shows all persons
also returns companies by themselves once, regardless of how many persons are related to it
orders by name across both tables
To clarify, some sample data:
(Table: Companies)
Id Name
1 Banana
2 ABC Inc.
3 Microsoft
4 Bigwig
(Table: Persons)
Id Name RelatedCompanyId
1 Joe Smith 3
2 Justin
3 Paul Rudd 4
4 Anjolie
5 Dustin 4
The output I'm looking for is something like this:
Name PersonName CompanyName RelatedCompanyId
ABC Inc. NULL ABC Inc. NULL
Anjolie Anjolie NULL NULL
Banana NULL Banana NULL
Bigwig NULL Bigwig NULL
Dustin Dustin Bigwig 4
Joe Smith Joe Smith Microsoft 3
Justin Justin NULL NULL
Microsoft NULL Microsoft NULL
Paul Rudd Paul Rudd Bigwig 4
As you can see, the new "Name" column is ordered across both tables (the company names appear correctly in between the person names), and each company appears exactly once, regardless of how many people are related to it.
Can this even be done in SQL?! P.S. I'm trying to create a view so I can use this later for easy data retrieval, fulltext indexing and make the programming side simpler by just querying the view.
Here's one way:
select * from (
select Name, null as PersonName, Name as CompanyName, null as RelatedCompanyID
from Companies
union
select Persons.Name as Name, Persons.Name as PersonName, Companies.Name as CompanyName, RelatedCompanyID
from Persons
left join Companies on Persons.RelatedCompanyID = Companies.ID
) as AggregatedData
order by AggregatedData.Name
Or slightly more readably with a common table expression, although there's no other real benefit in this case:
with AggregatedData as (
select Name, null as PersonName, Name as CompanyName, null as RelatedCompanyID
from Companies
union
select Persons.Name as Name, Persons.Name as PersonName, Companies.Name as CompanyName, RelatedCompanyID
from Persons
left join Companies on Persons.RelatedCompanyID = Companies.ID
)
select * from AggregatedData
order by AggregatedData.Name