SQL Combine data from two tables - sql

How to combine two tables in SQL?
Suppose we have table called books
book_id author_id name
_______ _________ _____________
1 2 XYZ
2 1 ABC
And we have table called authors
author_id firstname surname
___________ ____________ ___________
1 Alex Woodman
2 Steve Bush
I want to combine books and authors in select query:
book_id author_id name author_name
_________ __________ __________ ______________
1 2 XYZ Steve Bush
2 1 ABC Alex Woodman

You could use the JOIN clause to merge the two tables and use the CONCAT function to concatenate name and surname of the author:
SELECT
b.book_id,
a.author_id,
b.name,
CONCAT(a.firstname, ' ', a.surname) AS author_name
FROM
books b
JOIN
author a ON b.author_id = a.author_id

Related

Oracle: Move column to new table, eliminate non-unique rows remaining in original table while maintaining foreign relationship between two tables

I have a table like the following:
Table 1: Person_Favorite_Food
id name address favorite_food
------------------------------------------
1 Dave 123 Cherry Ln Pizza
2 Dave 123 Cherry Ln Cheeseburger
3 Dave 456 Peachtree St Ice cream
4 Cindy 789 Grove Rd Pizza
id - primary key
unique key constraint on the following columns: name, address and food
Since each person can have more than one favorite food item I'd like to split Table 1 into two tables like the following:
Table 2: Person
id name address
--------------------------
1 Dave 123 Cherry Ln
3 Dave 456 Peachtree St
4 Cindy 789 Grove Rd
Table 3: Person_Favorite_Food
person_id favorite_food
-----------------
1 Pizza
1 Cheeseburger
3 Ice cream
4 Pizza
How would I go about doing this in Oracle?
Note: In the original table Rows 1 and 2 represent favorite food for the same person so the favorite food entries in the Person_Favorite_Food table will need to have the same identifier for both of those entries although the identifiers are different in the initial table.
You can use:
create table new1 as
select distinct id, name, address
from t;
create table new2 as
select id, favorite_food
from t;
I would recommend create two new tables and not trying to morph the existing table into one of the new ones.
Use simple aggregation to create table person:
create table person as
select min(id) id, name, address
from person_favorite_food
group by name, address;
Use the same query in merge to change id for some rows:
merge into person_favorite_food a
using (select min(id) id, name, address from person_favorite_food group by name, address) b
on (a.name = b.name and a.address = b.address)
when matched then update set a.id = b.id;
Drop unwanted columns:
alter table person_favorite_food drop (name, address);
Done. Demo in dbfiddle.

Join multiple tables to return only one result for each record from main table

Currently I have three tables I am joining. I have data that was migrated from one system(old) to another system(new). I need to compare this data to ensure matches but also mismatches. I have three tables. One has the list of accounts being moved. The two systems have differnt ID types so this first table is a list of all IDs for the two tables and each account that was moved. So this is my base population.
ID1 ID2
ABC 123
ABC 123
ABC 123
DEF 456
DEF 456
DEF 456
I then have table 2 which is all the data from the old system.
ID Fname Lname
ABC John Smith
ABC Tom Smith
ABC Kate Smith
DEF Jason Thomas
DEF Ruby Thomas
DEF Alex Johnson
Then table 3 is all the data found in the new system.
ID Fname Lname
123 John Smith
123 Tom Smith
123 Kate Smith
456 Jason Thomas
456 Ruby Thomas
Right now when I join these tables on the ID I get a lot more rows than I need.
When I do my join I receive this:
ID Fname_old Lname_old ID2 Fname_new Lname_new
ABC John Smith 123 John Smith
ABC John Smith 123 Tom Smith
ABC John Smith 123 Kate Smith
I am trying to join them where it only returns the row that matches, and if it can't find a match I should still get the ID from the ID file and the data from table 2(old data) as this is the data that was sent to the new system.
ID1 ID2 Fname_old Lname_old Fname_new Lname_new
ABC 123 John Smith John Smith
ABC 123 Tom Smith Tom Smith
ABC 123 Kate Smith Kate Smith
DEF 456 Jason Thomas Jason Thomas
DEF 456 Ruby Thomas Ruby Thomas
DEF 456 Alex Johnson
The code I am using is:
Select a.ID1, a.ID2, b.fname as fname_old, b.lnam as lname_old,
c.fname as fname_new, c.lname as lname_new
from table1 a
left join table2 b
on a.ID1 = b.ID
left join table3 c
on a.ID2 = c.ID
If its just duplicate rows in your first table you could try distincting them in a derived table like below:
Select a.ID1, a.ID2, b.fname as fname_old, b.lnam as lname_old,
c.fname as fname_new, c.lname as lname_new
from (SELECT DISTINCT ID1, ID2 FROM table1) a
left join table2 b
on a.ID1 = b.ID
left join table3 c
on a.ID2 = c.ID
You are joining them on ID columns.
ID columns are usually UNIQUE while you have multiple identical IDs and specify join on those IDs.
Since you need to compare data, i suggest you lookup MATCH and how it works as that seems to be closer to what you are looking for here.
You can get a match using row_number():
Select a.ID1, a.ID2, b.fname as fname_old, b.lnam as lname_old,
c.fname as fname_new, c.lname as lname_new
from (select a.*,
row_number() over (partition by id order by id) as seqnum
from table1 a
) a left join
(select b.*,
row_number() over (partition by id order by id) as seqnum
from table2 b
) b
on a.ID1 = b.ID and a.seqnum = b.seqnum
(select c.*,
row_number() over (partition by id order by id) as seqnum
from table3 c
) c
on a.ID2 = c.ID and a.seqnum = c.seqnum;
Note: This does not preserve the "ordering" of the original values, so any rows can be matched with any other. Why? SQL tables represent unordered sets.
If there is an ordering in the tables, you can use that in the order by clauses to get a match consistent with the ordering.
If you have a compare chance for name and last name this code will work.
select DISTINCT a.ID1, a.ID2, b.fname as fname_old, b.lname as lname_old, c.fname as
fname_new, c.lname as lname_new from table2 b
left join table1 a on a.ID1=b.ID
left join table3 c on a.ID2=c.ID and b.Fname=c.Fname and b.Lname=c.Lname
My Result :
ID1 ID2 fname_old lname_old fname_new lname_new
ABC 123 John Smith John Smith
ABC 123 Kate Smith Kate Smith
ABC 123 Tom Smith Tom Smith
DEF 456 Alex Johnson NULL NULL
DEF 456 Jason Thomas Jason Thomas
DEF 456 Ruby Thomas Ruby Thomas
You say that this is data transferred to two systems. So you expect all data to match. You could hence reduce the query to only find data that doesn't match, if any.
Here is a SQL standard compliant query. You tagged your request with hive. I don't know about hive, so you may have to adjust the query.
select
t2.id as id1,
t3.id as id2,
t2.fname as fname_old,
t2.lname as lname_old,
t3.fname as fname_new,
t3.lname as lname_new
from table2 t2
full outer join t3
on t3.fname = t2.fname
and t3.lname = t2.lname
and exists (select null from table1 t1 where t1.id1 = t2.id and t1.id2 = t3.id)
where t2.id is null or t3.id is null;
This is a full anti join. It returns all rows that have no exact match in the other table. It doesn't, however guesstimate which deviating rows may be pairs. You will get a result like this:
ID1 | ID2 | Fname_old | Lname_old | Fname_new | Lname_new
----+-----+-----------+-----------+-----------+----------
DEF | | Alex | Johnson | |
GHI | | Jone | Miller | |
GHI | | Maxx | Miller | |
GHI | | Fritz | Miller | |
| 789 | | | Joan | Miller
| 789 | | | Max | Miller
| 799 | | | Fritz | Miller
As you see, you would have to examine this result manually. But ideally the query shouldn't return any row at all, which would just prove that everything went as expected and nobody (system or person) messed with the data :-)

create a table based on one row of one to another table

I created different tables for each authors because each author table has different column names.
Table Author
Author_ID FirstName LastName
1 Rock Smith
2 Edward Thomas
Table Author Books
Author_ID BookName
1 Book1
1 Book2
1 Book3
1 Book4
1 Book5
I want result like this
Result Table
FirstName LastName BookName
Rock Smith Book1
Rock Smith Book2
Rock Smith Book3
Rock Smith Book4
Rock Smith Book4
Table New_Authors
Author_ID Author_Table
1 Rock_Smith
2 Edward_Thomas
Talbe Rock_Smith
FirstName LastName BookName
Rock Smith Book1
Rock Smith Book2
Rock Smith Book3
Rock Smith Book4
Rock Smith Book4
Is it possible to get all the Rock_Smith table info on querying New_Authors table?
To return the data from those 2 tables, you can run this query:
SELECT a.FirstName, A.LastName, b.BookName
FROM Author A inner join AuthorBooks B on A.Author_ID = B.Author_ID
If you wanted to create a 3rd table based on the query you would do:
Step 1) Create your Table.
Step 2) Run your query:
SELECT a.FirstName, A.LastName, b.BookName
INTO [NewTable]
FROM Author A inner join AuthorBooks B on A.Author_ID = B.Author_ID
This is a simple one to many relationship achieved by a LEFT JOIN:
SELECT a.LastName, a.FirstName, b.BookName
FROM Authors a LEFT JOIN AuthorBooks b ON b.Author_ID = a.Author_ID
ORDER BY LastName, FirstName, BookName

SQL Server stored procedure to insert/update data from one source to multiple destination tables

This is my source table
Book | Employee | StartDate | EndDate
-------------------------------------
ABC PQR 02/02/2014 06/06/2014
QWE MNO 03/03/2014 07/07/2014
This is the DB schema, where this data should fit in...
Book table
BookID | BookName
-----------------
1 ABC
2 QWE
Employee table
EmployeeID | EmployeeName
-------------------------
1 PQR
2 MNO
BookEmployee table
BookID | EmployeeID | StartDate | EndDate
------------------------------------------
1 1 02/02/2014 06/06/2014
2 2 03/03/2014 07/07/2014
Note: if Book and Employee already exist in the Book and Employee tables, then we should not insert them, instead use their ID in the BookEmployee table
I would just do three queries.
INSERT INTO [Book] (BookName)
SELECT DISTINCT Book
FROM [Source]
WHERE Book NOT IN (SELECT BookName FROM Book)
INSERT INTO [Employee] (EmployeeName)
SELECT DISTINCT Employee
FROM [Source]
WHERE Employee NOT IN (SELECT EmployeeName FROM Employee)
INSERT INTO [BookEmployee] (BookID, EmployeeID, StartDate, EndDate)
SELECT Book.ID, Employee.ID, Source.StartDate, Source.EndDate
FROM [Source]
INNER JOIN Book ON Book.BookName = Source.Book
INNER JOIN Employee ON Employee.EmployeeName = Source.Employee
You could run those in a transaction if you're doing it a lot. You could also add in some MERGE behavior, but I wouldn't bother since you don't have anything more than one column to insert. I also didn't do anything as far as merging behavior is concerned for the last query, but I'm sure this will get you enough of a start to make it work.
But yeah, you won't be able to do it all "at once," per se, and even if you could, this way is significantly more readable than that would be.

Joining multiple tables with a single query

Student
student_id FirstName LastName
---------------------------------------------------
1 Joe Bloggs
2 Alan Day
3 David Johnson
Student_Course
course_id student_id courseName
---------------------------------------------------
1 1 Computer Science
2 1 David Beckham Studies
3 1 Geography
1 3 Computer Science
3 3 Geography
Student_clubs
club_id student_id club_name club_count
---------------------------------------------------
1 1 Footbal 10
2 1 Rugby 10
3 1 Syncronized Swimming 10
4 3 Tennis 15
In the above example, student with id = 1 takes 3 course and is part of 3 clubs.
If i was to find out which courses a student is involved in or which club the student is part of i can do it but i will need to run two queries. Is it possible to run a single query against the
tables listed above so that the results come out like this:
Output
student_id FirstName Student_associated_courses Student_associated_clubs
---------------------------------------------------------------------------
1 Joe 1,2,3 Football, Rugby, Syncronized swimming
3 David 1,3 Tennis
Is it possible to get the above output with just one query? I am using JDBC to get the data so i am trying to see if i can avoid multiple trips to get the necessary data.
use GROUP_CONCAT with DISTINCT in MySQL
SELECT a.student_ID, a.firstname,
GROUP_CONCAT(DISTINCT b.course_ID),
GROUP_CONCAT(DISTINCT c.club_name)
FROM student a
INNER JOIN student_Course b
ON a.student_id = b.student_ID
INNER JOIN student_clubs c
ON a.student_ID = c.student_ID
GROUP BY a.student_ID, a.firstname
See SQLFiddle Demo
Try it like this:
SELECT *
FROM Student s JOIN
(SELECT sc."student_id", listagg(sc."course_id", ',')within group(ORDER BY sc."course_id")
FROM Student_Course sc
GROUP BY sc."student_id") s_course ON s."student_id"=s_course."student_id"
JOIN (SELECT sl."student_id", listagg(sl."club_name", ',')within GROUP(ORDER BY sl."club_name")
FROM Student_clubs sl
GROUP BY sl."student_id") s_club ON s."student_id"=s_club."student_id"
The "catch" is that LISTAGG doesn't work with DISTINCT keyword
Here is a fiddle