How to merge results of an Union All in Oracle - sql

Consider i have 2 queries with the below result set...First result set is for students, so all the columns pertaining to instructors are null.. The second result set is pertaining to instructors and column related to students are null and both of them share few common columns..
Student:
uid f_name m_name l_name class school Section Dept. Branch Title
1 abc c dey 2 NYU 1 null null null
2 cde d rey 3 CU 2 null null null
3 xyz r mey 4 LSU 3 null null null
Teacher:
uid f_name m_name l_name class school Section Dept. Branch Title
4 wss c tey null null null Science Biology Asso.Prof
2 cde d rey null null null Arts Music Asso.Prof
5 rrr r jey null null null Science Chemistry Prof
If you look in the above result set, UID 2 is common in both the result set, that basically means a professor can also be a student at the same time... Now i want to join/merge these two queries into a common resultset say 'Users' who are basically the teachers and students.
The result set of 'Users' should be unique with respect to UID. If i use union all, there will be duplicates on UID 2. I need a query which can merge the columns for a single row... The result set should be:
1 abc c dey 2 NYU 1 null null null
2 cde d rey 3 CU 2 Arts Music Asso.Prof
3 xyz r mey 4 LSU 3 null null null
4 wss c tey null null null Science Biology Asso.Prof
5 rrr r jey null null null Science Chemistry Prof
Note 2 above, it has both student and professor details in one row...
How can i achieve this in Oracle ? Appreciate your help.

As has been mentioned in Comments, this is a poor design (or perhaps a poor solution using a good design). If the base tables are "person" (containing only information that is of the same kind for students and instructors, such as UID, name, date of birth, email etc.), "student" (with UID as foreign key, and showing only characteristics specific to students) and "teacher" (same for teachers or instructors), then the design is fine, and your desired final output can be obtained directly from these base tables, not from the results of other queries you have written. jva shows something along these lines in his/her answer.
If you really have to put up with it, the way to use your existing queries is to use union all - but then you must group by uid, and for each column you must select max(...) - for example max(f_name) and max(school). Use column aliases in the SELECT clause, for example select .... , max(school) as school, ...
Or, slightly more efficiently (but with some risks), group by id, f_name, m_name, l_name and select max(...) only for the remaining columns. The risk is that in one place the m_name is N and in the other it is null; that is, your data is internally inconsistent. (If it all comes from one base table, "person", then this risk shouldn't exist.)

This would be the general approach:
SELECT persons.id
,nvl(<some_student_field>, <same_teacher_field) as some_common_field
,...
FROM persons
LEFT OUTER JOIN students on (person.id = students.person_id)
LEFT OUTER JOIN teachers on (person.id = teachers.person_id)
WHERE <mandatory_field_for_students> IS NOT NULL
OR <mandatory_field_for_teachers> IS NOT NULL

Related

How to join with a table on a non unique column without getting redundant results?

Say I have two tables named Customers and Banks.
Customers has Id(PK), Name and BankCode
Banks has Id(PK), Name and Code
BankCode column in Customers table is a loose reference to Code column in Banks table. Code is not a foreign key so there are multiple records with the same Code.
Sample data is like this:
Persons:
Id
Name
BankCode
1
Jack
2
2
Jane
2
3
John
5
Banks:
Id
Name
Code
1
National
2
2
National Subsidiary
2
3
GNB
3
4
Global Banking
5
I need to get a list of persons with their bank name attached, I tried simply joining tables like this:
SELECT P.Id, P.Name, P.BankCode, B.Name
FROM Persons P
JOIN Banks B
ON P.BankCode = B.Code
But this query results in redundant records for persons whose bank code isn't unique. In this case Jack and Jane will each have two similar records with different bank names.
Since the banks with the same code are a family and have similar names, how can I manipulate the query to result only one record for each person (only use one bank name and ignore others)?
If you don't want to fix the design of your tables,
than the only way I can think of is to show all banks with that code for each user,
like this
select p.id,
p.Name,
( select string_agg(b.Name, ', ')
from Banks b
where b.BankCode = p.BankCode
) as Banks
from Persons p
This will look like this
id
Name
Banks
1
Jack
National, National Subsidiary
2
Jane
National, National Subsidiary
3
John
Global Banking
See a working DBFiddle here

How to display data in SQL from multiple tables, but only if one column data matches another column?

I'm still learning SQL, so this may just be my ignorance or inability to express in a search what I'm looking for. I've spent roughly an hour searching for some variation of the title (both here and general searches on Google). I apologize, I apparently also don't know how to format here. I'll try to clean it up now that I've posted.
I have a database of customer data that I did not design. In the GUI, there are multiple tabs, and it seems like each tab earned it's own table. The tables are linked together with a field called RecordID. In one of the tables is the Customer Data tab. The way that it's organized is that a single customer record from table A can have multiple rows in table B. I only want data from column B in table B is "CompanyA" and if column A in table B = 1. Sample data is below.
Expected output:
CardNumber LastName FirstName CustomerID DataItem
------------------------------------------------------
32154 Clapton Eric 181212 CompanyA
Table A:
RecordID CardNumber LastName FirstName CustomerID
---------------------------------------------------------------
1 12345 Smith John 190201
2 12346 Jones Sandy 190202
3 23456 Petty Tom 190203
4 32154 Clapton Eric 181212
5 14728 Tyler Steven 180225
Table B:
RecordID DataID DataItem
--------------------------------
1 0 CompanyA
1 1 Yes
1 2 No
1 3 Revoked
1 4 NULL
1 5 CompanyB
2 0 CompanyB
2 1 Yes
2 2 No
2 3 NULL
2 4 24-54A
2 5 CompanyC
3 0 CompanyA
3 1 No
3 2 No
3 3 NULL
3 4 68-69B
3 5 NULL
4 0 CompanyA
4 1 Yes
4 2 Yes
5 0 CompanyB
5 1 No
5 2 No
5 5 CompanyA
The concept you're looking for is a JOIN. In this case specifically you need an INNER JOIN. Joins connects two tables together based on criteria you specify (such as matching values in fields) and merges the result into one table in the output.
Here's an example to suit your scenario:
SELECT
A.CardNumber,
A.LastName,
A.FirstName,
A.CustomerID,
B.DataItem
FROM
TableA A
INNER JOIN TableB B -- join tableB onto tableA
ON A.RecordID = B.RecordID -- in the ON clause you specify criteria by you match the fields
WHERE
B.columnA = 'CompanyA'
AND B.columnB = 1
Here's the relevant SQL Server Documentation
Also I'd advise you to potentially take a comprehensive introductory SQL tutorial, and/or find a book. A good one will introduce all of the basic, key concepts such as this to you in a logical way, then you're not grasping in the dark trying to google things for which you don't know the correct terminology.
select a.CardNumber, a.LastName, a.FirstName, a.CustomerID, b.dataitem
from tableA A inner join TableB b
on a.recordid = b.recordid
where b.columnA= 'CompanyA' and b.columnB = 1
Here is your solution,
select a.CardNumber, a.LastName, a.FirstName, a.CustomerID, b.DataItem from
tableA a
inner join tableB b
on (a.RecordID = b.RecordID)
where
b.DataItem='CompanyA'
b.RecordID=1;
Le me know if the result is not as expected
Your question is quite hard to understand, but let me give you an example that resembles the what i think you are asking.
SELECT a.*, b.DataItem FROM A a INNER JOIN B b
ON a.RecordID = b.RecordID AND
b.DataItem = `CompanyA`
At the database engine level, if you are using Microsoft technology, the most efficient structure is to use an indexed foreign key constraint on Table B, and a Primary Surrogate Key (PSK) column on Table A. The Primary Surrogate Key in your case is on the Parent table, Table A, and is called RecordID. The foreign key column with the FKC is on Table B, on the column named RecordID. Once you verify that there is a FKC (foreign key constraint on Table B, which pins both columns named RecordID between both tables on matched values), then address the GUI. At the GUI, between the tabs, you generally indicate you have a parent table with a unique set of Record IDs (one column named Record ID with absolutely unique values in each row and no empty rows on that column). There will also be child tables on each Tab in your GUI, and those are bound to the parent table in a "1 to Many (1:M)" fashion, where 1 parent has many children. Your commentary or question indicates that you also want to filter, where Record ID on the child in one of the related tabs equates to the integer value 1 on the Record ID. So, there needs to be a query somewhere:
SELECT [columns]
FROM [Table B]
INNER JOIN [Table A]
ON A.RecordID = B.RecordID
AND B.RecordID = 1;
Does that help?

How would I combine 2 distinct tables into 1 table in SSIS?

Say I have 2 distinct tables in SSIS from 2 different servers.
Table 1 Table 2
Animal Age Owner Location
Dog 10 Bill IL
Dog 7 Kelly CA
Cat 4 Tom TX
I want to have one single result table that is
Result Table
Animal Age Owner Location
Dog 10 NULL NULL
Dog 7 NULL NULL
Cat 4 NULL NULL
NULL NULL Bill IL
NULL NULL Kelly CA
NULL NULL Tom TX
a UNION should fit:
select animal, age, null as owner, null as location from animal
union
select null as animal, null as age, owner, location from owner
If you want to join two datasets without any join logic (No key in common) you need to:
Create a fake key column in each dataset (With derived column for exemple, value of the fake column should be different in each dataset)
Sort these fake key columns
Use a full outer join merge based on these fake relations
You make think it's a weird way to do, that is because it's a weird result you are trying to obtain, maybe could you explain your initial need ?
You should use merge component from SSIS tools. If you want FULL OUTER JOIN you will choose that in the editor of component.
But first of all you need to go to the Input and Output Properties tab and in the OLE DB Source Output set the IsSorted property value to True.(You need to make sure that the input data is truly sorted though.)

SQL to list data from 2 tables joined by a foreign key

Sorry for the very simple question, I have tried researching, but the examples are with too specific to a particular person's issue or, sites only explain foreign key constraints for Creating,altering or dropping in a table.
Anyway, I have 2 tables, 1 containing 2 columns being the Unique primary key and the other post codes:
PCID postCode
1 CB1 4PY
2 CB2 9GH
3 CB23 4DG
and the second is people, 4 columns, first PK, second FK from PostCodes, then forename and surename.
PId PCID firstName lastName
1 1 Fred Bloggs
2 2 Arthur Brown
3 1 Mary Bloggs
4 4 Nigel Wilson
I just want to be able to list postcodes and the names of people who live there.
Try this:
SELECT n.firstName, n.lastName FROM Names n JOIN PostCode USING(PCID)
Name and PostCode here is the table names, change to yours.
Try this
SELECT t2.FirstName,t2.LastName , t1.PostCode
FROM postcodetablename t1
JOIN namestablename t2 on t1.PcId=t2.PcId

UPDATE query that fixes orphaned records

I have an Access database that has two tables that are related by PK/FK. Unfortunately, the database tables have allowed for duplicate/redundant records and has made the database a bit screwy. I am trying to figure out a SQL statement that will fix the problem.
To better explain the problem and goal, I have created example tables to use as reference:
alt text http://img38.imageshack.us/img38/9243/514201074110am.png
You'll notice there are two tables, a Student table and a TestScore table where StudentID is the PK/FK.
The Student table contains duplicate records for students John, Sally, Tommy, and Suzy. In other words the John's with StudentID's 1 and 5 are the same person, Sally 2 and 6 are the same person, and so on.
The TestScore table relates test scores with a student.
Ignoring how/why the Student table allowed duplicates, etc - The goal I'm trying to accomplish is to update the TestScore table so that it replaces the StudentID's that have been disabled with the corresponding enabled StudentID. So, all StudentID's = 1 (John) will be updated to 5; all StudentID's = 2 (Sally) will be updated to 6, and so on. Here's the resultant TestScore table that I'm shooting for (Notice there is no longer any reference to the disabled StudentID's 1-4):
alt text http://img163.imageshack.us/img163/1954/514201091121am.png
Can you think of a query (compatible with MS Access's JET Engine) that can accomplish this goal? Or, maybe, you can offer some tips/perspectives that will point me in the right direction.
Thanks.
The only way to do this is through a series of queries and temporary tables.
First, I would create the following Make Table query that you would use to create a mapping of the bad StudentID to correct StudentID.
Select S1.StudentId As NewStudentId, S2.StudentId As OldStudentId
Into zzStudentMap
From Student As S1
Inner Join Student As S2
On S2.Name = S1.Name
Where S1.Disabled = False
And S2.StudentId <> S1.StudentId
And S2.Disabled = True
Next, you would use that temporary table to update the TestScore table with the correct StudentID.
Update TestScore
Inner Join zzStudentMap
On zzStudentMap.OldStudentId = TestScore.StudentId
Set StudentId = zzStudentMap.NewStudentId
The most common technique to identify duplicates in a table is to group by the fields that represent duplicate records:
ID FIRST_NAME LAST_NAME
1 Brian Smith
3 George Smith
25 Brian Smith
In this case we want to remove one of the Brian Smith Records, or in your case, update the ID field so they both have the value of 25 or 1 (completely arbitrary which one to use).
SELECT min(id)
FROM example
GROUP BY first_name, last_name
Using min on ID will return:
ID FIRST_NAME LAST_NAME
1 Brian Smith
3 George Smith
If you use max you would get
ID FIRST_NAME LAST_NAME
25 Brian Smith
3 George Smith
I usually use this technique to delete the duplicates, not update them:
DELETE FROM example
WHERE ID NOT IN (SELECT MAX (ID)
FROM example
GROUP BY first_name, last_name)