Connecting Multiple Tables in SQL while Limiting Exponential Results - sql

I have searched Google and Stack Overflow for a answer to my query, but I feel my lack of SQL vocabulary is standing in my way of finding an answer as I believe this would be a common question. Any helpful points in the direction of what I need to be reading are always welcome.
On to the question, I'm attempting to join three tables in Oracle 8i, for example, a Company Table, an Invoice Table and a Jobs Table there is no link between the Invoice Table and the Jobs Table. I am hoping that in one query I can link all three tables returning all invoices and all jobs for a company without returning all jobs for each invoice (see my example results below).
I want to see:
Company 1 Invoice 1 Job 1
Company 1 Invoice 2 Job 2
Company 1 Invoice 3 Job 3
I don't want to see:
Company 1 Invoice 1 Job 1
Company 1 Invoice 1 Job 2
Company 1 Invoice 1 Job 3
Company 1 Invoice 2 Job 1
Company 1 Invoice 2 Job 2
Company 1 Invoice 2 Job 3
Company 1 Invoice 3 Job 1
Company 1 Invoice 3 Job 2
Company 1 Invoice 3 Job 3
As always thank you for any help you can offer.
EDIT:
Essentially both the Invoice and Job Tables both have a Company Table Key field it's just that the Job and Invoice table have no link to each other directly. If the instance comes up when there are 2 Invoices and 3 Jobs I'd ideally like it to show and vice versa:
Company 1 Invoice 1 Job 1
Company 1 Invoice 2 Job 2
Company 1 Job 3
Although looking at the problem like this makes me feel that this is further away from an easier answer than I hoped.

Your requirement means that you have a problem with your schema. My first advice in this case would be to modify your schema: add a job_id to invoice or an invoice_id to job (or a N-N relationship table invoice_job).
If you're not willing to update your schema, you can work a query that will make the join. The following query will basically join job and invoice one-to-one:
SELECT c.company_id, ij.job_id, ij.invoice_id
FROM company c
LEFT JOIN (SELECT job_id, invoice_id,
NVL(j.company_id, i.company_id) company_id
FROM (SELECT j.job_id, j.company_id,
row_number() OVER (PARTITION BY company_id
ORDER BY job_id) job_no
FROM job j) j
FULL OUTER JOIN
(SELECT i.invoice_id, i.company_id,
row_number() OVER (PARTITION BY company_id
ORDER BY invoice_id) invoice_no
FROM invoice i) i
ON j.company_id = i.company_id
AND j.job_no = i.invoice_no) ij
ON c.company_id = ij.company_id
The join condition here is artificial. If you remove an invoice, the jobs and invoices relationship may change.
If the two tables are really unrelated, you may want instead to display the results differently, for example:
SQL> SELECT cj.company_id, cj.jobs,
2 listagg(i.invoice_id, ',')
3 WITHIN GROUP (ORDER BY i.invoice_id) invoices
4 FROM (SELECT c.company_id,
5 listagg(j.job_id, ',') WITHIN GROUP (ORDER BY job_id) jobs
6 FROM company c LEFT JOIN job j ON c.company_id = j.company_id
7 GROUP BY c.company_id) cj
8 LEFT JOIN invoice i ON cj.company_id = i.company_id
9 GROUP BY cj.company_id, cj.jobs;
COMPANY_ID JOBS INVOICES
----------- ------ --------
1 1,2,3 1,2

Related

How to query optimize many to many association with SQL?

Scenario:
Invoice has many purchase_orders and purchase_orders has many invoices. Intermediate table is npayment_links which has foreign key invoice_id, purchase_order_id.
Tech Stack
Rails 5.x,
Postgresql
Here is my sample data
invoices
id
name
status
100
sample.pdf
archived
101
sample1.pdf
archived
102
sample2.pdf
archived
103
sample2.pdf
active
104
sample2.pdf
active
purchase_orders
id
title
1
first po
2
second po
3
third po
4
fourth po
npayment_links
id
purchase_order_id
invoice_id
1
1
100
2
1
101
3
1
102
4
2
100
5
2
103
6
3
104
7
4
100
I am expecting query which returns all purchase_orders whose all invoices are archived.
If you see npayment_links
purchase_orders with id=1 is associated with 3 invoices (100, 101, 102), which has all archived invoices.
purchase_orders with id=2 is associated with 2 invoices (100, 103), which has archived and active invoices.
purchase_orders with id=3 is associated with 1 invoice (104), which has active invoice.
purchase_orders with id=4 is associated with 1 invoice (100), which has archived invoice.
I'm searching for Sql query which returns PO list which contains all archived invoices.
Expected purchase_orders
id
title
1
first po
4
fourth po
I have achieved above issue with Rails AR way. But, I'm searching for some Sql query to achieve this:
Invoice.find(100).purchase_orders.each do |po|
if po.invoices.all? { |inv| inv.archived? }
# po.update(status: :done) # I will do some operation here. And If there are 1000s of data in which each PO again have many invoices, I might feel this will add query complexity. So, I am searching for some optimized solution here.
end
end
Any feedback would be appreciated.
You need to use JOINS to connect your tables. Then you can simply query for all purchase orders that have archived invoices and compare that to second select with EXCEPT that would give you POs that have active invoices. By using EXCEPT as a result you will get all rows that appear in first select without those that appear in second select.
SELECT
po.*
FROM
purchase_orders po
JOIN npayment_links pl on po.id = pl.purchase_order_id
JOIN invoices i on pl.invoice_id = i.id
WHERE i.status LIKE 'archived'
EXCEPT
SELECT
po.*
FROM
purchase_orders po
JOIN npayment_links pl on po.id = pl.purchase_order_id
JOIN invoices i on pl.invoice_id = i.id
WHERE i.status LIKE 'active'
db<>fiddle
Maybe not the most optimized solution but the following should work:
PurchaseOrder.where.not(id:
PurchaseOrder
.select(:id)
.joins(:invoices)
.where.not(invoices: {status: 'archived'})
)
The thought process is find all the Purchase Orders who's id is not in a list of Purchase Order ids with a status that is something other than archived and this will result in the following SQL
SELECT
purchase_orders.*
FROM
purchase_orders
WHERE
purchase_orders.id NOT IN (
SELECT
purchase_orders.id
FROM
purchase_orders
INNER JOIN invoice_links ON invoice_links.purchase_order_id = purchase_orders.id
INNER JOIN invoices ON invoices.id = invoice_links.invoice_id
WHERE
invoices.status <> 'archived'
)
This will also return Purchase Orders that do not have any invoices. If having an invoice is also a requirement you can simply add joins(:invoices) e.g.
PurchaseOrder
.joins(:invoices)
.where.not(id:
PurchaseOrder
.select(:id)
.joins(:invoices)
.where.not(invoices: {status: 'archived'})
)
Note: Your question states the Join Table is invoice_links and then references npayment_links so I am unsure which is the actual join table. For my example I will assume the join table is invoice_links as that makes more logical sense; however, provided the associations are setup correctly in the ORM, this assumption has no impact on the functionality of the proposed solution.

SQL Selecting & Counting In the same query

thanks in advance for any help on this, I am a bit of a newbie to MS SQL and I want to do something that I think is achievable but don't have the know how.
I have a simple table called "suppliers" where I can do (SELECT id, name FROM suppliers ORDER BY id ASC)
id
name
1
ACME
2
First Stop Business Supplies
3
All in One Supply Warehouse
4
Farm First Supplies
I have another table called "products"
id
name
supplier_id
1
Item 1
2
2
Item 2
1
3
Item 3
1
4
Item 4
3
5
Item 5
2
I want to list all the suppliers and get the total amount of products for each supplier if that makes sense on the same row? I am just not sure how to pass the suppliers.id through the query to get the count.
I am hoping to get to this:
id
name
total_products
1
ACME
2
2
First Stop Business Supplies
2
3
All in One Supply Warehouse
1
4
Farm First Supplies
0
I really appreciate any help on this.
Three concepts to grasp here. Left Join, group by, and Count().
select s.id, s.name, Count(*) as total_products
from suppliers s
left join products p on s.id=p.supplier_id --the left join gets your no matches
group by s.id, s.name
left join is a join where all of the values from the first table are kept even if there are no matches in the second.
Group by is an aggregation tool where the columns to be aggregated are entered.
Count() is simply a count of transactions for the grouped columns.
Try this :-
SELECT id, name, C.total_products
FROM Suppliers S
OUTER APPLY (
SELECT Count(id) AS total_products
FROM Products P
WHERE P.supplier_id = S.id
) C

SQL to fetch Data from 2 tables only using Join where we need empty records from second table as well

We have a situation in which one part of our stored procedure need to be filled with a join query, which had multiple filters in it. We need a solution only with join (it is easy to implement in the subquery, but our situation demands it to be a join [since the procedure has a where clause followed by it] )
We have two tables Customer and Order. We need to exclude the rows of Customer table, if Customer_id is present Order table & order_code = 10 & Customer.Grade = 3. It is not mandatory for all Customer_id to be present in Order table, but we still need it in the final result.
Customer Table OrderTable
Customer_id Grade Customer_id order_code
1 3 1 10
2 3 1 40
3 2 2 50
4 3 3 30
*Multiple Customer_id can be present in the OrderTable
Expected result :
Customer_id Grade
2 3
3 2
4 3
I think this may be what you need, not sure I understand the question properly.
select c.id, c.grade
from customer c left join customer_order o on (c.id = o.customer_id and o.order_code <> 10)
where c.grade = 3
This should give you all customers with a Grade of 3 that also have orders, provided the order_code is not 10. If you want to show customers that do not have any orders also, make it a left join.
You can express the logic like this:
select c.*
from customers c
where not (grade = 3 and
exists (select 1
from orders o
where o.customer_id = c.customer_id and
o.order_code = 10
)
);

SQL: Unable to find a join or union to produce the following table

A Pupil table with { ID, LastName}
a Subject Table with {ID, SubjectName}
and a Report Table with {ID, PupilID, SubjectID, Grade}
There is a one-to-many relationship between Pupil and Report Tables, and Subject and Report Tables.
I want to generate a table like this for say subjectID = 1
Pupil.ID Pupil.LastName SubjectID Grade
1 --------------Smith ---------- 1 ------------B
2 --------------Jones ---------- 1 ------------NULL
3 -------------Weston ----------1 ------------NULL
4 -------------Knightly ---------1 -----------A
The problem is that the Report table would contain just 2 entries for subject 1:
PupilID SubjectID Grade
----1------- 1 ----------- B
----4------- 1 ----------- A
Left joins don't seem to work since there are only 2 entries in the report table for subject 1
SAMPLE DATA
{Pupil Table}
ID LastName
1 ...Smith
2 ...Jones
3 ...Weston
4 ...Knightly
{Subject Table}
ID SubjectName
1 ....Maths
2 ....Physics
3 ....Chemistry
{Report Table}
ID PupilID SubjectID Grade
1 .......1 ..........1 ..........B
2 .......4 ..........1 ..........A
When I do a search on SubjectID = 1 I want the table:
Pupil.ID .......Pupil.LastName ........SubjectID ...........Grade
1 --------------Smith ---------- 1 ------------B
2 --------------Jones ---------- 1 ------------NULL
3 -------------Weston ----------1 ------------NULL
4 -------------Knightly ---------1 -----------A
Access doesn't do subqueries very easily, so everything gets crammed into the FROM clause with a series of wrapped parentheses. Based on your sample data and my fighting Access to stop being unnecessarily difficult, I came up with this:
SELECT ps.Pupil_ID, ps.LastName, ps.Subject_ID, r.Grade
FROM (SELECT * FROM (SELECT ID AS Pupil_ID, LastName FROM Pupil) p,
(SELECT DISTINCT ID AS Subject_ID FROM Subject)) ps
LEFT JOIN REPORT r ON r.PupilID = ps.Pupil_ID AND r.SubjectID = ps.Subject_ID
ORDER BY Pupil_ID, Subject_ID;
The subquery "ps" is a cartesian join of the Pupil and Subject table views that I specified. At this point, your query would look like this:
(LastName column not shown for clarity)
StudentID|SubjectID
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Now, using that Cartesian join subquery (pupilstudent -> ps), I use a LEFT JOIN to assign the Report table to each unique student's ID and subject ID. Therefore, if a student did not take a particular class, there will be a NULL value in the final result.
I tested this in Access using your sample data and it works on my machine.
Also as a note, it is poor practice to have a field called just ID in each table (e.g. in the Pupil table, ID becomes PupilID). This makes it much easier to use, and it self documents.
Cross join pupil and subject tables and left join result to report table
What you need is a cross join:
SELECT Pupil.ID, Pupil.LastName, SubjectID, Grade FROM
Pupil, Subject LEFT JOIN Report ON Subject.ID=Report.SubjectID
WHERE Subject.ID=1
To combine every pupil with every (or with a particular) subject, use cross join; Then use left join to get the corresponding grades:
select *
from pupil p cross join (select * from subject where id = 1) s
left join report on subjectId = s.id and pupilId = p.id

NHibernate + join to derived table

In a table that stores multiple rows per employee, I want to pull one row per employee that represents the most recent entry for each employee. Here's where I am with hand-written SQL:
SELECT [all the selected columns here]
FROM Nominations t
inner join
(select max(NominationId) mostRecentNominationId,
EmployeeId from Nominations group by EmployeeId) n
on n.mostRecentNominationId = t_.NominationId
From source data like this:
nomination_id employee_id
-------------------------------
1 5
2 5
4 10
7 10
That'll give me something like this:
nomination_id employee_id
-------------------------------
2 5
7 10
I haven't been able to figure out how to accomplish that type of query via NHibernate ICriteria. Any thoughts?
Here is what you need to do:
DetachedCriteria dCriteria = DetachedCriteria.For<Nomination>("nomination")
.SetProjection(Projections.Max("nomination.Id"))
.Add(Restrictions.EqProperty("nomination.EmployeeId", "employee.Id"));
var nominations = Session.CreateCriteria<Nomination>("nom")
.CreateCriteria("Employee", "employee")
.Add(Subqueries.PropertyEq("nom.Id", dCriteria)).List<Nomination>();
This is not equilevant to the SQL query providfed in the question but it does exactly the same thing.
The SQL query that is generated by the above criteria query is:
SELECT *
FROM Nomination nom
inner join Employee employee on nom.EmployeeId=employee.EmployeeId
WHERE nom.NominationId =
(SELECT max(nomination.NominationId) as maxID
FROM Nomination nomination
WHERE nomination.EmployeeId = employee.EmployeeId)