Oracle SQL GROUP BY clause containing joins - sql

I'm having issues writing this query correctly. Below is the goal, my current query, and attached are the scripts to build and populate the database. Thanks for any assistance!
For each DVD in the catalog, display it’s title, length, release_date, and how many times it has been checked out by all customers across all libraries. Include those that have not been checked out yet (display as 0). Sort results by title.
SELECT C.TITLE, D.LENGTH, C.RELEASE_DATE, COUNT(T.TRANSACTION_ID)
FROM catalog_item C
INNER JOIN dvd D ON D.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
INNER JOIN physical_item P ON P.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
LEFT OUTER JOIN transaction T ON T.PHYSICAL_ITEM_ID = P.PHYSICAL_ITEM_ID
GROUP BY C.TITLE;
Run first: https://drive.google.com/open?id=1PYAZV4KIfZtxP4eQn35zsczySsxDM7ls
Run second: https://drive.google.com/open?id=1pAzWmJqvD3o3n6YJqVUM6TtxDafKGd3f
EDIT
I've gotten the query working but haven't figured out how to get DVDs with zero checkouts to show up. Below is my updated query.
SELECT C.TITLE, D.LENGTH, C.RELEASE_DATE, COUNT(T.TRANSACTION_ID)
FROM catalog_item C
INNER JOIN dvd D ON D.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
INNER JOIN physical_item P ON P.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
LEFT OUTER JOIN transaction T ON T.PHYSICAL_ITEM_ID = P.PHYSICAL_ITEM_ID
GROUP BY C.TITLE, D.LENGTH, C.RELEASE_DATE;
SOLVED
Figured out the issue with GMB's help. Final query below!
SELECT C.TITLE, D.LENGTH, C.RELEASE_DATE, NVL(COUNT(T.TRANSACTION_ID), 0) AS NUMBER_OF_CHECKOUTS
FROM catalog_item C
INNER JOIN dvd D ON D.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
LEFT JOIN physical_item P ON P.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
LEFT JOIN transaction T ON T.PHYSICAL_ITEM_ID = P.PHYSICAL_ITEM_ID
GROUP BY C.TITLE, D.LENGTH, C.RELEASE_DATE
ORDER BY C.TITLE;

Your second query sure looks better than the first one, as it has the correct GROUP BY clause.
It is hard to provide a 100% sure response without seeing the full tables structures, however if you are still missing records with 0 checkouts in the output, it means that one of your INNER JOINs is not matching. In other words, you have DVDs in your catalog that are either not present in the dvd table, or not present in the physical_item table. As both tables look like referential table, this could indicate a discrepency in your data. I would recommend to change all INNER JOINs to LEFT JOINs to work around this issue.
Also please note that if no checkout happened for a DVD, expression COUNT(T.TRANSACTION_ID) will yield NULL : hence you want to wrap it in a NVL function to handle this case.
New query :
SELECT C.TITLE, D.LENGTH, C.RELEASE_DATE, NVL(COUNT(T.TRANSACTION_ID), 0)
FROM catalog_item C
LEFT JOIN dvd D ON D.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
LEFT JOIN physical_item P ON P.CATALOG_ITEM_ID = C.CATALOG_ITEM_ID
LEFT JOIN transaction T ON T.PHYSICAL_ITEM_ID = P.PHYSICAL_ITEM_ID
GROUP BY C.TITLE, D.LENGTH, C.RELEASE_DATE;

Related

How to inner join with multiple tables for 3NF Normalization in SQL

I am trying to create normalization 3 nf (normal form) in this database. However, when I execute the query, the table is empty. As you will see, this is my table and diagram.
Here is the relationship that we want to 3NF
Torder->(Customer-Food-Carrier-Waiter)-(Ternary Relationship)-(Customer_id,Food_id,Carrier_id,Waiter_id)
Here is my query
SELECT
Customers.ID AS CustomerID,
Food.ID AS FoodID,
Carrier.ID AS CarrierID,
Waiter.ID AS WaiterID,
tOrder.ID AS TORDERID
FROM
tOrder
INNER JOIN
Customers ON Customers.ID = tOrder.Customer_id
INNER JOIN
Food ON Food.ID = tOrder.Food_id
INNER JOIN
Carrier ON Carrier.ID = tOrder.Carrier_id
INNER JOIN
Waiter ON Waiter.ID = tOrder.Waiter_id
ORDER BY tOrder.ID;
Clearly, you have an issue where some of the tables are empty or the ids do not match. You can use LEFT JOIN to keep all orders and see what is happening:
FROM tOrder o LEFT JOIN
Customers c
ON c.ID = o.Customer_id LEFT JOIN
Food f
ON f.ID = o.Food_id LEFT JOIN
Carrier ca
ON ca.ID = o.Carrier_id LEFT JOIN
Waiter w
ON w.ID = o.Waiter_id
If there are no matches, then the orders are still in the results, with NULL for the values in the table with no matches.
Note that this also introduces table aliases, so the query is easier to write and to read.

retrieve people on same flight

I want to retrieve all people on same flight.
Db Model looks like:
Airplane(airplane_id),airplane_name,modelno)
Passenger (id,firstname,lastname,...)
Booking(booking_id,passenger_id,destination_id,flight_date,....)
Destinations(id,airplane_id,route,distance)
What do i need to correct in this query
SELECT * FROM Passenger as P
left join Booking as B
on P.id = B.passenger_id
left join Destinations as D
on D.id = B.destination_id
left join Airplane as A
on A.airplane_id = D.airplane_id;
How will i use where statement??
Your main issue is that you are joining on all the wrong Ids. See below for what they should be based on your column names. (If this is incorrect I would highly recommend changing your column names so they are less confusing). I have also included a sample where clause, you can put any condition you like in there though.
SELECT P.*
FROM
Passenger AS P
JOIN Booking AS B ON P.Id = B.Passenger_Id
JOIN Destinations AS D ON D.Id = B.Destination_Id
JOIN Airplane AS A ON A.Airplane_Id = D.Airplane_Id
WHERE
D.flight_date = '3/16/2021'
you can execute this query, then add a where clause to filter the results
SELECT *
FROM Airplane a
inner join Booking b on b.airplane_id = a.airplane_id
inner join Passenger p on p.id = b.passenger_id
inner join Destination d on d.id = b.destination_id

WHERE clause in an SQL query

I THINK what is happening with this query is if there are no records in the GenericAttribute table associated with the Product, then that product is not displayed. See line below in WHERE clause: "AND GenericAttribute.KeyGroup = 'Product'"
Is there a way to reword so that that part of the WHERE is ignored if no associated record in the GenericAttribute table?
Also, looking at my ORDER BY clause, will a record from the product table still show up if it has no associated record in the Pvl_AdDates table?
Thanks!
SELECT DISTINCT Product_Category_Mapping.CategoryId, Product.Id, Product.Name, Product.ShortDescription, Pvl_AdDates.Caption, Pvl_AdDates.EventDateTime, convert(varchar(25), Pvl_AdDates.EventDateTime, 120) AS TheDate, Pvl_AdDates.DisplayOrder, Pvl_Urls.URL, [Address].FirstName, [Address].LastName, [Address].Email, [Address].Company, [Address].City, [Address].Address1, [Address].Address2, [Address].ZipPostalCode, [Address].PhoneNumber
FROM [Address]
RIGHT JOIN (GenericAttribute
RIGHT JOIN (Pvl_Urls RIGHT JOIN (Pvl_AdDates
RIGHT JOIN (Product_Category_Mapping
LEFT JOIN Product
ON Product_Category_Mapping.ProductId = Product.Id)
ON Pvl_AdDates.ProductId = Product.Id)
ON Pvl_Urls.ProductId = Product.Id)
ON GenericAttribute.EntityId = Product.Id)
ON Address.Id = convert(int, GenericAttribute.Value)
WHERE
Product_Category_Mapping.CategoryId=12
AND GenericAttribute.KeyGroup = 'Product'
AND Product.Published=1
AND Product.Deleted=0
AND Product.AvailableStartDateTimeUtc <= getdate()
AND Product.AvailableEndDateTimeUtc >= getdate()
ORDER BY
Pvl_AdDates.EventDateTime DESC,
Product.Id,
Pvl_AdDates.DisplayOrder
I strongly encourage you to not mix left join and right join. I have written many SQL queries and cannot think of an occasion when that was necessary.
In fact, just stick to left join.
If you want all products (or at least all products not filtered out by the where clause), then start with the products table and go from there:
FROM Products p LEFT JOIN
Product_Category_Mapping pcm
ON pcm.ProductId = p.Id LEFT JOIN
Pvl_AdDates ad
ON ad.ProductId = p.id LEFT JOIN
Pvl_Urls u
ON u.ProductId = p.id LEFT JOIN
GenericAttribute ga
ON ga.EntityId = p.id LEFT JOIN
Address a
ON a.Id = convert(int, ga.Value)
Note that I added table aliases. These make queries easier to write and to read.
I would add a caution. It looks like you are combining data along different dimensions. You are likely to get a Cartesian product of the dimension attributes for each dimension. Perhaps that is what you want or the WHERE clause takes care of the additional rows.
Yes put constraints (restrictions) on tables on the outer side of outer joins in the on conditions of the outer join, not in the where clause. Conditions in where clauses are not evaluated and applied until after the outer joins are evaluated, so where there is not record in the outer table, the predicate will be false and entire row will be eliminated, undoing the outer-ness. Conditions in the join are evaluated during the join, before the rows from the inner side are added back in, so the result set will still include them.
Second, formatting formatting, formatting! Stick to one direction of join (left is easier) and use Aliases for tables names!
SELECT DISTINCT m.CategoryId, p.Id,
p.Name, p.ShortDescription, d.Caption, d.EventDateTime,
convert(varchar(25), d.EventDateTime, 120) TheDate,
d.DisplayOrder, u.URL, a.FirstName, a.LastName,
a.Email, a.Company, a.City, a.Address1, a.Address2,
a.ZipPostalCode, a.PhoneNumber
FROM Product_Category_Mapping m
left join Product p on p.Id = m.ProductId
and p.Published=1
and p.Deleted=0
and p.AvailableStartDateTimeUtc <= getdate()
and p.AvailableEndDateTimeUtc >= getdate()
left join Pvl_AdDates d ON d.ProductId = p.Id
left join Pvl_Urls u ON u.ProductId = p.Id
left join GenericAttribute g ON g.EntityId = p.Id
and g.KeyGroup = 'Product'
left join [Address] a ON a.Id = convert(int, g.Value)
WHERE m.CategoryId=12
ORDER BY d.EventDateTime DESC, p.Id, d.DisplayOrder

Outer Join or NOT IN

Need a little help solving problem. Apologies if this is vague which is why I need help. Show The title ID, title, publisher id, publisher name, and terms for all books that have never been sold. This means that there are no records in the Sales Table. Use OUTER JOIN or NOT IN against SALES table. In this case, the terms are referring to the items on the titles table involving money. ( Price, advance, royalty ).
I've come up with the following code and there are no records returned. I've looked at the tables and no records should be returned. Not sure my code is correct based on using the OuterJoin or NOT IN. Thanks!
SELECT titles.title_id,
titles.title,
publishers.pub_id,
sales.payterms,
titles.price,
titles.advance,
titles.royalty
FROM publishers
INNER JOIN titles
ON publishers.pub_id = titles.pub_id
INNER JOIN sales
ON titles.title_id = sales.title_id
WHERE (sales.qty = 0)
I believe all you need it to change the INNER JOIN to a LEFT JOIN for your sales table. Then check for NULL on any of the columns from the sales table.
SELECT titles.title_id, titles.title, publishers.pub_id, sales.payterms,
titles.price, titles.advance, titles.royalty
FROM publishers INNER JOIN
titles ON publishers.pub_id = titles.pub_id LEFT JOIN
sales ON titles.title_id = sales.title_id
WHERE sales.qty IS NULL
Well, you are using INNER JOIN so it will only return records that match the join condition which is not what you want. You should use LEFT OUTER JOIN (or LEFT JOIN) and check that the sales side is null (meaning there's no record matching).
SELECT titles.title_id, titles.title, publishers.pub_id, sales.payterms,
titles.price, titles.advance, titles.royalty
FROM publishers INNER JOIN
titles ON publishers.pub_id = titles.pub_id LEFT OUTER JOIN
sales ON titles.title_id = sales.title_id
WHERE sales.title_id IS NULL
IS NULL Means that a data value does not exist in the database. And just change your INNER JOIN TO LEFT JOIN.
use this..
SELECT titles.title_id,
titles.title,
publishers.pub_id,
sales.payterms,
titles.price,
titles.advance,
titles.royalty
FROM publishers
INNER JOIN titles
ON publishers.pub_id = titles.pub_id
LEFT JOIN sales
ON titles.title_id = sales.title_id
WHERE (sales.qty IS NULL)

Issues with joining many tables getting to many values

This is my script:
select c.rendering_id as prov_number, c.begin_date_of_service as date_of_service,
c.practice_id as group_number, v.enc_nbr as invoice, p.person_nbr as patient,
v.enc_nbr as invoice_number, c.charge_id as transaction_number,
t.med_rec_nbr as primary_mrn, p.last_name, p.first_name,
z.payer_id as orig_fsc_number, z.payer_id as curr_fsc_number,
c.location_id as location_number, c.closing_date as posting_date,
c.quantity as service_units, c.amt as charge_amount,
c.cpt4_code_id as procedure_code, r.description as procedure_name,
x.tran_code_id as pay_code_number, ISNULL([modifier_1],'') as modifier_code_1,
ISNULL([modifier_2],'') as modifier_code_2, ISNULL([modifier_3],'') as modifier_code_3,
ISNULL ([icd9cm_code_id],'') as dx_code_1, ISNULL ([icd9cm_code_id_2],'') as dx_code_2,
ISNULL ([icd9cm_code_id_3],'') as dx_code_3, ISNULL ([icd9cm_code_id_4],'') as dx_code_4
from charges c, person p, patient t, patient_encounter v, encounter_payer z, cpt4_code_mstr r, transactions x
where c.person_id = p.person_id
and c.person_id = t.person_id
and c.person_id = v.person_id
and c.person_id = z.person_id
and c.cpt4_code_id = r.cpt4_code_id
and c.person_id = x.person_id
and c.practice_id = '0001'
and c.closing_date >= GetDate() - 7
I should be getting about 14k rows but with this I am getting a couple hundred thousand. I feel like there should be an inner join here to correct it but I have read through a bunch of posts and can seem to get it working. Its by far the biggest pull I have ever done in SQL.
Any help would be greatly help.
Without knowing more about the data structures and foreign key relationships, this answer is just educated speculation. Before answering, though, you need to learn proper JOIN syntax. Your query should look like:
from charges c join
person p
on . . . .
That said, you problem is probably that you are joining along multiple dimensions at the same time. Although not explicitly clear, I am guessing that a person could have multiple patient encounters, say A, B, and C. A person might also have multiple charges, say 10, 11, and 12.
Your query will produce nine rows in this case, one for each combination.
In other words, you need to identify:
Verify the join keys between tables. Is a table called transactions really joined to encounters and costs using the person_id?
Find out where you are getting cross products, and split into two subqueries that are then appropriately joined together.
I would suggest that you start with the first two tables, and see whether you get the expected row count for:
select *
from charges c join
person p
on c.person_id = p.person_id
where c.practice_id = '0001' and
c.closing_date >= GetDate() - 7
Then build up the query one table at a time to get the results you want.
One last note, when using table aliases, I find it much clearer to use aliases that evoke the table. "C" for charges is very good. Consider something like "pe" for patient_encounters, and so on.
It should be like this or you can use left join
select c.rendering_id as prov_number, c.begin_date_of_service as date_of_service,
c.practice_id as group_number, v.enc_nbr as invoice, p.person_nbr as patient,
v.enc_nbr as invoice_number, c.charge_id as transaction_number,
t.med_rec_nbr as primary_mrn, p.last_name, p.first_name,
z.payer_id as orig_fsc_number, z.payer_id as curr_fsc_number,
c.location_id as location_number, c.closing_date as posting_date,
c.quantity as service_units, c.amt as charge_amount,
c.cpt4_code_id as procedure_code, r.description as procedure_name,
x.tran_code_id as pay_code_number, ISNULL([modifier_1],'') as modifier_code_1,
ISNULL([modifier_2],'') as modifier_code_2, ISNULL([modifier_3],'') as modifier_code_3,
ISNULL ([icd9cm_code_id],'') as dx_code_1, ISNULL ([icd9cm_code_id_2],'') as dx_code_2,
ISNULL ([icd9cm_code_id_3],'') as dx_code_3, ISNULL ([icd9cm_code_id_4],'') as dx_code_4
from charges c
inner join person p on c.person_id = p.person_id
inner join patient t on c.person_id = t.person_id
inner join patient_encounter v on c.person_id = v.person_id
inner join encounter_payer z on c.person_id = z.person_id
inner join cpt4_code_mstr r on c.cpt4_code_id = r.cpt4_code_id
inner join transactions x on c.person_id = x.person_id
where c.practice_id = '0001'
and c.closing_date >= GetDate() - 7
Now you comment one inner join at a time and execute below query and see which of these joins is causing one to many relationship...when the count gives you say around 14 K that means the commented table is causing 1 to many relationship.
Otherwise best way is to find the relationship based on unique key,primary key and FK on these tables.
select
count(c.person_id)
from charges c
inner join person p on c.person_id = p.person_id
inner join patient t on c.person_id = t.person_id
inner join patient_encounter v on c.person_id = v.person_id
inner join encounter_payer z on c.person_id = z.person_id
inner join cpt4_code_mstr r on c.cpt4_code_id = r.cpt4_code_id
inner join transactions x on c.person_id = x.person_id
where c.practice_id = '0001'
and c.closing_date >= GetDate() - 7
You can try
select count(*) from <tablename> group by person_id having count(*) > 1
and repeat above query for all tables this will give you an idea on what kind of relationship between charges table and other tables. Offcourse use cpt4_code_id for cpt4_code_mstr table but by name it looks like that this table is master table so it will have a signle vale for each cpt4-code_id value in charges table.
I hope it will help