List and Count items with a JOIN with SQL - sql

I'm trying to create a basic rapport from these 2 tables:
Table Products
|--------|----------------|----------|
| PRO_Id | PRO_CategoryId | PRO_Name |
|--------|----------------|----------|
| 1 | 98 | Banana |
| 2 | 98 | Apple |
|--------|----------------|----------|
Table Categories
|--------|----------|
| CAT_Id | CAT_Name |
|--------|----------|
| 98 | Fruits |
| 99 | Other |
|--------|----------|
What I needed is this output:
|------------|
| Categories |
|------------|
| Fruits (2) |
|------------|
I would like a report listing all the categories from Categories but only when product from Products has a link (with is the case form Fruits but not for Other).
This is where I am actually:
SELECT CAT_Name, COUNT(PRO_Name IN sum)
FROM Categories
JOIN Products
ON Products.PRO_CategoryId = Categories.CAT_Id as sum
ORDER BY CAT_Name ASC
Anyone to help me with this please ?
Thanks.

You are pretty close. You need to get rid of the garbage in the query and use a group by:
SELECT c.cat_name, COUNT(*)
FROM Categories c JOIN
Products p
ON p.PRO_CategoryId = c.CAT_Id
GROUP BY c.CAT_Name ;
Notes:
SELECT * is not appropriate for an aggregation query. What you want to select is.
This puts the count in a separate column which seems to be your intention, despite the sample results.
COUNT(pro_name in sum) doesn't make sense.
as sum doesn't make sense.

Related

SQL Join to the latest record in MS ACCESS

I want to join tables in MS Access in such a way that it fetches only the latest record from one of the tables. I've looked at the other solutions available on the site, but discovered that they only work for other versions of SQL. Here is a simplified version of my data:
PatientInfo Table:
+-----+------+
| ID | Name |
+-----+------+
| 1 | John |
| 2 | Tom |
| 3 | Anna |
+-----+------+
Appointments Table
+----+-----------+
| ID | Date |
+----+-----------+
| 1 | 5/5/2001 |
| 1 | 10/5/2012 |
| 1 | 4/20/2018 |
| 2 | 4/5/1999 |
| 2 | 8/8/2010 |
| 2 | 4/9/1982 |
| 3 | 7/3/1997 |
| 3 | 6/4/2015 |
| 3 | 3/4/2017 |
+----+-----------+
And here is a simplified version of the results that I need after the join:
+----+------+------------+
| ID | Name | Date |
+----+------+------------+
| 1 | John | 4/20/2018 |
| 2 | Tom | 8/8/2010 |
| 3 | Anna | 3/4/2017 |
+----+------+------------+
Thanks in advance for reading and for your help.
You can use aggregation and JOIN:
select pi.id, pi.name, max(a.date)
from appointments as a inner join
patientinfo as pi
on a.id = pi.id
group by pi.id, pi.name;
something like this:
select P.ID, P.name, max(A.Date) as Dt
from PatientInfo P inner join Appointments A
on P.ID=A.ID
group by P.ID, P.name
Both Bing and Gordon's answers work if your summary table only needs one field (the Max(Date)) but gets more tricky if you also want to report other fields from the joined table, since you would need to include them either as an aggregated field or group by them as well.
Eg if you want your summary to also include the assessment they were given at their last appointment, GROUP BY is not the way to go.
A more versatile structure may be something like
SELECT Patient.ID, Patient.Name, Appointment.Date, Appointment.Assessment
FROM Patient INNER JOIN Appointment ON Patient.ID=Appointment.ID
WHERE Appointment.Date = (SELECT Max(Appointment.Date) FROM Appointment WHERE Appointment.ID = Patient.ID)
;
As an aside, you may want to think whether you should use a field named 'ID' to refer to the ID of another table (in this case, the Apppintment.ID field refers to the Patient.ID). You may make your db more readable if you leave the 'ID' field as an identifier specific to that table and refer to that field in other tables as OtherTableID or similar, ie PatientID in this case. Or go all the way and include the name of the actual table in its own ID field.
Edited after comment:
Not quite sure why it would crash. I just ran an equivalent query on 2 tables I have which are about 10,000 records each and it was pretty instanteneous. Are your ID fields (i) unique numbers and (ii) indexed?
Another structure which should do the same thing (adapted for your field names and assuming that there is an ID field in Appointments which is unique) would be something like:
SELECT PatientInfo.UID, PatientInfo.Name, Appointments.StartDateTime, Appointments.Assessment
FROM PatientInfo INNER JOIN Appointments ON PatientInfo_UID = Appointments.PatientFID
WHERE Appointments.ID = (SELECT TOP 1 ID FROM Appointments WHERE Appointments.PatientFID = PatientInfo_UID ORDER BY StartDateTime DESC)
;
But that is starting to look a bit contrived. On my data they both produce the same result (as they should!) and are both almost instantaneous.
Always difficult to troubleshoot Access when it crashes - I guess you see no error codes or similar? Is this against a native .accdb database or another server?

Make a 1 to 1 multi-field SQL join where only some of the values match

I am trying to build a table that will be used as a conversion chart. I aim to make a simple join with this conversion table on multiple fields (8 in my case), and get a result. I will try to simplify the examples as much as I can because the original chart is a 40x10 matrix.
Let's say that I have these two (I know they don't make much sense and have bad design but they are just examples):
supply_conversion_chart
---
supply (integer)
customer_id (integer)
product_id (integer)
size (varchar)
purchase_type (varchar)
purchases
---
customer_id (integer)
product_id (integer)
size (varchar)
purchase_type (varchar)
and conversion chart would look something like this:
| supply | customer_id | product_id | size | purchase_type |
|--------|--------------|------------|----------|---------------|
| 100 | 1 | anything | anything | online |
| 101 | 1 | anything | anything | offline |
| 102 | other than 1 | anything | anything | online |
| 103 | 1 | 5 | XXL | online |
The main goal was to get an exact supply value by simply doing a join by doing something like:
SELECT supply
FROM purchases p
JOIN supply_conversion_chart scc ON
p.customer_id = scc.customer_id AND
p.product_id = scc.product_id AND
p.size = scc.size AND
p.purchase_type = scc.purchase_type;
Let's say that these are the records on purchases table:
| customer_id | product_id | size | purchase_type |
|-------------|------------|------|---------------|
| 1 | 3 | M | online |
| 1 | 5 | S | offline |
| 12345 | 4 | XL | online |
| 1 | 5 | XXL | online |
| 4353 | null | M | online |
I would expect first record's supply value to be 101, second record's to be 102, third 102, fourth 103, and fifth to be 102. However, as far as I know, SQL won't be able to do a proper join on all of these records except the fourth one, which is fully matching with supply 103 on supply_conversion_chart table. I don't know if it is possible in the first place to do a join using multiple fields when some of those fields are not fully matching.
My approach is probably faulty and there are better ways to get the results I am trying to achieve but I don't even know where to start. What should I do?
The original chart is much bigger that the provided example, and that I will be doing a join on 8 different fields.
You approach is a lateral join:
select p.*, scc.*
from purchases p left join lateral
(select scc.*
from supply_conversion_chart scc
where (scc.customer_id = p.customer_id or scc.customer_id is null) and
(scc.product_id = p.product_id or scc. product_id is null) and
(scc.size = p.size or scc.size is null) and
(scc.purchase_type = p.purchase_type or scc.purchase_type is null)
order by ( (scc.customer_id = p.customer_id)::int +
(scc.product_id = p.product_id)::int
(scc.size = p.size)::int
(scc.purchase_type = p.purchase_type)::int
) desc
limit 1
) scc;
Note: This represents "everything" as NULL. It doesn't have special logic for "customer other than 1". However, it does show you how to implement basically what you are trying to do.

How do I structure my SQL query to prevent the return of duplicate rows with related data?

I need some help with an SQL Query. I have a database table that has related data with other tables. When I query the table it returns the duplicate rows for every row of related data i.e.
|-------------| |-------------| |-------------|
| Cars | | Options | | Value |
|-------------| ------> |-------------| ------> |-------------|
| CarId | | OptionsId | | ValueId |
| CarMake | | OptionName | | CostValue |
| CarModel | | Confirmed | | CarId |
|-------------| | CarId | | OptionsId |
|-------------| |-------------|
|
|
---------------> |-------------|
| Warranty |
|-------------|
| WarrantyId |
| WarrantyType|
| CarId |
|-------------|
The query that I have made, which was designed in the query builder of SSMS (because of this it is not using aliases and has the 3 stage naming convention, this will be changed) is as follows:
SELECT dbo.Cars.CarId,
dbo.Cars.Make,
dbo.Cars.Model,
dbo.Options.OptionName,
dbo.Warranty.WarrantyType,
dbo.Value.CostValue
FROM dbo.Cars
LEFT JOIN dbo.Options ON dbo.Cars.CarId = dbo.Options.CarId
LEFT JOIN Value ON Options.OptionsId = Value.OptionsId
LEFT JOIN dbo.Warranty on dbo.Cars.CarId = dbo.Warranty.CarId
Executing this query as it stands returns my data, however, for cars with multiple options I receive duplicate rows i.e.
Id | Make | Model | Option Name | Warranty Type | Value
27 | Ford | Fiesta | Heated Seats | Static | 500
27 | Ford | Fiesta | Front Fog Lights | Static | 400
I've been looking around for possible answers to this question and found that the proposed solution is to use the keyword DISTINCT or to create a subquery. I added DISTINCT to my query but the same data was returned, probably because the options are both distinct in their own right, I don't know I'm guessing.
I'm happy to use a subquery but not sure how to apply that to my above query code. All I want to do here is return one single row for each car with the highest option value i.e.
27 | Ford | Fiesta | Heated Seats | Static | 500
Can anyone help me write this query? I think I've included everything in this question but if I can offer more, please let me know.
Instead of joining the table Value which gives you multiple rows,
you must join this query:
SELECT
dbo.Value.CarId,
dbo.Value.OptionsId,
MAX(dbo.Value.CostValue) AS CostValue
FROM dbo.Value
GROUP BY dbo.Value.CarId, dbo.Value.OptionsId
which you will give you from the table Value for each car the option with the max value.
So try this:
SELECT dbo.Cars.CarId,
dbo.Cars.Make,
dbo.Cars.Model,
dbo.Options.OptionName,
v.CostValue,
dbo.Warranty.WarrantyType
FROM dbo.Cars
LEFT JOIN dbo.Options ON dbo.Cars.CarId = dbo.Options.CarId
INNER JOIN (
SELECT
dbo.Value.CarId,
dbo.Value.OptionsId,
MAX(dbo.Value.CostValue) AS CostValue
FROM dbo.Value
GROUP BY dbo.Value.CarId, dbo.Value.OptionsId
) AS v ON Options.OptionsId = v.OptionsId
LEFT JOIN dbo.Warranty on dbo.Cars.CarId = dbo.Warranty.CarId
you can try like below by using window function
with cte as(
SELECT dbo.Cars.CarId,
dbo.Cars.Make,
dbo.Cars.Model,
dbo.Options.OptionName,
Value.CostValue,
row_number() over(partition by dbo.Cars.CarId,
dbo.Cars.Make,
dbo.Cars.Model order by Value.CostValue desc) rn
FROM dbo.Cars
LEFT JOIN dbo.Options ON dbo.Cars.CarId = dbo.Options.CarId
LEFT JOIN Value ON Options.OptionsId = Value.OptionsId
LEFT JOIN dbo.Warranty on dbo.Cars.CarId = dbo.Warranty.CarId
) select * from cte where rn=1

SQL: Bug in Joining two tables

I have a item table from which i want to get Sum of item quantity
Query:
Select item_id, Sum(qty) from item_tbl group by item_id
Result:
==================
| ID | Quantity |
===================
| 1 | 10 |
| 2 | 20 |
| 3 | 5 |
| 4 | 20 |
The second table is invoice table from which i am getting the item quantity which is sold. I am joining these two tables as
Query:
Select item_tbl.item_id, Sum(item_tbl.qty) as [item_qty],
-isnull(Sum(invoice.qty),0) as [invoice_qty]
from item_tbl
left join invoice on item_tbl.item_id = invoice invoice.item_id group by item_tbl.item_id
Result:
=================================
| ID | item_qty | invoice_qty |
=================================
| 1 | 10 | -5 |
| 2 | 20 | -20 |
| 3 | 10 | -25 | <------ item_qty raised from 5 to 10 ??
| 4 | 20 | -20 |
I don't know if i am joining these tables in right way. Because i want to get everything from item table and available things from invoice table to maintain the inventory. So i use left join. Help please..
Modification
when i added group by item_id, qty i got this:
=================================
| ID | item_qty | invoice_qty |
=================================
| 1 | 10 | -5 |
| 2 | 20 | -20 |
| 3 | 5 | -5 |
| 3 | 5 | -20 |
| 4 | 20 | -20 |
As its a view so ID is repeated. what should i do to avoid this ??
Clearing things up, my answer from the comments explained:
While using left join operation (A left join B) - a record will be created for every matching B record to an A record, also - a record will be created for any A record that has no matching B record, using null values wherever needed to complement the fields from B.
I would advise reading up on Using Joins in SQL when approaching such problems.
Below are 2 possible solutions, using different assumptions.
Solution A
Without any assumptions regarding primary key:
We have to sum up the item quantity column to determine the total quantity, resulting in two sums that need to be performed, I would advise using a sub query for readability and simplicity.
select item_tbl.item_id, Sum(item_tbl.qty) as [item_qty], -isnull(Sum(invoice_grouped.qty),0) as [invoice_qty]
from item_tbl left join
(select invoice.item_id as item_id, Sum(invoice.qty) as qty from invoice group by item_id) invoice_grouped
on (invoice_grouped.item_id = item_tbl.item_id)
group by item_tbl.item_id
Solution B
Assuming item_id is primary key for item_tbl:
Now we know we can rely on the fact that there is only one quantity for each item_id, so we can do without the sub query by selecting any (max) of the item quantities in the join result, resulting in a quicker execution plan.
select item_tbl.item_id, Max(item_tbl.qty) as [item_qty], -isnull(Sum(invoice.qty),0) as [invoice_qty]
from item_tbl left join invoice on (invoice.item_id = item_tbl.item_id)
group by item_tbl.item_id
If your database design is following the common rules, item_tbl.item_id must be unique.
So just change your query:
Select item_tbl.item_id, item_tbl.qty as [item_qty],
-isnull(Sum(invoice.qty),0) as [invoice_qty]
from item_tbl
left join invoice on item_tbl.item_id = invoice invoice.item_id group by item_tbl.item_id, item_tbl.qty

MySQL Advanced SELECT help

Alright well I recently got into normalizing my database for this little side project that I have been creating for a while now, but I've just hit a brick wall. I'll try to give an understandable example of what I have and what I need to accomplish ― and hopefully it won't be too painful. OK.
I have 3 tables the first one we will call Shows, structured something like this:
+----+--------------------------+
| id | title |
+----+--------------------------+
| 1 | Example #1 |
| 2 | Example #2 |
| 3 | Example #3 |
+----+--------------------------+
Plain and simple.
My next table is called Categories, and lookes like this:
+----+--------------------------+
| id | category |
+----+--------------------------+
| 1 | Comedy |
| 2 | Drama |
| 3 | Action |
+----+--------------------------+
And a final table called Show_categories:
+---------+---------+
| show_id | cat_id |
+---------+---------+
| 1 | 1 |
| 1 | 3 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 2 |
+---------+---------+
As you may have noticed the problem is the in my database a single show can have multiple categories. Everything is structured fine, except for the fact that I can't find a why to search for show with multiple categories.
If I were to search for action and comedy type shows I would be given Example #1, but it is not possible (at least with my queries), because the cat_id's inside the Show_categories are in different rows.
Example of a working single category search (Selecting all comedy shows):
SELECT s.id,s.title
FROM Shows s JOIN Show_categories sc ON sc.anid=s.id
WHERE sc.cat_id=1 GROUP BY s.id
And a query that is impossible (because cat_id can't equal 2 different things):
SELECT s.id,s.title
FROM Shows s JOIN Show_categories sc ON sc.anid=s.id
WHERE sc.cat_id=1 AND sc.cat_id=2 GROUP BY s.id
So to sum things up what I am asking is how do I handle a query where I am looking for a show based on multiple matching categories.
Use:
SELECT s.id,
s.title
FROM SHOWS s
JOIN SHOW_CATEGORIES sc ON sc.anid = s.id
WHERE sc.cat_id IN (1, 2)
GROUP BY s.id, s.title
HAVING COUNT(DISTINCT sc.cat_id) = 2
The COUNT(DISTINCT sc.cat_id) comparison needs to equal the number of cat_id values listed in the IN clause. But if both the SHOW_CATEGORIES show_id and cat_id columns are either the primary key, or there's a unique constraint on both columns -- then you can use COUNT(sc.cat_id).
You need an OR statement.
SELECT s.id,s.title
FROM Shows s JOIN Show_categories sc ON sc.anid=s.id
WHERE sc.cat_id=1 OR sc.cat_id=2 GROUP BY s.id
That is, you want all shows with either catid 1 OR catid 2. So this query will return 1, 2 and 3.