SQL Joins issue - sql

I have 3 database tables.
First one containing Ingredients, second one containing Dishes and the third one which is conecting both Ingredients and Dishes.
Adding data to those tables was easy but I faced a problem while trying to select specific content.
Reurning all ingredients for specific dish.
SELECT *
FROM Ingredient As I
JOIN DishIngredients as DI
ON I.ID = DI.IngredientID
WHERE DI.DishID = 1;
But If i try to query for dish Name and Description no matter what kind o join I use i always get number of results equal to number of used Ingredients. If i have 4 ingredients in my dish then select returns Name and Description 4 times, how can I modify my slect to select those values just once?
Here is result of my query (same as hawk's) if i try to select Name and Description. I am using MS SQL.
ID Name Description DishID IngredientID
-- -------------------- -------------------------------------------------------------------- ------ ---------
1 Spaghetti Carbonara This delcitious pasta is made with fresh Panceta and Single Cream 1 1
1 Spaghetti Carbonara This delcitious pasta is made with fresh Panceta and Single Cream 1 2
Kuzgun's query worked fine for me. However from your sugestions I see that I dont really need join between DishIngredient and Dish.
When I need Name and Descritpion I can simply go for
SELECT * FROM Dish WHERE ID=1;
Wehn I need list of Ingredient I can use my above query.

If you need to display both dish details and ingredient details, you need to join all 3 tables:
SELECT *
FROM Ingredient As I
JOIN DishIngredients as DI
ON I.ID = DI.IngredientID
JOIN Dish AS D
ON D.ID=DI.DishID
WHERE DI.DishID = 1;

If you don't care about ingredient,you don't have to use the table DishIngredient.Just use tale Dish.select * from dish d where d.id=1.
If you want to know what the ingredient is ,the sql that you use just query the id of table ingredient.It's useless.Because of the design of your database ,a little redundancy is a must .
select * from dish d join dishingredient di on d.id=di.dishid join ingredient i on
i.id=di.ingredientid where d.id=1
Of course,you will get number of results that contain dish's name and description.
If you want to get the full information but the least redundancy,you can do it in two step:
select * from dish d where d.id=1;
select * from ingredient i join DishIngredient di on i.id=di.ingredientid where di.dishid=1
In java ,you can write a class to represent a dish and a list to represent the ingredients it use.
public class Dish {
BigDecimal id;
String name;
String description;
List<Ingredient> ingredient;
}
class Ingredient{
BigDecimal id;
String name;
.....
}

Related

PostgreSQL join duplicates rows

I am using PostgreSQL and I am new to it. I am attempting to join a table to two other tables but the results are being duplicated. I have the following tables.
MEAL
id
name
ingredients
flavors
abc
Creamy Chicken Soup
{def,ghi,jkl}
{mno}
INGREDIENT
id
name
def
chicken
ghi
corn
jkl
pepper
FLAVOR
id
name
mno
spicy
And here is my query
SELECT
meal.id,
meal.name,
JSON_AGG(i) as ing,
JSON_AGG(f) as flav,
FROM meal LEFT JOIN
(SELECT
ingredient.id,
ingredient.name
FROM ingredient) i
ON (i.id = ANY(meal.ingredients)) LEFT JOIN
(SELECT
flavor.id,
flavor.name
FROM flavor) f
ON (f.id = ANY(meal.flavors))
GROUP BY
meal.id,
meal.name
And the results are:
id
name
ing
flav
abc
Creamy Chicken Soup
[{id: "def",name:"chicken"},{id:"ghi",name:"corn"},{id:"jkl",name:"pepper"}]
[{id:"mno",name:"spicy"},{id:"mno",name:"spicy"},{id:"mno",name:"spicy"}]
As you can see the flavors are being duplicated the same number of times as the ingredient count. How can I do this query without the duplicates. Unfortunatly I do not have any control over the table structure as it is being pulled in from a third party. I can maniputlate the data in code but I would prefer to query it and get back the correct data set.

Select multiple results from sub query into single row (as array datatype)

I'm trying to solve a small problem with a SQL query in an oracle database. Let's assume I have these tables:
One table that holds information about cars:
tblCars
ID Model Color
--------------------
1 Volvo Red
2 BMW Blue
3 BMW Green
And another one containing information about drivers:
tblDrivers
ID fID_tblCars Name
---------------------------
1 1 George
2 1 Mike
3 2 Jason
4 2 Paul
5 2 William
6 3 Steve
Now, let's pretend that to find out the popularity of the cars, I want to create reports that contain the data about the cars and the people that are driving them (which seems a very reasonable thing one would accomplish with a database).
This "ReportObject" would have a string for the model, a string for the color and an array (or a list) of strings for the drivers.
Currently, I do this with two queries, in the first I select the cars
SELECT ID, Model, Color FROM tblCars
and create a report object for each result.
Then, I would take each result and get the drivers for each specific car
SELECT Name FROM tblDrivers WHERE fID_tblCars = ResultObject.ID
Basically, step one gives me a resulting data set that looks like this:
Result
------------------------------------------
ColumnID ColumnModel ColumnColor
Type Integer Type String Type String
and now, if I will have more cars in the future, I will have to make a lot of additional queries, one for each row in the resulting table.
When I try this:
SELECT Model, Color, (SELECT Name FROM tblDrivers WHERE tblDrivers.fID_tblCars = tblCars.ID) as Name FROM tblCars
I get some error message telling me that one result in the row contains multiple elements (which is what I want!).
I want the result to look like this:
Result
--------------------------------------------------------
ColumnID ColumnModel ColumnColor ColumnName
Type Integer Type String Type String Type Array
So when I build my report object, I could do something like this:
foreach (var Row in Results)
{
ReportObject.Model = Row.Model;
ReportObject.Color = Row.Color;
foreach (string Driver in Row.Name)
{
ReportObject.Drivers.Add(Driver);
}
}
Am I completely missing my basics here or do I have to split this up in multiple queries?
Thanks!
This works in Oracle. In the SQL Fiddle example I couldn't get the IDENTITY or the PRIMARY KEYS to work when creating the table (never used Oracle SQL before)
SELECT c.id,
c.model,
c.color,
LISTAGG(d.name, ',') WITHIN GROUP (ORDER BY d.name) AS "Drivers"
FROM tblCars c
JOIN tblDrivers d
ON c.id = d.fID_TblCars
GROUP BY c.id,
c.model,
c.color
ORDER BY c.Id
SQL Fiddle Example

SSRS query and WHERE with multiple

Being new with SQL and SSRS and can do many things already, but I think I must be missing some basics and therefore bang my head on the wall all the time.
A report that is almost working, needs to have more results in it, based on conditions.
My working query so far is like this:
SELECT projects.project_number, project_phases.project_phase_id, project_phases.project_phase_number, project_phases.project_phase_header, project_phase_expensegroups.projectphase_expense_total, invoicerows.invoicerow_total
FROM projects INNER JOIN
project_phases ON projects.project_id = project_phases.project_id
LEFT OUTER JOIN
project_phase_expensegroups ON project_phases.project_phase_id = project_phase_expensegroups.project_phase_id
LEFT OUTER JOIN
invoicerows ON project_phases.project_phase_id = invoicerows.project_phase_id
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total >0 )
The parameter is for selectionlist that is used to choose a project to the report.
How to have also records that have
( project_phase_expensegroups.projectphase_expense_total ) with value 0 but there might be invoices for that project phase?
Tried already to add another condition like this:
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total > 0 )
OR
( invoicerows.invoicerow_total > 0 )
but while it gives some results - also the one with projectphase_expense_total with value 0, but the report is total mess.
So my question is: what am I doing wrong here?
There is a core problem with your query in that you are left joining to two tables, implying that rows may not exist, but then putting conditions on those tables, which will eliminate NULLs. That means your query is internally inconsistent as is.
The next problem is that you're joining two tables to project_phases that both may have multiple rows. Since these data are not related to each other (as proven by the fact that you have no join condition between project_phase_expensegroups and invoicerows, your query is not going to work correctly. For example, given a list of people, a list of those people's favorite foods, and a list of their favorite colors like so:
People
Person
------
Joe
Mary
FavoriteFoods
Person Food
------ ---------
Joe Broccoli
Joe Bananas
Mary Chocolate
Mary Cake
FavoriteColors
Person Color
------ ----------
Joe Red
Joe Blue
Mary Periwinkle
Mary Fuchsia
When you join these with links between Person <-> Food and Person <-> Color, you'll get a result like this:
Person Food Color
------ --------- ----------
Joe Broccoli Red
Joe Bananas Red
Joe Broccoli Blue
Joe Bananas Blue
Mary Chocolate Periwinkle
Mary Chocolate Fuchsia
Mary Cake Periwinkle
Mary Cake Fuchsia
This is essentially a cross-join, also known as a Cartesian product, between the Foods and the Colors, because they have a many-to-one relationship with each person, but no relationship with each other.
There are a few ways to deal with this in the report.
Create ExpenseGroup and InvoiceRow subreports, that are called from the main report by a combination of project_id and project_phase_id parameters.
Summarize one or the other set of data into a single value. For example, you could sum the invoice rows. Or, you could concatenate the expense groups into a single string separated by commas.
Some notes:
Please, please format your query before posting it in a question. It is almost impossible to read when not formatted. It seems pretty clear that you're using a GUI to create the query, but do us the favor of not having to format it ourselves just to help you
While formatting, please use aliases, Don't use full table names. It just makes the query that much harder to understand.
You need an extra parentheses in your where clause in order to get the logic right.
WHERE ( projects.project_number = #iProjectNumber )
AND (
(project_phase_expensegroups.projectphase_expense_total > 0)
OR
(invoicerows.invoicerow_total > 0)
)
Also, you're using a column in your WHERE clause from a table that is left joined without checking for NULLs. That basically makes it a (slow) inner join. If you want to include rows that don't match from that table you also need to check for NULL. Any other comparison besides IS NULL will always be false for NULL values. See this page for more information about SQL's three value predicate logic: http://www.firstsql.com/idefend3.htm
To keep your LEFT JOINs working as you intended you would need to do this:
WHERE ( projects.project_number = #iProjectNumber )
AND (
project_phase_expensegroups.projectphase_expense_total > 0
OR project_phase_expensegroups.project_phase_id IS NULL
OR invoicerows.invoicerow_total > 0
OR invoicerows.project_phase_id IS NULL
)
I found the solution and it was kind easy after all. I changed the only the second LEFT OUTER JOIN to INNER JOIN and left away condition where the query got only results over zero. Also I used SELECT DISTINCT
Now my report is working perfectly.

mysql where IN on large dataset or Looping?

I have the following scenario:
Table 1:
articles
id article_text category author_id
1 "hello world" 4 1
2 "hi" 5 2
3 "wasup" 4 3
Table 2
authors
id name friends_with
1 "Joe" "Bob"
2 "Sue" "Joe"
3 "Fred" "Bob"
I want to know the total number of authors that are friends with "Bob" for a given category.
So for example, for category 4 how many authors are there that are friends with "Bob".
The authors table is quite large, in some cases I have a million authors that are friends with "Bob"
So I have tried:
Get list of authors that are friends with bob, and then loop through them and get the count for each of them of that given category and sum all those together in my code.
The issue with this approach is it can generate a million queries, even though they are very fast, it seems there should be a better way.
I was thinking of trying to get a list of authors that are friends with bob and then building an IN clause with that list, but I fear that would blow out the amt of memory allowed in the query set.
Seems like this is a common problem. Any ideas?
thanks
SELECT COUNT(DISTINCT auth.id)
FROM authors auth
INNER JOIN articles art ON auth.id = art.author_id
WHERE friends_with = 'bob' AND art.category = 4
Count(Distinct a.id) is required as articles might hit multiple rows for each author.
But if you have any control over the database I would use a link table for friends_with as your cussrent solution either have to use a comma seperated list of names which will be disastrous for performance and require a completly different query or each author can only have one friend.
Friends
id friend_id
then the query would look like this
SELECT COUNT(DISTINCT auth.id)
FROM authors auth
INNER JOIN articles art ON auth.id = art.author_id
INNER JOIN friends f ON auth.id = f.id
INNER JOIN authors fauth ON fauth.id = f.friend_id
WHERE fauth.name = 'bob' AND art.category = 4
Its more complex but will allow for many friends, just remeber, this construct calls for 2 rows in friends for each pair, one from joe to bob and one from bob to joe.
You could build it differently but that would make the query even more complex.
Maybe something like
select fr.name,
fr.id,
au.name,
ar.article_text,
ar.category,
ar.author_id
from authors fr, authors au, articles ar
where fr.id = ar.author_id
and au.friends_with = fr.name
and ar.category = 4 ;
Just the count...
select count(distinct fr.name)
from authors fr, authors au, articles ar
where fr.id = ar.author_id
and au.friends_with = fr.name
and ar.category = 4 ;
A version without using joins (hopefully will work!)
SELECT count(distinct id) from authors where friends_with = 'Bob' and id in(select author_id from articles where category = 4)
I found it is easier to understand statements with 'IN' in when I started out with SQL.

MySQL Migration Script Help

I am working on a site that lists a directory of various restaurants, and currently in the process of switching to a newer CMS. The problem I have is that both CMSes represent the restaurant data differently.
Old CMS
A Cross Reference Database so it may list an entry for an example like this:
ID / FieldID / ItemID / data
3 / 1 / 6 / 123 Foo Street
4 / 2 / 6 / Bar
One reference table that reference FieldID 1 as street, FieldID 2 as City.
Another reference table that references ItemID 6 as Delicious Restaurant.
New CMS
The way the database is on the new CMS when I set up a sample listing, is all direct rows, no cross referencing. So instead the data for the same restaurant will be:
ID / Name / Street / City
3 / Delicious Restaurant / 123 Foo Street / Bar
There are about 2,000 restaurant listings so it's not a HUGE amount in terms of SQL row data size, but of course enough to not even consider re-entering all the restaurant listings by hand.
I have a few ideas, but it would be extremely dirty and take a while, and I'm not a MySQL expert so I am here for some ideas how I should tackle it.
Many thanks to those who can help.
You can join against the data table multiple times to get something like this:
insert into newTable
select oldNames.ItemID,
oldNames.Name,
oldStreets.data,
oldCities.data
from oldNames
inner join oldData as oldStreets on oldNames.ItemID = oldStreets.ItemID
inner join oldData as oldCities on oldNames.ItemID = oldCities.ItemID
inner join oldFields as streetsFields
on oldStreets.FieldID = streetsFields.FieldID
and streetsFields.Name = 'Street'
inner join oldFields as citiesFields
on oldCities.FieldID = citiesFields.Field
and citiesFields.Name = 'City'
You didn't provide names for all of the tables, so I made some names up. If you have more fields that you need to extract, it should be trivial to extend this sort of query.