Why do I have to use DISTINCT for this to work? - sql

here's my problem: I have an SQL query that makes 4 calls to a lookup table to return their values from a list of combinations in another table. I finally got this working, and for some reason, when I run the query without DISTINCT, I get a ton of data back, so I'm guessing that I'm either missing something or not doing this correctly. It would be really great if this would not only work, but also return the list alphabetically by the first colour name.
I'm putting my SQL here I hope I've explained this well enough:
SELECT DISTINCT
colour1.ColourID AS colour1_ColourID,
colour1.ColourName AS colour1_ColourName,
colour1.ColourHex AS colour1_ColourHex,
colour1.ManufacturerColourID AS colour1_ManufacturerColourID,
colour2.ColourID AS colour2_ColourID,
colour2.ColourName AS colour2_ColourName,
colour2.ColourHex AS colour2_ColourHex,
colour2.QEColourID2 AS colour2_QEColourID2,
colour3.ColourID AS colour3_ColourID,
colour3.ColourName AS colour3_ColourName,
colour3.ColourHex AS colour3_ColourHex,
colour3.QEColourID3 AS colour3_QEColourID3,
colour4.ColourID AS colour4_ColourID,
colour4.ColourName AS colour4_ColourName,
colour4.ColourHex AS colour4_ColourHex,
colour4.QEColourID4 AS colour4_QEColourID4,
Combinations.ID,
Combinations.ManufacturerColourID AS Combinations_ManufacturerColourID,
Combinations.QEColourID2 AS Combinations_QEColourID2,
Combinations.QEColourID3 AS Combinations_QEColourID3,
Combinations.QEColourID4 AS Combinations_QEColourID4,
Combinations.ColourSupplierID,
ColourSuppliers.ColourSupplier
FROM
ColourSuppliers INNER JOIN
(
colour4 INNER JOIN
(
colour3 INNER JOIN
(
colour2 INNER JOIN
(
colour1 INNER JOIN Combinations ON
colour1.ColourID=Combinations.ManufacturerColourID
) ON colour2.ColourID=Combinations.QEColourID2
) ON colour3.ColourID=Combinations.QEColourID3
) ON colour4.ColourID=Combinations.QEColourID4
) ON ColourSuppliers.ColourSupplierID=Combinations.ColourSupplierID
WHERE Combinations.ColourSupplierID = ?
Thanks
Steph

It looks as though you've probably got multiple records for each set of four colour combinations in the Combinations table - posting the structure of the table might help us to work it out.
Adding the clause order by colour1.ColourName to the end of the query should sort it alphabetically by the first colour name.

My guess (and it is a guess because your SQL query is very wide!) is that you're getting the cartesian product.

Related

Use group by with sum in query

These 3 tables that you see in the image are related
Course table and coaching table and sales table
I want to make a report from this table on how much each coach has sold by each course period.
The query I created is as follows, but unfortunately it has a problem and I do not know where the problem is.
Please help me fix the problem
Thank you
SELECT
dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid, dbo.tblPost.postTitle,
dbo.tblArticleAuthor.authorName, SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM
dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblCustomersOrders.pid
For this use, SUM() is an Aggregate Function, so you need to refer all the
fields that you want to get in your result set.
Example:
SELECT
dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid, dbo.tblPost.postTitle,
dbo.tblArticleAuthor.authorName, SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid,
dbo.tblPost.postTitle, dbo.tblArticleAuthor.authorName
But this query does not solve the need for your report.
If you just need to get "how much each coach has sold by each course" , you can try the query bellow.
SELECT
dbo.tblArticleAuthor.authorName, dbo.tblPost.postTitle,
SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblArticleAuthor.authorName, dbo.tblPost.postTitle
If you need, send more details regarding the desired result.
Here you can find more information about SQL SERVER Aggregate Functions:
https://learn.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql?view=sql-server-ver15
And here a quick example regarding SQL Aliases to build queries with a simple
and effective way:
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_alias_table
Per your description of the task, the problem is that you only GROUPed BY dbo.tblCustomersOrders.pid, which is the period's id I guess, but you also need to GROUP BY the coach, which is dbo.tblArticleAuthor.authorName, I guess again. Plus in the SELECT field list you can not use more columns only that are aggregated + GROUPed.

Finding differences in queries using two criteria

I'm trying to find a way to find a way to compare two queries that use a combine sent of criteria. In this case we have Prefixes (Two letter code like DA) and Pack number 1234567. In the query I've created a field that combines these two things so it appears 1234567DA this is done with each of the queries from the separate tables they are pulled from. The idea is that if this is in one table and not the other it would show up as "False". I tried to use an Unmatched query but that doesn't seem to work. What I have currently is as follows:
SELECT
[1LagoTest].Prefix,
[1BigPicPackPref].BigPicPP,
IIf([BigPicPP]=[LagoPP],"True","False") AS Compare,
[1LagoTest].RETAIL,
[1LagoTest].MEDIA
FROM 1LagoTest
LEFT JOIN 1BigPicPackPref
ON [1LagoTest].[Prefix] = [1BigPicPackPref].[BigPicPP]
WHERE (((IIf([BigPicPP]=[LagoPP],"True","False")) Like "False")
AND (([1LagoTest].MEDIA) Not Like "*2019 FL*"))
ORDER BY [1LagoTest].RETAIL;
Right now it will show whats missing from LagoPP but doesn't give me anything from missing packs in BigPicPP. Any help in the right direction would be greatly appreciated.
Thanks!!
This gets a little tricky in Access without FULL OUTER JOIN, but the general idea to is replicate a FULL OUTER JOIN using UNION ALL, then filter from that.
Something like this:
SELECT I.Prefix,
I.BigPicPP,
I.Compare,
I.Retail,
I.Media
FROM (SELECT L.Prefix,
B.BigPicPP,
IIf([BigPicPP]=[LagoPP],"True","False") as Compare,
L.Retail,
L.Media
FROM 1LagoTest L
JOIN 1BigPicPackPref B ON L.Prefix = B.BigPicPP
WHERE L.Media NOT LIKE "*2019 FL*"
UNION ALL
SELECT L.Prefix,
B.BigPicPP,
"False", --Missing records from 1BigPicPackPref
L.Retail,
L.Media
FROM 1LagoTest L
LEFT JOIN 1BigPicPackPref B ON L.Prefix = B.BigPicPP
AND L.Media NOT LIKE "*2019 FL*"
WHERE B.Prefix IS NULL
UNION ALL
SELECT B.Prefix,
B.BigPicPP,
"False", --Missing records from 1LagoTest
L.Retail,
L.Media
FROM 1LagoTest L
RIGHT JOIN 1BigPicPackPref B ON L.Prefix = B.BigPicPP
AND L.Media NOT LIKE "*2019 FL*"
WHERE L.Prefix IS NULL
) AS I
You only need IFF in the first part of the union because in the second two parts one side will always be NULL, so we know the compare will always fail and be False.
You shouldn't need this part of your current WHERE clause at all (((IIf([BigPicPP]=[LagoPP],"True","False")) Like "False"). But if you only want to see False records, just add WHERE I.Compare = "False" to the bottom of the outer select.
The reason the "Unmatched" query (assuming through the Wizard) does not work, is because you are attempting to see the values of two separate tables / queries that do not match either table / query. This is not how the "Unmatched" works. All that will give you is a single table / query that does not match another single table / query.
This can most likely be done any number of ways, but this would probably get you where you want to be (or close to it):
SELECT
a.Prefix,
b.BigPicPP,
IIf([BigPicPP]=[LagoPP],"True","False") AS Compare,
a.RETAIL,
a.MEDIA
FROM [1LagoTest] a
LEFT JOIN [1BigPicPackPref] b ON a.Prefix = b.BigPicPP
WHERE a.MEDIA Not Like "*2019 FL*"
AND b.BigPicPP IS NULL
ORDER BY a.RETAIL
UNION
SELECT
a.Prefix,
b.BigPicPP,
IIf([BigPicPP]=[LagoPP],"True","False") AS Compare,
a.RETAIL,
a.MEDIA
FROM [1LagoTest] a
RIGHT JOIN [1BigPicPackPref] b ON a.Prefix = b.BigPicPP
WHERE a.MEDIA Not Like "*2019 FL*"
AND a.Prefix IS NULL
ORDER BY a.RETAIL
NOTE: Depending on the data structure, the ORDER BY may cause some issues.
So the way I got this to finally work was to build two separate queries. One looking at what was missing from Lago and One that was looking at what was missing from BigPic. It was the only way I could get it to give me both sets of missing data. If I can find a better way to do it through one query I will report back as I'm still gonna play around with it.

how to join multiple tables without showing repeated data?

I pop into a problem recently, and Im sure its because of how I Join them.
this is my code:
select LP_Pending_Info.Service_Order,
LP_Pending_Info.Pending_Days,
LP_Pending_Info.Service_Type,
LP_Pending_Info.ASC_Code,
LP_Pending_Info.Model,
LP_Pending_Info.IN_OUT_WTY,
LP_Part_Codes.PartCode,
LP_PS_Codes.PS,
LP_Confirmation_Codes.SO_NO,
LP_Pending_Info.Engineer_Code
from LP_Pending_Info
join LP_Part_Codes
on LP_Pending_Info.Service_order = LP_Part_Codes.Service_order
join LP_PS_Codes
on LP_Pending_Info.Service_Order = LP_PS_Codes.Service_Order
join LP_Confirmation_Codes
on LP_Pending_Info.Service_Order = LP_Confirmation_Codes.Service_Order
order by LP_Pending_Info.Service_order, LP_Part_Codes.PartCode;
For every service order I have 5 part code maximum.
If the service order have only one value it show the result correctly but when it have more than one Part code the problem begin.
for example: this service order"4182134076" has only 2 part code, first'GH81-13601A' and second 'GH96-09938A' so it should show the data 2 time but it repeat it for 8 time. what seems to be the problem?
If your records were exactly the same the distinct keyword would have solved it.
However in rows 2 and 3 which have the same Service_Order and Part_Code if you check the SO_NO you see it is different - that is why distinct won't work here - the rows are not identical.
I say you have some problem in one of the conditions in your joins. The different data is in the SO_NO column so check the raw data in the LP_Confirmation_Codes table for that Service_Order:
select * from LP_Confirmation_Codes where Service_Order = 4182134076
I assume you are missing an and with the value from the LP_Part_Codes or LP_PS_Codes (but can't be sure without seeing those tables and data myself).
By this sentence If the service order have only one value it show the result correctly but when it have more than one Part code the problem begin. - probably you are missing and and with the LP_Part_Codes table
Based on your output result, here are the following data that caused multiple output.
Service Order: 4182134076 has :
2 PartCode which are GH81-13601A and GH96-09938A
2 PS which are U and P
2 SO_NO which are 1.00024e+09 and 1.00022e+09
Therefore 2^3 returns 8 rows. I believe that you need to check where you should join your tables.
Use DINTINCT
select distinct LP_Pending_Info.Service_Order,LP_Pending_Info.Pending_Days,
LP_Pending_Info.Service_Type,LP_Pending_Info.ASC_Code,LP_Pending_Info.Model,
LP_Pending_Info.IN_OUT_WTY, LP_Part_Codes.PartCode,LP_PS_Codes.PS,
LP_Confirmation_Codes.SO_NO,LP_Pending_Info.Engineer_Code
from LP_Pending_Info
join LP_Part_Codes on LP_Pending_Info.Service_order = LP_Part_Codes.Service_order
join LP_PS_Codes on LP_Part_Codes.Service_Order = LP_PS_Codes.Service_Order
join LP_Confirmation_Codes on LP_PS_Codes.Service_Order = LP_Confirmation_Codes.Service_Order
order by LP_Pending_Info.Service_order, LP_Part_Codes.PartCode;
distinct will not return duplicates based on your select. So if a row is same, it will only return once.

"Your query does not include the specified expression..."

I have tried endless things to get this to work and it seems to break over and over again and not work. I'm trying to GROUP BY product after I have calculated the field quantity returned/quantity ordered, but I get the error
your query does not include the specified expression 'quantity_returned/quantity_ordered' as part of an aggregate function.
I do not want to GROUP BY quantity_returned, quantity_ordered, and product, I only want to GROUP BY product.
Here's what my SQL looks like currently...
SELECT
quantity_returned/quantity_ordered AS percentage_returned,
quantity_returned,
quantity_ordered,
returns_fact.product
FROM
Customer_dimension
INNER JOIN
(
Product_dimension
INNER JOIN
(
Day_dimension
INNER JOIN
returns_fact
ON Day_dimension.day_key = returns_fact.day_key
)
ON Product_dimension.product_key = returns_fact.product_key
)
ON Customer_dimension.customer_key = returns_fact.customer_key
GROUP BY returns_fact.product;
When you use a group by you need to actually include everything in your select that isn't a aggregate function.
I have no idea how your tables are set up, but I am throwing a blind dart. If you provide fields in each of the 4 tables someone will be better able to help.
SELECT returns_fact.product, count(quantity_returned), count(quantity_ordered), count(quantity_returned)/count(quantity_ordered) as percentage returned

Why is my join giving me way too many results?

I have:
SELECT dv.VariableID ,
ds.DataSourceID ,
p.DataVariableDataSourceParamId ,
p.ParamCode ,
p.ParamDisplayName ,
p.DVDSParamControlType ,
p.DependentOnDVDSParamId ,
pv.ParamValue
FROM dbo.DataVariable dv
INNER JOIN dbo.DataVariableDataSource ds ON dv.DataSourceId = ds.DataSourceID
INNER JOIN dbo.DataVariableDataSourceParam p ON ds.DataSourceID = p.DataSourceId
INNER JOIN dbo.DataVariableDataSourceParamValue pv ON p.DataVariableDataSourceParamId = pv.DataVariableDataSourceParamId
WHERE dv.VariableID = #vid
ORDER BY dv.VariableID
When I just have the first two joins, I get what I want: 6 results. When I add the third, I get 660. I just want the ParamValue for the 6 records from the first 2 joins and I can't seem to figure out why this is breaking. I'm on my 12th hour of coding and I'm sure this is insanely obvious, but I could use a hand. Thanks in advance.
This is going to be because you have numerous rows in your pv table that match on DataVariableDataSourceParamId
You can verify by adding a SELECT DISTINCT. You may need to clean that table up or keep the distinct
However, the distinct will only help if pv.ParamValue is the same for all, otherwise you are rightfully getting more matches as what is happening is that you are finding all the matches for DataVariableDataSourceParamId and displaying them. If all those matches are the same value, then the distinct will indeed help, though