Query SQL Microsoft Access bug? - sql

I've been working on an Access database with SQL. I was trying to perform the following query:
SELECT Produtos.produto,
[aux].[total]/[Produtos].[existencias] AS [peso consumos nas existencias]
FROM (SELECT Produtos.produto, SUM(Consumos.quantidade) AS total
FROM Consumos, Produtos, Fornecedores
WHERE Consumos.codproduto=Produtos.produto
AND Produtos.codfornecedor=9
GROUP BY Produtos.produto
ORDER BY Produtos.produto) AS aux
INNER JOIN Produtos
ON aux.produto = Produtos.produto
WHERE (((aux.produto)=[Produtos].[produto]));
A closer look at the results showed me that the column [peso consumos nas existencias] was multiplied by 10. After trying to fix this, I noticed that I was not using the table Fornecedores although I was calling it after FROM keyword, so I removed it:
SELECT Produtos.produto,
[aux].[total]/[Produtos].[existencias] AS [peso consumos nas existencias]
FROM (SELECT Produtos.produto, SUM(Consumos.quantidade) AS total
FROM Consumos, Produtos
WHERE Consumos.codproduto=Produtos.produto
AND Produtos.codfornecedor=9
GROUP BY Produtos.produto
ORDER BY Produtos.produto) AS aux
INNER JOIN Produtos
ON aux.produto = Produtos.produto
WHERE (((aux.produto)=[Produtos].[produto]));
After running, the results were right. Was this suppose to happen? if so, why?
Thanks!

Your Fornecedores table probably has 10 records.
FROM Consumos, Produtos, Fornecedores
WHERE Consumos.codproduto=Produtos.produto
was doing a cartesian product of the Consumos-Produtos join with those 10 records, so the SUM() used each number 10 times.
Note 1:
It is considered better style to use the explicit INNER JOIN syntax:
FROM Consumos INNER JOIN Produtos
ON Consumos.codproduto=Produtos.produto
WHERE Produtos.codfornecedor=9
instead of FROM Consumos, Produtos
Note 2:
If you think you have found a bug in the Access (or any database) query engine, chances are almost 100% that the bug is in your query. ;-)

Related

Use group by with sum in query

These 3 tables that you see in the image are related
Course table and coaching table and sales table
I want to make a report from this table on how much each coach has sold by each course period.
The query I created is as follows, but unfortunately it has a problem and I do not know where the problem is.
Please help me fix the problem
Thank you
SELECT
dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid, dbo.tblPost.postTitle,
dbo.tblArticleAuthor.authorName, SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM
dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblCustomersOrders.pid
For this use, SUM() is an Aggregate Function, so you need to refer all the
fields that you want to get in your result set.
Example:
SELECT
dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid, dbo.tblPost.postTitle,
dbo.tblArticleAuthor.authorName, SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid,
dbo.tblPost.postTitle, dbo.tblArticleAuthor.authorName
But this query does not solve the need for your report.
If you just need to get "how much each coach has sold by each course" , you can try the query bellow.
SELECT
dbo.tblArticleAuthor.authorName, dbo.tblPost.postTitle,
SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblArticleAuthor.authorName, dbo.tblPost.postTitle
If you need, send more details regarding the desired result.
Here you can find more information about SQL SERVER Aggregate Functions:
https://learn.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql?view=sql-server-ver15
And here a quick example regarding SQL Aliases to build queries with a simple
and effective way:
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_alias_table
Per your description of the task, the problem is that you only GROUPed BY dbo.tblCustomersOrders.pid, which is the period's id I guess, but you also need to GROUP BY the coach, which is dbo.tblArticleAuthor.authorName, I guess again. Plus in the SELECT field list you can not use more columns only that are aggregated + GROUPed.

Order of Joins SQL Server

I ran across a curious issue earlier today while writing a multiple JOIN query in MS SQL Server and I am hoping somebody may be able to help me to understand better. Typically I like to join tables in the order I pull from them in the SELECT statement (just seems easier and cleaner to me). The below statement works only because it is not in order of the SELECT statement and I don't know why. If I re-wrote it to join VW_ORDER_HEADER --> VW_ORDER_ITEM --> VW_PO_ITEM in that order it returns no results. I only discovered this works after designing it in Access and looking at the resulting SQL. I am not new to SQL Server but at first and second glance I can't see why the order of the join would make a difference... Thanks in advance
SELECT O.SALES_ORDER_NUMBER,
O.BILL_TO,
O.SHIP_TO,
O.SOLD_TO,
O.ORDER_DATE,
O.REQUESTED_DELIVERY_DATE,
O.VALID_TO_DATE,
O.OPEN_QUANTITY,
OI.MATERIAL,
PI.PO_NUMBER
FROM VW_PO_ITEM PI
JOIN VW_ORDER_ITEM OI ON PI.MATERIAL = OI.MATERIAL
JOIN VW_ORDER_HEADER O ON OI.SALES_ORDER_NUMBER = O.SALES_ORDER_NUMBER
WHERE OI.MATERIAL = 'BA7948'
AND O.OPEN_QUANTITY > 0
AND O.VALID_TO_DATE >= GETDATE()
ORDER BY O.SALES_ORDER_NUMBER,
O.ORDER_DATE ASC;

"Error: Unexpected, Please Try Again" with GoogleBigQuery JOIN

I know this is an older issue with Google BigQuery, but it seems the problem had been fixed # mid 2013. I wanted to know if there has been any recent workarounds/fixes to this issue in the recent months. Here is my query from the google sample data.
SELECT publicdata:samples.natality.mother_age, publicdata:samples.gsod.station_number
FROM [publicdata:samples.natality]
INNER JOIN [publicdata:samples.gsod]
ON publicdata:samples.gsod.year = publicdata:samples.natality.year
LIMIT 100
Query Failed
Error: Unexpected. Please try again.
Job ID: deft-grammar-553:job_eUkW4EhgNvlJPuWPoP1bLL7Ra_w
Thanks for the report! That error message should be improved.
In the meantime: The same query using table aliases works well (though I had to change the JOIN to JOIN EACH to deal with the size of both tables).
Instead of:
SELECT publicdata:samples.natality.mother_age, publicdata:samples.gsod.station_number
FROM [publicdata:samples.natality]
INNER JOIN [publicdata:samples.gsod]
ON publicdata:samples.gsod.year = publicdata:samples.natality.year
LIMIT 100
Do:
SELECT a.mother_age, b.station_number
FROM [publicdata:samples.natality] a
INNER JOIN EACH [publicdata:samples.gsod] b
ON a.year = b.year
LIMIT 100

Timeout running SQL query

I'm trying to using the aggregation features of the django ORM to run a query on a MSSQL 2008R2 database, but I keep getting a timeout error. The query (generated by django) which fails is below. I've tried running it directs the SQL management studio and it works, but takes 3.5 min
It does look it's aggregating over a bunch of fields which it doesn't need to, but I wouldn't have though that should really cause it to take that long. The database isn't that big either, auth_user has 9 records, ticket_ticket has 1210, and ticket_watchers has 1876. Is there something I'm missing?
SELECT
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined],
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined]
HAVING
(COUNT([tickets_ticket].[id]) > 0 OR COUNT(T3.[id]) > 0 )
EDIT:
Here are the relevant indexes (excluding those not used in the query):
auth_user.id (PK)
auth_user.username (Unique)
tickets_ticket.id (PK)
tickets_ticket.capturer_id
tickets_ticket.responsible_id
tickets_ticket_watchers.id (PK)
tickets_ticket_watchers.user_id
tickets_ticket_watchers.ticket_id
EDIT 2:
After a bit of experimentation, I've found that the following query is the smallest that results in the slow execution:
SELECT
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id]
The weird thing is that if I comment out any two lines in the above, it runs in less that 1s, but it doesn't seem to matter which lines I remove (although obviously I can't remove a join without also removing the relevant SELECT line).
EDIT 3:
The python code which generated this is:
User.objects.annotate(
Count('tickets_captured'),
Count('assigned_tickets'),
Count('tickets_watched')
)
A look at the execution plan shows that SQL Server is first doing a cross-join on all the table, resulting in about 280 million rows, and 6Gb of data. I assume that this is where the problem lies, but why is it happening?
SQL Server is doing exactly what it was asked to do. Unfortunately, Django is not generating the right query for what you want. It looks like you need to count distinct, instead of just count: Django annotate() multiple times causes wrong answers
As for why the query works that way: The query says to join the four tables together. So say an author has 2 captured tickets, 3 assigned tickets, and 4 watched tickets, the join will return 2*3*4 tickets, one for each combination of tickets. The distinct part will remove all the duplicates.
what about this?
SELECT auth_user.*,
C1.tickets_captured__count
C2.assigned_tickets__count
C3.tickets_watched__count
FROM
auth_user
LEFT JOIN
( SELECT capturer_id, COUNT(*) AS tickets_captured__count
FROM tickets_ticket GROUP BY capturer_id ) AS C1 ON auth_user.id = C1.capturer_id
LEFT JOIN
( SELECT responsible_id, COUNT(*) AS assigned_tickets__count
FROM tickets_ticket GROUP BY responsible_id ) AS C2 ON auth_user.id = C2.responsible_id
LEFT JOIN
( SELECT user_id, COUNT(*) AS tickets_watched__count
FROM tickets_ticket_watchers GROUP BY user_id ) AS C3 ON auth_user.id = C3.user_id
WHERE C1.tickets_captured__count > 0 OR C2.assigned_tickets__count > 0
--WHERE C1.tickets_captured__count is not null OR C2.assigned_tickets__count is not null -- also works (I think with beter performance)

SUM(a*b) not working

I have a PHP page running in postgres. I have 3 tables - workorders, wo_parts and part2vendor. I am trying to multiply 2 table column row datas together, ie wo_parts has a field called qty and part2vendor has a field called cost. These 2 are joined by wo_parts.pn and part2vendor.pn. I have created a query like this:
$scoreCostQuery = "SELECT SUM(part2vendor.cost*wo_parts.qty) as total_score
FROM part2vendor
INNER JOIN wo_parts
ON (wo_parts.pn=part2vendor.pn)
WHERE workorder=$workorder";
But if I add the costs of the parts multiplied by the qauntities supplied, it adds to a different number than what the script is doing. Help....I am new to this but if someone can show me in SQL I can modify it for postgres. Thanks
Without seeing example data, there's no way for us to know why you're query totals are coming out differently that when you do the math by hand. It could be a bad join, so you are getting more/less records than you expected. It's also possible that your calculations are off. Pick an example with the smallest number of associated records & compare.
My suggestion is to add a GROUP BY to the query:
SELECT SUM(p.cost * wp.qty) as total_score
FROM part2vendor p
JOIN wo_parts wp ON wp.pn = p.pn
WHERE workorder = $workorder
GROUP BY workorder
FYI: MySQL was designed to allow flexibility in the GROUP BY, while no other db I've used does - it's a source of numerous questions on SO "why does this work in MySQL when it doesn't work on db x...".
To Check that your Quantities are correct:
SELECT wp.qty,
p.cost
FROM WO_PARTS wp
JOIN PART2VENDOR p ON p.pn = wp.pn
WHERE p.workorder = $workorder
Check that the numbers are correct for a given order.
You could try a sub-query instead.
(Note, I don't have a Postgres installation to test this on so consider this more like pseudo code than a working example... It does work in MySQL tho)
SELECT
SUM(p.`score`) AS 'total_score'
FROM part2vendor AS p2v
INNER JOIN (
SELECT pn, cost * qty AS `score`
FROM wo_parts
) AS p
ON p.pn = p2v.pn
WHERE p2n.workorder=$workorder"
In the question, you say the cost column is in part2vendor, but in the query you reference wo_parts.cost. If the wo_parts table has its own cost column, that's the source of the problem.