"Error: Unexpected, Please Try Again" with GoogleBigQuery JOIN - sql

I know this is an older issue with Google BigQuery, but it seems the problem had been fixed # mid 2013. I wanted to know if there has been any recent workarounds/fixes to this issue in the recent months. Here is my query from the google sample data.
SELECT publicdata:samples.natality.mother_age, publicdata:samples.gsod.station_number
FROM [publicdata:samples.natality]
INNER JOIN [publicdata:samples.gsod]
ON publicdata:samples.gsod.year = publicdata:samples.natality.year
LIMIT 100
Query Failed
Error: Unexpected. Please try again.
Job ID: deft-grammar-553:job_eUkW4EhgNvlJPuWPoP1bLL7Ra_w

Thanks for the report! That error message should be improved.
In the meantime: The same query using table aliases works well (though I had to change the JOIN to JOIN EACH to deal with the size of both tables).
Instead of:
SELECT publicdata:samples.natality.mother_age, publicdata:samples.gsod.station_number
FROM [publicdata:samples.natality]
INNER JOIN [publicdata:samples.gsod]
ON publicdata:samples.gsod.year = publicdata:samples.natality.year
LIMIT 100
Do:
SELECT a.mother_age, b.station_number
FROM [publicdata:samples.natality] a
INNER JOIN EACH [publicdata:samples.gsod] b
ON a.year = b.year
LIMIT 100

Related

Join with Multiple Tables

I am getting a syntax error with the following problem and can't seem to figure out, hope you guys can help me!
I have these tables (they are populated):
I am trying to retrieve the first and last name of all the passengers scheduled in a certain flight number so what I have is this:
SELECT PassFName, PassLName
FROM Passenger
INNER JOIN PassID ON Passenger.PassID = Reservation.PassID
INNER JOIN FlightNum ON FlightNum.Reservation = FlightNum.ScheduledFlight
WHERE ScheduledFlight.FlightNum = [Enter Flight Number];
However, I am getting error:
Not sure why and I have also noticed in the last line it is misspelling FlightNum.ScheduledFlight. Any idea what am I doing wrong?
Thank you!
Gordon's point is valid, but he's got his parentheses misplaced and missed the other big issues. This query is more than a little whacked, with table names and field names flip-flopped. Here's what I would guess would work...
SELECT
PassFName
, PassLName
FROM (
Passenger
INNER JOIN Reservation
ON Passenger.PassID = Reservation.PassID
)
INNER JOIN ScheduledFlight
ON Reservation.FlightNum = ScheduledFlight.FlightNum
WHERE
ScheduledFlight.FlightNum = [Enter Flight Number];
MS Access has a strange syntax for joins. It requires parentheses around each JOIN pair. So:
SELECT PassFName, PassLName
FROM (Passenger INNER JOIN
Reservation
) ON Passenger.PassID = Reservation.PassID INNER JOIN
FlightNum
ON FlightNum.Reservation = FlightNum.ScheduledFlight
WHERE ScheduledFlight.FlightNum = [Enter Flight Number];
Although other databases support this syntax, it is only required in MS Access.

MS Access 2016 Error: "Multi-level group by not allowed"

i'm stuck at another point in my little Access 2016 Database. My code looks like the following and i know it probably isn't the cleanest solution but i'm kinda new to this and i tried to educate myself and get some help here already.
I'm trying to play around with the reports now a little bit and i am using this test query which returns all entries of two tables joined together.
As far as i could find out I have this one subquery included that returns the prvious day inventory for each record and that is most likely the cause of my error. I found a possible solution with adding SELECT * FROM at the beginning of my code but i get a Syntax error when i do that and i'm not sure how to solve this problem.
here's my code
SELECT Stations.StationName, Product.ProductName, GasInventoryTransactions.TransactionDate, (SELECT TOP 1 Dupe.ActualInventory FROM GasInventory AS Dupe WHERE Dupe.StationID = Stations.StationID AND Dupe.ProductID = Product.ProductID AND Dupe.InventoryDate < GasInventory.InventoryDate ORDER BY Dupe.InventoryDate DESC) AS PreviousDayInventory, GasInventory.ActualInventory, GasInventoryTransactions.GasSales, GasInventoryTransactions.GasDelivery, [PreviousDayInventory]+[GasDelivery]-[GasSales] AS BookBalance, GasInventory.ActualInventory, [ActualInventory]-[BookBalance] AS OverShort
FROM (Stations INNER JOIN (Product INNER JOIN GasInventory ON Product.[ProductID] = GasInventory.[ProductID]) ON Stations.[StationID] = GasInventory.[StationID]) INNER JOIN GasInventoryTransactions ON GasInventory.[InventoryDate] = GasInventoryTransactions.[TransactionDate];
thanks for your help!

Query SQL Microsoft Access bug?

I've been working on an Access database with SQL. I was trying to perform the following query:
SELECT Produtos.produto,
[aux].[total]/[Produtos].[existencias] AS [peso consumos nas existencias]
FROM (SELECT Produtos.produto, SUM(Consumos.quantidade) AS total
FROM Consumos, Produtos, Fornecedores
WHERE Consumos.codproduto=Produtos.produto
AND Produtos.codfornecedor=9
GROUP BY Produtos.produto
ORDER BY Produtos.produto) AS aux
INNER JOIN Produtos
ON aux.produto = Produtos.produto
WHERE (((aux.produto)=[Produtos].[produto]));
A closer look at the results showed me that the column [peso consumos nas existencias] was multiplied by 10. After trying to fix this, I noticed that I was not using the table Fornecedores although I was calling it after FROM keyword, so I removed it:
SELECT Produtos.produto,
[aux].[total]/[Produtos].[existencias] AS [peso consumos nas existencias]
FROM (SELECT Produtos.produto, SUM(Consumos.quantidade) AS total
FROM Consumos, Produtos
WHERE Consumos.codproduto=Produtos.produto
AND Produtos.codfornecedor=9
GROUP BY Produtos.produto
ORDER BY Produtos.produto) AS aux
INNER JOIN Produtos
ON aux.produto = Produtos.produto
WHERE (((aux.produto)=[Produtos].[produto]));
After running, the results were right. Was this suppose to happen? if so, why?
Thanks!
Your Fornecedores table probably has 10 records.
FROM Consumos, Produtos, Fornecedores
WHERE Consumos.codproduto=Produtos.produto
was doing a cartesian product of the Consumos-Produtos join with those 10 records, so the SUM() used each number 10 times.
Note 1:
It is considered better style to use the explicit INNER JOIN syntax:
FROM Consumos INNER JOIN Produtos
ON Consumos.codproduto=Produtos.produto
WHERE Produtos.codfornecedor=9
instead of FROM Consumos, Produtos
Note 2:
If you think you have found a bug in the Access (or any database) query engine, chances are almost 100% that the bug is in your query. ;-)

Inner Join between 2 queries resulting in "Invalid Operation"

In an attempt to create a listing of Orders (each with multiple items) that satisfy some criteria, I have attempted to create a typical LEFT JOIN statement.
The attempt looks like this
SELECT
Q1.Order_Number,
OD.Item_Num
FROM
(
SELECT
OS.Order_Number
FROM
[4-Open_Order_Summary] AS OS
WHERE
Date() >= OS.Ship_Date AND
OS.Back_Ordered > 0
)
AS Q1
LEFT JOIN [1-Open_Order_Data] AS OD
ON Q1.Order_Number = OD.Order_Number
Running this query gives me an unexplained "Invalid operation" error. Researching this error with regards to Access SQL has led me to this question on StackOverflow pertaining to multiple JOIN statements of different types, and this question on the SuperUser branch pertaining to FULL OUTER JOIN statements. However I was unable to find questions related to a single LEFT JOIN statement.
In my attempts to resolve this I have done the following;
Changing
ON Q1.Order_Number = OD.Order_Number to
ON Q1.Order_Number LIKE OD.Order_Number
crashes Access
Running
SELECT
Q1.Order_Number,
FROM
(
SELECT
OS.Order_Number
FROM
[4-Open_Order_Summary] AS OS
WHERE
Date() >= OS.Ship_Date AND
OS.Back_Ordered > 0
)
AS Q1
returns the intended order numbers.
Why not try something like the following if you're trying to get Order Numbers from one table, and related Order Details from another?
SELECT
Q1.Order_Number
OD.Item_Num
FROM
[4-Open_Order_Summary] Q1
LEFT JOIN
[1-Open_Order_Data] OD
ON
OD.Order_Number = Q1.Order_Number
WHERE
DATE() >= Q1.Ship_Date
AND Q1.Back_Ordered > 0

Timeout running SQL query

I'm trying to using the aggregation features of the django ORM to run a query on a MSSQL 2008R2 database, but I keep getting a timeout error. The query (generated by django) which fails is below. I've tried running it directs the SQL management studio and it works, but takes 3.5 min
It does look it's aggregating over a bunch of fields which it doesn't need to, but I wouldn't have though that should really cause it to take that long. The database isn't that big either, auth_user has 9 records, ticket_ticket has 1210, and ticket_watchers has 1876. Is there something I'm missing?
SELECT
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined],
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined]
HAVING
(COUNT([tickets_ticket].[id]) > 0 OR COUNT(T3.[id]) > 0 )
EDIT:
Here are the relevant indexes (excluding those not used in the query):
auth_user.id (PK)
auth_user.username (Unique)
tickets_ticket.id (PK)
tickets_ticket.capturer_id
tickets_ticket.responsible_id
tickets_ticket_watchers.id (PK)
tickets_ticket_watchers.user_id
tickets_ticket_watchers.ticket_id
EDIT 2:
After a bit of experimentation, I've found that the following query is the smallest that results in the slow execution:
SELECT
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id]
The weird thing is that if I comment out any two lines in the above, it runs in less that 1s, but it doesn't seem to matter which lines I remove (although obviously I can't remove a join without also removing the relevant SELECT line).
EDIT 3:
The python code which generated this is:
User.objects.annotate(
Count('tickets_captured'),
Count('assigned_tickets'),
Count('tickets_watched')
)
A look at the execution plan shows that SQL Server is first doing a cross-join on all the table, resulting in about 280 million rows, and 6Gb of data. I assume that this is where the problem lies, but why is it happening?
SQL Server is doing exactly what it was asked to do. Unfortunately, Django is not generating the right query for what you want. It looks like you need to count distinct, instead of just count: Django annotate() multiple times causes wrong answers
As for why the query works that way: The query says to join the four tables together. So say an author has 2 captured tickets, 3 assigned tickets, and 4 watched tickets, the join will return 2*3*4 tickets, one for each combination of tickets. The distinct part will remove all the duplicates.
what about this?
SELECT auth_user.*,
C1.tickets_captured__count
C2.assigned_tickets__count
C3.tickets_watched__count
FROM
auth_user
LEFT JOIN
( SELECT capturer_id, COUNT(*) AS tickets_captured__count
FROM tickets_ticket GROUP BY capturer_id ) AS C1 ON auth_user.id = C1.capturer_id
LEFT JOIN
( SELECT responsible_id, COUNT(*) AS assigned_tickets__count
FROM tickets_ticket GROUP BY responsible_id ) AS C2 ON auth_user.id = C2.responsible_id
LEFT JOIN
( SELECT user_id, COUNT(*) AS tickets_watched__count
FROM tickets_ticket_watchers GROUP BY user_id ) AS C3 ON auth_user.id = C3.user_id
WHERE C1.tickets_captured__count > 0 OR C2.assigned_tickets__count > 0
--WHERE C1.tickets_captured__count is not null OR C2.assigned_tickets__count is not null -- also works (I think with beter performance)