I have a select statement:
select
M.FA_Unique_Listing_Identifier_Ref_ID
,P.ATTOM_ID
,P.ParcelNumberRaw Parcel
,M.Assessors_Parcel_Identification_Number
,M.Listing_Tracking_ID
,P.CensusTract GEOID
,M.Current_Original_Listing_Date
,M.Add_Change_Delete_Indicator
,M.FA_Calculated_Days_on_Market
,M.Status
,M.Status_Sub_Type
,M.Update_Timestamp
,M.LoadDate
,'Parcel Match' as Match_Type
from `mother-stg-254212.DATATREE_MLS.MLS_STAGE` M
JOIN `mother-216719.PROPERTY.PROPERTY_DETAIL` P on
M.Property_Address = P.PropertyAddressFull
and M.Property_Zip = P.PRopertyAddressZIP
but this query times out.
THe two matching fields Address and Zip are Strings.
It runs 6 hours and times out.
How can I make this work faster?
Thanks
This looks like an unbalanced joins scenario.
For this you could either try to optimize your join patterns. If the issue persists after this, I suggest opening an issue tracker so the BigQuery engineering team can verify this behavior.
Related
I have two tables and with column paperNo and some data regarding that paper. I am trying to search all data based on paper no. from both the tables. I have successfully written the query and it is retrieving the data successfully. but I have noticed that. If my paperNo contains zero(0) then the query is not searching for that data. And for the non zero contains paperNo it is retrieving the same record twice.
I don't understand what is going wrong. tried every thing.
Here is my Query .-
SELECT PaperDate.paperNo,
PaperDate.RAW_PAPER,
PaperDate.EDGE_SEALED,
PaperDate.HYDRO_120,
PaperDate.HYDRO_350,
PaperDate.CATALYST_1ST,
PaperDate.CATALYST_2ND,
PaperDate.SIC_350,
tblThicknessPaperDate.rawThickness,
tblThicknessPaperDate.catThickness,
tblThicknessPaperDate.sicThickness,
tblThicknessPaperDate.rejectedThickness
FROM tblThicknessPaperDate
FULL OUTER JOIN PaperDate ON PaperDate.paperNo =tblThicknessPaperDate.paperNo
WHERE (tblThicknessPaperDate.paperNo = #paperNo)
I would try:
FROM tblThicknessPaperDate
RIGHT JOIN PaperDate ON PaperDate.paperNo =tblThicknessPaperDate.paperNo
WHERE (PaperDate.paperNo = #paperNo)
The two changes are: swapping to a right join so even if a record isn't in tblThicknessPaperDate we will still see the record in PaperDate. The other change is to use PapterDate.paperNo in the where clause. Since tblThicknessPaperDate.paperNo could be null we don't want to use that in the where if we can avoid it.
SELECT PaperDate.paperNo,
PaperDate.RAW_PAPER,
PaperDate.EDGE_SEALED,
PaperDate.HYDRO_120,
PaperDate.HYDRO_350,
PaperDate.CATALYST_1ST,
PaperDate.CATALYST_2ND,
PaperDate.SIC_350,
tblThicknessPaperDate.rawThickness,
tblThicknessPaperDate.catThickness,
tblThicknessPaperDate.sicThickness,
tblThicknessPaperDate.rejectedThickness
FROM tblThicknessPaperDate
FULL OUTER JOIN PaperDate ON PaperDate.paperNo =tblThicknessPaperDate.paperNo
WHERE (tblThicknessPaperDate.paperNo = #papNo | PaperDate.paperNo = #paperNo)
They return the same count of rows, but I'm not sure if one is an accident waiting to happen, or one is simply "the preferred method":
SELECT duckbill.id, duckbill.pack_size, duckbill.description, duckbill.platypus_id, duckbill.department, duckbill.subdepartment, duckbill.unit_cost, duckbill.unit_list, duckbill.open_qty, duckbill.UPC_code, duckbill.UPC_pack_size, duckbill.crv_id, duckbill_platypuss.platypus_item
FROM duckbill
INNER JOIN duckbill_platypuss ON duckbill.platypus_id = duckbill_platypuss.platypus_id
SELECT duckbill.id, duckbill.pack_size, duckbill.description, duckbill.platypus_id, duckbill.department, duckbill.subdepartment, duckbill.unit_cost, duckbill.unit_list, duckbill.open_qty, duckbill.UPC_code, duckbill.UPC_pack_size, duckbill.crv_id, duckbill_platypuss.platypus_item
FROM duckbill, duckbill_platypuss
WHERE (duckbill.platypus_id = duckbill_platypuss.platypus_id)
So the reason there's a difference is due to the order of execution. You may not notice the difference on some table combinations, but on very large tables, you generally want to filter with the JOIN first.
This post gives some good information toward that end it looks like: Order Of Execution of the SQL query
I have been given a database, the structure and data values are all unchangable and have been provided with a question.
Produce a query to list the holiday code, holiday description, holiday duration and site description for all holidays which visit site code 101. Your answer must not assume that site code 101 will always have the same site description.
I am confused on how to tackle this question. I have tried Multiple joins, different dot notation and googled the question to hell and back. Any help?
Table 1 - Holiday_Details
Holiday_Code - Country_Visited - Holiday_Duration - Holiday_Desc - Rating_Code - Cost
Table 2 - Site_Of_Holiday
Site_Description - Site_Code
Table 3 - Site_Visited
Holiday_Code - Site_Code
Comments have asked for previous attempts. This was my first.
SELECT holiday_code,
holiday_desc,
holiday_duration site_of_holiday.Site_Name
FROM holiday_details
JOIN site_visited
ON holiday_code = site_visited.holiday_code
JOIN site_of_holiday
ON site_visited.site_code = site_of_holiday.site_code
WHERE site_of_holiday.site_code = 101;
For future reference, you'll get a better response if you post a lot more detail about your failed attempts. By that, I mean code. Using SO to solve your homework assignments is frowned upon but, like a commenter said, once you've wracked your brain we're willing to help.
You seem like you may have actually tried real hard, so I'll throw you a bone...
The trick to navigating multiple tables is to find the "pairs" of matching columns. In this case you want to find a path between the tables Site_Of_Holiday (which has Site_Description) and Holiday_Details (which has everything else).
The columns that match between each pair of tables are:
Holiday_Code is found in both Site_Visited and Holiday_Details
Site_Code is found in both Site_Of_Holiday and Site_Visited
This allows you to build a path between the tables that contain all of the columns we want in the output. You would do this, in this case, using INNER JOINs across those matching column pairs.
Once you've joined the tables, think of the result like a giant table whose columns include all columns from all three tables (prefixed with whatever you 'name' the table during the joins). Now you just filter on the Site_Code with the usual WHERE clause.
Here's the full example - let me know if it works for you:
SELECT hd.Holiday_Code, hd.Holiday_Desc, hd.Holiday_Duration, soh.Site_Description
FROM Holiday_Details hd
INNER JOIN Site_Visited sv ON hd.Holiday_Code = sv.Holiday_Code
INNER JOIN Site_Of_Holiday soh ON sv.Site_Code = soh.Site_Code
WHERE sv.Site_Code = 101
Good luck!
P.S. In case any Americans get a similar assignment, here's the translation ;-)
SELECT vd.Vacation_Code, vd.Vacation_Desc, vd.Vacation_Duration, sov.Site_Description
FROM Vacation_Details vd
INNER JOIN Site_Visited sv ON vd.Vacation_Code = sv.Vacation_Code
INNER JOIN Site_Of_Vacation sov ON sv.Site_Code = sov.Site_Code
WHERE sv.Site_Code = 101
I am new to this portal. I have a very simple problem to be solved. It is related to the ANSI SQL. I am writing a reports using BIRT and I am fetching the data from several tables. I understand how the SQL joins work but maybe not fully. I researched google for hours and I could not find relevant answer.
My problem is that one of the relationships in the code produce a duplicate result (the same row is copied - duplicated). I was so determined to solve it I used every type of join available. Some of this SQL was produced already. I shall post my code below. I know that one of the solutions to my problem is use of the 'DISTINCT' keyword. I have used it and it does not solve my problem.
Can anyone propose any solution to that?
Sample code:
SELECT DISTINCT
partmaster.partdesc,
partmaster.uom,
traders.name AS tradername,
worksorders.id AS worksorderno,
worksorders.partid,
worksorders.quantity,
worksorders.duedate,
worksorders.traderid,
worksorders.orderid,
routingoperations.partid,
routingoperations.methodid,
routingoperations.operationnumber,
routingoperations.workcentreid,
routingoperations.settime,
routingoperations.runtime,
routingoperations.perquantity,
routingoperations.description,
routingoperations.alternativeoperation,
routingoperations.alternativeoperationpreference,
machines.macdesc,
machines.msection,
allpartmaster.partnum,
allpartmaster.nbq,
allpartmaster.partdesc,
routingoperationtools.toolid,
tools.tooldesc,
CAST (emediadetails.data as VARCHAR(MAX)) AS cplandata
FROM worksorders
INNER JOIN partmaster ON worksorders.partid = partmaster.partnum
INNER JOIN traders traders ON worksorders.traderid = traders.id
INNER JOIN routingoperations routingoperations ON worksorders.partid = routingoperations.partid
AND worksorders.routingmethod = routingoperations.methodid
INNER JOIN allpartmaster allpartmaster ON routingoperations.partid = allpartmaster.partnum
LEFT OUTER JOIN machines machines ON routingoperations.workcentreid = machines.macid
LEFT OUTER JOIN routingoperationtools routingoperationtools ON routingoperationtools.partid = routingoperations.partid
AND routingoperationtools.routingmethod = routingoperations.methodid
AND routingoperationtools.operationnumber = routingoperations.operationnumber
LEFT OUTER JOIN tools tools ON tools.toolid = routingoperationtools.toolid
LEFT OUTER JOIN emediadetails ON emediadetails.keyvalue1 = worksorders.id
AND emediadetails.keyvalue2 = routingoperations.operationnumber
AND emediadetails.emediaid = 'worksorderoperation'
I do not have too much of the test data but I know that one row is copied twice as the result of the query below even tho I used DISTINCT keyword. I know that my problem is rather specific and not general but the solution that someone will propose may help others with the similar problem.
I can't solve your problem for you without some test data, but I have some helpful hints.
In principle, you should be really careful with DISTINCT - its a great way of hiding bugs in your query. Only use DISTINCT if you are confident that the underlying data contains legitimate duplicates. If your joins are wrong, and you're getting a cartesian product, you can remove the duplicates from the results with DISTINCT - but that doesn't stop the cartesian product being generated. You'll get very poor performance, and possibly incorrect data.
Secondly, I am pretty sure that DISTINCT works properly - you are almost certainly not getting duplicates, but it may be hard to spot the difference between two rows. Leading or trailing spaces in text columns, for instance could be to blame.
Finally, to work through this problem, I'd recommend building the query up join by join, and seeing where you get the duplicate - that's the join that's to blame.
So, start with:
SELECT
traders.name AS tradername,
worksorders.id AS worksorderno,
worksorders.partid,
worksorders.quantity,
worksorders.duedate,
worksorders.traderid,
worksorders.orderid
FROM worksorders
INNER JOIN traders traders ON
worksorders.traderid = traders.id
and build up to the next join.
Are you sure the results are exact duplicates? Makes sure there isn't one column that actually has a different value.
I've got an Access MDB I use for reporting that has linked table views from SQL Server 2005. I built a query that retrieves information off of a PO table and categorizes the line item depending on information from another table. I'm relatively certain the query was fine until approximately a month ago when we shifted from compatibility mode 80 to 90 on the Server as required by our primary application (which creates the data). I can't say this with 100% certainty, but that is the only major change made in the past 90 days. We noticed that suddenly data was not showing up in the query making the reports look odd.
This is a copy of the failing query:
SELECT dbo_porel.jobnum, dbo_joboper.opcode, dbo_porel.jobseqtype,
dbo_opmaster.shortchar01,
dbo_porel.ponum, dbo_porel.poline, dbo_podetail.unitcost
FROM ((dbo_porel
LEFT JOIN dbo_joboper ON (dbo_porel.assemblyseq = dbo_joboper.assemblyseq)
AND (dbo_porel.jobseq = dbo_joboper.oprseq)
AND (dbo_porel.jobnum = dbo_joboper.jobnum))
LEFT JOIN dbo_opmaster ON dbo_joboper.opcode = dbo_opmaster.opcode)
LEFT JOIN dbo_podetail ON (dbo_porel.poline = dbo_podetail.poline)
AND (dbo_porel.ponum = dbo_podetail.ponum)
WHERE (dbo_porel.jobnum="367000003")
It returns the following:
jobnum opcode jobseqtype shortchar01 ponum poline unitcost
367000003 S 6624 2 15
The query normally should have displayed a value for opcode and shortchar01. If I remove the linked table dbo_podetail, it properly displays data for these fields (although I obviously don't have unitcost anymore). At first I thought it might be a data issue, but I found if I nested the query and then linked the table, it worked fine.
For example the following code works perfectly:
SELECT qryTest.*, dbo_podetail.unitcost
FROM (
SELECT dbo_porel.jobnum, dbo_joboper.opcode, dbo_porel.jobseqtype,
dbo_opmaster.shortchar01, dbo_porel.ponum, dbo_porel.poline
FROM (dbo_porel
LEFT JOIN dbo_joboper ON (dbo_porel.jobnum=dbo_joboper.jobnum)
AND (dbo_porel.jobseq=dbo_joboper.oprseq)
AND (dbo_porel.assemblyseq=dbo_joboper.assemblyseq))
LEFT JOIN dbo_opmaster ON dbo_joboper.opcode=dbo_opmaster.opcode
WHERE (dbo_porel.jobnum="367000003")
) As qryTest
LEFT JOIN dbo_podetail ON (qryTest.poline = dbo_podetail.poline)
AND (qryTest.ponum = dbo_podetail.ponum)
I'm at a loss for why it works in the latter case and not in the first case. Worse yet, it seems to work intermittently for some records and not for others (it's consistent about the ones it does and does not work for).
Do any of you experts have any ideas?
You definitely need to use subqueries for multiple left/right joins in Access.
I think it's a limitation of the Jet optimizer that gets confused if you're just chaining left/right joins.
You can see that this is a recurrent problem that surfaces often.
I'm always confused by Access' use of brackets in joins. Try stripping out the extra brackets.
FROM
dbo_porel
LEFT JOIN
dbo_joboper ON (dbo_porel.assemblyseq = dbo_joboper.assemblyseq)
AND (dbo_porel.jobseq = dbo_joboper.oprseq)
AND (dbo_porel.jobnum = dbo_joboper.jobnum)
LEFT JOIN
dbo_opmaster ON (dbo_joboper.opcode = dbo_opmaster.opcode)
LEFT JOIN
dbo_podetail ON (dbo_porel.poline = dbo_podetail.poline)
AND (dbo_porel.ponum = dbo_podetail.ponum)
OK the above doesn't work - Sorry I give up