I am trying to merge multiple tables. This is how I am doing it:
CREATE TABLE big3 AS SELECT *
FROM trainSearchStream a
LEFT OUTER JOIN SearchInfo b ON b.SearchID=a.SearchID LIMIT 3
LEFT OUTER JOIN AdsInfo c ON c.AdID=a.AdID LIMIT 3;
However, I get this error:
Error: near "LEFT": syntax error
As jarlh already mentioned, there can be only one LIMIT per select statement.
So this is forbidden:
select *
from a limit 5
join b limit 6 an a.x = b.x;
But this would be allowed:
select *
from (select * from a limit 5) alim
join (select * from b limit 6) blim on alim.x = blim.x;
As you simply want to test your query however, I'd suggest, you take a sample from trainSearchStream to test it. The modulo operator % is great for taking samples:
CREATE TABLE big3 AS SELECT *
FROM (select * from trainSearchStream where searchid % 12345 = 6789) a
LEFT OUTER JOIN SearchInfo b ON b.SearchID = a.SearchID
LEFT OUTER JOIN AdsInfo c ON c.AdID = a.AdID;
Choose whatever numbers you like for the modulo operation. Above statement divides your trainSearchStream count by about 12345 (provided the IDs are evenly spread).
Related
I have this query:
SELECT *
FROM Licitaciones L
LEFT JOIN CPVLicitaciones C ON L.IDLicitacion = C.IDLicitador
WHERE (DateInserted>='2019-12-11' or DateUpdated>='2019-12-11')
The query takes about two minutes and there are not so many records
I would like to optimize this query but I don't have the knowledge to do it.
SELECT *
FROM Licitaciones L
LEFT JOIN
CPVLicitaciones C ON
L.IDLicitacion = C.IDLicitador
WHERE DateInserted>='2019-12-11'
UNION
SELECT *
FROM Licitaciones L
LEFT JOIN
CPVLicitaciones C ON
L.IDLicitacion = C.IDLicitador
WHERE DateUpdated>='2019-12-11'
I want to achieve this effect in hive :
select a.* from entry_data_fxj_cl a left join exit_data b
on trim(a.ecardid) = trim(b.ecardid) and abs(a.entrytime-b.entrytime)>60000
where trim(b.ecardid) IS NULL
b.entrytime should match the closest time to A.entrytime
How to express unequal ?
How to express closest ?
Thanks for your answer.
I would be inclined to write this as:
select edf.*
from entry_data_fxj_cl edf
where not exists (select 1
from exit_data ed
where trim(ed.ecardid) = trim(edf.ecardid) and
ed.entrytime > edf.entrytime - 60000 and
ed.entrytime < edf.entrytime + 60000
);
Does this work for you in Hive?
The solution is to move non-equality join condition to the WHERE and add OR IS NULL to allow left join.
Please see comments in the SQL code:
select *
from
(--move non-equality condition to the where + OR is null to allow left join
select a.*, b.ecardid as b_ecardid
from entry_data_fxj_cl a left join exit_data b
on trim(a.ecardid) = trim(b.ecardid)
where abs(a.entrytime-b.entrytime)>60000 or b.ecardid is NULL --allow left join
)s
where b_ecardid IS NULL --filter only rows for which b.ecardid is not found
When I execute this query it gives a 10 result set .
select * from OA_SERVICE_REQUESTS WHERE
OA_SERVICE_REQUESTS.CUSREG_ID=4
But when I join with other table with to get more information, I use 2 inner join because this is 2 foreign key from ELVM_SMUNT_CUS table it gives me 120 results
select * from OA_SERVICE_REQUESTS
inner join ELVM_SMUNT_CUS T1 on OA_SERVICE_REQUESTS.DIVCOD = T1.DIVCOD
inner join ELVM_SMUNT_CUS T2 on OA_SERVICE_REQUESTS.UNTNUM = T2.UNTNUM
WHERE OA_SERVICE_REQUESTS.CUSREG_ID=4
Try to combine them together :
select * from OA_SERVICE_REQUESTS R
inner join ELVM_SMUNT_CUS T1 on ( R.DIVCOD = T1.DIVCOD
and R.UNTNUM = T1.UNTNUM )
where R.CUSREG_ID=4;
for your query not to produce cross-product results.
Probably, you have 12 matching records for R.DIVCOD = T1.DIVCOD, and 10 matching records for R.UNTNUM = T1.UNTNUM for R.CUSREG_ID=4, by combining the result set by an and you can have 10results at the same time, but may yield 120 occurences by 12 times 10, if conditions are taken apart by more joins.
I have the following SQL Query which works if I remove the last line which is order by.
Can anyone tell me please why this is not working when I put order by on this query.
declare #UQ as decimal(20,6);
declare #MUQ as decimal(20,6);
select #UQ=ItemUnit_UnitQuantity, #MUQ=ItemUnit_MainUnitQuantity from tItemUnit
where ItemUnit_Id = 23996
select top 1 (InvBuyPriceValue/InvL_Quantity)*iu.ItemUnit_UnitQuantity/iu.ItemUnit_MainUnitQuantity*#MUQ/#UQ as InvBuyPrice, Discount
from tInvL l
left outer join tInvH h on h.InvH_Id = l.InvH_Id
left outer join tItem i on i.Item_Id = l.Item_Id
left outer join tItemUnit iu on iu.ItemUnit_Id = l.ItemUnit_Id
left outer join tUnit u on u.Unit_Id = iu.Unit_Id
left outer join tClientObj clo on clo.ClientObj_Id = h.ClientObj_Id
left outer join tDocType dt on dt.DocType_Id = h.DocType_Id
where h.CompanyObj_Id = (select CompanyObj_Id from tEnabledCompany where CompanyClientObj_Id=(select ClientObj_Id from tClientObj where ClientObj_Code = '504'))
and dt.DocType_InOut = 1
and l.Item_Id = 19558
and h.ClientObj_Id = 386
order by InvH_DocDate desc, InvH_DocTime desc
I get the error saying:
Divide by zero error encountered.
I don't understand why I get this error on order by and not for example in select statement...
There are divisions in
select top 1 (InvBuyPriceValue/InvL_Quantity)*iu.ItemUnit_UnitQuantity/iu.ItemUnit_MainUnitQuantity*#MUQ/#UQ
so probably InvL_Quantity or iu.ItemUnit_MainUnitQuantity are zero.
Why don't you see the SQL Server error without the ORDER BY? You are only requesting the TOP 1 row, so the SQL Server does not need to go over all rows and calculate the result. For performance reasons the SQL Server just picks the TOP 1 row, calculates the results for it and returns it.
That you get a divide-by-zero with TOP 1 and not without is just by chance. You'll definitely see the same error if you don't TOP 1 and if you don't ORDER BY.
SELECT TOP 1
(InvBuyPriceValue / InvL_Quantity) *
iu.ItemUnit_UnitQuantity / iu.ItemUnit_MainUnitQuantity *
#MUQ / #UQ AS InvBuyPrice,
That is the only place where you are performing divisions. Either InvL_Quantity or ItemUnit_MainUnitQuantity columns might contain the value zero.
Also check ItemUnit_UnitQuantity, which is the value being assigned to #UQ, which is also a divisor.
I was trying to join 3 tables - CurrentProducts, SalesInvoice and SalesInvoiceDetail. SalesInvoiceDetail contains FK/foreign key to the other two tables and some other columns. The first query is ok but the second is not. My question comes at the end of the code.
Right
select *
from CurrentProducts inner join
(dbo.SalesInvoiceDetail inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
)
on dbo.SalesInvoiceDetail.ProductID = dbo.CurrentProducts.ProductID
Wrong
select *
from CurrentProducts inner join
(select * from
dbo.SalesInvoiceDetail inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
)
on dbo.SalesInvoiceDetail.ProductID = dbo.CurrentProducts.ProductID
error - Incorrect syntax near the keyword 'on'.
Why is the second query wrong ? Isn't it conceptually the same as the first one ? That is inside join makes a result set. We select * the result set and then join this result set to CurrentProducts ?
The first query is a "plain" join expressed with an older syntax. It can be rewritten as:
select
*
from
CurrentProducts
inner join dbo.SalesInvoiceDetail
on dbo.SalesInvoiceDetail.ProductID = dbo.CurrentProducts.ProductID
inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
The second query is a join where the second table is a subquery. When you join on a subquery, you must assign an alias to it and use that alias to refer to the columns returned by the subquery:
select
*
from
CurrentProducts
inner join (select *
from dbo.SalesInvoiceDetail
inner join dbo.SalesInvoice
on SalesInvoiceDetail.InvoiceID = SalesInvoice.InvoiceID
) as foo on foo.ProductID = dbo.CurrentProducts.ProductID
You need to alias the inner query. Also, in the first one the parentheses are not needed.
select *
from CurrentProducts inner join
(select * from
dbo.SalesInvoiceDetail inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
) A
on A.ProductID = dbo.CurrentProducts.ProductID