SQL to Hive query conversion

SQL to Hive query conversion - sql

I have an sql query as below. I want to convert this to hive query, is this possible? Using cloudera/hadoop. Thanks
DECLARE
#sd_y FLOAT,
#sd_x1 FLOAT,
#sd_x2 FLOAT,
SELECT
#sd_y = SUM(Y_w*Y),
#sd_x1 = SUM(Y_w*X1),
#sd_x2 = SUM(Y_w*X2)
FROM dbo.c
UPDATE dbo.c
SET
Y = SQRT(Y_w)*(Y-#sd_y),
X1 = SQRT(Y_w)*(X1-#sd_x1),
X2 = SQRT(Y_w)*(X2-#sd_x2)
SELECT
#sd_y,
#sd_x1,
#sd_x2

Related

Scalar-valued function returning NULL

I have the below function, and for the life of me, I cannot get it to return a value, I get NULL every time.
I am calling it via select [dbo].[getFiatProfit](600.26,'GBP', 1000.99,'BTC') as op
What am I missing?
/****** Object: UserDefinedFunction [dbo].[getFiatProfit] Script Date: 06/07/2022 11:42:26 ******/
ALTER FUNCTION [dbo].[getFiatProfit] (
#fiatInvested float,
#fiatInvestedCurrency nvarchar,
#quantity float,
#currency nvarchar
)
RETURNS float
AS
BEGIN
declare #tmp float
declare #result float
declare #usdtgbp float
IF (#fiatInvestedCurrency = 'USD')
BEGIN
select #tmp = [dbo].[usdtPairs].[Value] from [dbo].[usdtPairs] where usdtPairs.ID = #currency;
select #usdtgbp = [dbo].[usdtPairs].[Value] from [dbo].[usdtPairs] where usdtPairs.ID = 'GBP';
set #result = (((#quantity * #tmp) - #fiatInvested) / #usdtgbp);
-- set #result = #quantity * #tmp;
END
ELSE
BEGIN
select #tmp = [dbo].[usdtPairs].[Value] from [dbo].[usdtPairs] where usdtPairs.ID = #currency;
set #result = ((#quantity * #tmp) - #fiatInvested);
-- set #result = #quantity * #tmp;
END
return (#result)
END

Your issue looks it's because your parameters are declared without a length. nvarchar defaults to a length of 1 in a lot of circumstances, so it's simply the wrong value being received. A much better data type would be char(3) which is fixed length, given that all currencies have exact three-letter names.
You should also convert this function into a Table Valued Function, which is likely to perform far better.
CREATE OR ALTER FUNCTION dbo.getFiatProfit (
#fiatInvested float,
#fiatInvestedCurrency char(3),
#quantity float,
#currency char(3)
)
RETURNS TABLE
AS RETURN
SELECT
result = ((#quantity * u.Value) - #fiatInvested)
/ (CASE WHEN #fiatInvestedCurrency = 'USD'
THEN 1
ELSE
(SELECT u2.Value FROM dbo.usdtPairs u2 WHERE u2.ID = 'GBP')
END)
FROM dbo.usdtPairs u
WHERE u.ID = #currency;
You use it like this
SELECT t.*, fp.*
FROM YourTable t
CROSS APPLY dbo.getFiatProfit(t.fiatInvested, t.fiatInvestedCurrency, t.Qty, 'GBP') fp;

Change scientific notation into integer in IMPALA

How can I chance custno: 5.0256926E7 into a normal integer / number in Impala SQL?
This is what I've tried so far:
SELECT * FROM z9_strategy.dstool_model_data_m
WHERE snapshot_date_key = 20170630
AND custtype_ind = 1
AND retailer_retail = 1
AND CAST((custno AS FLOAT) AS int);
I also tried SELECT CAST(CAST(custno AS FLOAT) AS int)

Use CAST:
CAST(custno AS int);

random floating point error with subquery order

I am running sql server 2008 database, i am using the following query in a web app, but for the point of debugging the error i am directly running the query in management studio.
I am getting the following error - An invalid floating point operation occurred. when running this query.
select p.Id as Id, p.CatId as CatId, p.MetaName as MetaName ,p.Active as Active,p.HasChildren as HasChildren ,p.Mlevel as Mlevel ,p.ParentId as ParentId ,p.Type as Type, p.VOrder as VOrder, p.UrlOrder as UrlOrder, Count('*') as VCount
from MetaDataValues as m
left join MetaData as p on m.MetaDataId = p.Id
left join Adverts as a on m.AdvertId = a.Id
where a.Status = 1
and a.ExpDate > current_timestamp and
m.AdvertId in
(select m2.AdvertId from MetaDataValues as m2 left join MetaData as p2 on m2.MetaDataId = p2.Id where p2.MetaName = 'meta1'
and m.AdvertId in (select m3.AdvertId from MetaDataValues as m3 left join MetaData as p3 on m3.MetaDataId = p3.Id where p3.MetaName = 'meta2'
and m.AdvertId in (select m4.AdvertId from MetaDataValues as m4 left join MetaData as p4 on m4.MetaDataId = p4.Id where p4.MetaName = 'meta3'
and m.AdvertId in (select ad9.Id from Adverts as ad9 where dbo.GetDist(ad9.X,ad9.Y,ad9.Z,52.9131514,-2.9313405) < 969))))
group by p.Id, p.CatId, p.MetaName,p.Active,p.HasChildren,p.Mlevel,p.ParentId,p.Type, p.VOrder, p.UrlOrder
To explain its the GetDist function that is causing the problem, if i move this up in the subqueries to the top level the query runs fine?? This isnt an ideal as the code that builds this query is coded in a certain way and i dont want to alter it. so here is the query that works, exactly the same but a different order!
select p.Id as Id, p.CatId as CatId, p.MetaName as MetaName ,p.Active as Active,p.HasChildren as HasChildren ,p.Mlevel as Mlevel ,p.ParentId as ParentId ,p.Type as Type, p.VOrder as VOrder, p.UrlOrder as UrlOrder, Count('*') as VCount
from MetaDataValues as m
left join MetaData as p on m.MetaDataId = p.Id
left join Adverts as a on m.AdvertId = a.Id
where a.Status = 1
and a.ExpDate > current_timestamp and
m.AdvertId in
(select m2.AdvertId from MetaDataValues as m2 left join MetaData as p2 on m2.MetaDataId = p2.Id where p2.MetaName = 'meta1'
and m.AdvertId in (select ad9.Id from Adverts as ad9 where dbo.GetDist(ad9.X,ad9.Y,ad9.Z,52.9131514,-2.9313405) < 969)
and m.AdvertId in (select m3.AdvertId from MetaDataValues as m3 left join MetaData as p3 on m3.MetaDataId = p3.Id where p3.MetaName = 'meta2'
and m.AdvertId in (select m4.AdvertId from MetaDataValues as m4 left join MetaData as p4 on m4.MetaDataId = p4.Id where p4.MetaName = 'meta3' )))
group by p.Id, p.CatId, p.MetaName,p.Active,p.HasChildren,p.Mlevel,p.ParentId,p.Type, p.VOrder, p.UrlOrder
GetDist code
USE [MVC]
GO
/****** Object: UserDefinedFunction [dbo].[GetDist] Script Date: 02/20/2013 17:05:00 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[GetDist]
(
#xaxis float,
#yaxis float,
#zaxis float,
#CenterLat float,
#CenterLon float
)
RETURNS float
AS
BEGIN
declare #CntXAxis float
declare #CntYAxis float
declare #CntZAxis float
declare #EarthRadius float
set #EarthRadius = 3961
set #CntXAxis = cos(radians(#CenterLat)) * cos(radians(#CenterLon))
set #CntYAxis = cos(radians(#CenterLat)) * sin(radians(#CenterLon))
set #CntZAxis = sin(radians(#CenterLat))
return (#EarthRadius * acos( #XAxis*#CntXAxis + #YAxis*#CntYAxis + #ZAxis*#CntZAxis))
END

It looks like GetDist may be calculating the distance between a pair of latitude/longitude values. I have a lot of experience with this. Most GetDist functions like this use the Arc Cosine function "ACos". The parameter for this function is limited to the range -1 to 1. If you try to pass a value outside this range, you will get a domain error in SQL Server. If your GetDist function uses a CLR function, the error would be within your .net code and would have a slightly different message.
When dealing with floats, you have to be aware of weird rounding issues. For example, if your calculations would return a value of 1.00000000000001, and you pass that in to the ACos function, you will get an error.
There's a lot of speculation here, and I could be completely off base, but please consider this and spend a couple minutes doing some research.
Based on your GetDist function posted above, I would suggest a relatively minor change:
ALTER FUNCTION [dbo].[GetDist]
(
#xaxis float,
#yaxis float,
#zaxis float,
#CenterLat float,
#CenterLon float
)
RETURNS float
AS
BEGIN
declare #CntXAxis float
declare #CntYAxis float
declare #CntZAxis float
declare #EarthRadius float
declare #Temp float
set #EarthRadius = 3961
set #CntXAxis = cos(radians(#CenterLat)) * cos(radians(#CenterLon))
set #CntYAxis = cos(radians(#CenterLat)) * sin(radians(#CenterLon))
set #CntZAxis = sin(radians(#CenterLat))
Set #Temp = #XAxis*#CntXAxis + #YAxis*#CntYAxis + #ZAxis*#CntZAxis
If #Temp > 1
Set #Temp = 1
Else If #Temp < -1
Set #Temp = -1
return (#EarthRadius * acos(#Temp))
END
Even if this doesn't solve your original problem, it will at least protect you from weird float/precision problems.

Is the problem an invalid operation or something more like "Error converting data type varchar to numeric"? This is a fairly common problem, when numeric data is being stored as a string. It works when the string looks right, but then fails at other times.
Are all the arguments to getDist() of the correct type? Is the return value a number?
I would guess that the filtering at the higher level filters out the bad values that are causing the problem.

Haversine formula using SQL server to find closest venue - vb.net

I am grabbing a postcode from a form. I can then convert this postcode to lng,lat coordinates as I have these stored in a table.
SELECT lng, lat from postcodeLngLat WHERE postcode = 'CV1'
I have another table which stores the lng,lat of a selection of venues.
SELECT v.lat, v.lng, v.name, p.lat, p.lng, p.postcode, 'HAVERSINE' AS distance FROM venuepostcodes v, postcodeLngLat p WHERE p.outcode = 'CB6' ORDER BY distance
What I am trying to do is create a datagrid which shows the distance of each venue from the postcode (CV1 in this case). I know that the Haversine formula should do what I am trying to achieve but I'm lost as to where I should start incorporating it into my query. I think the formula needs to go where I've put 'HAVERSINE' in the query above.
Any ideas?
EDIT
SELECT o.outcode AS lead_postcode, v.venue_name, 6371.0E * ( 2.0E *asin(case when 1.0E < (sqrt(square(sin(((RADIANS(CAST(o.lat AS FLOAT)))-(RADIANS(CAST(v.lat AS FLOAT))))/2.0E)) + (cos(RADIANS(CAST(v.lat AS FLOAT))) * cos(RADIANS(CAST(o.lat AS FLOAT))) * square(sin(((RADIANS(CAST(o.lng AS FLOAT)))-(RADIANS(CAST(v.lng AS FLOAT))))/2.0E))))) then 1.0E else (sqrt(square(sin(((RADIANS(CAST(o.lat AS FLOAT)))-(RADIANS(CAST(v.lat AS FLOAT))))/2.0E)) + (cos(RADIANS(CAST(v.lat AS FLOAT))) * cos(RADIANS(CAST(o.lat AS FLOAT))) * square(sin(((RADIANS(CAST(o.lng AS FLOAT)))-(RADIANS(CAST(v.lng AS FLOAT))))/2.0E))))) end )) AS distance FROM venuepostcodes v, outcodepostcodes o WHERE o.outcode = 'CB6' ORDER BY distance

I think you'd do best putting it in a UDF and using that in your query:
SELECT v.lat, v.lng, v.name, p.lat, p.lng, p.postcode, udf_Haversine(v.lat, v.lng, p.lat, p.lng) AS distance FROM venuepostcodes v, postcodeLngLat p WHERE p.outcode = 'CB6' ORDER BY distance
create function dbo.udf_Haversine(#lat1 float, #long1 float, #lat2 float, #long2 float) returns float begin
declare #dlon float, #dlat float, #rlat1 float, #rlat2 float, #rlong1 float, #rlong2 float, #a float, #c float, #R float, #d float, #DtoR float
select #DtoR = 0.017453293
select #R = 3937 --3976
select
#rlat1 = #lat1 * #DtoR,
#rlong1 = #long1 * #DtoR,
#rlat2 = #lat2 * #DtoR,
#rlong2 = #long2 * #DtoR
select
#dlon = #rlong1 - #rlong2,
#dlat = #rlat1 - #rlat2
select #a = power(sin(#dlat/2), 2) + cos(#rlat1) * cos(#rlat2) * power(sin(#dlon/2), 2)
select #c = 2 * atn2(sqrt(#a), sqrt(1-#a))
select #d = #R * #c
return #d
end

Alternatively uou could also use SQL Server 2008 geography datatypes. If you currently store the longitude/latitide as varchar() in the DB, you will have to store them as geograpghy datatype and then use a function like STIntersects() to get the distance.

Is it possible to join two tables on one column = concatenation 2 columns?

Table A has column X, which is an int made up of the concatenation of columns Y and Z (which are both floats) in table B. I want to join tables A and B in a manner similar to this:
select *
from tableA a inner join tableB b
on a.X = b.cast(concat(cast(b.Y as varchar), cast(b.Z as varchar)) as integer
Except that, obviously, my example is not correctly done.

You can do this:
select *
from tableA a
inner join tableB b
on a.X = cast(cast(b.Y as varchar) + cast(b.Z as varchar) as int)
If either of your floats have decimal points though, the conversion to int will fail.
E.g., this works:
declare #f1 as float
declare #f2 as float
set #f1 = 1
set #f2 = 7
select cast(cast(#f1 as varchar) + cast(#f2 as varchar) as int)
Output: 17
But this does not:
declare #f1 as float
declare #f2 as float
set #f1 = 1.3
set #f2 = 7
select cast(cast(#f1 as varchar) + cast(#f2 as varchar) as int)
Output: Conversion failed when converting the varchar value '1.37' to data type int.

Sounds like a job for a computed column, then it would be index-able.
http://www.mssqltips.com/tip.asp?tip=1682

Can you create another column in b named x which contains the value you want?
Then the join to A is easy.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL to Hive query conversion - sql

Related

Scalar-valued function returning NULL

Change scientific notation into integer in IMPALA

random floating point error with subquery order

Haversine formula using SQL server to find closest venue - vb.net

Is it possible to join two tables on one column = concatenation 2 columns?

Categories

Resources