divide operation in hive tables - sql

I'm a beginner in hive and had some basic questions trying to manipulate a table.
I have a hive table as
bought sold fruit
5 0 apple
0 0 mango
3 2 orange
I want the output as
agg fruit
0 apple
0 mango
1.5 orange
I'm writing an aggregate query as
SELECT sold/bought as agg, fruit from table GROUP BY fruit
I'm having two issues
I want the solution to be proper ex 3/2 is giving me 1 it should give 1.5
If bought or sold is 0, instead of Null or NaN it should give the value 0.
Any suggestions on how to achieve this. Thank you.

Hmmm. I thought that Hive divided integers as floats. If not, a simple solution is to multiply by 1.0:
SELECT sold * 1.0 / bought as agg, fruit
FROM table
GROUP BY fruit

Use a CASE expression where you apply your conditions:
select
case
when bought = 0 or sold = 0 then 0
else 1.0 * bought / sold
end agg,
fruit
from tablename
Since you are not doing any aggregation there is no need for GROUP BY.

I have seen erratic behaviours in hive or impala. To be on safe side i would use,
SELECT cast(sold as decimal(5,1)) / cast(bought as decimal(5,1)) as agg, fruit FROM table GROUP BY fruit
If you want to insert data into a decimal(5,1) column, cast result as decimal(5,1). i know this is stupid but i want to be safe and avoid unnecessary version specific automatic-casting issues. I have seen calculation producing null result in one version and works just fine in other.

Related

counting string in when using concat in sql

I have a problem with counting strings in my concat syntax.
i concat 4 columns and the result is as below.
(complete,complete,incomplete,incomplete).
now I want to score them base on the number of 'complete'. for example if there are 4 complete it is 100%, if there are 3 complete it is 60% and so on.
is there any way to do this in sql
I suspect that you want aggregation like this:
select avg(case when val = 'complete' then 1.0 else 0 end) as complete_ratio

How to compute percentages in BigQuery SQL for SUM?

I need to compute the Percentage of a specific column in a table, here's an example
Food Quantity
Lemon 2
Mango 8
SweetJuice 10
Water 20
I want to obtain a table like this:
Food Quantity_pc
Lemon 5
Mango 20
SweetJuice 25
Water 50
Note that this result is not grouped and I have just added this example to simplify the problem.
Here's what am trying to do:
Select Food, (Quantity/Sum(Quantity))*100 as Quantity_pc
From `FoodTable`;
But it's throwing me this error:
bigquery error: 400 SELECT list expression references column Decision
which is neither grouped nor aggregated
You want window functions:
select food, quantity * 100 / sum(quantity) over () as quantity_percent
from `foodtable`;

How to show blank instead of column value for all duplicated columns of a SQL query?

There is a similar question which answer this for a known number of columns and only a single selection column. But the problem here is that
I have no knowledge of columns (count, type) of a specified SQL query and also I want to blank for all columns not a single column.
For example lets say I have following query.
Select * from View1
Result :
Column(1) Column(2) Column(..) Column(N)
1 A Sales 1500
2 C Sales 2500
3 C Sales 2500
4 A Development 2500
Expected result :
Column(1) Column(2) Column(..) Column(N)
1 A Sales 1500
2 C 2500
3
4 A Development
Pseudo SQL Query :
EXEC proc_blank_query_result 'Select * from View1'
If you're in SQL Server 2012 or newer, you can do this with lag, something like this:
select
nullif(column1, lag(column1) over (order by yourorderbyclause)) as column1,
nullif(column2, lag(column2) over (order by yourorderbyclause)) as column2,
...
from
View1
To make it dynamic, well then you have to parse a lot of metadata from the query. Using sp_describe_first_result_set might be a good idea, or use select into a temp. table and parse the columns of it.

force query to produce records for each value in lookup table

I have a lookup table that is 1 to N with a data table. The look up table has the
By example the lookup table contains (Dog, Cat, Bird, Exotic)
The data table has the following fields.
house, animal_type, quantity
If I have data such as
house animal_type quantity
1 dog 1
1 cat 1
2 dog 2
3 exotic 1
How do I get a query that will produce the following? (The order of the column headings is immaterial).
house dog cat bird exotic
1 1 1 0 0
2 2 0 0 0
3 0 0 0 1
I know about
IIF([animals].[quantity] is null,0,[animals].[quantity])
But it is not producing a zero record for each animal type even if it is not at that house.
You can get what you need with a crosstab query. See the Access help topic: TRANSFORM Statement (Microsoft Access SQL). And check out the Access Crosstab Query Wizard to get started.
It seems you want a column for bird even when no house has one. So add IN ('dog', 'cat', 'bird', 'exotic') to the PIVOT clause:
TRANSFORM Sum(data_table.[quantity]) AS SumOfquantity
SELECT data_table.[house]
FROM data_table
GROUP BY data_table.[house]
PIVOT data_table.[animal_type] IN ('dog', 'cat', 'bird', 'exotic');
That query returns Null instead of zero where a house does not have a record for an animal_type. Include Nz() if you prefer zero instead of Null:
TRANSFORM Nz(Sum(data_table.[quantity]), 0) AS SumOfquantity
With that query, Access 2010 gives me this result set from your sample data:
You can use TRANSFORM operator in the MS Access to solve your task.
TRANSFORM Nz(SUM(quantity), 0)
SELECT house FROM Test
GROUP BY house
PIVOT animal_type
Output of this query
More information about it you can find here (https://msdn.microsoft.com/en-us/library/bb208956(v=office.12).aspx)
The following link explains step by step to get a crosstab query to show dynamic headings even if there is not associated data for that heading.
How to Tame the Crosstab Missing Column Beast

How to find percentage in SQL from list of zeros and ones?

I have result set like -
id achieved
1 0
2 1
3 1
4 0
5 0
The Percentage should be 2/5 i.e. 40 %. How can I write a SQL Query to achieve something like this ? I would prefer not to use and nested select as the actual query is already doing quite a bit. Thanks !
select avg(achieved) from ...
Note that you will have to use a group by function to include categories:
select gender, avg(achieved) from ... group by gender