I'm a beginner in hive and had some basic questions trying to manipulate a table.
I have a hive table as
bought sold fruit
5 0 apple
0 0 mango
3 2 orange
I want the output as
agg fruit
0 apple
0 mango
1.5 orange
I'm writing an aggregate query as
SELECT sold/bought as agg, fruit from table GROUP BY fruit
I'm having two issues
I want the solution to be proper ex 3/2 is giving me 1 it should give 1.5
If bought or sold is 0, instead of Null or NaN it should give the value 0.
Any suggestions on how to achieve this. Thank you.
Hmmm. I thought that Hive divided integers as floats. If not, a simple solution is to multiply by 1.0:
SELECT sold * 1.0 / bought as agg, fruit
FROM table
GROUP BY fruit
Use a CASE expression where you apply your conditions:
select
case
when bought = 0 or sold = 0 then 0
else 1.0 * bought / sold
end agg,
fruit
from tablename
Since you are not doing any aggregation there is no need for GROUP BY.
I have seen erratic behaviours in hive or impala. To be on safe side i would use,
SELECT cast(sold as decimal(5,1)) / cast(bought as decimal(5,1)) as agg, fruit FROM table GROUP BY fruit
If you want to insert data into a decimal(5,1) column, cast result as decimal(5,1). i know this is stupid but i want to be safe and avoid unnecessary version specific automatic-casting issues. I have seen calculation producing null result in one version and works just fine in other.
Related
I have a problem with counting strings in my concat syntax.
i concat 4 columns and the result is as below.
(complete,complete,incomplete,incomplete).
now I want to score them base on the number of 'complete'. for example if there are 4 complete it is 100%, if there are 3 complete it is 60% and so on.
is there any way to do this in sql
I suspect that you want aggregation like this:
select avg(case when val = 'complete' then 1.0 else 0 end) as complete_ratio
I need to compute the Percentage of a specific column in a table, here's an example
Food Quantity
Lemon 2
Mango 8
SweetJuice 10
Water 20
I want to obtain a table like this:
Food Quantity_pc
Lemon 5
Mango 20
SweetJuice 25
Water 50
Note that this result is not grouped and I have just added this example to simplify the problem.
Here's what am trying to do:
Select Food, (Quantity/Sum(Quantity))*100 as Quantity_pc
From `FoodTable`;
But it's throwing me this error:
bigquery error: 400 SELECT list expression references column Decision
which is neither grouped nor aggregated
You want window functions:
select food, quantity * 100 / sum(quantity) over () as quantity_percent
from `foodtable`;
There is a similar question which answer this for a known number of columns and only a single selection column. But the problem here is that
I have no knowledge of columns (count, type) of a specified SQL query and also I want to blank for all columns not a single column.
For example lets say I have following query.
Select * from View1
Result :
Column(1) Column(2) Column(..) Column(N)
1 A Sales 1500
2 C Sales 2500
3 C Sales 2500
4 A Development 2500
Expected result :
Column(1) Column(2) Column(..) Column(N)
1 A Sales 1500
2 C 2500
3
4 A Development
Pseudo SQL Query :
EXEC proc_blank_query_result 'Select * from View1'
If you're in SQL Server 2012 or newer, you can do this with lag, something like this:
select
nullif(column1, lag(column1) over (order by yourorderbyclause)) as column1,
nullif(column2, lag(column2) over (order by yourorderbyclause)) as column2,
...
from
View1
To make it dynamic, well then you have to parse a lot of metadata from the query. Using sp_describe_first_result_set might be a good idea, or use select into a temp. table and parse the columns of it.
I have a lookup table that is 1 to N with a data table. The look up table has the
By example the lookup table contains (Dog, Cat, Bird, Exotic)
The data table has the following fields.
house, animal_type, quantity
If I have data such as
house animal_type quantity
1 dog 1
1 cat 1
2 dog 2
3 exotic 1
How do I get a query that will produce the following? (The order of the column headings is immaterial).
house dog cat bird exotic
1 1 1 0 0
2 2 0 0 0
3 0 0 0 1
I know about
IIF([animals].[quantity] is null,0,[animals].[quantity])
But it is not producing a zero record for each animal type even if it is not at that house.
You can get what you need with a crosstab query. See the Access help topic: TRANSFORM Statement (Microsoft Access SQL). And check out the Access Crosstab Query Wizard to get started.
It seems you want a column for bird even when no house has one. So add IN ('dog', 'cat', 'bird', 'exotic') to the PIVOT clause:
TRANSFORM Sum(data_table.[quantity]) AS SumOfquantity
SELECT data_table.[house]
FROM data_table
GROUP BY data_table.[house]
PIVOT data_table.[animal_type] IN ('dog', 'cat', 'bird', 'exotic');
That query returns Null instead of zero where a house does not have a record for an animal_type. Include Nz() if you prefer zero instead of Null:
TRANSFORM Nz(Sum(data_table.[quantity]), 0) AS SumOfquantity
With that query, Access 2010 gives me this result set from your sample data:
You can use TRANSFORM operator in the MS Access to solve your task.
TRANSFORM Nz(SUM(quantity), 0)
SELECT house FROM Test
GROUP BY house
PIVOT animal_type
Output of this query
More information about it you can find here (https://msdn.microsoft.com/en-us/library/bb208956(v=office.12).aspx)
The following link explains step by step to get a crosstab query to show dynamic headings even if there is not associated data for that heading.
How to Tame the Crosstab Missing Column Beast
I have result set like -
id achieved
1 0
2 1
3 1
4 0
5 0
The Percentage should be 2/5 i.e. 40 %. How can I write a SQL Query to achieve something like this ? I would prefer not to use and nested select as the actual query is already doing quite a bit. Thanks !
select avg(achieved) from ...
Note that you will have to use a group by function to include categories:
select gender, avg(achieved) from ... group by gender