Let's say I have a table called "means" that looks like this:
year mean
1990 1.5
1991 1.0
1992 1.3
1993 1.0
And I have a second table called "values" that looks like this:
year tag value
1990 A 0.25
1991 B 1.10
1992 C 2.32
1993 A 0.70
I want to create another column where if the value for a given year is greater than the mean for a given year, the value of that column should be "Greater". If it's less than the mean for a given year, it should be "Less" and if it's equal to the mean, it should be "Equal".
Essentially, I want to create a series of Case When statements that are indexed to the year given in the table.
How would I go about doing that?
That'as a join with conditional logic:
select
v.*,
case
when v.value > m.mean then 'Greater'
when v.value < m.mean then 'Less'
else 'Equal'
end comp
from vals v
inner join means m on m.year = v.year
Note: this is called a case expression, not a case statement (the former is a conditional expression, while the latter is a control flow structure).
Related
Data : Desired result:
class type number class rate score
------------------------- ----------------------
2021 1 5 2021 0.5 4.8
2021 1 4.6 2022 0.5 4.6
2021 0 4.8
2021 null null
2022 1 4.2
2022 1 5
2022 0 4.2
2022 null null
rate = (type = 1 / all list) group by class.
score = AVG(number) where type = 1 group by class.
I want to do like below:
SELECT
a.class, SUM(type) / COUNT(*) AS rate, b.score
FROM
data as a
LEFT JOIN
(SELECT
class, AVG(number) AS score
FROM
data
WHERE
type = 1
GROUP BY
class) AS b ON a.class = b.class
GROUP BY
class
Is there any method to do this without JOIN?
First some issues should be named:
Do not use SQL key words like type or number as column names or table names.
Do not do a division without ruling out possible dividing by zero exceptions.
Anyway, in case your description is correct, you can do following:
SELECT class,
ROUND(AVG(CAST(COALESCE(type,0) AS FLOAT)),2) AS rate,
ROUND(AVG(CASE WHEN type = 1 THEN COALESCE(number,0) END),2) AS score
FROM data
GROUP BY class;
You can see here it's working correctly: db<>fiddle
Some explanations:
AVG will build the average without doing risky divisions.
COALESCE replaces NULL values by zero to make sure the average will be correct.
ROUND makes sure the average will be shown as example as 0.33, not as 0.33333...
If this is not sufficient for you, please be more precise about what exactly you want to do.
I have a problem with counting strings in my concat syntax.
i concat 4 columns and the result is as below.
(complete,complete,incomplete,incomplete).
now I want to score them base on the number of 'complete'. for example if there are 4 complete it is 100%, if there are 3 complete it is 60% and so on.
is there any way to do this in sql
I suspect that you want aggregation like this:
select avg(case when val = 'complete' then 1.0 else 0 end) as complete_ratio
I'm a beginner in hive and had some basic questions trying to manipulate a table.
I have a hive table as
bought sold fruit
5 0 apple
0 0 mango
3 2 orange
I want the output as
agg fruit
0 apple
0 mango
1.5 orange
I'm writing an aggregate query as
SELECT sold/bought as agg, fruit from table GROUP BY fruit
I'm having two issues
I want the solution to be proper ex 3/2 is giving me 1 it should give 1.5
If bought or sold is 0, instead of Null or NaN it should give the value 0.
Any suggestions on how to achieve this. Thank you.
Hmmm. I thought that Hive divided integers as floats. If not, a simple solution is to multiply by 1.0:
SELECT sold * 1.0 / bought as agg, fruit
FROM table
GROUP BY fruit
Use a CASE expression where you apply your conditions:
select
case
when bought = 0 or sold = 0 then 0
else 1.0 * bought / sold
end agg,
fruit
from tablename
Since you are not doing any aggregation there is no need for GROUP BY.
I have seen erratic behaviours in hive or impala. To be on safe side i would use,
SELECT cast(sold as decimal(5,1)) / cast(bought as decimal(5,1)) as agg, fruit FROM table GROUP BY fruit
If you want to insert data into a decimal(5,1) column, cast result as decimal(5,1). i know this is stupid but i want to be safe and avoid unnecessary version specific automatic-casting issues. I have seen calculation producing null result in one version and works just fine in other.
Forgive me, but I can't get this working.
I can find lots of complex pivots using numeric values, but nothing basic based on strings to build upon.
Lets suppose this is my source query from a temp table. I can't change this:
select * from #tmpTable
This provides 12 rows:
Row | Name | Code
---------------------------------
1 | July 2019 | 19/20-01
2 | August 2019 | 19/20-02
3 | September 2019 | 19/20-03
.. .. ..
12 | June 2020 | 19/20-12
I want to pivot this and return the data like this:
Data Type | [0] | [1] | [3] | [12]
---------------------------------------------------------------------------
Name | July 2019 | August 2019 | September 2019 | June 2020
Code | 19/20-01 | 19/20-02 | 19/20-03 | 19/20-12
Thanks in advance..
Strings and numbers aren't much different in pivot terms, it's just that you can't use numeric aggregators like SUM or AVG on them. MAX will be fine and in this case you'll only have one Value so nothing will be lost
You need to pull your data out to a taller key/value representation before pivoting it back to look the other way round as it does now
unpivot the data:
WITH upiv AS(
SELECT 'Name' as t, row as r, name as v FROM #tempTable
UNION ALL
SELECT 'Code' as t, row, code FROM #tempTable
)
Now the data can be re grouped and conditionally aggregated on the r columns:
SELECT
t,
MAX(CASE WHEN r = 1 THEN v END) as r1,
MAX(CASE WHEN r = 2 THEN v END) as r2,
...
MAX(CASE WHEN r = 12 THEN v END) as r12
FROM
upiv
GROUP BY
t
You'll need to put the two sql blocks I present here together so they form a single sql statement. If you want to know more about how this works, I suggest you run the sql statement inside the with block, take a look at it, and also remove the group by/max words from the full statement and look at the result. You'll see the WITH block query makes the data taller, essentially a key/value pair that is tracking what type the data is (name or code). When you run the full sql without the group by/max you'll see the tall data spreads out sideways to give a lot of nulls and a diagonal set of cell data (if ordered by r). The group by collapses all these nulls because a MAX will pick any value over null (of which there is only one)
You could also do this as an UNPIVOT followed by a PIVOT. I've always preferred to use this form because not every database supports the UN/PIVOT keywords. Arguably, UNPIVOT/PIVOT could perform better because there may be specific optimizations the developers can make (eg UNPIVOT could single scan a table; this multiple Union approach may require multiple scans and ways round it could be more memory intensive) but in this case it's only 12 rows. I suspect you're using SQLServer but if you're using a database that doesn't understand WITH you can place the bracketed statement of the WITH (including the brackets) between the FROM and the upiv to make it a subquery if the pattern SELECT ... FROM (SELECT ... UNION ALL SELECT ...) upiv GROUP BY ...; there is no difference
I'll leave renaming the output columns as an exercise for you but I would urge you to consider not putting spaces or square brackets in the column names as you show in your question
I have a question whether if it's possible to make a group by an aggregate function.
Scenario:
I have a table which has biomass(kg) and number of individuals for everyday and a description, therefore I can calculate the total av. weight and total number of individuals within two dates as:
select
description,
sum(biomass)/sum(number_individuals) as av.weight,
sum(number_individuals) as individuals
from
Table
group by description
Which works okay, now, the thing is that I want to group those individuals separating them by weight ranges, in order to get something like:
description range(kg) number av.weigh(g)
Foo 2-3 2400 2584.48
I have tried something like
SELECT
description,
case when sum(biomass)/sum(number_individuals) >= 2000.0
and sum(biomass)*1000/sum(number_individuals) < 3000 then '2-3'
else 'nothing'
end as desc_range
FROM Table
Group by
description,
sum(biomass)/sum(number_individuals)
But it doesn't seem to work, neither using the alias desc_range ofc.
I am using Informix 9.40 TC3
Any help will be appreciated.
Best regards
If you want to aggregate on an aggregation, you usually need a subquery. However, you mention individuals, so perhaps this is what you want:
select description,
(case when biomass between 2 and 3 then '2-3'
else 'nothing'
end) as biomass
sum(biomass)/sum(number_individuals) as av.weight, sum(number_individuals) as individuals
from Table
group by description,
(case when biomass between 2 and 3 then '2-3'
else 'nothing'
end);