How to find the min of multiple values in HIVE?

How to find the min of multiple values in HIVE? - hive

Hive has min(col) to find the minimum value of a column. But how about finding the minimum of multiple values (NOT one column), for example
select min(2,1,3,4);
returns
FAILED: UDFArgumentTypeException Exactly one argument is expected
Any tips?

Found the solution!
Instead of min(col), we should use least(a, b, c, d)

Instead of using MIN, use LEAST method to find the minimum values form given values/columns^^rows.
select least(2,1,3,4);

Use below to find the minimum value from multiple columns.
It will produce minimum value from each row.
select least(col1,col2) as least_value from table_name;
It will produce minimum value from all rows.
select min(least(col1,col2)) as least_value from table_name;

Related

Difference between count(*) and count(true) in sql?

what is the difference between the select count(*) and select count(true)?
so is there any different between the count(*) and count(true) which one should I use?
can you give me situation example for each one that is better option to choose?

The result of both is the same, but count(*) is slightly faster than count(true). That is because in the first case, the aggregate function has no arguments (that's what the * means in SQL), whereas in the second case the argument true is checked for NULLness, since count skips rows where the argument is NULL.

The same result, it will give you total number of rows in a table

Grouping data and keeping only distinct values in SQL

Is it possible to group and the following data in pgsql:
(TL;DR: Note the similar target entries for the two print_names qz.M2 and qz.M1)
print_name
target
qz.R
q3zA
qz.S
NULL
qz.M1
q2zA
qz.M1
q1zA
qz.M2
q2zA
qz.M2
q1zA
in such a way that the distinct values of target are still in the result while the doubling of qz.M* is avoided.
The result desired would therefore be:
print_name
target
qz.R
q3zA
qz.S
NULL
qz.M1
q2zA
qz.M2
q1zA
I tried:
SELECT min(target) FROM Table GROUP BY print_name;
However, this of course only yields one of two entries in target.
Thank you for your help!

I dont think this is achievable without casing specific print_name if you want consistent answer.
SELECT t.print_name
FROM Table t
CASE
WHEN t.print_name = 'qz.M1' THEN max(t.target)
WHEN t.print_name = 'qz.M2' THEN min(t.target)
ELSE t.target END as Target
GROUP BY t.print_name

Your desired results would seem to indicate just a simple aggregate:
select print_name, Max(target) target
from t
group by print_name
Note your sample data does not include any reliable method or sorting, max() will be based on string ordering.

How do I retrieve next value for multiple sequences in different rows in db2?

I am using this query
SELECT NEXT VALUE FOR SEQUENCENAME1 FROM DUMMY_TABLE UNION
SELECT NEXT VALUE FOR SEQUENCENAME2 FROM DUMMY_TABLE
This is not resulting the answer in multiple rows .
Suggest me some solution for this because I have to retrieve next value for thousands of sequences

To answer your core question: You can check SYSCAT.SEQUENCES for metadata about sequences.
The attributes / columns START and NEXTCACHEFIRSTVALUE are the ones of interest for you. Note that sequence values typically are cached for performance, so you need to keep that in mind.

Try using UNION ALL:
SELECT NEXT VALUE FOR SEQUENCENAME1 FROM DUMMY_TABLE UNION ALL
SELECT NEXT VALUE FOR SEQUENCENAME2 FROM DUMMY_TABLE;
If the expressions return the same value, then UNION will return only one row -- it removes duplicates.

You could use this query
VALUES (NEXT VALUE FOR SEQUENCENAME1)
, (NEXT VALUE FOR SEQUENCENAME2)
although you won't know which row is which, so maybe this
VALUES ('SEQ1', NEXT VALUE FOR SEQUENCENAME1)
, ('SEQ2', NEXT VALUE FOR SEQUENCENAME2)

Same return with and without the SUM operator PostgreSQL

I'm using PostgreSQL 10 and trying to run this query. I started with a CTE which I am referencing as 'query.'
SELECT
ROW_NUMBER()OVER() AS my_new_id,
query.geom AS geom,
query.pop AS pop,
query.name,
query.distance AS dist,
query.amenity_size,
((amenity_size)/(distance)^2) AS attract_score,
SUM((amenity_size)/(distance)^2) AS tot_attract_score,
((amenity_size)/(distance)^2) / SUM((amenity_size)/(distance)^2) as marketshare
INTO table_mktshare
FROM query
WHERE
distance > 0
GROUP BY
query.name,
query.amenity_size,
query.geom,
query.pop,
query.distance
The query runs but the problem lies in the 'markeshare' column. It returns the same answer with or without the SUM operator and returns one, which appears to make both the attract_score and the tot_attract_score the same. Why is the SUM operator read the same as the expression above it?

This is occurring specifically because each combination of columns in the group by clause uniquely identifies one row in the table. I don't know if this is intentional, but more normally, one would expect something like this:
SELECT ROW_NUMBER() OVER() AS my_new_id,
query.geom AS geom, query.pop AS pop, query.name,
SUM((amenity_size)/(distance)^2) AS tot_attract_score,
INTO table_mktshare
FROM query
WHERE distance > 0
GROUP BY query.name, query.geom, query.pop;
This is not your intention, but it does give a flavor of what's expected.

In Oracle, find number which is larger than 80% of a set of a numbers

Assume I have a table with a column of integers in Oracle. There are a good amount of rows; somewhere in the millions. I want to write a query that gives me back an integer that is larger than 80% of all of the numbers in table. What is the best way to approach this?
If it matters, this is Oracle 10g r1.

Sounds like you want to use the PERCENTILE_DISC function if you want an actual value from the set, or PERCENTILE_CONT if you want an interpolated value for a particular percentile, say 80%:
SELECT PERCENTILE_DISC(0.8)
WITHIN GROUP(ORDER BY integer_col ASC)
FROM some_table
EDIT
If you use PERCENTILE_DISC, it will return an actual value from the dataset, so if you wanted a larger value, you'd want to increment that by 1 (for an integer column).

I think you could use the NTILE function to divide the input into 5 buckets, then select the MIN(Column) from the top bucket.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to find the min of multiple values in HIVE? - hive

Hive has min(col) to find the minimum value of a column. But how about finding the minimum of multiple values (NOT one column), for example select min(2,1,3,4); returns FAILED: UDFArgumentTypeException Exactly one argument is expected Any tips?

Found the solution! Instead of min(col), we should use least(a, b, c, d)

Instead of using MIN, use LEAST method to find the minimum values form given values/columns^^rows. select least(2,1,3,4);

Use below to find the minimum value from multiple columns. It will produce minimum value from each row. select least(col1,col2) as least_value from table_name; It will produce minimum value from all rows. select min(least(col1,col2)) as least_value from table_name;

Related

Difference between count(*) and count(true) in sql?

Grouping data and keeping only distinct values in SQL

How do I retrieve next value for multiple sequences in different rows in db2?

Same return with and without the SUM operator PostgreSQL

In Oracle, find number which is larger than 80% of a set of a numbers

Categories

Resources