HiveQL: Divide all values of a column by max of the column - hive

How do you divide all values of a column by the max of the column?
For example, suppose I have:
id value
1 10
2 20
3 30
I want:
id value
1 0.33333
2 0.66666
3 0.99999
I have tried:
SELECT col_a/MAX(col_a)
FROM db.table
and
SELECT col_a/(SELECT MAX(col_a) FROM db.table)
FROM db.table
and both attempts failed with lengthy error messages.
While I was able to get the code of the first answer below to work when I copy-pasted it, I couldn't replicate its results with my own table. I tried:
WITH temp AS (SELECT * FROM mydb.tablenamehere)
SELECT colA/MAX(colA) OVER()
FROM temp;
and also:
USE mydb;
WITH temp AS (SELECT * FROM tablenamehere)
SELECT colA/MAX(colA) OVER()
FROM temp;
but I get the following error for both:
FAILED: SemanticException Line 1:28 Table not found 'tablenamehere' in definition of CTE temp [
SELECT * FROM tablenamehere
] used as temp at Line 1:28

Use analytic max():
with your_table as ( --use your table instead of this
select stack(3,
1, 10 ,
2, 20 ,
3, 30 ) as (id, value)
)
select id, value/max(value) over() as value
from your_table
order by id --remove order if not necessary
;
Returns:
OK
1 0.3333333333333333
2 0.6666666666666666
3 1.0
Time taken: 80.474 seconds, Fetched: 3 row(s)

Related

Accessing 2th element in varray column

Let's say a have a table with a varray column, defined as follow:
create or replace TYPE VARRAY_NUMBER_LIST AS VARRAY(15) OF NUMBER;
Now, I'm trying to select the first element of each varray column of my table. It works fine:
select (select * from table(myvarraycolumn) where rownum = 1) from mytable cc
It is returning an output like:
2
1
4
4
2
2
My issue occurs when I try to get the second element of each varray column with this SQL:
select (select * from table(myvarraycolumn) where rownum = 2) from mytable cc
In this case, all output lines are returning null. Please, let me know if I'm forgetting something or making some confusion.
You need to select rows 1 and 2 and then work out a way to filter out the unwanted preceding rows - one way is to use aggregation with a CASE statement to only match the second row:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE mytable ( myvarraycolumn ) AS
SELECT SYS.ODCINUMBERLIST( 1, 2, 3 ) FROM DUAL UNION ALL
SELECT SYS.ODCINUMBERLIST( 4, 5, 6 ) FROM DUAL;
Query 1:
SELECT (
SELECT MAX( CASE ROWNUM WHEN 2 THEN COLUMN_VALUE END )
FROM TABLE( t.myvarraycolumn )
WHERE ROWNUM <= 2
) AS second_element
FROM mytable t
Results:
| SECOND_ELEMENT |
|----------------|
| 2 |
| 5 |
My issue occurs when I try to get the second element of each varray column with this SQL:
select (select * from table(myvarraycolumn) where rownum = 2) from mytable cc
In this case, all output lines are returning null. Please, let me know if I'm forgetting something or making some confusion.
It is not working because: for the first row in the correlated inner query, ROWNUM is 1 and your filter is WHERE ROWNUM = 2 then this reduces to WHERE 1=2 and the filter is not matched and the row is discarded. The subsequent row will then be tested against a ROWNUM of 1 (since the previous row is no longer in the output and will not have a row number), which will again fail the test and be discarded. Repeat, ad nauseum and all rows fail the WHERE filter and are discarded.

SQL select dummy data

I have attempted to create some dummy data from a select statement. I can easily create 1 column with 1 dummy data, or 2 columns with 1 dummy data, but how can I go about making 1 column with 2 dummy data(2 rows)?
(No column name)
dummy1
dummy2
Select statements that are 1 dummy data per column:
Select 'dummy'
Select 'dummy1','dummy2'
Just another option with one or multiple columns
Single Column
Select *
From (values ('Dummy1')
,('Dummy2')
) A(Dummies)
Returns
Dummies
Dummy1
Dummy2
Multiple Columns
Select *
From (values ('Dummy1',1)
,('Dummy2',2)
) A(Dummies,Value)
Returns
Dummies Value
Dummy1 1
Dummy2 2
You would have to use UNION with two select statements:
SELECT 'dummy1' AS [Dummies]
UNION
SELECT 'dummy2'
This will produce a single column.
Dummies
-------
dummy1
dummy2
Select 'dummy1,dummy2' as dummy
Not sure why you'd want to though...

Pivot in SQL: count not working as expected

I have in my Oracle Responsys Database a table that contains records with amongst other two variables:
status
location_id
I want to count the number of records grouped by status and location_id, and display it as a pivot table.
This seems to be the exact example that appears here
But when I use the following request :
select * from
(select status,location_id from $a$ )
pivot (count(status)
for location_id in (0,1,2,3,4)
) order by status
The values that appear in the pivot table are just the column names :
output :
status 0 1 2 3 4
-1 0 1 2 3 4
1 0 1 2 3 4
2 0 1 2 3 4
3 0 1 2 3 4
4 0 1 2 3 4
5 0 1 2 3 4
I also gave a try to the following :
select * from
(select status,location_id , count(*) as nbreports
from $a$ group by status,location_id )
pivot (sum(nbreports)
for location in (0,1,2,3,4)
) order by status
but it gives me the same result.
select status,location_id , count(*) as nbreports
from $a$
group by status,location_id
will of course give me the values I want, but displaying them as a column and not as a pivot table
How can I get the pivot table to have in each cell the number of records with the status and location in row and column?
Example data:
CUSTOMER,STATUS,LOCATION_ID
1,-1,1
2,1,1
3,2,1
4,3,0
5,4,2
6,5,3
7,3,4
The table data types :
CUSTOMER Text Field (to 25 chars)
STATUS Text Field (to 25 chars)
LOCATION_ID Number Field
Please check if my understanding for your requirement is correct, you can do vice versa for the location column
create table test(
status varchar2(2),
location number
);
insert into test values('A',1);
insert into test values('A',2);
insert into test values('A',1);
insert into test values('B',1);
insert into test values('B',2);
select * from test;
select status,location,count(*)
from test
group by status,location;
select * from (
select status,location
from test
) pivot(count(*) for (status) in ('A' as STATUS_A,'B' as STATUS_B))

create a table of duplicated rows of another table using the select statement

I have a table with one column containing different integers.
For each integer in the table I would like to duplicate it as the number of digits -
For example:
12345 (5 digits):
1. 12345
2. 12345
3. 12345
4. 12345
5. 12345
I thought doing it using with recursion t (...) as () but I didn't manage, since I don't really understand how it works and what is happening "behind the scenes.
I don't want to use insert because I want it to be scalable and automatic for as many integers as needed in a table.
Any thoughts and an explanation would be great.
The easiest way is to join to a table with numbers from 1 to n in it.
SELECT n, x
FROM yourtable
JOIN
(
SELECT day_of_calendar AS n
FROM sys_calendar.CALENDAR
WHERE n BETWEEN 1 AND 12 -- maximum number of digits
) AS dt
ON n <= CHAR_LENGTH(TRIM(ABS(x)))
In my example I abused TD's builtin calendar, but that's not a good choice, as the optimizer doesn't know how many rows will be returned and as the plan must be a Product Join it might decide to do something stupid. So better use a number table...
Create a numbers table that will contain the integers from 1 to the maximum number of digits that the numbers in your table will have (I went with 6):
create table numbers(num int)
insert numbers
select 1 union select 2 union select 3 union select 4 union select 5 union select 6
You already have your table (but here's what I was using to test):
create table your_table(num int)
insert your_table
select 12345 union select 678
Here's the query to get your results:
select ROW_NUMBER() over(partition by b.num order by b.num) row_num, b.num, LEN(cast(b.num as char)) num_digits
into #temp
from your_table b
cross join numbers n
select t.num
from #temp t
where t.row_num <= t.num_digits
I found a nice way to perform this action. Here goes:
with recursive t (num,num_as_char,char_n)
as
(
select num
,cast (num as varchar (100)) as num_as_char
,substr (num_as_char,1,1)
from numbers
union all
select num
,substr (t.num_as_char,2) as num_as_char2
,substr (num_as_char2,1,1)
from t
where char_length (num_as_char2) > 0
)
select *
from t
order by num,char_length (num_as_char) desc

ORA-01422: exact fetch returns more than requested number of rows in RETURNING INTO

I have the following sql (oracle) that removes all rows from a table except the 100 newest.
DELETE FROM my_table tab_outer
WHERE tab_outer.rowid IN (
-- Fetch rowids of rows to delete
SELECT rid FROM (
SELECT rownum r, rid FROM (
SELECT tab.rowid rid
FROM my_table tab
ORDER BY tab.created_date DESC
)
)
-- Delete everything but the 100 nesest rows
WHERE r > 100
)
-- Return newest date that was removed
RETURNING max(tab_outer.created_date) INTO :latestDate
This code sometimes gives a ORA-01422: exact fetch returns more than requested number of rows claiming that more than one row was inserted into latestDate. How is this possible? The aggregate function (max) in the RETURNING INTO clause should ensure that only one row is returned, no? Could it have anything to do with the explicit use or rowid (I don't see how)?
I thought it was not possible to use aggregates in the returning clause, as I had never tried it and it isn't mentioned in the documentation, but it actually works (11gr2) !!
See below in PL/SQL:
SQL> CREATE TABLE my_table (created_date DATE);
Table created
SQL> INSERT INTO my_table
2 (SELECT SYSDATE + ROWNUM FROM dual CONNECT BY LEVEL <= 500);
500 rows inserted
SQL> DECLARE
2 latestDate DATE;
3 BEGIN
4 DELETE FROM my_table tab_outer
5 WHERE tab_outer.rowid IN (
6 -- Fetch rowids of rows to delete
7 SELECT rid FROM (
8 SELECT rownum r, rid FROM (
9 SELECT tab.rowid rid
10 FROM my_table tab
11 ORDER BY tab.created_date DESC
12 )
13 )
14 -- Delete everything but the 100 nesest rows
15 WHERE r > 100
16 )
17 -- Return newest date that was removed
18 RETURNING max(tab_outer.created_date) INTO latestDate;
19 dbms_output.put_line(latestDate);
20 END;
21 /
06/08/14
And even in SQL*Plus (10.2.0.1.0 client, 11.2.0.3.0 database):
SQL> VARIABLE latestDate VARCHAR2(20);
SQL> DELETE FROM my_table tab_outer
2 WHERE tab_outer.rowid IN (
3 -- Fetch rowids of rows to delete
4 SELECT rid FROM (
5 SELECT rownum r, rid FROM (
6 SELECT tab.rowid rid
7 FROM my_table tab
8 ORDER BY tab.created_date DESC
9 )
10 )
11 -- Delete everything but the 100 nesest rows
12 WHERE r > 100
13 )
14 -- Return newest date that was removed
15 RETURNING max(tab_outer.created_date) INTO :latestDate;
400 rows deleted.
SQL> select :latestDate from dual;
:LATESTDATE
--------------------------------------------------------------------------------
06/08/14
Can you post a complete example and your database/client version.