Insert an array of zeros into hive - sql

I have a hive table with a column of type array, how can I insert a record with say, an array of 1000 0s? I could just use a script to create an array of 1000 0s but looking for something more compact and less error prone

Use lpad for generating string of 0 with length of 1000. then use split function to split result by empty string except beginning and end of the string:
hive> select split(lpad(0,1000,0),'(?!^)(?!$)');
The result is an array of 1000 0s:
["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"]

A quick hack is to select 1000 rows from a table and then use collect_list.
select collect_list(res)
from (select 0 as res
from tblWithGT1000Rows
limit 1000
) t

Related

count number of times a regex pattern occurs in hive

I have a string variable stored in hive as follows
stringvar
AA1,BB3,CD4
AA12,XJ5
I would like to count (and filter on) how many times the regex pattern \w\w\d occurs. In the example, in the first row there are obviously three such examples. How can I do that without resorting to lateral views and explosions of stringvar (too expensive)?
Thanks!
You can split string by pattern and calculate size of result array - 1.
Demo:
select size(split('AA1,BB3,CD4','\\w\\w\\d'))-1 --returns 3
select size(split('AA12,XJ5','\\w\\w\\d'))-1 --returns 2
select size(split('AAxx,XJx','\\w\\w\\d'))-1 --returns 0
select size(split('','\\w\\w\\d'))-1 --returns 0
If column is null-able than special care should be taken. For example like this (depends on what you need to be returned in case of NULL):
select case when col is null then 0
else size(split(col,'\\w\\w\\d'))-1
end
Or simply convert NULL to empty string using NVL function:
select size(split(NVL(col,''),'\\w\\w\\d'))-1
The solution above is the most flexible one, you can count the number of occurrences and use it for complex filtering/join/etc.
In case you just need to filter records with fixed number of pattern occurrences or at least fixed number and do not need to know exact count then simple RLIKE without splitting is the cheapest method.
For example check for at least 2 repeats:
select 'AA1,BB3,CD4' rlike('\\w\\w\\d+,\\w\\w\\d+') --returns true, can be used in WHERE

Take % out of an SQL column

I'm trying to convert 60% -> 60 of all the columns in a table. I have tried this, but it does not work because % is an SQL operator.
UPDATE host_info set host_response_rate = replace(host_response_rate,'%', '');
But I get all the values to be NULL...
I'm using postgresql
you can use this function, this take out the last n characters of a string. if you use 1, it will take out the last digit.
SELECT RIGHT(host_response_rate, 1)
FROM ...

Transform set of data into a single column

Lets say I have this set of integers enclosed in the parenthesis (1,2,3,4,5).
Data I have:
(1,2,3,4,5)
And I would want them to be in a single column.
Expected Output:
column
--------
1
2
3
4
5
(5 rows)
How can I do this? I've tried using array then unnest but with no luck. I know I'm doing something wrong.
I need this to optimize a query that is using a large IN statement, I want to put it in a temp table then join it on the main table.
You can convert the string to an array, then do the unnest:
select *
from unnest(translate('(1,2,3,4,5)', '()', '{}')::int[]);
The translate() call converts '(1,2,3,4,5)' to '{1,2,3,4,5}' which is the string representation of an array. That string is then cast to an array using ::int[].
You don't need a temp table, you can directly join to the result of the unnest.
select *
from some_table t
join unnest(translate('(1,2,3,4,5)', '()', '{}')::int[]) as l(id)
on t.id = l.id;
Another option is to simply use that array in a where condition:
select *
from some_table t
where t.id = any (translate('(1,2,3,4,5)', '()', '{}')::int[]);

How to limit the maximum display length of a column in PostgreSQL

I am using PostgreSQL, and I have a column in a table which contains very long text. I want to select this column in a query, but limit its display length.
Something like:
select longcolumn (only 10 chars) from mytable;
How would I do this?
What you can do is use PostgreSQL's substring() method. Either one of the two commands below will work:
SELECT substring(longcolumn for 10) FROM mytable;
SELECT substring(longcolumn from 1 for 10) FROM mytable;
Slightly shorter:
SELECT left(longcolumn, 10) from mytable;

determine DB2 text string length

I am trying to find out how to write an SQL statement that will grab fields where the string is not 12 characters long. I only want to grab the string if they are 10 characters.
What function can do this in DB2?
I figured it would be something like this, but I can't find anything on it.
select * from table where not length(fieldName, 12)
From similar question DB2 - find and compare the lentgh of the value in a table field - add RTRIM since LENGTH will return length of column definition. This should be correct:
select * from table where length(RTRIM(fieldName))=10
UPDATE 27.5.2019: maybe on older db2 versions the LENGTH function returned the length of column definition. On db2 10.5 I have tried the function and it returns data length, not column definition length:
select fieldname
, length(fieldName) len_only
, length(RTRIM(fieldName)) len_rtrim
from (values (cast('1234567890 ' as varchar(30)) ))
as tab(fieldName)
FIELDNAME LEN_ONLY LEN_RTRIM
------------------------------ ----------- -----------
1234567890 12 10
One can test this by using this term:
where length(fieldName)!=length(rtrim(fieldName))
This will grab records with strings (in the fieldName column) that are 10 characters long:
select * from table where length(fieldName)=10
Mostly we write below statement
select * from table where length(ltrim(rtrim(field)))=10;