How to get maximum size used by a string in Hive? - sql

I want to know the maximum length a particular string column is taking.
I tried taking the approached mentioned here :
how to get the max size used by a field in table but did not work in Hive
but that did not work in Hive.

In that example they use len, use length instead:
select max(length(mycolumn)) from mytable;
This works fine in hive QL.

multiple ways to check length of column in where clause as well:
select max(length(column_name)),min(length(column_name)) from table_name where length(column_name)<15
Here, checked column length with max and min values. Also use where clause if checking column length lesser than 15

Related

How to get value string with regexp in bigquery

Hi i have string in BigQuery column like this
cancellation_amount: 602000
after_cancellation_transaction_amount: 144500
refund_time: '2022-07-31T06:05:55.215203Z'
cancellation_amount: 144500
after_cancellation_transaction_amount: 0
refund_time: '2022-08-01T01:22:45.94919Z'
i already using this logic to get cancellation_amount
regexp_extract(file,r'.*cancellation_amount:\s*([^\n\r]*)')
but the output only amount 602000, i need the output 602000 and 144500 become different column
Appreciate for helping
If your lines in the input (which will eventually become columns) are fixed you can use multiple regexp_extracts to get all the values.
SELECT
regexp_extract(file,r'cancellation_amount:\s*([^\n\r]*)') as cancellation_amount
regexp_extract(file,r'. after_cancellation_transaction_amount:\s*([^\n\r]*)') as after_cancellation_transaction_amount
FROM table_name
One issue I found with your regex expression is that .*cancellation_amount won't match after_cancellation_transaction_amount.
There is also a function called regexp_extract_all which returns all the matches as an array which you can later explode into columns, but if you have finite values separating them out in different columns would be a easier.

Need to get a value till precision of 15 digit after decimal in DB2 Query

One of DB2 table column value is appearing as 0.3901369869709015 and we need to compare this value against my expected value as 0.390136986970901. I tried to get the value from DB2 by using Decimal/Dec method. With that the value is getting round off and appearing as 0.390136986970902. Could you please help me to correct the below query that i am using to extract data from my DB2 table.
SELECT DECIMAL(UV_FIELDSCOREMAP,15,15) AS UV_FIELDSCOREMAP From cvsinst.uv_occ WHERE CASEID = '20170720'
Use TRUNCATE(UV_FIELDSCOREMAP, 15) instead.

Groupe_concat max length in postgres?

is there a way to set, like in the SQL example below, groupe_concat max length in Postgres?
SET group_concat_max_len=15000000;
many thanks.
No there is no such configuration parameter.
The maximum length of the result of string_agg() (or any other string concatenation) is limited by the maximum length of a text value in Postgres which is 1GB.

How to find MAX() value of character column?

We have legacy table where one of the columns part of composite key was manually filled with values:
code
------
'001'
'002'
'099'
etc.
Now, we have feature request in which we must know MAX(code) in order to give user next possible value, in example case form above next value is '100'.
We tried to experiment with this but we still can't find any reasonable explanation how DB2 engine calculates that
MAX('001', '099', '576') is '576'
MAX('099', '99', 'www') is '99' and so on.
Any help or suggestion would be much appreciated!
You already have the answer to getting the maximum numeric value, but to answer the other part with regard to 'www','099','99'.
The AS/400 uses EBCDIC to store values, this is different to ASCII in several ways, the most important for your purposes is that Alpha characters come before numbers, which is the opposite of Ascii.
So on your Max() your 3 strings will be sorted and the highest EBCDIC value used so
'www'
'099'
'99 '
As you can see your '99' string is really '99 ' so it is higher that the one with the leading zero.
Cast it to int before applying max()
For the numeric maximum -- filter out the non-numeric values and cast to a numeric for aggregation:
SELECT MAX(INT(FLD1))
WHERE FLD1 <> ' '
AND TRANSLATE(FLD1, '0123456789', '0123456789') = FLD1
SQL Reference: TRANSLATE
And the reasonable explanation:
SQL Reference: MAX
This max working well in your type definition, when you want do max on integer values then convert values to integer before calling MAX, but i see you mixing max with string 'www' how you imagine this works?
Filter integer only values, cast it to int and call max. This is not good designed solution but looking at your problem i think is enough.
Sharing the solution for postgresql
which worked for me.
Suppose here temporary_id is of type character in database. Then above query will directly convert char type to int type when it gives response.
SELECT MAX(CAST (temporary_id AS Integer)) FROM temporary
WHERE temporary_id IS NOT NULL
As per my requirement I've applied MAX() aggregate function. One can remove that also and it will work the same way.

How do I sort a VARCHAR column in PostgreSQL that contains words and numbers?

I need to order a select query using a varchar column, using numerical and text order. The query will be done in a java program, using jdbc over postgresql.
If I use ORDER BY in the select clause I obtain:
1
11
2
abc
However, I need to obtain:
1
2
11
abc
The problem is that the column can also contain text.
This question is similar (but targeted for SQL Server):
How do I sort a VARCHAR column in SQL server that contains words and numbers?
However, the solution proposed did not work with PostgreSQL.
Thanks in advance, regards,
I had the same problem and the following code solves it:
SELECT ...
FROM table
order by
CASE WHEN column < 'A'
THEN lpad(column, size, '0')
ELSE column
END;
The size var is the length of the varchar column, e.g 255 for varying(255).
You can use regular expression to do this kind of thing:
select THECOL from ...
order by
case
when substring(THECOL from '^\d+$') is null then 9999
else cast(THECOL as integer)
end,
THECOL
First you use regular expression to detect whether the content of the column is a number or not. In this case I use '^\d+$' but you can modify it to suit the situation.
If the regexp doesn't match, return a big number so this row will fall to the bottom of the order.
If the regexp matches, convert the string to number and then sort on that.
After this, sort regularly with the column.
I'm not aware of any database having a "natural sort", like some know to exist in PHP. All I've found is various functions:
Natural order sort in Postgres
Comment in the PostgreSQL ORDER BY documentation