I have the following problem in Hive: I have a table stored as a Textfile with all the fields being of STRING type. I want to convert this table in an ORC table, but some of the STRING fields must be cast to decimal with precision = 3. Th problem is that the comma is not already there in the initial string field, so I am looking to see if there is a way to tell Hive to put this decimal 3 positions before the end of the string :-).
So my HiveSql commands look like this:
CREATE my_orc_table(entry1 STRING, entry2 DECIMAL(10,3)) STORED AS ORC;
INSERT INTO TABLE my_orc_table SELECT * FROM my_text_table;
So the problem is that if I have 00050000 in entry2 of my TextTable, I want to obtain 50.0 in my ORC table. For the moment I have 50000 (I suppose that Hive put the comma at the end of my string, which is quite logic, but not what I am looking for).
I tried to google a bit but I did not really find the solution.
Thank you :-) !
What about..
select cast(entry2 AS DECIMAL)/1000.0
Related
I have a big problem right now and I really need your help, because I can't find the right answer.
I am currently writing a script that triggers a migration process from a table with raw data (data we received from an excel file) to a new normalized schema.
My problem is that there is a column PRICE (varchar2 datatype) with a bunch of traps. For example: 540S, 25oo , I200 , S000 .
And I need to convert it to the correct NUMBER(9,2) format so I can get: 5405, 2500, 1200, 5000 as NUMBER for the previous examples and INSERT INTO my_new_table.
Is there any way I can parse every CHAR of these strings that verify certain conditions?
Or others better way?
Thank you :)!
One of the wonderful things about Oracle that some other DBs lack, is the TRANSLATE function:
SELECT TRANSLATE(number, 'SsIilOoxyz', '5511100') FROM t
This will convert:
S, s to 5
I, i and l to 1
O, o to 0
Remove any x, y or z from the number
The second and third arguments to translate define what characters are to be mapped. If the first string is longer than the second then anything over the length of the second is deleted from the resulting string. Mapping is direct based on position:
'SsIilOoxyz',
'5511100'
Look at the columns of the characters; the character above is mapped to the character below:
S->5,
s->5,
I->1,
i->1,
l->1,
O->0,
o->0,
x->removed,
y->removed,
z->removed`
You can use translate() and along with to_number(). Your rules are not exactly clear, but something like this:
select to_number(translate(price, '0123456789IoS', '012345678910'))
from t;
This replaces I with 1, o with 0, and removes S.
I have data like this
1234500010
1234500020
1234500021
12345600010
12345600011
123456700010
123456700020
123456710010
The pattern is
1-data(varian 3-7 digit number) + 2-data(any 3 digit number) + 3-data (any 2 digit number)
I want to create SQL to get 1-data only.
For example I want to get data 12345
I want the result only
1234500010
1234500020
1234500021
If I using "like",
select *
FROM data
where ID like '12345%' `
I will get all the data with 12345, 123456 and 1234567
If I using equal, I will only get one specific data.
Can I combine like and equal together to get result like what I want?
select * FROM data where data = '12345 + any 2-data(3 digit) + any 3-data(2 digit)'
Anyone can help?
Addition : Sorry if I didn't mention the data type and make some miss communication. The data type is in char. #Gordon answers and the others not wrong. It works for number and varchar. but not works for char type. Here I post some pic for char data type. Oracle specification for char data type is a fixed lenght. So if I input less than lenght the remain of it will be change into a space.
Thank you very much. Hope someone can help for this
Since your datatype is CHAR, Gordon's answer is not working for you. CHAR adds trailing spaces for the strings less than maximum limit. You could use TRIM to fix this as shown. But, you should preferably store numbers in the NUMBER type and not CHAR or VARCHAR2, which will create other problems sooner or later.
select *
from data
where trim(ID) like '12345_____';
I think you want:
select *
from data
where ID like '12345_____' -- exactly 5 _
Here is a rextester demonstrating the answer.
You really can't combine equality and LIKE. But you can use a regular expression to do this kind of searching, with the REGEXP_LIKE function:
SELECT *
FROM DATA
WHERE REGEXP_LIKE(ID, '^12345[0-9]{3}[0-9]{2}');
But if I understand correctly, for your 1-data you really want a 3 to 7 digit number:
SELECT *
FROM DATA
WHERE REGEXP_LIKE(ID, '^[0-9]{3,7}[0-9]{3}[0-9]{2}');
Oracle regular expression docs here
SQLFiddle here
Best of luck.
I think this gives you the solution you want,
create table data(ID number(15));
insert into data values(1234500010);
insert into data values(1234500020);
insert into data values(1234500021);
insert into data values(12345600010);
insert into data values(12345600011);
insert into data values(123456700010);
insert into data values(123456700020);
insert into data values(123456710010);
select * from data where ID like '12345_____'
// After 5_ underscore are exactly 5 , any 3 digits from 2-data(3 underscores) and 2 digits from 3-data(2 underscores)
You'll be getting(OUTPUT) :
ID
1234500010
1234500020
1234500021
3 rows returned in 0.00 seconds
Do you know how to format the output of a number in hive with thousand separator? For example:
data:146452664
output:146,452,664
I use this in Teradata, but don't know how to achieve in Hive.
cast(cast(cast(number as integer) as format'ZZZ,ZZZ,ZZZ,ZZ9') as char(11))
Use the format_number() function.
select format_number(146452664,0)
The first argument is the number and second is the number of decimal places to round.If D is 0, the result has no decimal point.
Is there a way to get a number formatted with a comma for thousand in numbers?
According to IBM documentation, this is the syntax:
DECIMAL(:newsalary, 9, 2, ',')
newsalary is the string (field)
9 is the precision
2 is the scale
, is the delimiter.
I tried:
SELECT DECIMAL ( T1.FIELD1 , 15 , 2 , "," ) AS TOTAL FROM TABLE T1
When trying it, I am getting the following error:
Message: [SQL0171] Argument 4 of function DECIMAL not valid.
DECIMAL converts from string type to a numeric type.
Numeric types don't have separators; only character representations of numbers have separators.
What tool are you using STRSQL, Run SQL Scripts or something else? Once you convert the string to a number, the tool should add the language appropriate separators when it displays the numeric data. For example, in STRSQL:
select decimal('12345.67', 12,2) as mynum
from sysibm.sysdummy1
Returns:
MYNUM
12,345.67
Using SQL to format strings is usually a bad idea. That should be left to whatever is consuming the data.
But if you really, really, really want to do it. You should create a user defined function (UDF) that does it for you. Here's an article, Make SQL Edit the Way You Want It To that includes source for for an EDITDEC function written in ILE RPG along with the SQL function definition you need to use it in an SQL statement.
I am looking for a way to take data from one table and manipulate it and bring it to another table using an SQL query.
I have a Column called NumberStuff that has data like this in it:
INC000000315482
I need to cut off the INC portion of the number and convert it into an integer and store it into a Column in another table so that it ends up looking like this:
315482
Any help would be much appreciated!
Another approach is to use the Replace function. Either in TSQL or as a Derived Column Expression in SSIS.
TSQL
SELECT REPLACE(T.MyColumn, 'INC', '') AS ReplacedINC
SSIS
REPLACE([MyColumn], "INC", "")
This removes the character based data. It then becomes an optional exercise in converting to a numeric type before storing it to the target table or letting the implicit conversion happen.
Simplest version of what you need.
select cast(right(column,6) as int) from table
Are you doing this in a SSIS statement, or?...is it always the last 6 or?...
This is a little less dependant on your formatting...removes 0's and can be any length (will trim the first 3 chars and the leading 0's).
select cast(SUBSTRING('INC000000315482',4,LEN('INC000000315482') - 3) as int)