i am loading a file which has an amount column and it contains values like 123,56€
when i loaded into hive table , the euro symbol gets replaced by a square box,
and the second thing is that the comma indicates a decimal.
Now i want a regex which can convert this value into 123.56 so basically remove comma and euro symbol.
Try this:-
regexp_extract(regexp_replace('123,56€',',','.' ),'([0-9.]+)', 1)
This will give 123.56
hive> select translate('123,56€',',€','.');
OK
123.56
And if you have unknown currency symbols
hive> select translate('123,56€',translate('123,56€','1234567890',''),'.');
OK
123.56
hive> select regexp_replace('123,56€','(\\d+),(\\d+).','$1.$2');
OK
123.56
and you probably want it as a number
hive> select cast(regexp_replace('123,56€','(\\d+),(\\d+).','$1.$2') as decimal(12,2));
OK
123.56
Related
I created a Table with numeric values like 9,35 or 10,5 in it. The Datatype is float. The table looks like this in the short version:
Currency | Euro | 2018 |
USD | 1 | 9,35 |
Now I want to update my table and replace all komma (,) with a dot (.)
I tried it with this code:
update dbo.[Table]
set [2018] = replace([2018], ',','.')
It says that 24 Rows are affected but when I Update my table it changed nothing.
If I use this code:
select replace ([2018],',','.') from dbo.[Table]
Then it works fine but it don't update my table...
Numeric columns do not contain a separator - they use a separator when the data is displayed. The SQL server was probably set up with a culture that uses commas instead of decimals when it displays data. The coma is not stored with the value.
But, all you need to do is specify the format when you display the data, meaning in a report, form, app, whatever. That's where you specify how to format the values.
I would not format the data in the actual SQL query (e.g. converting the data to a string and specifying the format), since it makes it harder to do aggregations and other numeric operations on the client, and takes up more space in memory (which may not be a problem until you get to a massive scale).
I have a column named annualsalary with varchar2 data type. In this column, I have values like $25,000, $67,000, etc. I want to get rid of both $ sign and the comma. I am able to do only one replace function and get rid of the dollarsign. But how do I get rid of the comma in the same query?
I tried doing it with the following query but I'm unable to get rid of comma.
SELECT REPLACE(ANNUALINCOME,'$','') AS column_variable FROM table_name;
It got rid of the $ sign and showed me the output as 50,000. But now I want to remove comma as well to make it 50000.
Nest replaces
SELECT REPLACE(REPLACE(ANNUALINCOME,'$',''), ',', '') AS column_variable
FROM table_name;
You can use TO_NUMBER:
SELECT TO_NUMBER(value, '$999G999G999D99') AS amount
FROM table_name;
Note: this has advantages over REPLACE in that it only requires a single function call and, more importantly, it converts the value from a string to a number so that if you sort the data then it will be sorted numerically and not alpha-numerically.
Note 2: You should store ANNUALINCOME as a NUMBER data type and not as a formatted string and can apply the appropriate formatting using TO_CHAR when you want to display the value rather than doing it reversed.
Which, for the sample data:
CREATE TABLE table_name (value) AS
SELECT '$25,000' FROM DUAL UNION ALL
SELECT '$67,000' FROM DUAL
Outputs:
AMOUNT
25000
67000
db<>fiddle here
eg 18.45 should be 00000000001845000
datatype suppose number(x,5) so last five digits are for precision
Another option is to use the V format model element; from the documentation:
Element
Example
Description
V
999V99
Returns a value multiplied by 10n (and if necessary, round it up), where n is the number of 9's after the V.
So you can do:
select to_char(18.45, '000000000000V00000') from dual;
TO_CHAR(18.45,'000000000000V00000')
-----------------------------------
00000000001845000
or without the leading space (which is a placehold for a minus sign in case there are negative values):
select to_char(18.45, 'FM000000000000V00000') from dual;
TO_CHAR(18.45,'FM000000000000V00000')
-------------------------------------
00000000001845000
db<>fiddle
Also you can multiply for 100000 the given number:
SELECT TO_CHAR(18.45 * 100000, '00000000000000000') FROM DUAL;
This should do it:
SELECT REPLACE(TO_CHAR(18.45, 'FM000000000000D00000', 'NLS_NUMERIC_CHARACTERS=''.,'''), '.', '') FROM DUAL;
The NLS_NUMERIC_CHARACTERSmakes sure the decimal separator is a . regardless what the session is configured. This way we're safe to remove it from the resulting string with the replace function.
The FM is used to suppress the leading space character.
The query below outputs 1642575.0. But I only want 1642575 (just the number without the decimal and the zero following it). The number of delimited values in the field varies. The only constant is that there's always only one number with a decimal. I was trying to write a regexp function to extract the number between " and ..
How would I revise my regexp_extract function to get the desired output? Thank you!
select regexp_extract('{"1244644": "1642575.0", "1338410": "1650435"}','([1-9][0-9]*[.][0-9]+)&*');
You can cast the result to bigint.
select cast(regexp_extract('{"1244644": "1642575.9", "1338410": "1650435"}','([1-9][0-9]*[.][0-9]+)&*') as bigint) col;
output - 1642575
You can use round if you want to round it off.
select round(regexp_extract('{"1244644": "1642575.9", "1338410": "1650435"}','([1-9][0-9]*[.][0-9]+)&*')) col;
output - 1642576
Use this regexp: '"(\\d+)\\.' - means double-quote, capturing group with one or more digits, dot.
select regexp_extract('{"1244644": "1642575.9", "1338410": "1650435"}','"(\\d+)\\.',1)
Result:
1642575
To skip any number of leading zeroes, use this regexp: '"0*(\\d+)\\.'
(sorry for my poor english)
If you try this select operation over a sqlite database:
SELECT column AS 'alias 1' FROM table;
You get the expected column name:
alias 1
--------
result 1
result 2
but if your alias contains a dot "." ... you get a wrong column name:
SELECT column AS 'alias.1' FROM table;
1
--------
result 1
result 2
(all behind the dot is ommited in the column name)
Wow...
It's weird...
anyone can help me?
thank you very much
UPDATE:
maybe it's just a bug in SQLiteStudio (the software where I'm testing my queries) and in QT (they both doesn't expect dots in alias names but sqlite does)
Enclose your alias in double quotes.
SELECT 'test' AS "testing.this"
Output:
| testing.this |
test
Updated:
Double quotes are used to enclose identifiers in SQL, not single quotes. Single quotes are only for strings. In this case you are trying to ensure that "testing.this" is used as is and not confused as testing.this (testing table this column).
http://www.sqlite.org/faq.html#q24
Use backticks
SELECT column AS `alias.1` FROM table;
Or double quotes (ANSI standard) per the other answer
SELECT column AS "alias.1" FROM table;
Both verified in SQLite Manager for FireFox
Definitely working properly:
C:\Windows>sqlite3.exe
SQLite version 3.7.8 2011-09-19 14:49:19
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .mode column
sqlite> .headers on
sqlite> SELECT 'hello' AS 'alias.1';
alias.1
----------
hello
sqlite>
If you're using the SQLite 3 then the following query works just fine with various types used for the Alias column names.
See the result below the query:
select '1' as 'my.Col1', '2' as "my.Col2", '3' as [my.Col3], '4' as [my Col4] , '5' as 'my Col5'
I've found a "fix"...
SELECT column AS '.alias.1' FROM
table;
alias.1
--------
result 1
result 2
just another dot in the begining...
of course I don't like this solution...
any other idea??
Please try below, it works on Hive
select 1 as `xxx.namewith.dot`
xxx. means any word you want to input with dot notation on latest
namewith.dot means any alias name with dot notation on it