Pentaho: How to replace commas with dots and vice versa(format) - pentaho

I want to change in number-field format, commas with dots and dots with commas.
Thanks in advance!
VALUE 9,795.00
What I used to format number ###,###,###.00
And I want it to be like: 9.795,00

At the end I left formation like it is (###,###,###.00), because formatting depends on localization. I will set hardcoded localization value through java and when calling report it will forward french or german localization.

Related

REGEX_EXTRACT for specific pattern inside brackets

Trying to use REGEX_EXTRACT in SQL to extract certain string patterns inside Brackets.
So I have tried this formula: REGEX_EXTRACT(column, r'\[(.*?)\]'), but problem is that there are multiple Brackets in the same cell, and this formula will only extract the first string pattern in the first bracket.
So, what I'm trying to figure out is how can I extract specific patterns within the Brackets? The pattern I'm looking for looks like this: [xx-XX]
Where x can be any string in the alphabet.
Any tips or directions would be greatly appreciated
This should work if you always have 2 lowercase letters followed by '-' and then followed by 2 uppercase letters:
\[([a-z]{2}-[A-Z]{2})\]

How can I add a string character based on a position in OpenRefine?

I have a column in Openrefine, which I would like to add a character string in each of its rows, based on the position in the string.
For example:
I have an 8th character number string: 85285296 and would like to add "-" at the fourth place: "8528-5296".
Anyone can help me find the specific function in OpenRefine?
Thanks
Tzipy
The simplest approach is to just use the expression language's built-in string indexing and concatenation:
value[0,4]+'-'+value[4,8]
or more generally, if you don't know that your value is exactly 8 characters long:
value[0,4]+'-'+value[4,999]
Possible solution (not sure if it's the most straightforward):
value.replace(/(\d{4})(.+)/, "$1-$2")
This means : if $1 represents the content of the first parenthesis/group in the regular expression before and $2 the content of the second one, replaces each value in the column with $1-$2.
Some other options:
value.splitByLengths(4,4).join("-")
value.match(/(\d{4})(\d{4})/).join("-")
value.substring(0,4)+"-"+value.substring(4,8)
I think 'splitByLengths' is the neatest, but I might use 'match' instead because it fails with an error if your starting string isn't 8 digits - which means you don't accidentally process data that doesn't conform to your assumption of what data is in the column - but you could use a facet/filter to check this with any of the others

SQL cannot search

In my SQL table Image, when i perform a search query
SELECT * FROM Image WHERE platename LIKE 'WDD 666'
it return no result(using other column to search then no problem).
The all the column data was inserted by C# code. (If enter data manually search works.)
now i suspect that the words WDD 666 wasn't english alphabet. is this possible?
In c#,
the plate number was generate by using tesseract wrapper string type.
what should i do to search the plate number?
Thanks in advance and sorry for my bad English.
Since your case matches, I'm going to rule out Case-sensitivity.
There may be leading or trailing blank spaces - Try this..
SELECT * FROM Image WHERE platename LIKE '%WDD 666%'
Try running this command:
SELECT '*'+plateName+'*',len(plateName)
FROM image.
I suspect platename has some non-printable characters in the field.
It appears to be a CR/LF at the end of the data. You can use
UPDATE image SET plateName = replace(plateName,char(13)+char(10),'')
WHERE plateName like '%'+char(13)+char(10)+'%'
If you get a positive row count, you'll know there was CR/LF data and it was removed. If you run the select afterwards, your lengths should be 7 and 8 based on your sample data

Convert text with HTML character encoding to database characterset

Our application receives data from various sources. Some of these contain HTML character makeup instead of regular characters. So instead of string "â" we receive string "â".
How can we convert "â" to a character in the database character set using SQL/PLSQL?
Our database is 10GR2.
Unescape_reference and excape_reference I believe is what you're looking for
UTL_I18N.UNESCAPE_REFERENCE('hello < å')
This returns 'hello <'||chr(229).
http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/u_i18n.htm#i998992
You can use the CHR() function to convert an ascii character number to a character representation.
SELECT chr(226)
FROM dual;
CHR(226)
--------
â
For more information see: http://www.techonthenet.com/oracle/functions/chr.php
Hope it helps...
one solution
replace(your_test, 'â', chr(226))
but you'd have to nest many replace functions, one for each entity you need to replace. This might be very slow if you have to replace many.
You can wrote your own function, seqrching for the ampersand and replacing when found.
Have you searched the Oracle Supplied Packages manual? I know they have a function that does the opposite for a few entities.
to convert a column in oracle which contains HTML items to plain text, you could use:
trim(regexp_replace(UTL_I18N.unescape_reference(column_name), '<[^>]+>'))
It will replace HTML character as above stated but will also remove HTML tags en remove leading and trailing spaces.
I hope it will help someone.

Delimiting User Input

What is the best character to use to delimit user input?
For example if a user has an infinite number of textboxes to type things into, but each textbox's value will be concatenated into a single database field, what is the safest character to delimit each input?
I think it should be a character not on your typical keyboard. Is there a character out there just for this?
You could use one of the ASCII control characters. There's one called "Record Separator" which has a hex value of 0x1E that might fit your needs.
Edit: Incidentally, if you want to do a proper job, you should probably ensure that \x1E is escaped in user input. One way to do this would be to use another ASCII control character: \x1B which is the "escape" control code. Thus, "\x1E" in input becomes "\x1B\x1E" and "\x1B" becomes "\x1B\x1B".
Keep in mind, of course, that because these are non-printing control codes, they can't be displayed. If you want a printable representation, you might want to go with a normal character like the comma and just escape it from input.
I guess one approach is to use a comma, and then to escape commas within the user input. It's probably not safe to assume any character (or even a sequence of characters) can't appear in user input -- if you can enter it in your code, then there's a way the user can enter it into a text box!
Normally commas or semi-colons are used for splitting data. What about | which the average user never uses?
How about a combination of keys? e.g.
|::|
so
this|::|and|::|that. Plus Those:Here and there.|::|Even this|that works
Any markup language will do for this. They're a little verbose but at least they'll be future proofing your field.
use ♥
ftw