Need to use Replace in string with pentaho with REGEx

Need to use Replace in string with pentaho with REGEx - pentaho

I want to use "Replace in String" step in Pentaho version 8 with the use of Regular expression. Which means my search string will contain a REGEX and "Replace With" in the step should contain "$10$3" which means it should replace with $1 value of regex append 0 in between and then $3 value of it.
ex: Input :- Ron-234-GR
Output :- Ron-2340-GR
Code for reference

Related

How to add asterisk to a list of filenames and then make it a line using Notepad++

I have a list of file names (about 4000).
For example:
A-67569
H-67985
J-87657
K-85897
...
I need to put an asterisk before and after each file name. And then make it a line format.
Example:
*A-67569* *H-67985* *J-87657* *K-85897* so on...
Note that there is a space between filenames.
Forgot to mention, I'm trying to do this with Notepad++
How can I do it?
Please advise.
Thanks

C# example for list to string plus edits
List<string> list = new List<string> { "A - 67569"), "H-67985", "J-87657", "K-85897"};
string outString = "";
foreach(string item in list)
{
outString += "*" + item + "* ";
}
content of outstring: *A - 67569* *H-67985* *J-87657* *K-85897*

Use the Replace of your Notedad++ (Search > Replace..)
Select Extended (\n \r \t \0 \x...) on the bottom of the Replace window
In the field Find what write '\r\n' and in the field Replace with write * *
Replace all
Note, that you should manually place the single asterisk before the first and after the last words.
If this won't work, in step 2. instead of \r\n try to use only \n or \r.

You can use Regular expression in the search Mode.
Find what:
(\S+)(\R|$)
Replace with:
*$1
Note the space after de number one
For the archive
A-67569
H-67985
J-87657
K-85897
Output:
*A-67569 *H-67985 *J-87657 *K-85897
Explication of regex:
(\S+) Mean find one or more caracters is not a blank.
(\R|$) Mean find any end of line or end of file
(\S+)(\R|$) Mean find any gorup of caracters not blank ho end with end of line or end of file.
Explication of Replace with
When you use the $ simpbol, you are using a reference to the groups finded, $1 is the first group, in this case the group (\S+).

Pentaho data Integration - cut a string after comma (not in a fixed position)

In Pentaho data-integration I need (from a Excel Input) to replace a specific string.
I need to delete all characters after the first comma (it can be in a casual position), like this:
"A,b,c,f" -> "A"
"aaaaa,bbbb,cccc" -> "aaaaa"
I've tried string cut but it allows meto cut after a specific number of characters (not a specific word or character, such as comma).
In my sql is : SUBSTRING_INDEX(technician, ',', 1).. but what would it be with Pentaho?
Thanks

You can use the Replace in strings step, set it to search and replace using RegEx, set the search RegEx to ([^,]),. and set the replace field as $1.

How to check if input string contains a letter in TCL/TK scripting?

I have a script which reads a list of integers from arguments and stores into a list then reverses its order.
I am trying to look for a way to check if the input argument contains a letter so I can halt the program and throw a error message. Then exit the script.
How can I check if a certain string has a letter? This letter can be uppercase or lowercase.

Try
regexp {[[:alpha:]]} $string
returns 1 if there is a letter, 0 otherwise.
Documentation:
regexp,
Syntax of Tcl regular expressions

regex expression to find out variable type

I'm using HIVE to import files (which uses sql type syntax) and trying to find out the variable type of my input file which needs to be a regex expression, which may be any of following:
Text
Long
Double
Date
So, far I've done:
For Long only: ^(^\\d*$)
For Double only: (\\d{0,2}\\.\\d{1,2})
For Date only: \\d{2}\/\\d{2}\/\\d{4}
but the problem is for the Text.
I thought if any of the above criteria doesn't match, then it is a Text and did this:
For Text : ([^(^\\d*$)][^(\\d{2}\/\\d{2}\/\\d{4}])
but this matches only a part of the Text (i.e if the value contain "updated", the above expression returns only "upd"). I could not understand why it is only the part of the string.

Got it. Just a simple pattern would do like (^[a-zA-Z]+)

Using groups in OpenRefine regex

I'm wondering if it is possible to use "groups" in ReGeX used in Open Refine GREL syntax. I mean, I'd like to replace all the dots followed and preceded by a character WITH the same character and dot but followed by a space and then the character.
Something like:
s.replace(/(.{1})\..({1})/,/(1).\s(2)/)

It should, but your last argument needs to be a string, not a regular expression. Internally Refine uses Java's Matcher#replaceAll method which accepts a string argument.

I think I found out how to deal with this. You need to put $X in your string value to address a Xth capture group.
It should be like this:
s.replace(/.?(#capcure group 1).?(#capcure group 2).*?/), " some text $1 some text $2 some text")

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Need to use Replace in string with pentaho with REGEx - pentaho

Related

How to add asterisk to a list of filenames and then make it a line using Notepad++

Pentaho data Integration - cut a string after comma (not in a fixed position)

How to check if input string contains a letter in TCL/TK scripting?

regex expression to find out variable type

Using groups in OpenRefine regex

Categories

Resources