How to cut part from this string...
"abb.c.d+de.ee+f.xxx+qaa.+.,,s,"
... where i know position by this:
Result is always between "." (left side of result) and "+" (right side).
I know number of "." from left side and number of "+" from right side, to delimit resulting string.
Problem is right side, cause i need to count "+" from end.
Say...
from left side: begining is at 4th "."
( this is easy ), result is =
"xxx+qaa.+.,,s,"
from right side: end is at second "+" from end!
"xxx[here]+qaa.+.,,s,"
result is =
"xxx"
I try to do this myself with .substring and .indexOf, but with no success...
Any ideas? thanks
You could use the StrReverse function to reverse the character sequence and then count + from the left (using the same method as counting the .).
To find the start of the substring, loop through the string from the left. Count the number of .s you have seen and stop when you've hit the number you want. Store the index in some variable like start.
Similarly to find the end of the substring, loop from the right and count +s.
You can solve this problem using Regex:
Dim r As New Regex("^(.*\.){4}(?<value>.*)(\+.*){2}$")
Dim m As Match = r.Match("abb.c.d+de.ee+f.xxx+qaa.+.,,s,")
Dim result As String = m.Result("${value}")
Explanation
^ Indicates the beginning of the string
(.*\.){4} This means any character (.) repeated any number of times (*) followed by a period (\.). The period has to be escaped with the backslash because otherwise the period would be the any-character wildcard. The .*\. is enclosed in (){4} to say that pattern must repeat four times.
(?<value>.*) This specifies the placeholder for the text we are after. value is the name we are assigning to it. The .* specifies that the value is any number of any characters.
(\+.*){2} This means a plus character (has to be escaped) followed by any number of any characters, repeated twice.
$ Indicates the end of the string
Related
I need to retrieve the bolded section of the below string . This value is in a column within my Postgres database table.
SEALS_LME_TRADES_MBL_20220919_00212.csv
I tried to utilize the functions; substring, reverse, strpos but they all have limitations. It seems like regex is the best option, however I was not able to do it.
Essentially I need to substring from beginning till the second last '_'. I do not want the date and sequence number along with the file extension at the end.
The closes regex I managed to get is: ^(([^]*){4})
https://regex101.com/
This look a little wonky but how about this?
select substring ('SEALS_LME_TRADES_MBL_20220919_00212.csv', '^(.+)_[^_]+_[^_]+')
Translation
^ from the beginning
(.+) any characters (capture and return this value), followed by
_ an underscore, followed by
[^_]+ one or more non-underscores, followed by
_ an underscore, followed by
[^_]+ one or more non-underscores
Regex greediness will cause any incidental underscores to be captured in the initial string.
Technically speaking the last portion (one or more non-underscores) can probably be omitted.
I want to take the value of
T.GS.+0.220kg
but I don't know how to remove the string.
I just want to take numbers from the weight.
like 0.220
Can someone help me ?
You can make use of the Regular Expressions to extract a decimal value from basically any string. First you'd need to import the library:
Imports System.Text.RegularExpressions
Then using this will return just the decimal value:
Regex.Match("T.GS.+0.220kg", "\d+.\d+").Value
This particular expression looks for a digit or digits, followed by a point (dot), followed by another number of digits, so the previous points (in between T and G for example) aren't included.
This returns exactly 0.220, you can then replace the string with any string variable and assign this expression as needed.
If you havn't worked with regular expressions before and want somthing that looks a little nicer. You could use the string.split method.
dim input as string = "T.GS.+0.220kg"
input = input.split("+")(1) ' which will grab the "0.220kg"
input = input.substring(0, input.length - 2) ' then filter off the last 2 chars
In english:
split the string into 2 seperate pieces grabing the part to the right of the first '+' symbol.
Then remove the last 2 chars from the end.
Lets say I have a database with about 50k entries in a column called content.
This column contains strings which causes problems to my further work.
Now here is the thing I need to do it for all the rows inside of that table.
Any Ideas?
Here an example:
'user wrote:
-----------------------------------------------------
> Some text
> that vary too much and I dont need it actually
> here is end of the text
The text I actually need.'
I would like to remove all of the unnecessary part so the only thing that is left is in this case :
'The text I actually need.'
This should delete all lines that start with a >:
regexp_replace(textcol, E'^>.*\n', '', 'gn');
The g flag is needed to delete all such lines, and the n flag makes the ^ match the position right after each line break.
I use an “extended” string literal (the leading E) so that I can write a newline as \n.
I would like to write a regular expression that starts with the string "wp" and ends with the string "php" to locate a file in a directory. How do I do it?
Example file: wp-comments-post.php
This should do it for you ^wp.*php$
Matches
wp-comments-post.php
wp.something.php
wp.php
Doesn't match
something-wp.php
wp.php.txt
^wp.*\.php$ Should do the trick.
The .* means "any character, repeated 0 or more times". The next . is escaped because it's a special character, and you want a literal period (".php"). Don't forget that if you're typing this in as a literal string in something like C#, Java, etc., you need to escape the backslash because it's a special character in many literal strings.
Example:
ajshdjashdjashdlasdlhdlSTARTasdasdsdaasdENDaknsdklansdlknaldknaaklsdn
1) START\w*END
return: STARTasdasdsdaasdEND - will give you words between START and END
2) START\d*END
return: START12121212END - will give you numbers between START and END
3) START\d*_\d*END
return: START1212_1212END - will give you numbers between START and END having _
I have a program that I am making with visual basic 2010 that will pull logs of corrupted files and give the user the location of the corrupted file(s) to fix it. These logs are huge and vary depending on the amount of corruption.
I already have set in code to only pull the lines of text that are flagged as errors but, within these lines, there are directories that point to what file is corrupted. I need to know if there is any way to read these directories and put them into a RichTextBox. Here is an example of a line from a log file:
oa = #0x238282b270->OBJECT_ATTRIBUTES {s:48; rd:NULL; on:[100]"\??\C:\Windows\WinSxS\amd64_3ware.inf.resources_31bf3856ad364e35_10.0.10130.0_en-us_ca9e7cc7a071e60f"; a:(OBJ_CASE_INSENSITIVE)}, iosb = #0x238282b250, as = (null), fa = 0,
And here is the part that I need to pull from it:
C:\Windows\WinSxS\amd64_3ware.inf.resources_31bf3856ad364e35_10.0.10130.0_en-us_ca9e7cc7a071e60f from this string
I'm pretty new to all of this, so bear with me please.
RegEx provides great flexibility for this sort of thing, but you need to establish a known pattern that defines where the path begins and ends. For instance, if it always is prefixed by on:[100]"\??\ and always ends with ";, then you could extract it with this RegEx pattern:
on:[100]"\\?\?\(.*?)";
Here's what the pattern means:
on:\[100\]"\\\?\?\\ - Matches must begin with on:[100]"\??\ exactly
The extra backslashes are necessary to escape all of the special characters which would otherwise have special meaning. In this case, [, ], \, and ? all have special meaning to RegEx, so they each need to be preceded a the backslash to escape them.
(.*?) - Matches can contain any number of any characters between the preceding on:[100]"\??\ and the following ";. The value of this portion of the input is captured as an unnamed group (i.e. group 1).
( - Begins a capturing group
. - Matches any character
* - Any number of times
? - Matches in a non-greedy fashion (i.e. only captures up through the first instance of whatever follows it in the pattern)
) - Ends the capturing group
"; - Matches must end with these two characters exactly
So, for instance:
Dim input As String = "oa = #0x238282b270->OBJECT_ATTRIBUTES {s:48; rd:NULL; on:[100]""\??\C:\Windows\WinSxS\amd64_3ware.inf.resources_31bf3856ad364e35_10.0.10130.0_en-us_ca9e7cc7a071e60f""; a:(OBJ_CASE_INSENSITIVE)}, iosb = #0x238282b250, as = (null), fa = 0,"
Dim m As Match = Regex.Match(input, "on:\[100\]""\\\?\?\\(.*?)"";")
If m.Success Then
Dim path As String = m.Groups(1).Value
End If
Or, if the input can contain multiple matches, you can loop through them like this:
For Each m As Match In Regex.Matches(input, "on:\[100\]""\\\?\?\\(.*?)"";")
Dim path As String = m.Groups(1).Value
Next
That's just an example. Depending upon your needs, you could adjust the RegEx pattern as necessary. RegEx is very flexible, so as long as there's some logical way to recognize where the path is in the string, it should be possible to find it with a RegEx pattern. On a side note, since the pattern is, itself, just a string, it can be stored in a configuration setting outside of the code too, which is an added benefit.