determine position of matching word within a regular expression - sql

Situation:
I have a static dictionary and a dynamically determined regular expression.
The regular expression is somewhat limited in it's application here in that I would never use symbols that stand for a variable number of characters.
I need to create a regular expression that finds all words in the dictionary that match this type of pattern: (any letter in set: q,w,e,r,t,y,u,i or blank) & (any letter in set: q,w,e,r,t,y,u,i or blank) & (a or blank) & (s or blank) & (and letter in set: q,w or blank) & (d or blank) & (any letter in set: q,w,e or blank)
[q,e,r,t,y,u,i,null][q,e,r,t,y,u,i,null][a,null][s,null][q,w,null][d,null][q,w,e,null]
For example the word "rasw" would be valid (assuming it's in my dictionary).
Problem:
I also need one more piece of information, I need to know that this word started in the 2nd position. As apposed to the also valid word "qra" which starts in the first position or the valid word "sqde" which starts in the 4th position.
Additional Info:
I plan on doing this in MS SQL SERVER using a regular expression .dll
http://www.codeproject.com/KB/string/SqlRegEx.aspx
Also note that given the above example I would not want "qqqq" to be a valid word even if it was in the dictionary. The word would not be allowed to skip over a space, however it is allowed to not start on the first space if this makes scene and is possible to do...
Thanks!

I would very strongly suggest using CLR here. Then in .NET do your regex stuff. And if you find the word, all you need to do is to do a .Split(" ") on spaces, and then iterrate through the array and find out the n'th position it is in.
For example: var myString = "The fat fox is lazy";
(assume you matches the word "is") - so you know that "is" is in there, now to figure out the location you can:
var _counter = 0;
foreach(var s as string in myString.Split(" "))
{
if(s == "is") { return _counter; }
_counter += 1;
}
Hope you get the idea of what I am suggesting here.

Related

How to copy one string's n number of characters to another string in Kotlin?

Let's take a string var str = "Hello Kotlin". I want to copy first 5 character of str to another variable strHello. I was wondering is there any function of doing this or I have to apply a loop and copy characters one by one.
As Tim commented, there's a substring() method which does exactly this, so you can simply do:
val strHello = str.substring(0, 5)
(The first parameter is the 0-based index of the first character to take; and the second is the index of the character to stop before.)
There are many, many methods available on most of the common types.  If you're using an IDE such as IDEA or Eclipse, you should see a list of them pop up after you type str..  (That's one of many good reasons for using an IDE.)  Or check the official documentation.
Please use the string.take(n) utility.
More details at
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/take.html
I was using substring in my project, but it gave an exception when the length of the string was smaller than the second index of substring.
val name1 = "This is a very very long name"
// To copy to another string
val name2 = name1.take(5)
println(name1.substring(0..5))
println(name1.substring(0..50)) // Gives EXCEPTION
println(name1.take(5))
println(name1.take(50)) // No EXCEPTION

What does this SQL query replacing JSON text mean?

I'm trying to understand a part of SQL query but I don't know what's it used for; can anyone help me?
I know it wants to replace something, but what is " ":"(.+)" ", and why the string like "store" can be used in substring()?
replace((
CASE
WHEN(char_length(substring(xxx_json::text FROM 'Name":"(.+)" , "store')) > 0)
THEN substring(xxx_json::text FROM 'Name":"(.+)" , "store')
ELSE substring(xxx_json::text FROM 'Name":"(.+)" , "employees')
END),'\u0016','''')
This appears to be a variant of substring that does regular-expression matching. The first argument, xxx_json::text, is the string to be searched. The second argument is the regular expression to match.
Note that the second argument consists of the entire SQL string literal 'Name":"(.+)" , "store' (in the first two cases). Everything in that string, except for the (.+), should literally match a portion of the string to be searched. The (.+) is regex syntax. A dot matches any character; a + means one or more occurrences; the parentheses define this as a capture group. In this context, the text that matches the capture group is what will be returned by substring.
So for instance if the contents of the string to be searched was a simple JSON expression like this: { "Name":"John Smith" , "store":"London" }, the regular expression would match and the substring would return 'John Smith'.
In short, this is a slightly hacky way of parsing JSON in SQL to extract the value of the Name element (or some element whose key ends with Name).
See section 9.7.3 in https://www.postgresql.org/docs/9.4/static/functions-matching.html for detailed documentation on this form of substring.

Regex matching sequence of characters

I have a test string such as: The Sun and the Moon together, forever
I want to be able to type a few characters or words and be able to match this string if the characters appear in the correct sequence together, even if there are missing words. For example, the following search word(s) should all match against this string:
The Moon
Sun tog
Tsmoon
The get ever
What regex pattern should I be using for this? I should add that the supplied test strings are going to be dynamic within an app, and so I'd like to be able to use a pattern based on the search string.
From your example Tsmoon you show partial words (T), ignoring case (s, m) and allow anything between each entered character. So as a first attempt you can:
Set the ignore case option
Between each chapter input insert the regular expression to match zero or more of anything. You can choose whether to match the shortest or longest run.
Try that, reading the documentation for NSRegularExpression if you're stuck, and see how it goes. If you get stuck ask a new question showing your code and the RE constructed and explain what happens/doesn't work as expected.
HTH

In MS Word how do I convert a Set Field containing "$1540.38-" to a negitive number I can sum

I have a Word document that has fields set by our main publishing software. This software cannot utilize VBA code.
The fields I have to sum are things like
{SET ArrearsBalance "$1,540.38-"},
{SET RepossessionCosts "$200.00"},
and {SET StorageCost "$200.00"}
If I have them in a table and then use {=SUM(A1,A2,A3)} it will give a total of $400
If I manually remove the trailing - I can get the total as if they were all positive.
If I manually remove the - from the back of the number and put it at the front it will sum correctly.
Is there a way trim/move the - symbol to the front of the $ symbol?
Comments consolidated/augmented into an Answer
First, be warned: the way that Word interprets currency strings is dependent on the regional settings for the computer displaying the document. e.g. on a machine with typical US settings, "$1,540.38" will be recognised as a number, but "£1,540.38" will not.
There are other potential problems in this area, which lead me to suggest that you do something like this for each amount that you want to sum:
{ SET XArrearsBalance "{ IF "{ ArrearsBalance }" = "*-" "-" }{ ={ ArrearsBalance }-0 }" }
(All the {} need to be the special field code brace pairs that you can insert on Windows Word using ctrl-F9.
If you just have an amount with the wrong format in cell A1, you cannot easily strip the trailing "-" in the same way because the "=" field can only work with numbers - you can't get it to manipulate texts. So you probably need to put that whole formula in cell A1.
Once you have done that, your formula needs to be
{ =SUM(XArrearsBalance,XRepossessionCosts,XStorageCost) }
If you do have literal values in your cells, another way you can reference the values is to create a different paragraph style for each cell you want to reference. Let's say you create a style called "SArrearsBalance" and apply it to cell A1. Then you should be able to use
{ STYLEREF SArrearsBalance }
to get the value of the cell, and you can (probably) use
{ IF "{ STYLEREF SArrearsBalance }" = "*-" "-" }{ ={ STYLEREF SArrearsBalance }-0 }" }
Some say that you can bookmark the whole cell, and reference it using { REF theBookmarkName } but I usually find that this leads to large, incorrect numbers being generated by calculations.

User input text translation

I'm working on a translator that will take English language text (as user input into a UITextView) and (with a button press) replace specific words with alternatives. I have both the English words in scope plus their alternatives in separate Arrays (englishArray and alternativeArray), indexed correspondingly.
My challenge is finding an algorithm that will allow me to identify a word in the input text (a UITextView) ignoring characters like <",.()>, lookup the word in englishArray (case insensitive), locate the corresponding word in alternativeArray and then use that word in place of the original - writing it back to the UITextView.
Any help greatly appreciated.
NB. I have created a Category extending the NSArray functionality with a indexOfCaseInsensitiveString method that ignores case when doing an indexOfObject type lookup if that helps.
Tony.
I think that using an NSScanner would be best to parse the string into separate words which you could then pass to your indexOfCaseInsensitiveString method. scanCharactersFromSet:intoString: using a set of all the characters you want to ignore, including whitespace and newline characters should get you to the start of a word, and then you could use scanUpToCharactersFromSet:intoString: using the same set to scan to the end of the word. Using scanLocation at the beginning and end of each scan should allow you to get the range of that word, so if you find a match in your array, you will know where in your string to make the replacement.
Thanks for your suggestion. It's working with one exception.
I want to capture all punctuation so I can recreate the original input but with the substituted words. Even though I have a 'space' in my Character Set, the scanner is not putting the spaces into the 'intoString'. Other characters I specify in the Character Set such as '(' and ';' are represented in the 'intoString'.
Net is that when I recreate the input, it's perfect except that I get individual words running into each other.
UPDATE: I fixed that issue by including:
[theScanner setCharactersToBeSkipped:nil];
Thanks again.