Is there a built-in function to extract all characters in a string up until the first occurrence of a space? - vb.net

Is there a built-in function to extract all characters in a string up until the first occurrence of a space?
Say the string is:
Methicillin-resistant staphylococcus aureus
I want to be able to get the substring:
Methicillin-resistant

You can do it in two functions:
newstring = mystring.Substring(0, mystring.IndexOf(" "))
Although that will fail if there's no space in mystring.
So you could pull out mystring.IndexOf(" ") into a variable and check whether it's -1 (no space found) before you try to use it in Substring.

The first solution you can use is a simple IndexOf
string GetFirstWord(string source)
{
int index = source.IndexOf(" ");
if (index == -1) return source;
else return source.Substring(0, index);
}
The second solution can be used if you want to keep all words into a string array.
string[] GetWords(string source)
{
return source.Split(' ');
}
if you only want the first word, you can use it like this :
string word = GetWords("Methicillin-resistant staphylococcus aureus")[0];

And a VB.NET solution. No, it can't be done with one built-in method; you need two:
Left(myString, InStr(myString, " ") - 1)
And like the other solutions you need to check InStr doesn't return 0 if myString may not contain a space.

Related

Split by delimiter which is contained in a record

I have a column which I am splitting in Snowflake.
The format is as follows:
I have been using split_to_table(A, ',') inside of my query but as you can probably tell this uncorrectly also splits the Scooter > Sprinting, Jogging and Walking record.
Perhaps having the delimiter only work if there is no spaced on either side of it? As I cannot see a different condition that could work.
I have been researching online but haven't found a suitable work around yet, is there anyone that encountered a similar problem in the past?
Thanks
This is a custom rule for the split to table, so we can use a UDTF to apply a custom rule:
create or replace function split_to_table2(STR string, DELIM string, ROW_MUST_CONTAIN string)
returns table (VALUE string)
language javascript
strict immutable
as
$$
{
initialize: function (argumentInfo, context) {
},
processRow: function (row, rowWriter, context) {
var buffer = "";
var i;
const s = row.STR.split(row.DELIM);
for(i=0; i<s.length-1; i++) {
buffer += s[i];
if(s[i+1].includes(row.ROW_MUST_CONTAIN)) {
rowWriter.writeRow({VALUE: buffer});
buffer = "";
} else {
buffer += row.DELIM
}
}
rowWriter.writeRow({VALUE: s[i]})
},
}
$$;
select VALUE from
table(split_to_table2('Car > Bike,Bike > Scooter,Scooter > Sprinting, Jogging and Walking,Walking > Flying', ',', '>'))
;
Output:
VALUE
Car > Bike
Bike > Scooter
Scooter > Sprinting, Jogging and Walking
Walking > Flying
This UDTF adds one more parameter than the two in the build in table function split_to_table. The third parameter, ROW_MUST_CONTAIN is the string a row must contain. It splits the string on DELIM, but if it does not have the ROW_MUST_CONTAIN string, it concatenates the strings to form a complete string for a row. In this case we just specify , for the delimiter and > for ROW_MUST_CONTAIN.
We can get a little clever with regexp_replace by replacing the actual delimiters with something else before the table split. I am using double pipes '||' but you can change that to something else. The '\|\|\\1' trick is called back-referencing that allows us to include the captured group (\\1) as part of replacement (\|\|)
set str='car>bike,bike>car,truck, and jeep,horse>cat,truck>car,truck, and jeep';
select $str, *
from table(split_to_table(regexp_replace($str,',([^>,]+>)','\|\|\\1'),'||'))
Yes, you are right. The only pattern, which I can see, is the one with the whitespace after the comma.
It's a small workaround but we can make use of this pattern. In below code I am replacing such commas, where we do have whitespaces afterwards. Then I am applying split to table function and I am converting the previous replacement back.
It's not super pretty and would crash if your string contains "my_replacement" or any other new pattern, but its working for me:
select replace(t.value, 'my_replacement', ', ')
from table(
split_to_table(replace('Car > Bike,Bike > Scooter,Scooter > Sprinting, Jogging and Walking,Walking > Flying', ', ', 'my_replacement'),',')) t

How to replace string characters that are not in a reference list in kotlin

I have a reference string on which the allowed characters are listed. Then I also have input strings, from which not allowed characters should be replaced with a fixed character, in this example "0".
I can use filter but it removes the characters altogether, does not offer a replacement. Please note that it is not about being alphanumerical, there are ALLOWED non-alphanumerical characters and there are not allowed alphanumerical characters, referenceStr happens to be arbitrary.
var referenceStr = "abcdefg"
var inputStr = "abcqwyzt"
inputStr = inputStr.filter{it in referenceStr}
This yields:
"abc"
But I need:
"abc00000"
I also considered replace but it looks more like when you have a complete reference list of characters that are NOT allowed. My case is the other way around.
Given:
val referenceStr = "abcd][efg"
val replacementChar = '0'
val inputStr = "abcqwyzt[]"
You can do this with a regex [^<referenceStr>], where <referenceStr> should be replaced with referenceStr:
val result = inputStr.replace("[^${Regex.escape(referenceStr)}]".toRegex(), replacementChar.toString())
println(result)
Note that Regex.escape is used to make sure that the characters in referenceStr are all interpreted literally.
Alternatively, use map:
val result = inputStr.map {
if (it !in referenceStr) replacementChar else it
}.joinToString(separator = "")
In the lambda decide whether the current char "it" should be transformed to replacementChar, or itself. map creates a List<Char>, so you need to use joinToString to make the result a String again.

Insert WhiteSpaces and remove last characters from string

I need a prepared string for my Visual Studio 2010 macro. The string should be the document name (document.Name) but without the file extension (for example .cs) and after each upper case should be a white space.
Example:
document.Name = TestFileName.cs
How can I get this:
"Test File Name"
For trivial cases (non consecutive upper-case):
file = IO.Path.GetFileNameWithoutExtension(file)
file = System.Text.RegularExpressions.Regex.Replace(file, "([a-z0-9])([A-Z])", "$1 $2")
Here is a basic framework
String PreString = "getAllItemsByID";
System.Text.StringBuilder SB = new System.Text.StringBuilder();
foreach (Char C in PreString)
{
if (Char.IsUpper(C))
SB.Append(' ');
SB.Append(C);
}
Response.Write(SB.ToString());
You may need to add a few checks:-When the very first char is Uppercase not to add a space.-When a word like ID is encountered, remove the last space.
[OR try this]
This will find each occurance of a lower case character followed by an upper case character, and insert a space between them:
s = s.replace(/([a-z])([A-Z])/g, '$1 $2')

How to filter out some vulnerability causing characters in query string?

I need to filter out characters like /?-^%{}[];$=*`#|&#'\"<>()+,\. I need replace this with empty string if it is there in the query string. Please help me out. I am using this in ASP pages.
Best idea would be to use a function something along the lines of:
Public Function MakeSQLSafe(ByVal sql As String) As String
'first i'd avoid putting quote chars in as they might be valid? just double them up.
Dim strIllegalChars As String = "/?-^%{}[];$=*`#|&#\<>()+,\"
'replace single quotes with double so they don't cause escape character
If sql.Contains("'") Then
sql = sql.Replace("'", "''")
End If
'need to double up double quotes from what I remember to get them through
If sql.Contains("""") Then
sql = sql.Replace("""", """""")
End If
'remove illegal chars
For Each c As Char In strIllegalChars
If sql.Contains(c.ToString) Then
sql = sql.Replace(c.ToString, "")
End If
Next
Return sql
End Function
This hasn't been tested and it could probably be made more efficient, but it should get you going. Wherever you execute your sql in your app, just wrap the sql in this function to clean the string before execution:
ExecuteSQL(MakeSQLSafe(strSQL))
Hope that helps
As with any string sanitisation, you're much better off working with a whitelist that dictates which characters are allowed, rather than a blacklist of characters that aren't.
This question about filtering HTML tags resulted in an accepted answer suggesting the use of a regular expression to match against a whitelist: How do I filter all HTML tags except a certain whitelist? - I suggest you do something very similar.
I'm using URL Routing and I found this works well, pass each part of your URL to this function. It's more than you need as it converts characters like "&" to "and", but you can modify it to suit:
public static string CleanUrl(this string urlpart) {
// convert accented characters to regular ones
string cleaned = urlpart.Trim().anglicized();
// do some pretty conversions
cleaned = Regex.Replace(cleaned, " ", "-");
cleaned = Regex.Replace(cleaned, "#", "no.");
cleaned = Regex.Replace(cleaned, "&", "and");
cleaned = Regex.Replace(cleaned, "%", "percent");
cleaned = Regex.Replace(cleaned, "#", "at");
// strip all illegal characters like punctuation
cleaned = Regex.Replace(cleaned, "[^A-Za-z0-9- ]", "");
// convert spaces to dashes
cleaned = Regex.Replace(cleaned, " +", "-");
// If we're left with nothing after everything is stripped and cleaned
if (cleaned.Length == 0)
cleaned = "no-description";
// return lowercased string
return cleaned.ToLower();
}
// Convert accented characters to standardized ones
private static string anglicized(this string urlpart) {
string beforeConversion = "àÀâÂäÄáÁéÉèÈêÊëËìÌîÎïÏòÒôÔöÖùÙûÛüÜçÇ’ñ";
string afterConversion = "aAaAaAaAeEeEeEeEiIiIiIoOoOoOuUuUuUcC'n";
string cleaned = urlpart;
for (int i = 0; i < beforeConversion.Length; i++) {
cleaned = Regex.Replace(urlpart, afterConversion[i].ToString(), afterConversion[i].ToString());
}
return cleaned;
// Spanish : ÁÉÍÑÓÚÜ¡¿áéíñóúü"
}

Length Cannot be zero vb.net

Hi is there away to detect the length of a byte before I get the error message:
Length cannot be less than zero. Parameter name: length
I get the error on this line:
new_username = new_username.Substring(0, new_username.IndexOf(" Joined "))
I am removing the "joined" from the string I get....how can I ignore it is "joined" isnt the the data?
Thanks
I would test to see what IndexOf returned before using it in this context:
if(new_username.IndexOf(" Joined") > 0)
{
new_username = new_username.Substring(0, new_username.IndexOf(" Joined "))
}
Try this:
new_username = new_Username.Replace(" Joined ", "")
Be warned that this will remove all occurrences of the "Joined" substring rather than just the first.
It looks like new_username.IndexOf(" Joined ") is returning -1 meaning the string " Joined" was not found by Substring. I would break this out into two statements:
The error you are seeing is that you are effectively making this call:
new_username = new_username.Substring(0, -1)