Cleaning text strings in vb - sql

I am trying to clean up a string from a text field that will be part of sql query.
I have created a function:
Private Function cleanStringToProperCase(dirtyString As String) As String
Dim cleanedString As String = dirtyString
'removes all but non alphanumeric characters except #, - and .'
cleanedString = Regex.Replace(cleanedString, "[^\w\.#-]", "")
'trims unecessary spaces off left and right'
cleanedString = Trim(cleanedString)
'replaces double spaces with single spaces'
cleanedString = Regex.Replace(cleanedString, " ", " ")
'converts text to upper case for first letter in each word'
cleanedString = StrConv(cleanedString, VbStrConv.ProperCase)
'return the nicely cleaned string'
Return cleanedString
End Function
But when I try to clean any text with two words, it strips ALL white space out. "daz's bike" becomes "Dazsbike". I am assuming I need to modify the following line:
cleanedString = Regex.Replace(cleanedString, "[^\w\.#-]", "")
so that it also lets single white space characters remain. Advice on how to do so is greatly received as I cannot find it on any online tutorials or on the MSDN site (http://msdn.microsoft.com/en-us/library/844skk0h(v=vs.110).aspx)

Use "[^\w\.,#\-\' ]" instead of your pattern string.
Also, I would use
Regex.Replace(cleanedString, " +", " ")
instead of
Regex.Replace(cleanedString, " ", " ")

Or, if you are not a big fan of regex...
Private Function cleanStringToProperCase(dirtyString As String) As String
'specify valid characters
Dim validChars As String = " #-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
'removes all but validChars'
Dim cleanedString As String = dirtyString.Where(Function(c) validChars.Contains(c)).ToArray
Dim myTI As Globalization.TextInfo = New Globalization.CultureInfo(Globalization.CultureInfo.CurrentCulture.Name).TextInfo
'return the nicely cleaned string'
Return myTI.ToTitleCase(cleanedString.Trim)
End Function

Related

How to replace a character within a string

I'm trying to convert WText into its ASCII code and put it into a TextBox; Numencrypt. But I don't want to convert the spaces into ASCII code.
How do I replace the spaces with null?
Current code:
Dim withSpace As String = Numencrypt.Text
For h = 1 To lenText
wASC = wASC & CStr(Asc(Mid$(WText, h, 1)))
Next h
Numencrypt.Text = wASC
Numencrypt2.Text = Numencrypt2.Replace(Numencrypt.Text, " ", "")
By the way, the TextBox Numencrypt2 is the WText without a space inside it.
Without knowing whether or not you want the null character or empty string I did the following in a console app so I don't have your variables. I also used a string builder to make the string concatenation more performant.
Dim withSpaces = "This has some spaces in it!"
withSpaces = withSpaces.Replace(" "c, ControlChars.NullChar)
Dim wASC As New StringBuilder
For h = 1 To withSpaces.Length
wASC.Append($"{AscW(Mid(withSpaces, h, 1))} ") ' Added a space so you can see the boundaries ascii code boundaries.
Next
Dim theResult = wASC.ToString()
Console.WriteLine(theResult)
You will find that if you use ControlChars.NewLine as I have, the place you had spaces will be represented by a zero. That position is completely ignored if you use Replace(" ", "")

VB.NET - Delete excess white spaces between words in a sentence

I'm a programing student, so I've started with vb.net as my first language and I need some help.
I need to know how I delete excess white spaces between words in a sentence, only using these string functions: Trim, instr, char, mid, val and len.
I made a part of the code but it doesn't work, Thanks.
enter image description here
Knocked up a quick routine for you.
Public Function RemoveMyExcessSpaces(str As String) As String
Dim r As String = ""
If str IsNot Nothing AndAlso Len(str) > 0 Then
Dim spacefound As Boolean = False
For i As Integer = 1 To Len(str)
If Mid(str, i, 1) = " " Then
If Not spacefound Then
spacefound = True
End If
Else
If spacefound Then
spacefound = False
r += " "
End If
r += Mid(str, i, 1)
End If
Next
End If
Return r
End Function
I think it meets your criteria.
Hope that helps.
Unless using those VB6 methods is a requirement, here's a one-line solution:
TextBox2.Text = String.Join(" ", TextBox1.Text.Split(New Char() {" "c}, StringSplitOptions.RemoveEmptyEntries))
Online test: http://ideone.com/gBbi55
String.Split() splits a string on a specific character or substring (in this case a space) and creates an array of the string parts in-between. I.e: "Hello There" -> {"Hello", "There"}
StringSplitOptions.RemoveEmptyEntries removes any empty strings from the resulting split array. Double spaces will create empty strings when split, thus you'll get rid of them using this option.
String.Join() will create a string from an array and separate each array entry with the specified string (in this case a single space).
There is a very simple answer to this question, there is a string method that allows you to remove those "White Spaces" within a string.
Dim text_with_white_spaces as string = "Hey There!"
Dim text_without_white_spaces as string = text_with_white_spaces.Replace(" ", "")
'text_without_white_spaces should be equal to "HeyThere!"
Hope it helped!

Left split, getting blank return.

Issue, where the character I am removing does not exist I get a blank string
Aim: To look for three characters in order and only get the characters to the left of the character I am looking for. However if the character does not exist then to do nothing.
Code:
Dim vleftString As String = File.Name
vleftString = Left(vleftString, InStr(vleftString, "-"))
vleftString = Left(vleftString, InStr(vleftString, "_"))
vleftString = Left(vleftString, InStr(vleftString, " "))
As a 'fix' I have done
Dim vleftString As String = File.Name
vleftString = Replace(vleftString, "-", " ")
vleftString = Replace(vleftString, "_", " ")
vleftString = Left(vleftString, InStr(vleftString, " "))
vleftString = Trim(vleftString)
Based on Left of a character in a string in vb.net
If File.Name is say 1_2.pdf it passes "-" and then works on line removing anything before "" (though not "" though I want it to)
When it hits the line for looking for anything left of space it then makes vleftString blank.
Since i'm not familiar (and avoid) the old VB functions here a .NET approach. I assume you want to remove the parts behind the separators "-", "_" and " ", then you can use this loop:
Dim fileName = "1_2.pdf".Trim() ' Trim used to show you the method, here nonsense
Dim name = Path.GetFileNameWithoutExtension(fileName).Trim()
For Each separator In {"-", "_", " "}
Dim index = name.IndexOf(separator)
If index >= 0 Then
name = name.Substring(0, index)
End If
Next
fileName = String.Format("{0}{1}", name, Path.GetExtension(fileName))
Result: "1.pdf"

Split( "TEXT" , "+" OR "-" ,2)

I need to separate a string with Visual Basic.
The downside is that i have more then one separator.
One is "+" and the other one is "-".
I need the code to check for the string if "+" is the one in the string then use "+"
if "-" is in the string then use "-" as separator.
Can I do this?
For example: Split( "TEXT" , "+" OR "-" ,2)
easiest way is to replace out the second character and then split by only one:
Dim txt As String, updTxt As String
Dim splitTxt() As String
updTxt = Replace(txt, "-", "+")
splitTxt = Split(updTxt, "+")
or more complex. The below returns a collection of the parts after being split. Allows you to cusomize the return data a bit more if you require:
Dim txt As String, outerTxt As Variant, innerTxt As Variant
Dim splitOuterTxt() As String
Dim allData As New Collection
txt = "test + test - testing + ewfwefwef - fwefwefwf"
splitOuterTxt = Split(txt, "+")
For Each outerTxt In splitOuterTxt
For Each innerTxt In Split(outerTxt, "-")
allData.Add innerTxt
Next innerTxt
Next outerTxt
What you describe seems pretty straightforward. Just check if the text contains a + to decide which separator you should use.
Try this code:
dim separator as String
separator = Iif(InStr(txt, "+") <> 0, "+", "-")
splitResult = Split(txt, separator, 2)
The code assumes that the text you want to split is in the txt variable.
Please note that I don't have VBA here and wasn't able to actually run the code. If you get any error, let me know.

Get only the line of text that contains the given word VB2010.net

I have a text file on my website and I download the whole string via webclient.downloadstring.
The text file contains this :
cookies,dishes,candy,(new line)
back,forward,refresh,(new line)
mail,media,mute,
This is just an example it's not the actual string , but it will do for help purposes.
What I want is I want to download the whole string , find the line that contains the word that was entered by the user in a textbox, get that line into a string, then I want to use the string.split with as delimiter the "," and output each word that is in the string into an richtextbox.
Now here is the code that I have used (some fields are removed for privacy reasons).
If TextBox1.TextLength > 0 Then
words = web.DownloadString("webadress here")
If words.Contains(TextBox1.Text) Then
'retrieval code here
Dim length As Integer = TextBox1.TextLength
Dim word As String
word = words.Substring(length + 1) // the plus 1 is for the ","
Dim cred() As String
cred = word.Split(",")
RichTextBox1.Text = "Your word: " + cred(0) + vbCr + "Your other word: " + cred(1)
Else
MsgBox("Sorry, but we could not find the word you have entered", MsgBoxStyle.Critical)
End If
Else
MsgBox("Please fill in an word", MsgBoxStyle.Critical)
End If
Now it works and no errors , but it only works for line 1 and not on line 2 or 3
what am I doing wrong ?
It's because the string words also contains the new line characters that you seem to be omitting in your code. You should first split words with the delimiter \n (or \r\n, depending on the platform), like this:
Dim lines() As String = words.Split("\n")
After that, you have an array of strings, each element representing a single line. Loop it through like this:
For Each line As String In lines
If line.Contains(TextBox1.Text) Then
'retrieval code here
End If
Next
Smi's answer is correct, but since you're using VB you need to split on vbNewLine. \n and \r are for use in C#. I get tripped up by that a lot.
Another way to do this is to use regular expressions. A regular expression match can both find the word you want and return the line that contains it in a single step.
Barely tested sample below. I couldn't quite figure out if your code was doing what you said it should be doing so I improvised based on your description.
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub ButtonFind_Click(sender As System.Object, e As System.EventArgs) Handles ButtonFind.Click
Dim downloadedString As String
downloadedString = "cookies,dishes,candy," _
& vbNewLine & "back,forward,refresh," _
& vbNewLine & "mail,media,mute,"
'Use the regular expression anchor characters (^$) to match a line that contains the given text.
Dim wordToFind As String = TextBox1.Text & "," 'Include the comma that comes after each word to avoid partial matches.
Dim pattern As String = "^.*" & wordToFind & ".*$"
Dim rx As Regex = New Regex(pattern, RegexOptions.Multiline + RegexOptions.IgnoreCase)
Dim M As Match = rx.Match(downloadedString)
'M will either be Match.Empty (no matching word was found),
'or it will be the matching line.
If M IsNot Match.Empty Then
Dim words() As String = M.Value.Split(","c)
RichTextBox1.Clear()
For Each word As String In words
If Not String.IsNullOrEmpty(word) Then
RichTextBox1.AppendText(word & vbNewLine)
End If
Next
Else
RichTextBox1.Text = "No match found."
End If
End Sub
End Class