How to detect a number and split in MS Access? - sql

I cannot control these circumstances at the moment so please bear with me.
I pull email addresses from a field called EMAIL_O, and sometimes they are completely valid (somename#domain.com) and other times they have a 12-character phone number appended at the front (123-456-7890somename#domain.com).
How can I, in MS Access, detect which type of field I am seeing and remove the phone number appropriately when pulling in this data? I cannot just take the mid() from the 13th character because if the email is valid, I'd be removing good characters.
So somehow I need to detect the presence of a number and then apply the mid(), or just take the full field if no number is present.

Use pattern matching to check whether EMAIL_O starts with a phone number.
EMAIL_O = "123-456-7890somename#domain.com"
? EMAIL_O Like "###-###-####*"
True
EMAIL_O = "somename#domain.com"
? EMAIL_O Like "###-###-####*"
False
So you can use that strategy in an IIf expression. Apply Mid when EMAIL_O matches the pattern. Otherwise, just return EMAIL_O unaltered.
? IIf(EMAIL_O Like "###-###-####*", Mid(EMAIL_O, 13), EMAIL_O)
somename#domain.com
Those were examples copied from the Immediate window. If you want to use the same approach in a query ...
SELECT IIf(EMAIL_O Like "###-###-####*", Mid(EMAIL_O, 13), EMAIL_O) AS email
FROM YourTable;

Using the Left and IsNumeric functions you can just check to see if the first character is numeric, something like that is below. (Untested code).
Public Function CheckEmail(strEmail As String)
If IsNumeric(Left(strEmail, 1)) Then
strEmail = Right(strEmail, Len(strEmail - 13))
CheckEmail = strEmail
Else
CheckEmail = strEmail
End If
End Function

You can use regular expressions to get this done. Assume the number format in front of the email has a dynamic length you would not be able to start from 13th char as you suggested. Try pattern matching and replacing using RegEx. This would eliminate the entire string finding/cutting/joining operations
Try this:
Public Function FN_GET_EMAIL(iEmail As String) As String
Const mPattern As String = "^[0-9_-]+[0-9_-]"
With CreateObject("vbscript.RegExp")
.Pattern = mPattern
.Global = True
FN_GET_EMAIL = Nz(.Replace(iEmail, ""), "")
End With
End Function
sample results:
?FN_GET_EMAIL("123123123-123123123123-1231231231231EmailWith1123123Number#domain.com")
EmailWith1123123Number#domain.com
?fn_get_email("123-456-7890somename#domain.com")
somename#domain.com
in addition you can add a proper email validation or even change the pattern as per your needs without doing a complex string operation.

You could check for a numeric header:
If IsNumeric(Replace(Left(Nz([Email]), 12), "-", "")) Then
' chop 12 chars
Else
' use as is
End If

Related

VB.Net Date.ParseExact Culture issue

I am using vb.net and I create a list of months (3 char representation) by using the following function.
Public Function getMonths() As Array
Dim months As String = ""
For i = 1 To 12
months += StrConv(MonthName(i, True), VbStrConv.ProperCase) + ","
Next
months = months.Substring(0, months.Length - 1)
getMonths = months.Split(",")
End Function
This works beautifully, as the site I am building can change language etc on the fly.
However when I try to then change the month back to the numeric value to process using this function
Public Function monthToNumber(ByVal monthin As String, ByVal culture As System.Globalization.CultureInfo) As Integer
monthToNumber = DateTime.ParseExact("01/" + monthin + "/1999", "dd/MMM/yyyy", culture).Month
End Function
, the date.ParseExact throws an exception of date input is not in a valid string.
The month is being produce by the culturalisation, so I can't understand the failure. This only happens with a culture of {pt-PT}. The process works fine for spain, uk, france,italy.
If you use DateTime.ParseExact, you have to use the right DateSeparator.
For pt-PT, it's not /, but -.
/ can work as default DateSeparator, but only if you use CultureInfo.InvariantCulture. But if you do, you can't parse the culture specific date abbreviation.
That's why your code fails.
To generate the month abbreviations, simple use DateTimeFormatInfo.AbbreviatedMonthNames or DateTimeFormatInfo.AbbreviatedMonthGenitiveNames; no need to write a method yourself.
Also, you should look into String.Join (another thing you don't have to reinvent).
For parsing the string back, you could use something like
monthToNumber = DateTime.ParseExact(monthin, "MMM", culture).Month
No need for day/year if you simply use MMM for your format string.

access 2007 change the first letter in a field of a table

I have a table called documents one of the fields is called location which shows the file path for the document. I need to change it from D:\........ to H:\.....
How can I do this using update in sql as the file paths vary in length and there are lots of records
You can use string helper function to achieve the same. Something like below
UPDATE documents SET location = 'H:' + Mid(location, 2, Len(location) - 2)
WHERE Left(location, 1) = 'D'
Here, Len() function returns the length of the string literal
Left() function returns 1 character from the left of the string literal
Mid() function give you substring from a string (starting at any position)
See MS Access: Functions for more information on the same.

VBA in query error evaluating false part of IIF

I have the following column in a query.
iif(Len([Field1])=0,0,Asc(Mid([Field1] & "",Len([Field1]))))
The idea is that it should return the ASCII value of the last character in a string field.
The problem is that if Field1 is blank the statement errors with the following message: "Invalid procedure call or argument (Error 5)". If the field is blank it should return 0.
Assuming blank means either Null or an empty string, you can concatenate an empty string to [Field1], and if the combined length is 0, return 0.
The Right() function is more direct than Mid() to get the last character.
SELECT IIf(Len([Field1] & "")=0,0,Asc(Right([Field1],1)))
FROM YourTable;
By all means you have HansUp's answer to go ahead. :)
..At any point of this querying voyage, your query with IIF gets very very long... ;) here a head start with a UDfunction that uses REGEX. In your case the most important aspect is to validate the input - not much of getting the ASCII value. Since your concern mainly on last character, what if your Field1 contains alphanumeric value and if your last character happens to be digit, special character instead of String. Well, just to be safe, we could validate that in the following funciton which will respond only to Strings. Another advantage is that you can re-use this function within your db.
Perhaps this really is complicating your process :D
Option Compare Database
Option Explicit
Public Function GetASCII(strField1 As String) As Integer
Dim regex As Object
Dim strTemp As String
Set regex = CreateObject("vbscript.regexp")
'--assume strField1 = "hola", you may enter different types
'-- of values to test e.g. "hola1, hola$ , hola;
With regex
.IgnoreCase = True
.Global = True
.Pattern = "[a-zA-Z]+"
'--check for null
Select Case IsNull(strField1) '-- validates any other datatype than String
Case True
GetASCII = 0
Case Else
Select Case IsError(strField1)
Case True
GetASCII = 0
Case Else
Select Case IsEmpty(strField1)
Case True
GetASCII = 0
'--check if entire string is String
'--only (no special characters, digits)
'--you may change this to only check the
'----last character if your Field1 is alphanumeric
Case Else
Select Case .Test(strField1)
Case True
strTemp = Mid(strField1, Len(strField1), 1)
GetASCII = Asc(strTemp)
Case Else
GetASCII = 0
End Select
End Select
End Select
End Select
End With
Set regex = Nothing
End Function
Note: Thought this would be helpful in the long run :) An Access Query Primer Article.

Is there a more efficient way to handle these replace calls

I'm querying across two dbs separated by a legacy application. When the app encounters characters like, 'ü', '’', 'ó' they are replaced by a '?'.
So to match messages, I've been using a bunch of 'replace' calls like so:
(replace(replace(replace(replace(replace(replace(lower(substring([Content],1,153)) , '’', '?'),'ü','?'),'ó','?'), 'é','?'),'á','?'), 'ñ','?'))
Over a couple thousand records, this can (as you expect) is very slow. There is probably a better way to do this. Thanks for telling me what it is.
One thing you can do is implement a RegEx Replace function as a SQL assembly and call is as a user-defined function on your column instead of the Replace() calls. Could be faster. You also want to probably to the same RegEx Replace on your passed in query values.
TSQL Regular Expression
You could create a persisted computed column on the same table where the [Content] column is.
Alternatively, you can probably speed up the replace by creating a user defined function in C# using a StringBuilder. And you can even combine both of these solutions.
[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static SqlString LegacyReplace(SqlString value)
{
if(value.IsNull) return value;
string s = value.Value;
int l = Math.Min(s.Length, 153);
var sb = new StringBuilder(s, 0, l, l);
sb.Replace('’', '?');
sb.Replace('ü', '?');
// etc...
return new SqlString(sb.ToString());
}
Why not first do the same replace (chars to "?") on the string you are searching for in the app side using regular expressions? E.g. your SQL server query that was passed a raw string to search for and used these nested replace() calls will instead be passed a search string already containing "?"s by your app code.
Could you convert the strings to varbinary before comparing? Something like the below:
declare
#Test varbinary (100)
,#Test2 varbinary (100)
select
#Test = convert(varbinary(100),'abcu')
,#Test2 = convert(varbinary(100),'abcü')
select
case
when #Test <> #Test2 then 'NO MATCH'
else 'MATCH'
end

How to Replace Multiple Characters in Access SQL?

I'm a novice at SQL, so hopefully someone can spell this out for me. I tried following the "Replace Multiple Strings in SQL Query" posting, but I got stuck.
I'm trying to do the same thing as the originator of the above posting but with a different table and different fields. Let's say that the following field "ShiptoPlant" in table "BTST" has three records (my table actually has thousands of records)...
Table Name: BTST
---------------
| ShiptoPlant |
| ----------- |
| Plant #1 |
| Plant - 2 |
| Plant/3 |
---------------
Here's what I'm trying to type in the SQL screen:
SELECT CASE WHEN ShipToPlant IN ("#", "-", "/") Then ""
ELSE ShipToPlant END FROM BTST;
I keep getting the message (Error 3075)...
"Syntax error (missing operator) in query expression
'CASE WHEN ShiptoPlant IN (";","/"," ") Then "" ELSE ShipToPlant END'."
I want to do this operation for every character on the keyboard, with exception of "*" since it is a wildcard.
Any help you could provide would be greatly appreciated!
EDIT: Background Information added from the comments
I have collected line-item invoice-level data from each our 14 suppliers for the 2008 calendar year. I am trying to normalize the plant names that are given to us by our suppliers.
Each supplier can call a plant by a different name e.g.
Signode Service on our master list could be called by suppliers
Signode Service
Signode - Service.
SignodeSvc
SignodeService
I'm trying to strip non-alphanumeric chars so that I can try to identify the plant using our master listing by creating a series of links that look at the first 10 char, if no match, 8 char, 6, 4...
My basic hang-up is that I don't know how to strip the alphanumeric characters from the table. I'll be doing this operation on several columns, but I planned on creating separate queries to edit the other columns.
Perhaps I need to do a mass update query that strips all the alphanumerics. I'm still unclear on how to write it. Here's what I started out with to take out all the spaces. It worked great, but failed when I tried to nest the replace
UPDATE BTST SET ShipToPlant = replace(ShipToPlant," ","");
EDIT 2: Further Information taken from Comments
Every month, up to 100 new permutations of our plant names appear in our line item invoice data- this could represent thousands of invoice records. I'm trying to construct a quick and dirty way to assign a master_id of the definitive name to each plant name permutation. The best way I can see to do so is to look at the plant, address, city and state fields, but the problem with this is that these fields have various permutations as well, for example,
128 Brookview Drive
128 Brookview Lane
By taking out alphanumerics and doing
LEFT(PlantName,#chars) & _
LEFT(Address,#chars) & _
LEFT(City,#chars) & _
LEFT(State,#chars)
and by changing the number of characters until a match is found between the invoice data and the Master Plant Listing (both tables contain the Plant, Address, City and State fields), you can eventually find a match. Of course, when you start dwindling down the number of characters you are LEFTing, the accuracy becomes compromised. I've done this in excel and had decent yield. Can anyone recommend a better solution?
You may wish to consider a User Defined Function (UDF)
SELECT ShiptoPlant, CleanString([ShiptoPlant]) AS Clean
FROM Table
Function CleanString(strText)
Dim objRegEx As Object
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
objRegEx.Pattern = "[^a-z0-9]"
CleanString = objRegEx.Replace(strText, "")
End Function
You could use the built in Replace function within Access
SELECT
Replace(Replace(Replace(ShipToPlant, "#", ""), "-", ""), "/", "") AS ShipToPlant
FROM
BTST
As others have said, within Access you can write your own functions in VBA and use them in your queries.
EDIT:
Here's a way to handle the nested Replace limit by wrappering the Replace function within our own function. It feels dirty but it works- put this in a module within Access
Public Function SuperReplace(ByRef field As String, ByVal ReplaceString As String) As String
' Size this as big as you need... it is zero-based by default'
Dim ReplaceArray(3) As String
'Fill each element with the character you need to replace'
ReplaceArray(0) = "#"
ReplaceArray(1) = "-"
ReplaceArray(2) = "/"
ReplaceArray(3) = " "
Dim i As Integer
For i = LBound(ReplaceArray) To UBound(ReplaceArray)
field = Replace(field, ReplaceArray(i), ReplaceString)
Next i
SuperReplace = field
End Function
Then test it with this query
SELECT
SuperReplace(ShipToPlant,"") AS ShipToPlant
FROM
BTST
You might want to take this an expand it so that you can pass in an array of strings instead of hard-coding them into the function.
EDIT 2:
In response to the additional information in the comments on the question, here's a suggestion for how you might want to handle the situation differently. The advantage to this apprach is that once you have mapped in a plant name permutation, you won't need to perform a string replace on future data in future years, only add new plant names and permutations to the map.
Start with creating another table, let's call it plant_map
CREATE TABLE plant_map (id AUTOINCREMENT PRIMARY KEY, name TEXT, master_id LONG)
into plant_map, add all of the permutations for plant names and insert the id for the name you wish to use to refer to a particular plant name permutation group with, into the master_id field. From your comments, I'll use Signode Service
INSERT INTO plant_map(name, master_id) VALUES ("Signode Service", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode Svc", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode - Service", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode svc", 1);
INSERT INTO plant_map(name, master_id) VALUES ("SignodeService", 1);
Now when you query BTST table, you can get data for Signode Service using
SELECT
field1,
field2
FROM
BTST source
INNER JOIN
(
plant_map map1
INNER JOIN
plant_map map2
ON map1.master_id = map2.id
)
ON source.ShipToPlant = map1.name
WHERE
map2.name = "Signode Service"
Data within table BTST can remain unchanged.
Essentially, this is joining on the plant name in BTST to the name in plant_map then, using master_id, self joining on id within plant_map so that you need only pass in one "common" name. I would advise putting an index on each of the columns name and master_id in plant_map as both fields will be used in joins.
Don't think Access supports the CASE statement. Consider using iif:
iif ( condition, value_if_true, value_if_false )
For this case you can use the REPLACE function:
SELECT
REPLACE(REPLACE(REPLACE(yourfield, '#', ''), '-', ''), '/', '')
as FieldName
FROM
....
Create a public function in a Code module.
Public Function StripChars(ByVal pStringtoStrip As Variant, ByVal pCharsToKeep As String) As String
Dim sChar As String
Dim sTemp As String
Dim iCtr As Integer
sTemp = ""
For iCtr = 1 To Len(pStringtoStrip)
sChar = Mid(pStringtoStrip, iCtr, 1)
If InStr(pCharsToKeep, sChar) > 0 Then
sTemp = sTemp & sChar
End If
Next
StripChars = sTemp
End Function
Then in your query
SELECT
StripChars(ShipToPlant, "abcdefghijklmnopqrstuvwxyz0123456789") AS ShipToPlantDisplay
FROM
BTST
Notes - this will be slow for lots of records - if you what this to be permanent then create an update query using the same function.
EDIT: to do an Update:
UPDATE BTST
SET ShipToPlant = StripChars(ShipToPlant, "abcdefghijklmnopqrstuvwxyz0123456789")
OK, your question has changed, so the solution will too. Here are two ways to do it. The quick and dirty way will only partially solve your issue because it won't be able to account for the more odd permutations like missing spaces or misspelled words. The quick and dirty way:
Create a new table - let's call it
tChar.
Put a text field in it - the
char(s) you want to replace - we'll
call it char for this example
Put all the char or char combinatios that you want removed in this table.
Create and run the query below.
Note that it will only remove one
item at a time, but you can also put
different versions of the same
replacement in it too like ' -' or
'-'
For this example I created a table called tPlant with a field called ShipToPlant.
SELECT tPlant.ShipToPlant, Replace([ShipToPlant],
(SELECT top 1 char
FROM tChar
WHERE instr(ShipToPlant,char)<>0 ORDER BY len(char) Desc),""
) AS New
FROM tPlant;
The better (but much more complex) way. This explanation is going to be general because it would be next to impossible to put the whole thing in here. If you want to contact me directly use my user name at gmail.:
Create a table of Qualifiers -
mistakes that people enter like svc
instead of service. Here you would
enter every wierd permutation you
get.
Create a table with QualifierID and
Plant ID. Here you would say which
qualifier goes to which plant.
Create a query that joins the two
and your table with mistaken plant
names in it. Use instr so say what
is in the fields.
Create a second query that
aggragates the first. Count the
instr field and use it as a score.
The entry with the highest score is
the plant.
You will have to hand enter the ones
it can't find, but pretty soon that
will be next to none as you have
more and more entries in the table.
ughh
You have a couple different choices. In Access there is no CASE in sql, you need to use IIF. It's not quite as elegant as the solutions in the more robust db engines and needs to be nested for this instance, but it will get the job done for you.
SELECT
iif(instr(ShipToPlant,"#")<>0,"",
iif(instr(ShipToPlant,"-")<>0,"",
iif(instr(ShipToPlant,"/")<>0,"",ShipToPlant ))) AS FieldName
FROM BTST;
You could also do it using the sql to limit your data.
SELECT YourID, nz(aBTST.ShipToPlant,"") AS ShipToPlant
FROM BTST LEFT JOIN (
SELECT YourID, ShipToPlant
FROM BTST
WHERE ShipToPlant NOT IN("#", "-", "/")
) as aBTST ON BTST.YourID=aBTST.YourID
If you know VB you can also create your own functions and put them in the queries...but that is another post. :)
HTH
SELECT
IIF
(
Instr(1,ShipToPlant , "#") > 0
OR Instr(1,ShipToPlant , "/") > 0
OR Instr(1,ShipToPlant , "-") > 0, ""
, ShipToPlant
)
FROM BTST
All - I wound up nesting the REPLACE() function in two separate queries. Since there's upwards of 35 non-alphanumeric characters that I needed to replace and Access limits the complexity of the query to somewhere around 20 nested functions, I merely split it into two processes. Somewhat clunky, but it worked. Should have followed the KISS principle in this case. Thanks for your help!
I know this is a really old question, but I stumbled over it whilst looking for a solution to this problem, but ended up using a different approach.
The field that I wish to update is called 'Customers'. There are 20-odd accented characters in the 'CustName' field for which I wish to remove the diacritics - so (for example) ã > a.
For each of these characters I created a new table 'recodes' with 2 fields 'char' and 'recode'. 'char' contains the character I wish to remove, and 'recode' houses the replacement.
Then for the replace I did a full outer join inside the update statement
UPDATE Customers, Recodes SET Customers.CustName = Replace([CustName],[char],[recode]);
This has the same effect as nesting all of the replace statements, and is a lot easier to manage.
This query grabs the 3 first characters and replace them with Blanks
Example: BO-1234
Output: 1234
BO: IIf(IsNumeric(Left([sMessageDetails],3)),[sMessageDetails],Replace([sMessageDetails],Left([sMessageDetails],3),""))