T-SQL: checking for email format - sql

I have this scenario where I need data integrity in the physical database. For example, I have a variable of #email_address VARCHAR(200) and I want to check if the value of #email_address is of email format. Anyone has any idea how to check format in T-SQL?
Many thanks!

I tested the following query with many different wrong and valid email addresses. It should do the job.
IF (
CHARINDEX(' ',LTRIM(RTRIM(#email_address))) = 0
AND LEFT(LTRIM(#email_address),1) <> '#'
AND RIGHT(RTRIM(#email_address),1) <> '.'
AND CHARINDEX('.',#email_address ,CHARINDEX('#',#email_address)) - CHARINDEX('#',#email_address ) > 1
AND LEN(LTRIM(RTRIM(#email_address ))) - LEN(REPLACE(LTRIM(RTRIM(#email_address)),'#','')) = 1
AND CHARINDEX('.',REVERSE(LTRIM(RTRIM(#email_address)))) >= 3
AND (CHARINDEX('.#',#email_address ) = 0 AND CHARINDEX('..',#email_address ) = 0)
)
print 'valid email address'
ELSE
print 'not valid'
It checks these conditions:
No embedded spaces
'#' can't be the first character of an email address
'.' can't be the last character of an email address
There must be a '.' somewhere after '#'
the '#' sign is allowed
Domain name should end with at least 2 character extension
can't have patterns like '.#' and '..'

AFAIK there is no good way to do this.
The email format standard is so complex parsers have been known to run to thousands of lines of code, but even if you were to use a simpler form which would fail some obscure but valid addresses you'd have to do it without regular expressions which are not natively supported by T-SQL (again, I'm not 100% on that), leaving you with a simple fallback of somethign like:
LIKE '%_#_%_.__%'
..or similar.
My feeling is generally that you shouln't be doing this at the last possible moment though (as you insert into a DB) you should be doing it at the first opportunity and/or a common gateway (the controller which actually makes the SQL insert request), where incidentally you would have the advantage of regex, and possibly even a library which does the "real" validation for you.

If you use SQL 2005 or 2008 you might want to look at writing CLR stored proceudues and use the .NET regex engine like this. If you're using SQL 2000 or earlier you can use the VBScript scripting engine's regular expression like ths. You could also use an extended stored procedure like this

There is no easy way to do it in T-SQL, I am afraid. To validate all the varieties of email address allowed byRFC 2822 you will need to use a regular expression.
More info here.
You will need to define your scope, if you want to simplify it.

Related

How to use Regex to lowercase catalogue values without any logic codes

For a loan domain we pass some catalogue values eg. if a customer is primary or secondary customer like that. So i need to check the values irrespective of uppercase, lowercase, camelcase. Software which i am using will accept only regex codes not any Java, js codes (it is different scripting). I am trying to convert only with regexp but still getting error.
If catalogue_value ~"(/A-Z/)" then
Catalogue_value ~"/l"
Endif
As i am learning regex as of now still figuring for correct expressions to use.
Kindly please tell me correct format to use regex to change into lowercase / uppercase
If i understood your problem you want to search without worrying about the case, for example the data is Paul, and you want to find this record searching by PAUL, paul, PaUl, etc?
One common to technique to do that is to put both sides all in upper or lower case, without regex, for example, in javascript:
"Paul".toLowerCase() === "paUL".toLowerCase()
In SQL:
select case when LOWER('Paul') = LOWER('paUL') then 1 else 0 end

REVERSE function in Netezza not working, how to extract file name from path without it?

My company has production and testing databases in SQL Server and a data warehouse in IBM Netezza. I wrote a query in SQL Server and now need to covert it for use in the data warehouse, however I am running into a problem.
A crucial part of the query is extracting a file name from a path, and in SQL Server I use this:
RIGHT( BitmapID, CHARINDEX( '\', REVERSE( BitmapID ) + '\' ) - 1 )
This turns "G:\grps\every\Permanent Marketing Signage\SPC\BRD\BLAD\BCAG_BLAD_001.png" to "BCAG_BLAD_001.png" and it works perfectly. I tried to convert this to Netezza syntax like so:
SUBSTRING(bit_map_ID, LENGTH(bit_map_ID) - ( STRPOS( REVERSE( bit_map_ID ), '\' ) + 2 ) )
However, when I run this, I get an error:
ERROR [42S02] ERROR: Function 'REVERSE(VARCHAR)' does not exist
Unable to identify a function that satisfies the given argument types
You may need to add explicit typecasts
When I replace REVERSE( bit_map_ID ) with a reversed string example like "gnp.100_DALB_GACB\DALB\DRB\CPS\egangiS gnitekraM tnenamreP\yreve\sprg:G" this also works perfectly, so it's the REVERSE function that's the problem. Even though Aginity Workbench highlights the REVERSE function as if it exists, it doesn't seem to work at all - or if there is a way to make it work, I can't figure it out. I've already tried using CAST as suggested by the error message but it makes no difference.
Is there a way to reverse a string in Netezza? Or failing that, is there any other way of accomplishing what I want to do without reversing the string?
I was able to figure out how to do this in Netezza without using a REVERSE function like so:
SUBSTRING( bit_map_ID, INSTR( bit_map_ID, '\', -1 ) + 1 )
The key is to use the INSTR function and specify the third argument as -1 so that it will look for the first instance starting from the end of the string instead of the beginning of the string. No reversing needed.
While this works for my needs, I would definitely be open for alternative answers for the question I posed!
To my knowledge, the REVERSE function does not exist on netezza, and that is indeed what the error message above says, so I can confirm that the solution you provided is the way to go.
Alternative solutions would be to use a regular expression function or a string split.
To my knowledge MSsql server has none of those 3 solutions available, and the real issue for you is probably that the SQL standard does not include a list of functions needed to be compliant, so each database has its own take on which functions to include and what their interface is (negative arguments to instr in not universally accepted)

Using SQL - how do I match an exact number of characters?

My task is to validate existing data in an MSSQL database. I've got some SQL experience, but not enough, apparently. We have a zip code field that must be either 5 or 9 digits (US zip). What we are finding in the zip field are embedded spaces and other oddities that will be prevented in the future. I've searched enough to find the references for LIKE that leave me with this "novice approach":
ZIP NOT LIKE '[0-9][0-9][0-9][0-9][0-9]'
AND ZIP NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
Is this really what I must code? Is there nothing similar to...?
ZIP NOT LIKE '[\d]{5}' AND ZIP NOT LIKE '[\d]{9}'
I will loath validating longer fields! I suppose, ultimately, both code sequences will be equally efficient (or should be).
Thanks for your help
Unfortunately, LIKE is not regex-compatible so nothing of the sort \d. Although, combining a length function with a numeric function may provide an acceptable result:
WHERE ISNUMERIC(ZIP) <> 1 OR LEN(ZIP) NOT IN(5,9)
I would however not recommend it because it ISNUMERIC will return 1 for a +, - or valid currency symbol. Especially the minus sign may be prevalent in the data set, so I'd still favor your "novice" approach.
Another approach is to use:
ZIP NOT LIKE '%[^0-9]%' OR LEN(ZIP) NOT IN(5,9)
which will find any row where zip does not contain any character that is not 0-9 (i.e only 0-9 allowed) where the length is not 5 or 9.
There are few ways you could achieve that.
You can replace [0-9] with _ like
ZIP NOT LIKE '_'
USE LEN() so it's like
LEN(ZIP) NOT IN(5,9)
You are looking for LENGTH()
select * from table WHERE length(ZIP)=5;
select * from table WHERE length(ZIP)=9;
To test for non-numeric values you can use ISNUMERIC():
WHERE ISNUMERIC(ZIP) <> 1

Is there a way to use the LIKE operator from an entity framework query?

OK, I want to use the LIKE keyword from an Entity Framework query for a rather unorthodox reason - I want to match strings more precisely than when using the equals operator.
Because the equals operator automatically pads the string to be matched with spaces such that col = 'foo ' will actually return a row where col equals 'foo' OR 'foo ', I want to force trailing whitespaces to be taken into account, and the LIKE operator actually does that.
I know that you can coerce Entity Framework into using the LIKE operator using .StartsWith, .EndsWith, and .Contains in a query. However, as might be expected, this causes EF to prefix, suffix, and surround the queried text with wildcard % characters. Is there a way I can actually get Entity Framework to directly use the LIKE operator in SQL to match a string in a query of mine, without adding wildcard characters? Ideally it would look like this:
string usernameToMatch = "admin ";
if (context.Users.Where(usr => usr.Username.Like(usernameToMatch)).Any()) {
// An account with username 'admin ' ACTUALLY exists
}
else {
// An account with username 'admin' may exist, but 'admin ' doesn't
}
I can't find a way to do this directly; right now, the best I can think of is this hack:
context.Users.Where(usr =>
usr.Username.StartsWith(usernameToMatch) &&
usr.Username.EndsWith(usernameToMatch) &&
usr.Username == usernameToMatch
)
Is there a better way? By the way I don't want to use PATINDEX because it looks like a SQL Server-specific thing, not portable between databases.
There isn't a way to get EF to use LIKE in its query, However you could write a stored procedure that finds users using LIKE with an input parameter and use EF to hit your stored procedure.
Your particular situation however seems to be more of a data integrity issue though. You shouldn't be allowing users to register usernames that start or end with a space (username.Trim()) for pretty much this reason. Once you do that then this particular issue goes away entirely.
Also, allowing 'rough' matches on authentication details is beyond insecure. Don't do it.
Well there doesn't seem to be a way to get EF to use the LIKE operator without padding it at the beginning or end with wildcard characters, as I mentioned in my question, so I ended up using this combination which, while a bit ugly, has the same effect as a LIKE without any wildcards:
context.Users.Where(usr =>
usr.Username.StartsWith(usernameToMatch) &&
usr.Username.EndsWith(usernameToMatch) &&
usr.Username == usernameToMatch
)
So, if the value is LIKE '[usernameToMatch]%' and it's LIKE '%[usernameToMatch]' and it = '[usernameToMatch]' then it matches exactly.

In T-SQL under MS SQL Server 2008, what does '#' mean in front of a parameter *value* that's a string literal?

I have come across the following example code:
EXECUTE msdb.dbo.sysmail_add_profileaccount_sp
#profile_name = #'SQL mail profile',
#account_name = #'account name',
#sequence_number = 1 ;
What does '#' mean in front of the string literal, as in the example above:
#account_name=#'account name'
I understand that my question may stand true for any executable module's parameters in T-SQL, or maybe for any string literal in T-SQL in general, so the above is just a concrete example of what I'm looking at.
I do not think that this is valid T-SQL. This may be an artifact of replacing variables with values somewhere in a script and not trimming the leading #.
I get a syntax error with that, so I don't think it means anything except that it's not valid syntax. Did you pull that from valid T-SQL that is being called using just T-SQL, or perhaps this is parameterized stuff coming from some other language or program?