MS-SQL List of email addresses LIKE statement/regex - sql

I have a column in my table called TO which is a comma separated list of email addresses. (1-n)
I am not concerned with a row if it ONLY contains addresses to Whatever#mycompany.com and want to flag that as 0. However, if a row contains a NON mycompany address (even if there are mycompany addresses present) I'd like to flag it as 1. Is this possible using one LIKE statement?
I've tried;
AND
[To] like '%#%[^m][^y][^c][^o][^m][^p][^a][^n][^y]%.%'
The ideal output will be:
alice#mycompany.com, bob#mycompany.com, malory#yourcompany.com 1
alice#mycompany.com, bob#mycompany.com 0
malory#yourcompany.com 1
Would it be better to write some kind of parsing function to split out addresses into a table if this isnt possible? I don't have an exhaustive list of other domains in the data.

It's ugly but it works. Case statement compares number of occurences of # symbol with number of occurences of #mycompany.com (XXX.. is just for keeping the length of the string):
select
*
, flag = case when len(field) - len(replace(replace(field,'#mycompany.com','XXXXXXXXXXXXXX'),'#','')) > 0 then 1 else 0 end
from (
select 'alice#mycompany.com, bob#mycompany.com, malory#yourcompany.com' as field union all
select 'alice#mycompany.com, bob#mycompany.com' union all
select 'malory#yourcompany.com'
) x

I would suggest a simple counting approach. Count the number of times that "#mycompany" appears and count the number of commas. If these differ, then you have an issue:
select emails,
(case when len(emails) - len(replace(emails, ',', '')) =
len(emails) - len(replace(emails, '#mycompany.com', 'mycompany.com'))
then 0
else 1
end) as HasNonCompanyEmail
from t
To simplify the arithmetic, I replace "#mycompany.com" with "mycompany.com". This removes exactly one character.

Related

Check specific integer is even

I want to know if the 4th integer in the ID, is even, or if its odd.
If the 4th number is even (if the number is either 0,2,4,6,8 I want to put the ID into a new column named 'even'
IF the 4th number is odd, the column should have the name 'Odd'
select ID as 'Female'
from Users2
where ID LIKE '%[02468]'
This shows if any of the numbers are even. I want to specify the 4th number
Try this:
select *, OddOrEven = iif(substring(ID,4,1) in ('0','2','4','6','8') , 'Even', 'Odd') from Users2
This will tell you whether the 4th character is Odd or Even.
This is of course assuming that the 4th character of ID column will be numeric.
To make it permanently part of the table, you can add a computed column as shown below.
alter table Users2
add OddOrEven as iif(substring(ID,4,1) in ('0','2','4','6','8'), 'Even', 'Odd')
Substring the character you are interested in
Convert to an int
Check whether modulus 2 returns 0 (i.e. even).
select id
, case when convert(int,substring(id, 4, 1)) % 2 = 0 then 'Even' else 'Odd' end
from Users;
Example:
select id
, case when convert(int,substring(id, 4, 1)) % 2 = 0 then 'Even' else 'Odd' end
from (values ('4545-4400'), ('4546-4400')) X (id);
Returns
id
4545-4400
Odd
4546-4400
Even
Thats assuming there is always a 4th character. If not you would need to check for it.
You were close, but only need to check a single character against a set of characters:
where Substring( Id, 4, 1 ) like '[02468]'
Note that there is no wildcard (%) in the pattern.
It can be used in an expression like:
case when Substring( Id, 4, 1 ) like '[02468]' then 'Even' else 'Odd' end as Oddity

How to ignore specific string value when using pattern and patindex function in SQL Server Query?

I have this query here.
WITH Cte_Reverse
AS (
SELECT CASE PATINDEX('%[^0-9.- ]%', REVERSE(EmailName))
WHEN 0
THEN REVERSE(EmailName)
ELSE left(REVERSE(EmailName), PATINDEX('%[^0-9.- ]%', REVERSE(EmailName)) - 1)
END AS Platform_Campaign_ID,
EmailName
FROM [Arrakis].[xtemp].[Stage_SendJobs_Marketing]
)
SELECT REVERSE(Platform_Campaign_ID) AS Platform_Campaign_ID, EmailName
FROM Cte_Reverse
WHERE REVERSE(Platform_Campaign_ID) <> '2020'
AND REVERSE(Platform_Campaign_ID) <> ''
AND LEN(REVERSE(Platform_Campaign_ID)) = 4;
It is working for the most part, below is a screenshot of the result set.
The query I posted above extracts the 4 numbers to the right out of the initial value that is set for the column I am extracting out of. But I am unable to figure out how I can also have the query ignore cases when the right most value is -v2, -v1, etc. essentially anything with -v and whatever number version it is.
If you want four digits, then one method is:
select substring(emailname, patindex('%[0-9][0-9][0-9][0-9]%', emailname), 4)

Selecting Strings With Alphabetized Characters - In SQL Server 2008 R2

This is a recreational pursuit, and is not homework. If you value academic challenges, please read on.
A radio quiz show had a segment requesting listeners to call in with words that have their characters in alphabetical order, e.g. "aim", "abbot", "celt", "deft", etc. I got these few examples by a quick Notepad++ (NPP) inspection of a Scrabble dictionary word list.
I'm looking for an elegant way in T-SQL to determine if a word qulifies for the list, i.e. all its letters are in alpha order, case insensitive.
It seemed to me that there should be some kind of T-SQL algorithm possible that will do a SELECT on a table of English words and return the complete list of all words in the Srcabble dictionary that meets the spec. I've spent considerable time looking at regex strings, but haven't hit on anything that comes even remotely close. I've thought about the obvious looping scenario, but abandoned it for now as "inelegant". I'm looking for your ideas that will obtain the qualifying word list,
preferably using
- a REGEX expression
- a tally-table-based approach
- a scalar UDF that returns 1 if the input word meets the requirement, else 0.
- Other, only limited by your creativity.
But preferably NOT using
- a looping structure
- a recursive solution
- a CLR solution
Assumptions/observations:
1. A "word" is defined here as two or more characters. My dictionary shows 55 2-character words, of which only 28 qualify.
2. No word will have more than two concecutive characters that are identical. (If you find one, please point it out.)
3. At 21 characters, "electroencephalograms" is the longest word in my Scrabble dictionary
(though why that word is in the Scrabble dictionary escapes me--the board is only a 15-by-15 grid.)
Consider 21 as the upper limit on word length.
4. All words LIKE 'Z%' can be dismissed because all you can create is {'Z','ZZ', ... , 'ZZZ...Z'}.
5. As the dictionary's words' initial character proceedes through the alphabet, fewer words will qualify.
6. As the word lengths get longer, fewer words will qualify.
7. I suspect that there will be less than 0.2% of my dictionary's 60,387 words that will qualify.
For example, I've tried NPP regex searches like "^a[a-z][b-z][b-z][c-z][c-z][d-z][d-z][e-z]" for 9-letter words starting with "a", but the character-by-character alphabetic enforcement is not handled properly. This search will return "abilities" which fails the test with the "i" that follows the "l".
There's several free Scrabble word lists available on the web, but Phil Factor gives a really interesting treatment of T-SQL/Scrabble considerations at https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-scrabble-and-rapping/ which is where I got my word list.
Care to give it a shot?
Split the word into individual characters using a numbers table. Use the numbers as one set of indices. Use ROW_NUMBER to create another set. Compare the two sets of indices to see if they match for every character to see if they match. If they do, the letters in the word are in the alphabetical order.
DECLARE #Word varchar(100) = 'abbot';
WITH indexed AS (
SELECT
Index1 = n.Number,
Index2 = ROW_NUMBER() OVER (ORDER BY x.Letter, n.Number),
x.Letter
FROM
dbo.Numbers AS n
CROSS APPLY
(SELECT SUBSTRING(#Word, n.Number, 1)) AS x (Letter)
WHERE
n.Number BETWEEN 1 AND LEN(#Word)
)
SELECT
Conclusion = CASE COUNT(NULLIF(Index1, Index2))
WHEN 0 THEN 'Alphabetical'
ELSE 'Not alphabetical'
END
FROM
indexed
;
The NULLIF(Index, Index2) expression does the comparison: it returns a NULL if the the arguments are equal, otherwise it returns the value of Index1. If all indices match, all the results will be NULL and COUNT will return 0, which means the order of letters in the word was alphabetical.
I did something similar to Andriy. I created a numbers table with value 1-21. I use it to create one set of data with the individual letters order by the index and the a second set ordered alphabetically. Joined the sets to each other on the letter and numbers. I then count nulls. Anything over 0 means it is not in order.
DECLARE #word VARCHAR(21)
SET #word = 'abbot'
SELECT Count(1)
FROM (SELECT Substring(#word, number, 1) AS Letter,
Row_number() OVER ( ORDER BY number) AS letterNum
FROM numbers
WHERE number <= CONVERT(INT, Len(#word))) a
LEFT OUTER JOIN (SELECT Substring(#word, number, 1) AS letter,
Row_number() OVER ( ORDER BY Substring(#word, number, 1)) AS letterNum
FROM numbers
WHERE number <= CONVERT(INT, Len(#word))) b
ON a.letternum = b.letternum
AND a.letter = b.letter
WHERE b.letter IS NULL
Interesting idea...
Here's my take on it. This returns a list of words that are in order, but you could easily return 1 instead.
DECLARE #WORDS TABLE (VAL VARCHAR(MAX))
INSERT INTO #WORDS (VAL)
VALUES ('AIM'), ('ABBOT'), ('CELT'), ('DAVID')
;WITH CHARS
AS
(
SELECT VAL AS SOURCEWORD, UPPER(VAL) AS EVALWORD, ASCII(LEFT(UPPER(VAL),1)) AS ASCIICODE, RIGHT(VAL,LEN(UPPER(VAL))-1) AS REMAINS, 1 AS ROWID, 1 AS INORDER, LEN(VAL) AS WORDLENGTH
FROM #WORDS
UNION ALL
SELECT SOURCEWORD, REMAINS, ASCII(LEFT(REMAINS,1)), RIGHT(REMAINS,LEN(REMAINS)-1), ROWID+1, INORDER+CASE WHEN ASCII(LEFT(REMAINS,1)) >= ASCIICODE THEN 1 ELSE 0 END AS INORDER, WORDLENGTH
FROM CHARS
WHERE LEN(REMAINS)>=1
),
ONLYINORDER
AS
(
SELECT *
FROM CHARS
WHERE ROWID=WORDLENGTH AND INORDER=WORDLENGTH
)
SELECT SOURCEWORD
FROM ONLYINORDER
Here it is as a UDF:
CREATE FUNCTION dbo.AlphabetSoup (#Word VARCHAR(MAX))
RETURNS BIT
AS
BEGIN
SET #WORD = UPPER(#WORD)
DECLARE #RESULT INT
;WITH CHARS
AS
(
SELECT #WORD AS SOURCEWORD,
#WORD AS EVALWORD,
ASCII(LEFT(#WORD,1)) AS ASCIICODE,
RIGHT(#WORD,LEN(#WORD)-1) AS REMAINS,
1 AS ROWID,
1 AS INORDER,
LEN(#WORD) AS WORDLENGTH
UNION ALL
SELECT SOURCEWORD,
REMAINS,
ASCII(LEFT(REMAINS,1)),
RIGHT(REMAINS,LEN(REMAINS)-1),
ROWID+1,
INORDER+CASE WHEN ASCII(LEFT(REMAINS,1)) >= ASCIICODE THEN 1 ELSE 0 END AS INORDER,
WORDLENGTH
FROM CHARS
WHERE LEN(REMAINS)>=1
),
ONLYINORDER
AS
(
SELECT 1 AS RESULT
FROM CHARS
WHERE ROWID=WORDLENGTH AND INORDER=WORDLENGTH
UNION
SELECT 0
FROM CHARS
WHERE NOT (ROWID=WORDLENGTH AND INORDER=WORDLENGTH)
)
SELECT #RESULT = RESULT FROM ONLYINORDER
RETURN #RESULT
END

How to quickly compare many strings?

In SQL Server, I have a string column that contains numbers. Each entry I need is only one number so no parsing is needed. I need some way to find all rows that contain numbers from 400 to 450. Instead of doing:
...where my stringcolumn like '%400%' or stringcolumn like '%401%' or stringcolumn like '%402%' or ...
is there a better that can save on some typing?
There are also other values in these rows such as: '5335154', test4559#me.com', '555-555-5555'. Filtering those out will need to be taken into account.
...where stringcolumn like '4[0-4][0-9]' OR stringcolumn = '450'
You don't need the wildcard if you want to restrict to 3 digits.
Use regex to accomplish this.
...where stringcolumn like '4[0-4][0-9]' OR stringcolumn like '450'
one way
WHERE Column like '%4[0-4][09]%'
OR Column LIKE '%500%'
keep in mind that this will pick anything with the number in it, so 5000 will be returned as well
I would do the following:
select t.*
from (select t.*,
(case when charindex('4', col) > 0
then substrint(col, charindex('4', col), charindex('4', col) + 2)
end) as col4xx
from t
) t
where (case when isnumeric(col4xx) = 1
then (case when cast(col4xx as int) between 400 and 450 then 'true'
end)
end) = 'true'
I'm not a fan of having case statements in WHERE clauses. However, to ensure conversion to a number, this is needed (or the conversion could become a column in another subquery). Note that the following is not equivalent:
where col4xx between '400' and '450'
Since the string '44A' would match.

sql with <> and substring function

The output of query has to return records where company is not equal to 'CABS' OR substring of company until empty space (eg CABS NUTS).The company name can the CABS, COBS, CABST , CABS NUTS , CAB
SELECT *
FROM records
WHERE UPPER(SUBSTR(company, 0, (INSTR(company,' ')-1))) <> 'CABS'
OR COMPANY <> 'CABS'
But the above query is returing CABS NUTS along with COBS , CAB.
I tried using "LIKE CABS" it looks fine but if the company name is "CAB" it will not return "CABS" and CABS NUTS because of like. So LIKE is completely ruled out.
Can anyone please suggest me.
So you want all records where the first 4 characters of the Company field are not "CABS". Okay.
WHERE left(company, 4) != 'CABS'
SELECT
*
FROM
Records
WHERE
LEFT(Company, 4) <> 'CABS'
AND Company <> 'CABS'
Note: Basic TSQL String Comparison is case-insensitive
Can quite work out which ones you do want returns, but have you considered LIKE 'CABS %'
select * from records where company NOT IN (SELECT company
FROM records
WHERE UPPER(SUBSTR(company, 0, (INSTR(company,' ')-1))) = 'CABS'
OR COMPANY = 'CABS')
I think this will fetch the desired records from the records table
RECORDS:
COMPANY
=====================
CAB
CABST
COBS
First, I think you should use AND instead of OR in your compound condition.
Second, you could simplify the condition this way:
WHERE UPPER(SUBSTR(company, 0, (INSTR(company || ' ',' ') - 1))) <> 'CABS'
That is, the company <> 'CABS' part is not needed in this case.
The problem you are getting comes about because the result of the SUBSTR is null if there is not a space. And thanks to three value logic, the result of some_var <> NULL is NULL, rather than TRUE as you might expect.
And example of this is shown by the query below:
with mytab as (
select 1 as myval from dual union all
select 2 as myval from dual union all
select null as myval from dual
)
select *
from mytab
where myval = 1
union all
select *
from mytab
where myval <> 1
This example will only return two rows rather than three rows that you might expect.
There are several ways to rewrite the condition to make it ignore the null result from the substr function. These are listed below. However, as mentioned by one of the other respondents, the two conditions need to be joined using the AND operator rather than OR.
Firstly, you could explicitly check that the column has a space in it using the set of conditions below:
(INSTR(company,' ') = 0 or
UPPER(SUBSTR(company, 0, (INSTR(company,' ')-1))) <> 'CABS') and
COMPANY <> 'CABS'
Another option would be to use the LNNVL function. This is a function that I only recently found out about. It return TRUE from a condition when the result of the condition provided as the input is FALSE or NULL.
lnnvl(UPPER(SUBSTR(company, 0, (INSTR(company,' ')-1))) = 'CABS') and
COMPANY <> 'CABS'
And another option (which would probably be my preferred option) is to use the REGEXP_LIKE function. This is simple, to the point and easy to read.
WHERE not regexp_like(company, '^CABS( |$)')