Get total number of user where username have defferrent case - sql

I have SQL table where username have different cases for example "ACCOUNTS\Ninja.Developer" or "ACCOUNTS\ninja.developer"
I want to find the how many records where username where first in first and last name capitalize ? how can use Regex to find the total ?
x table
User
"ACCOUNTS\James.McAvoy"
"ACCOUNTS\michael.fassbender"
"ACCOUNTS\nicholas.hoult"
"ACCOUNTS\Oscar.Isaac"

Do you want something like this?
select count(*)
from t
where name rlike 'ACCOUNTS\[A-Z][a-z0-9]*[.][A-Z][a-z0-9]*'
Of course, different databases implement regular expressions differently, so the actual comparator may not be rlike.
In SQL Server, you can do:
select count(*)
from t
where name like 'ACCOUNTS\[A-Z][^.][.][A-Z]%';
You might need to be sure that you have a case-sensitive collation.

In most cases in MS SQL string collation is case insensitive so we need some trick. Here is an example:
declare #accts table(acct varchar(100))
--sample data
insert #accts values
('ACCOUNTS\James.McAvoy'),
('ACCOUNTS\michael.fassbender'),
('ACCOUNTS\nicholas.hoult'),
('ACCOUNTS\Oscar.Isaac')
;with accts as (
select
--cleanup and split values
left(replace(acct,'ACCOUNTS\',''),charindex('.',replace(acct,'ACCOUNTS\',''),0)-1) frst,
right(replace(acct,'ACCOUNTS\',''),charindex('.',replace(acct,'ACCOUNTS\',''),0)) last
from #accts
)
,groups as (--add comparison columns
select frst, last,
case when CAST(frst as varbinary(max)) = CAST(lower(frst) as varbinary(max)) then 'lower' else 'Upper' end frstCase, --circumvert case insensitive
case when CAST(last as varbinary(max)) = CAST(lower(last) as varbinary(max)) then 'lower' else 'Upper' end lastCase
from accts
)
--and gather fruit
select frstCase, lastCase, count(frst) cnt
from groups
group by frstCase,lastCase

Your question is a little vague but;
You might be looking for the DISTINCT command.
REF
I don't think you need regex.
Maybe do something like:
Get distinct names from Table X as Table A
Use inputs table A as where clause on Table X
count
union
I hope this helps,
Rhys

Given your example set you can use a combination of techniques. First if the user name always begins with "ACCOUNTS\" then you can use substr to select the characters that start after the "\" character.
For the first name:
Then you can use a regex function to see if it matches against [A-Z] or [a-z] assuming your username must start with an alpha character.
For the last name:
Use the instr function on the substr and search for the character '.' and again apply the regex function to match against [A-Z] or [a-z] to see if the last name starts with an upper or a lower character.
To total:
Select all matches where both first and last match against upper and do a count. Repeat for the lower matches and you'll have both totals.

Related

SQL: Finding dynamic length characters in a data string

I am not sure how to do this, but I have a string of data. I need to isolate a number out of the string that can vary in length. The original string also varies in length. Let me give you an example. Here is a set of the original data string:
:000000000:370765:P:000001359:::3SA70000SUPPL:3SA70000SUPPL:
:000000000:715186816:P:000001996:::H1009671:H1009671:
For these two examples, I need 3SA70000SUPPL from the first and H1009671 from the second. How would I do this using SQL? I have heard that case statements might work, but I don't see how. Please help.
This works in Oracle 11g:
with tbl as (
select ':000000000:370765:P:000001359:::3SA70000SUPPL:3SA70000SUPPL:' str from dual
union
select ':000000000:715186816:P:000001996:::H1009671:H1009671:' str from dual
)
select REGEXP_SUBSTR(str, '([^:]*)(:|$)', 1, 8, NULL, 1) data
from tbl;
Which can be described as "look at the 8th occurrence of zero or more non-colon characters that are followed by a colon or the end of the line, and return the 1st subgroup (which is the data less the colon or end of the line).
From this post: REGEX to select nth value from a list, allowing for nulls
Sorry, just saw you are using DB2. I don't know if there is an equivalent regular expression function, but maybe it will still help.
For the fun of it: SQL Fiddle
first substring gets the string at ::: and second substring retrieves the string starting from ::: to :
declare #x varchar(1024)=':000000000:715186816:P:000001996:::H1009671:H1009671:'
declare #temp varchar(1024)= SUBSTRING(#x,patindex('%:::%', #x)+3, len(#x))
SELECT SUBSTRING( #temp, 0,CHARINDEX(':', #temp, 0))

Select rows that has mixed charcters in a single value e.g. 'Joh?n' in name column

In an oracle table:
1- a value in a VARCHAR column contains characters that are not letters.
Consider a scenarion where a name in 'last_name' column contains regular characters (A - Z, a - z) as well as characters that are not english letters (e.g. '.', '-', ' ','_', '>' or similar).
The challenge is to select the rows that has names in 'last_name' as '.John' or 'John.' or '-John' or 'Joh-n'
2- Is it possible to have non-date values in a Date defined column? If yes, how can such records be selected in an oracle query?
Thanks!
I believe this will do the trick:
SELECT * FROM mytable WHERE REGEXP_LIKE(last_name, '[^A-Za-z]');
As for your 2nd question, I am unsure. I would be glad if someone else could add on to what I have to answer your 2nd question. I have found this website thought that might be of help. http://infolab.stanford.edu/~ullman/fcdb/oracle/or-time.html
It explains the DATE format.
If I properly understand your goal, you need to select rows with last_name column containing the name 'John', but it may also have additional characters before, after, or even inside the name. In that case, this should be helpful:
select * from tab where regexp_replace(last_name, '[^A-Za-z]+', '') = 'John'

replace two characters in one cell

I am using this query to replace one character in a cell
select replace(id,',','')id from table
But I want to replace two characters in a cell.
If the cell is having this data (1,3.1), and I want it to look like this (131).
How can I replace two different characters in one cell?
Use TRANSLATE instead of REPLACE(). It replaces each occurrence of a character in the first pattern with its matched character in the second. To remove characters, simply leave cut short the replacement string:
select translate(id, '1,.', '1') id from table
Note that the second string cannot be null. Hence the need to include 1 (or some other character) in both strings.
Find out more.
Obviously the more characters you need to convert/remove the more attractive TRANSLATE() becomes. The main use for REPLACE is changing patterns (such as words) rather than individual characters.
Can use
select replace(translate(id,',.',' '),' ','') from table;
or
select regexp_replace('1,3.1','[,.]','') from dual;
or
select replace(replace(id,',',''),'.','') from table;
Call the replace again.
select replace(replace(id,',',''), '.','') id from table
Do this:
select REPLACE(REPLACE(id,',',''),'.','')
Or use a regular expression:
select regexp_replace(id, '[.,]', '') id from table
Find out more

Oracle: a query, which counts occurrences of all non alphanumeric characters in a string

What would be the best way to count occurrences of all non alphanumeric characters that appear in a string in an Oracle database column.
When attempting to find a solution I realised I had a query that was unrelated to the problem, but I noticed I could modify it in the hope to solve this problem. I came up with this:
SELECT COUNT (*), SUBSTR(TITLE, REGEXP_INSTR(UPPER(TITLE), '[^A-Z,^0-9]'), 1)
FROM TABLE_NAME
WHERE REGEXP_LIKE(UPPER(TITLE), '[^A-Z,^0-9]')
GROUP BY SUBSTR(TITLE, REGEXP_INSTR(UPPER(TITLE), '[^A-Z,^0-9]'), 1)
ORDER BY COUNT(*) DESC;
This works to find the FIRST non alphanumeric character, but I would like to count the occurrences throughout the entire string, not just the first occurrence. E. g. currently my query analysing "a (string)" would find one open parenthesis, but I need it to find one open parenthesis and one closed parenthesis.
There is an obscure Oracle TRANSLATE function that will let you do that instead of regexp:
select a.*,
length(translate(lower(title),'.0123456789abcdefghijklmnopqrstuvwxyz','.'))
from table_name a
Try this:
SELECT a.*, LENGTH(REGEXP_REPLACE(TITLE, '[^a-zA-Z0-9]'), '')
FROM TABLE_NAME a
The best option, as you discovered is to use a PL/SQL procedure. I don't think there's any way to create a regex expression that will return multiple counts like you're expecting (at least, not in Oracle).
One way to get around this is to use a recursive query to examine each character individually, which could be used to return a row for each character found. The following example will work for a single row:
with d as (
select '(1(2)3)' as str_value
from dual)
select char_value, count(*)
from (select substr(str_value,level,1) as char_value
from d
connect by level <= length(str_value))
where regexp_instr(upper(char_value), '[^A-Z,^0-9]'), 1) <> 0
group by char_value;

Ignoring spaces between words - an SQL question

I want to grab from a db a record with name='John Doe'.
I'd like my query to also grab 'John(4 spaces between)Doe','John(2 spaces betwewen)Doe' etc. (at least one space however).
I'd also like that the case won't matter, so I can also get 'John Doe' by typing
'john doe' etc.
Try this:
SELECT * FROM table WHERE lower(NAME) like 'john%doe%'
use like with wildcards (e.g. %) to get around the spaces and the lower (orlcase) to be case insensitive.
EDIT:
As the commenters pointed out, there are two shortcomings within this solution.
First: you will select "johnny doe" or worse "john Eldoe", or worse, "john Macadoelt" with this query, so you'll need extra filtering on the application side.
Second: using a function can lead to table scans instead of index scans. This may be avoided, if your dbms supports function based indexes. See this Oracle example
If your database has Replace function
Select * From Table
Where Upper(Replace(name, ' ', '')) = 'JOHNDOE'
The rest of these will return rows where the middle part between John and Doe is anything, not just spaces...
if your database has left function and Reverse, Try either
Select * From Table
Where left(Upper(name), 4) = 'JOHN'
And Left(Reverse(Upper(name), 3)) = 'EOD'
else use substring and reverse
Select * From Table
Where substring(Upper(name), 4) = 'JOHN'
And substring(Reverse(Upper(name), 3)) = 'EOD'
or Like operator
Select * From Table
Where Upper(name) Like 'JOHN%DOE'
In MYSQL:
select user_name from users where lower(trim(user_name)) regexp 'john[[:blank:]]+doe' = 1;
Explanation:
In the above case the user_name matches the regular expression john(one or more spaces)doe. The comparison is case insensitive and trim takes care of spaces before and after the string.
In case the search string contains special characters, they need to be escaped. More in mysql documentation: http://dev.mysql.com/doc/refman/5.5/en/regexp.html
In sql you can use wildcards, that is a character that stands in place of other characters. for example:
select * from table where name like 'john%doe'
will select all names that start with john and end with doe no matter how many characters in between.
This article explains more and gives some options.
The wildcard will match on zero characters, so if you want at least one space then you should do
select * from table where lower(name) like 'john %doe'
In Oracle you can do:
SELECT * FROM table
WHERE UPPER(REGEXP_REPLACE(NAME,'( ){2,}', ' ') = 'JOHN DOE';
No false positives like 'john mordoe' etc.