Query WHERE Only Alphabetic Characters - sql

I am trying to filter out data in my Excel sheet of customers for my company.
The three fields I need to by are FIRST_NAME, LAST_NAME, and COMPANY_NAME.
The rules are as follows:
FIRST_NAME AND LAST_NAME must NOT be NULL
FIRST_NAME AND LAST_NAME must be only alphabetic
The above rules are irrelevant IF COMPANY_NAME is NOT NULL
So, just to reiterate to be clear.. A customer must have a FIRST_NAME AND a LAST_NAME (They cannot be missing one or both), BUT, if they have a COMPANY_NAME they are allowed to not have a FIRST_NAME and/or LAST_NAME.
Here's some example data and if they should stay in the data or not:
FIRST_NAME | LAST_NAME | COMPANY_NAME | Good customer?
-----------|-----------|--------------|--------------------------------
Alex | Goodman | AG Inc. | Yes - All are filled out
John | Awesome | | Yes - First and last are fine
Cindy | | Cindy Corp. | Yes - Company is filled out
| | Blank Spa | Yes - Company is filled out
| | | No - Nothing is filled out
Gordon | Mang#2 | | No - Last contains non-alphabet
Jesse#5 | Levvitt | JL Inc. | Yes - Company is filled out
Holly | | | No - No last or company names
Here is the query (With some fields in the SELECT clause removed):
SELECT VR_CUSTOMERS.CUSTOMER_ID, VR_CUSTOMERS.FIRST_NAME, VR_CUSTOMERS.LAST_NAME, VR_CUSTOMERS.COMPANY_NAME, ...
FROM DEV.VR_CUSTOMERS VR_CUSTOMERS
WHERE (
LENGTH(NAME)>4 AND
(UPPER(NAME) NOT LIKE UPPER('%delete%')) AND
(COMPANY_NAME IS NOT NULL OR (COMPANY_NAME IS NULL AND FIRST_NAME IS NOT NULL AND LAST_NAME IS NOT NULL AND FIRST_NAME LIKE '%^[A-z]+$%' AND LAST_NAME LIKE '%^[A-z]+$%'))
)
I've tried as well the regex of '%[^a-z]%'. I've tried RLIKE and REGEXP, instead of LIKE, and those did not seem to work either.
With the above query, the results only show records with a COMPANY_NAME.

Fixed the issue using REGEXP_LIKE and the regex ^[A-z]+$.
Here is the WHERE clause after this fix:
WHERE (
LENGTH(NAME)>4 AND
(UPPER(NAME) NOT LIKE UPPER('%delete%')) AND
(COMPANY_NAME IS NOT NULL OR (COMPANY_NAME IS NULL AND REGEXP_LIKE(FIRST_NAME, '^[A-z]+$') AND REGEXP_LIKE(LAST_NAME, '^[A-z]+$')))
)

It appears you're using MySQL given your mention of RLIKE and REGEXP. In that case, try this WHERE clause, that uses the regular expression character class 'alpha':
WHERE
COMPANY_NAME is not null -- COMPANY_NAME being present is the higher priority pass condition
or ( -- but if COMPANY_NAME is not present, then the following conditions must be satisfied
FIRST_NAME is not null
and FIRST_NAME REGEXP '[[:alpha:]]+'
and LAST_NAME is not null
and LAST_NAME REGEXP '[[:alpha:]]+'
)
Bear in mind that the not null check is redundant given the regular expression, so the WHERE clause would simplify itself to:
WHERE
COMPANY_NAME is not null -- COMPANY_NAME being present is the higher priority pass condition
or ( -- but if COMPANY_NAME is not present, then the following conditions must be satisfied
FIRST_NAME REGEXP '[[:alpha:]]+'
and LAST_NAME REGEXP '[[:alpha:]]+'
)

Related

find duplicated record by first and last name

I have a table called beneficials. Some facts about it:
A beneficial belongs to one organization
An organization has many beneficial
Beneficials have first and last names and no other identification form.
Some sample data from the table
| id | firstname | lastname | organization_id |
|----|-----------|----------|-----------------|
| 1 | jan | kowalski | 1 |
| 2 | jan | kovalski | 3 |
| 3 | john | doe | 1 |
| 4 | jan | kowalski | 2 |
I want to find if a beneficial from an organization is also present in other organizations through first and last name and if so, I want to get the organization or organizations ids.
in the sample data above, what I want is given organization id 1, the query should return 2 because jan kowalski is also beneficial on organization 2 but not 3 because even though they match the first name, they don't match the last name
I came up with the following query:
with org_beneficials as (
select firstname, lastname from beneficials where organization_id = ? and deleted_at is null
)
select organization_id from beneficials
where firstname in (select firstname from org_beneficials)
and lastname in (select lastname from org_beneficials)
and deleted_at is null
and organization_id <> ?;
it kinda works but returns some false positive if beneficial from different organizations share the same first or last name. I need to match both first and last names and I can't figure out how.
I thought about joining the table itself but I'm not sure if this would work since an organization has many beneficials. Adding a column like fullname is not something I want to do it here
You can group by first and last names, then filter for duplicates
SELECT firstname, lastname
FROM beneficials
GROUP BY firstname, lastname
HAVING COUNT(*) > 1;
After your edits, it seems you want to select the records of people of a given organization that also appear in a different organization
SELECT *
FROM beneficials a
WHERE a.organization_id != 1
AND EXISTS (
SELECT 1
FROM beneficials b
WHERE a.firstname = b.firstname
AND a.lastname = b.lastname
AND b.organization_id = 1
);

VSQL: Concatenate two values in same column from same table

I have a table that looks like the following:
email | first_name
----------------------+------------
------#diffem.com | Matthew
------#email.net | Susan
------#email.net | Thomas
------#email.com | Donald
------#email.com | Paula
I.e. I have records where there is only one value (name) per key (email), but in other instance I have two values per key.
I want the output to look like this:
email | first_name
----------------------+-----------------
------#diffem.com | Matthew
------#email.net | Susan and Thomas
------#email.com | Donald and Paula
I have tried the following, but it is not working due to grouping by an aggregate function:
CREATE TABLE user.table1 AS
(
select distinct email
, case when email_count = 1 then first_name
when email_count = 2 then (MIN(first_name))||' and '||MAX(first_name))
else null end as first_name_grouped
FROM (
SELECT email
, first_name
, count(email) over (partition by email) as email_count
FROM table
)
x
)
;
I've also tried partitioning by email, putting the two names into different columns and then concatenating that, but am ending up with blanks in my output table (see below)
email | name1 | name 2
----------------------+--------+-------
------#email.net | Susan | null
------#email.net | null | Donald
Is there a way to do this in SQL, without creating two separate name columns? Thanks in advance.
What you are trying to accomplish could be done in MYSQL like
SELECT email, GROUP_CONCAT(first_name)
FROM table
GROUP BY email
There is similar function in MS SQL server called STRING_AGG() , you can see more here https://database.guide/mysql-group_concat-vs-t-sql-string_agg/

How to find only Japanese language name and English name data in a table of sql server

I have a Sql Server table loaded with data from 2 language names. Say Japanese, English.
How to identify the language and its relevant data from that table ?
How to pull only Japanese data ? i tried with this command didn't work
How to pull only English data ? i tried with firstname like '%[^!-~ ]%'
after population I need to separate Japanese and English names in 2 diff column ?
sample:
For English i tried -firstname like '%[^!-~ ]%'
For Japanese i tried -firstname like '[^A-Z]%'
select
case when firstname like '[^A-Z]%' then firstname
end as Japanese_firstname
from All_Users
Sample table
id | firstname
1 | steven
2 | 佳恵
3 | Yoshie
4 | Fruit south
5 | 果南
I need a query to produce :
is this possible to get ?
id | firstname_english | firstname_Japanese
1 | steven | null
2 | null | 佳恵
3 | Yoshie | null
4 | Fruit south | null
5 | null | 果南
One idea would be to CONVERT the values to a varchar (assuming that you aren't using a Japanese collation). If, afterwards, they contain '?', then you know that the value contains a unicode character outside of the collation and therefore can assume it has Kanji Characters in it:
SELECT ID,
CASE WHEN CONVERT(varchar(100),REPLACE(YT.FirstName,'?','')) NOT LIKE '%?%' THEN YT.FirstName END AS RomanjiName,
CASE WHEN CONVERT(varchar(100),REPLACE(YT.FirstName,'?','')) LIKE '%?%' THEN YT.FirstName END AS KanjiName
FROM (VALUES(1,N'steven'),
(2,N'佳恵'),
(3,N'Yoshie'),
(4,N'Fruit south'),
(5,N'果南'))YT(ID,FirstName);

values not returned as it should in a SQL query

I am running a simple PostgreSQL query on the following table named storelocation2:
store_name | brand | city | store_id
----------------------------------------------
MS Products | SAMSUNG|Gurugram|5611
Ajay Electric| SAMSUNG|Gurugram|5611
Vijay Sales | SAMSUNG|Gurugram|5611
Upon running this command:
postgres=> \d storelocation2;
the following table details are returned:
Column | Type | Modifiers
-------------+-----------------------+-----------
store_name | character varying(20) |
brand | character varying(20) |
city | character varying(20) |
store_id | numeric |
Query
Now when I run the select statement:
select city
from storelocation2 where
brand='SAMSUNG';
following results:
city
------
(0 rows)
which is wrong.
The problem is that what you see as SAMSUNG is not what you get. This is normally due to unexpected characters at the beginning or end of the string.
The most common would be spaces, which are handled with:
where trim(brand)= 'SAMSUNG'
Next are other hidden characters, which you can find with:
where brand like '%SAMSUNG%'
Then there are characters "in-between":
where brand ~ '.*S.*A.*M.*S.*U.*N.*G.*'
Depending on which of these end up matching, you can investigate the unusual characters in the column.
try like below by uisng trim function though already it said in comments
select city
from storelocation2 where
trim(brand)='SAMSUNG';

PL/SQL to find Special Characters in multiple columns and tables

I am trying to come up with a script that we can use to locate any special characters that may exist in a column of data except for period, dash or underscore, and using variables.
My Data - Employees table:
---------------------------------------------------------
ID | LASTFIRST | LAST_NAME | FIRST_NAME | MIDDLE_NAME
---------------------------------------------------------
57 | Miller, Bob | Miller | &^$#*)er | NULL
58 | Smith, Tom | Smith | Tom | B
59 | Perry, Pat | Perry | P. | Andrew
My Script:
VAR spchars VARCHAR
spchars := '!#$%&()*+/:;<=>?#[\\\]^`{}|~'
select *
from (select dcid, LastFirst, Last_Name, First_Name, middle_name,
CASE WHEN REGEXP_LIKE(First_Name, '[ || spchars || ]*$' )
THEN '0' ELSE '1' END AS FNSPC
from employees)
where FNSPC = '0';
/
And all rows are returned.
Any idea what I am doing wrong here? I want to only select Bob Miller's row.
REGEXP, Schmegexp! ;-)
select * from employees
where translate (first_name, 'x!#$%&()*+/:;<=>?#[\]^`{}|~', 'x') != first_name;
That translates all the special characters to nothing, i.e. removes them from the string - hence changing the string value.
The 'x' is just a trick because translate doesn't work as you'd like if the 3rd parameter is null.