Regular Expression for zipcode pattern? - google-bigquery

How to write regular expression for zipcode pattern? I need to makesure that all zipcodes(sample addresses) are 5 digits, but my query is not working.
with table1 as(
select "123 6th St. Melbourne, FL 32904" as address union all
select "71 Pilgrim Avenue, Chevy Chase, MD 20815" union all
select "70 Bowman St. South Windsor, CT 06074" union all
select "4 Goldfield Rd. Honolulu, HI 966815" union all
select "44 Shirley Ave. West Chicago, IL 60185" union all
select "514 S. Magnolia St. Orlando, FL 32806 "
)
select address,regexp_contains("address",r"\s\d{5}$")check from table1

At least remove quotes around address in regexp_contains
select address, regexp_contains(address, r"\s\d{5}$") check from table1
Also you might want to revisit use of $ at the end of regex
Consider r"\b\d{5}\b" as an option

Related

Split string into 2 columns (Bigquery)

I need to split ADDRESS LINE into two columns - address number and street.
I have tried the following:
Select
REGEXP_EXTRACT_ALL(address_number, r"([0-9]+)"),
REGEXP_EXTRACT(address_street, r"([a-zA-Z]+)")
From table;
and
Select
substr(addressline1, 1, 4) as address_number,
substr(addressline1, 6, 30) as address_street,
From table;
However, none of them seem to be ideal because address line does not have strict structure.
It can be:
Adressline1
9666 Northridge Ct.
P.O. Box 8070
369 Peabody Road
83 Mountain View Blvd
3279 W 46th St
I would say to cut it into two parts - and split it after first space but did not find the right way.
You Can try following code
with data as
(select '9666 Northridge Ct.' as add1 Union all
select 'P.O. Box 8070' as add1 Union all
select '369 Peabody Road' as add1 Union all
select '83 Mountain View Blvd' as add1 Union all
select '3279 W 46th St' as add1 )
SELECT
add1,
REGEXP_EXTRACT_ALL(add1, r'\d+') AS numbers,
REGEXP_REPLACE(add1, r'\d+', '') AS non_numbers
FROM data
output looks like :-

Oracle, show only rows where no numeric values appear

I have column X with values like:
619 19th St S, Oslo, AL 3522310, Spain
4538 S Harvard Ave, Roma, OK 74135, Germany
Golaa, CA , USA
Piri, SO, Italy
And I would like to filter only those, where I see no number in column so the outcome of the query should be:
Golaa, CA , USA
Piri, SO, Italy
I would use a regular expression, but I think this is simpler:
SELECT *
FROM yourTable
WHERE NOT REGEXP_LIKE(x, '[0-9]');
You can also do this without regular expressions:
WHERE x = TRANSLATE(x, 'a0123456789', 'a')
You can use Regular Expressions for pattern matching in Oracle.
SELECT
*
FROM
yourTable
WHERE
NOT REGEXP_LIKE(x, '[0-9]+')
This will exclude any rows that have one or more numeric digits in column x.
with s as (
select '619 19th St S, Oslo, AL 3522310, Spain' str from dual union all
select '4538 S Harvard Ave, Roma, OK 74135, Germany' from dual union all
select 'Golaa, CA , USA' from dual union all
select 'Piri, SO, Italy' from dual)
select *
from s
where str = translate(str, 'z1234567890', 'z');
STR
------------------------------
Golaa, CA , USA
Piri, SO, Italy

Insert space in between string Regecp Oracle

I have some addresses not in proper format. I want to add a space if there is none. Example shown below
Input Expected output
----- ----------------
AVEX AVE X or AVENUE X
AVE X AVE X or AVENUE X
AVENUEX AVENUE X or AVE X
AVENUE X AVENUE X or AVE X
AVEOFCITY AVE OF CITY or AVENUE OF CITY
I created a below expression ,but it is not giving correct result for all cases, especially the AVENUE breaks into AVE NUE
SELECT REGEXP_REPLACE('AVENUEN','^(AVE(NUE)*?)(\w)','\1 \3') rep FROM dual;
This will get you a little closer. Just tweaked your regex a little to allow for an optional 'NUE' and handle 0 or more spaces after.
with tbl(id, str) as (
select 1, 'AVEX' from dual union all
select 2, 'AVE X' from dual union all
select 3, 'AVENUEX' from dual union all
select 4, 'AVENUE X' from dual union all
select 5, 'AVEOFCITY' from dual
)
SELECT
id,
REGEXP_REPLACE(str,'^(AVE(NUE)?) *?(\w)','\1 \3') rep
FROM tbl;
You may need another pass to handle the 'OFCITY' as who knows what could come after the AVENUE that you have to allow for.

SQLServer select common element

Hopefully a fairly simple bit of SQL, I have a table with two columns, street and city. Given a list of 3 street names, how do I select the city which is common to the streets?
For example.
Street City
------ ----
1st St NYC
2nd St NYC
3rd St NYC
1st St SF
1st St LA
etc St XX
If I have "1st St", "2nd St" and "3rd St", which query returns "NYC"?
You can use group by and having:
select t.city
from table t
where t.street in ('1st st', '2nd st', '3rd st')
group by t.city
having count(distinct t.street) = 3;
This is an example of a set-within-sets query, where you are looking for sets of things (streets) for another thing (cities). Group by and having is a very flexible way of addressing this type of problem.

Return Distinct Values Where One Column Is The Same But One Column Different

I am trying to return results in TSQL where it only displays addresses where there are multiple names. The tricky part has been there are multiple duplicates already in this table... so the Having Count variations that I've tried do not work because they all have a count greater than one. So I have not been able to easily distinguish unique names that have the same address. The solution illustrated below is what I would like to produce... and I have but my solution is a sad last ditched effort within Access where I ended up using a query with three sub queries to get the results:
Address Name
101 1st Ave Brian Wood
101 1st Ave Amy Wood
101 1st Ave Adam Wood
555 5th St Sarah Parker
555 5th St Parker Corp.
Sample Data Looks Like this:
Address Name
101 1st Ave Brian Wood
101 1st Ave Brian Wood
101 1st Ave Brian Wood
101 1st Ave Amy Wood
101 1st Ave Adam Wood
555 5th St Sarah Parker
555 5th St Sarah Parker
555 5th St Sarah Parker
555 5th St Parker Corp.
I've been trying to get this for hours... I know their is a much simpler way to do this but as it's been a 16 hour day and it's 2:00 am I just can't get my head around it.
Here is an example of my best TSQL results... it does the trick but it bumps it into two different columns:
SELECT DISTINCT t1.Name, t2.Name, t1.Address
FROM tblLeads t1
JOIN tblLeads t2 ON t1.Address = t2.Address
WHERE t1.Name <> t2.Name
ORDER BY t1.Address
You can do a GROUP with COUNT(Distinct Name) > 1 to get Address with more than 1 unique name, and then do a select distinct with a filter on the above grouped Addresses like this.
SELECT DISTINCT Address,Name
From Table1
WHERE Address IN (
SELECT Address
FROM Table1
GROUP BY Address
HAVING COUNT(distinct Name) > 1
)
You could use multiple CTE's to simplify this task. You first want to clean up your data, so remove all those duplicates, therefore you can use DISTINCT. Then use Count(*)OVER(Partition By Address) to get the count of rows per Address:
WITH CleanedData AS
(
SELECT DISTINCT Address, Name
FROM dbo.tblLeads
),
CTE AS
(
SELECT Address, Name,
cnt = Count(*) OVER (Partition By Address)
FROM CleanedData
)
SELECT Address, Name
FROM CTE
WHERE cnt > 1
Demo
By the way, this works also if Address has null values: Demo (as opposed to this).
Use EXISTS to verify same addresses but other name:
SELECT DISTINCT t1.LastName, t1.Street
FROM tblLeads t1
WHERE EXISTS (select 1 from tblLeads t2
where t1.Street = t2.Street
and t1.LastName <> t2.LastName)
ORDER BY t1.Street
alternative solution to Tim's one without CTE:
select address, name
from (select t.*, count(*) over(partition by address) as cnt
from (select distinct address, name from tblLeads) t
) where cnt > 1