I have a phone number and zip code field in a table. I am trying to get this information into a common format, and I want to get rid of all the extra junk like dashes, parenthesis, spaces, and letters.
I was wondering if there was a way to do this with the replace function, I tried doing it similarly to how one would in REGEXP_LIKE() and had no luck, this is what I have.
select (REPLACE(numbers.PHONE,'[a-zA-Z._-%() ]',''))
from table numbers;
If there isn't a way to do this that's fine, I just wanted to avoid having to make a whole bunch of replace statements for everything I want to replace.
It would depend on how much junk you have in your zip codes and phones. For example, you could remove all non-digital characters in those fields with a replace like this one:
SELECT REGEXP_REPLACE('234N2&.-#3NDJ23842','[^[:digit:]]+') FROM DUAL
And afterwards you could format the resulting digits with a replace like this:
SELECT REGEXP_REPLACE('2342323842','([[:digit:]]{3})([[:digit:]]{3})([[:digit:]]{4})','\1 \2 \3') FROM DUAL
I know the examples are not valid as zip codes nor phone numbers but I think they might help you.
Related
I am trying to store some text with a hyphen aka dash (-) in Oracle 12c Varchar2 field.
But when I go to do a Select on the table value, the hyphen/dash character results in a funny looking symbol. I have tried escaping before using the dash (-) but that still produced the funny looking symbol.
How do i store hypens/dashes properly in Oracle?
Thank you
Putting as answer as for comment it would be too long.
First you have to establish the problem is with inserting dash or while fetching it. To verify, run this on the column
select * from table where column like '%-%';
If you get output, that means it is stored properly. So the problem is with displaying it.
If you don't get ouput, that means you are not inserting it properly. In that case show your insert statement. You just have to treat dash as any other string character.
I'm looking to find specific characters a column that has addresses. So this has numbers, characters and spaces. Using "like" does not seem to work. I tried using "instr," but I can't get it right....Is it because it has spaces?
So for example:
1234 Arlington Hwy
I want to pull up any address records that has "Hwy" in it. Help please!
SELECT * FROM mytable
WHERE column1 LIKE '*Hwy*'
The * operator acts as a wild-card, allowing anything to come before and after "Hwy" to return.
I have a column that I need to clean the data up on.
First I'd like to do a select to get a record of the bad data then I've like to run a replace on the invalid charters.
I'm looking to select anything that contains non alphanumeric characters but ignores the slash "\" as the second character and also ignores underscores and dashes in the rest of the string. Here's a couple of example of the data I'm expecting to get back from this query.
#\AAA
A\Adam's
A\Amanda.Smith
B\Bear's-ltd
C\Couple & More
After this I'd like to run a replace on any of these invalid characters and replace them with underscores so the result would look like this:
_\AAA
A\Adam_s
A\Amanda_Smith
B\Bear_s-ltd
C\Couple_More
I do not think there is native support for that. You can create a CLR to support regex, ex: https://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/
I am attempting to remove extraneous characters from data in a primary key column..the data in this column serves as a control number, and the extra characters are preventing a Web application from effectively interacting with the data.
As an example, one row may look like this:
ocm03204415 820302
I want to remove everything after the space...so the characters '820302'. I could manually do it, but, there are around 2,000 records that have these extra values in the column. It would be great if I could remove them programmatically. I can't do a simple Replace because the characters have no pattern...I couldn't define a rule to discover them...the only thing uniform is the space...although, now that I look at the data set, they do all start with 8.
Is there a way I could remove these characters programmatically? I am familiar with PL/SQL in the Oracle environment, and was wondering if Transactional SQL would offer some possibilities in the MS-SQL environment?
Thanks so much.
You may want to look into the CHARINDEX function to find the space. Then you can use SUBSTRING to grab everything up to the space in a single UPDATE statement.
Try this:
UPDATE YourTable
SET YourColumn = LEFT(YourColumn,CHARINDEX(' ',YourColumn)-1)
WHERE CHARINDEX(' ',YourColumn) > 1
I'm working on a search module that searches in text columns that contains html code. The queries are constructed like: WHERE htmlcolumn LIKE '% searchterm %';
Default the modules searches with spaces at both end of the searchterms, with wildcards at the beginning and/or the end of the searchterms these spaces are removed (*searchterm -> LIKE '%searchterm %'; Also i've added the possibility to exclude results with certain words (-searchterm -> NOT LIKE '% searchterm %'). So far so good.
The problem is that words that that are preceded by an html-tag are not found (<br/>searchterm is not found when searching on LIKE '% searchterm.., also words that come after a comma or end with a period etc.).
What i would like to do is search for words that are not preceded or followed by the characters A-Z and a-z. Every other characters are ok.
Any ideas how i should achieve this? Thanks!
Look into MySQLs fulltextsearch, it might be able to use non-letter characters as delimiters. It will alsow be much much faster than a %term% search since that requires a full table-scan.
You could use a regular expression: http://dev.mysql.com/doc/refman/5.0/en/regexp.html
Generally speaking, it is better to use full text search facilities, but if you really want a small SQL, here it is:
SELECT * FROM `t` WHERE `htmlcolumn` REGEXP '[[:<:]]term[[:>:]]'
It returns all records that contain word 'term' whether it is surrounded with spaces, punctuation, special characters etc
I don't think SQL's "LIKE" operator alone is the right tool for the job you are trying to do. Consider using Lucene, or something like it. I was able to integrate Lucene.NET into my application in a couple days. You'll spend more time than that trying to salvage your current approach.
If you have no choice but to make your current approach work, then consider storing the text in two columns in your database. The first column is for the pure text, with punctuation etc. The second column is the text that has been pre-preprocessed, just words, no punctuation, normalized so as to be easier for your "LIKE" approach.