unique wildcard update SQL syntax - sql

I'm incredibly new to SQL (as in I've been painfully teaching myself "database administration" for a couple weeks at my job...) and I'm in a bit of a pickle. I have a table that has a column full of zip codes, but the data that was imported to this database was incorrect and we have a large, large number of zip codes with four digits versus five (the leading 0 was omitted).
I managed to get a script working that replaces a known four digit zip code with a proper five digit zip code, but the problem there is we have a few hundred thousand entries and I can't realistically thumb through them all to find every entry with four digits and add the zero.
I've copied the database and have ruined it a few times trying to make some syntax work, but essentially I am looking for a quick fix to simply add a 0 to any FOUR digit zip code while leaving any FIVE digit zip code alone.
Is there a way that I can simply have a SQL script use a wildcard for whatever the four digits are and add a zero at the start? I messed around for 45~ minutes at this place (http://www.w3schools.com/sql/sql_wildcards.asp) trying to get something to work to no avail.
Much, much appreciation to anyone able to assist.
Pertinent information:
Table: tbl_Address
Column: ReceiverPostalCode
Incorrect Postal Codes: 8481 (should be 08481), 8638 (should be 08638), etc
Correct Postal Codes: 20872, 27501, 90039, etc
Best,
Steve

Assuming no zips are longer than 5 characters, you can do this:
update tbl_Address
set ReceiverPostalCode = right(concat('00', ReceiverPostalCode ), 5)
where len(ReceiverPostalCode) between 3 and 4

Related

Format Phone Number SQL Server

I am working on a stored procedure that in part, pulls phone numbers from a database.
In some cases, the number may be in an incorrect format. The correct format is:
+714XXXXXXX, however there are cases where the number appears as, e.g: 142877261
or 7147267261. There are even some cases where the number appears as say, ++1749186372
How can i force the number to append +714 to the start while keeping the rest of the number intact?
Any help would be much appreciated.
You can take the 7 most right digits and unconditionnaly preprending +714
'+714' + right(phonenumber, 7)

How do I search for a five digit number in a string column?

What I'm trying to do is determine (Using Teradata SQL) if a person's zip code has accidently been put on an address line. I've looked on various forums and I can't find any similar questions.
Ultimately, I would want to write something like:
Where address_line_1 like '%[0-9][0-9][0-9][0-9][0-9]%'
Any ideas?
Target database is Teradata 13.x
If you want to inspect the entire column to see if it contains only a ZIP code, you might try something like this:
where address_line_1 between '00000' and '99999'
But if you are thinking of searching the entire string for any occurrence of five consecutive digits, that would not be a good test anyway. For example, the following would be a perfectly valid mailing address:
28305 Southwest Main Street
Doing validity checks after data has been loaded is difficult; such a task should really be performed during the load process.
Find all the entries that match this regex [^0-9][0-9][0-9][0-9][0-9][0-9][^0-9]
As this will find numbers in some text that are exactly 5 digits long, assuming that's the definition of a zipcode.
where address_line_1 between '00000' and '99999' would not work if there are four-digit numbers in your address_line_1 because it will pick them up
Where address_line_1 like '%[0-9][0-9][0-9][0-9][0-9]%' would be a better solution.

Struggling with a MySQL database of phone numbers

My application wants to store a list of international phone number in a mysql database. Then the application would need to query the database and search for a specific number. Sounds simple but it's actually a huge problem.
Because users can search for that number in a different format we'll have to do a full scan to the database each time.
For example. We might have the number 17162225555 stored in the database (along with another 5 million entries). Now the user comes along and attempt to search using 7162225555. Another user might try to serach with 2225555. etc etc. So in other words, the database have to issue the SQL query using a "like %number%" which would result in a full scan.
How should we design this application? Is there some way to tweat the Mysql to handle this better? Or should we not use SQL at all?
PS. We have millions of entries, and 10s of these search request per second.
This is very weird, I've struggled with this issue myself many times, over the last 15 years and generally come up with structures that separate area codes, country codes and number into separate fields etc. But whilst reading your question another solution just popped into my head, it does require a separate field though so may not be appropriate for you.
You could have a separate field called reverse_phone_number, have this automatically populated by the DB engine then when people search simply reverse the search string and use the indexed reverse field with just a % at the end of the like string, thereby allowing the use of the index.
Dependant on your DB engine you may be able to create an index based on a user-defined function that does the reverse for you obviating the need for an additional field.
In some countries, e.g. the UK, you may have an issue with leading zeros. A UK phone number is represented as (area code)(Phone Number) e.g. 01634 511098, when this is internationalised the leading zero of the area code is removed and the international dial code (+ or 00) and the country code (44) are added. This results in an international phone number of +441634511098. Any user searching for 0163451109 would not find the phone number if it was entered in internationalised format. You can overcome this issue by removing leading zeros from the search string.
EDIT
Based on suggestions from Ollie Jones you should store the number as entered by the user and then strip leading zeros, punctuation and white space from the number before reversing and storing in the reversed field. Then simply use the same algorithm to strip the search string before reversing, find the record and then display the originally entered number back to the user.

Mysql Datatype for US Zip (Postal Codes)

I am writing a web application, that is US specific, so the format that other countries use for postal codes are not important. I have a list of us zip codes that i am trying to load into a database table that includes the
5 digit us zip code
latitude
longitude
usps classification code
state code
city
the zip code is the primary key as it is what i will be querying against. i started using a medium int 5 but that truncates the zip codes that have leading zeros.
i considered using a char5 but am concerned about the performance hit of indexing against a char variable.
so my question is what is the best mysql datatype to store zip codes as?
Note: i have seen it in several other questions related to zip codes. I am only interested in US 5 digit zip codes. So there is no need to take other countries postal code formats into consideration.
char(5) is the correct way to go. String indexing is quite fast, particularly when it is such a small data set.
You are correct in that you should never use an integer for a zip code, since it isn't truly numeric data.
Edit to add:
Check out this for good reasons why you don't use numbers for non-numerically important data:
Is it a good idea to use an integer column for storing US ZIP codes in a database?
go with your medium INT(5) ZEROFILL, it should add the leading zeros for you. No need to impact the index and performance on a formatting issue.
If he makes it Char(6), then he can handle Canadian postal codes as well.
When you consider that there is a maximum of 100,000 5-digit Zip Code and how little space it would take up even if you made the entire table memory-resident, there's no reason not to.

Excel to SQL direct import error

Working for a considerable time on cracking some sales data, I came across an error which started to bug me for so real, eating my time of work. After so much of an effort, I was so fed up and nearly to give up on un-importable records.
The scenario:
Bulk sales data comes on txt/csv format needs to be imported to SQL database and then matched with Address History information available on a combination of tables by verifying strings directly from field to field.
If codes matched, need to run a script to update few tables with data. If not matched, need to insert a whole bunch of data in to different tables to create ID which requires for the final sales import.
Most of the was matching, except for few which was giving the trouble. I just needed to import those to history tables. Then started the problems, even though, I updated them, i couldn't match them.
After some much of frustrated hours, I just asked my girl-friend to check when there any error in the string, I worked with.
The string is "Bramhall Stockport" to be matched to "Bramhall   Stockport". For SQL script, these two strings are not matching.
I bet if you copy and paste on your table this would match, coz now this is txt format.
Then, Ana figured the error (She is not a computer geek, Architecture Masters), by simply coping and pasting on Microsoft Word 2007.
Screenshot: http://www.contentbcc.com/Anushka/sql_xls.png
Do you see the difference? First is in the txt/csv file and second on the SQL table.
In the first one, you have three regular spaces (ascii 20). In the second one, you have a regular space followed by a non-breaking space, (unicode 0xA0). In excel you can do a search and replace with ALT+0160 as the search and a space character as the replacement to fix it.