PDA

View Full Version : How do I separate city, state, zip into separate columns?


happyinLA
08-11-2010, 06:19 PM
I have a table where city, state, and zip are all together in one column and would like to have them separated into three columns for ease of sorting. Is there a simple way to do this?

oracleguy
08-11-2010, 06:40 PM
Is there any delimiter between the data like a comma? Show us an example piece of data.

happyinLA
08-11-2010, 06:43 PM
Is there any delimiter between the data like a comma? Show us an example piece of data.

No, there are no delimiters. Just spaces. Example:

BEVERLY HILLS CA 90210

Old Pedant
08-11-2010, 07:22 PM
And you want to do this all in MySQL? No help from server side scripting (i.e., PHP or ASP or...)???

Can we COUNT on there ALWAYS being ONLY a 5 digit zip code and ONLY a two character state abbreviation?

And I assume this data is in some existing table? Do you want to insert the separated pieces into another table or just SELECT them into a set of results?

happyinLA
08-11-2010, 07:47 PM
And you want to do this all in MySQL? No help from server side scripting (i.e., PHP or ASP or...)???

Can we COUNT on there ALWAYS being ONLY a 5 digit zip code and ONLY a two character state abbreviation?

And I assume this data is in some existing table? Do you want to insert the separated pieces into another table or just SELECT them into a set of results?

Yes, the data is in an existing table, and I'd like to end up with separate columns for name|address|city|state|zip. Can I insert the new city, state, zip columns into the existing table? If so, then I could just select the columns I want and output the result into a new table.

And no, there's not always a 5 digit zip code. Actually, most of the time it's a nine digit one that looks like: 902100000 (no hyphen). The two character state abbreviation is pretty consistent though.

Old Pedant
08-11-2010, 09:00 PM
"most of the time" and "pretty consistent" make it tough.

SQL isn't the best language around for doing this kind of stuff, even with the many extensions that MySQL adds.

What about if the code could handle 95% or more of the cases and for the problem cases you'd have to fix them by hand??

I'll assume that the current form is in a field named "csz" and that the table already contains the new fields named "city", "state", and "zip".


-- clean up data a little, first
-- get rid of spaces on ends of csz
UPDATE table SET csz = TRIM(csz);
-- and hopefully zap all multiple spaces:
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );

-- handle 5 and 9 digit zip codes:
UPDATE table SET zip = RIGHT( csz, 5 ) WHERE csz RLIKE ' [0-9]{5}$';
UPDATE table SET zip = RIGHT( csz, 9 ) WHERE csz RLIKE ' [0-9]{9}$';

-- find the 2 character state:
UPDATE table SET state = LEFT( RIGHT( csz, 8 ), 2 ) WHERE csz RLIKE '[A-Za-z]{2} [0-9]{5}$';
UPDATE table SET state = LEFT( RIGHT( csz, 12 ), 2 ) WHERE csz RLIKE '[A-Za-z]{2} [0-9]{9}$';

-- and finally the city:
UPDATE table SET city = LEFT( csz, LENGTH(csz)-9 ) WHERE WHERE csz RLIKE '[A-Za-z]{2} [0-9]{5}$';
UPDATE table SET city = LEFT( csz, LENGTH(csz)-13 ) WHERE WHERE csz RLIKE '[A-Za-z]{2} [0-9]{9}$';

Utterly untested, but feels right.

oracleguy
08-11-2010, 10:36 PM
Well if you can use a scripting language, you could use regular expressions to get that last 5%.

happyinLA
08-11-2010, 11:41 PM
"most of the time" and "pretty consistent" make it tough.

SQL isn't the best language around for doing this kind of stuff, even with the many extensions that MySQL adds.

What about if the code could handle 95% or more of the cases and for the problem cases you'd have to fix them by hand??

I'll assume that the current form is in a field named "csz" and that the table already contains the new fields named "city", "state", and "zip".


-- clean up data a little, first
-- get rid of spaces on ends of csz
UPDATE table SET csz = TRIM(csz);
-- and hopefully zap all multiple spaces:
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );
UPDATE table SET csz = REPLACE( csz, ' ', ' ' );

-- handle 5 and 9 digit zip codes:
UPDATE table SET zip = RIGHT( csz, 5 ) WHERE csz RLIKE ' [0-9]{5}$';
UPDATE table SET zip = RIGHT( csz, 9 ) WHERE csz RLIKE ' [0-9]{9}$';

-- find the 2 character state:
UPDATE table SET state = LEFT( RIGHT( csz, 8 ), 2 ) WHERE csz RLIKE '[A-Za-z]{2} [0-9]{5}$';
UPDATE table SET state = LEFT( RIGHT( csz, 12 ), 2 ) WHERE csz RLIKE '[A-Za-z]{2} [0-9]{9}$';

-- and finally the city:
UPDATE table SET city = LEFT( csz, LENGTH(csz)-9 ) WHERE WHERE csz RLIKE '[A-Za-z]{2} [0-9]{5}$';
UPDATE table SET city = LEFT( csz, LENGTH(csz)-13 ) WHERE WHERE csz RLIKE '[A-Za-z]{2} [0-9]{9}$';

Utterly untested, but feels right.

I added the city, state, and zip columns and substituted my table name for "table" and my column name for "csz," but when I ran the query, I got an error message:

Msg 195, Level 15, State 10, Line 1
'TRIM' is not a recognized built-in function name.
Msg 4145, Level 15, State 1, Line 6
An expression of non-boolean type specified in a context where a condition is expected, near 'RLIKE'.
Msg 4145, Level 15, State 1, Line 7
An expression of non-boolean type specified in a context where a condition is expected, near 'RLIKE'.
Msg 4145, Level 15, State 1, Line 8
An expression of non-boolean type specified in a context where a condition is expected, near 'RLIKE'.
Msg 4145, Level 15, State 1, Line 9
An expression of non-boolean type specified in a context where a condition is expected, near 'RLIKE'.
Msg 195, Level 15, State 10, Line 10
'LENGTH' is not a recognized built-in function name.
Msg 195, Level 15, State 10, Line 11
'LENGTH' is not a recognized built-in function name.

Wonder if I'm doing something wrong....

Oh, and I took out the comments as well.

Old Pedant
08-12-2010, 01:12 AM
*SIGH* This is the MYSQL forum.

You are apparently using SQL Server.

No, SQL Server does not have TRIM() or LENGTH() or RLIKE.

By comparison to MySQL's dialect, SQL Server is brain-dead.

There's nothing really equivalent to RLIKE in SQL Server, so I don't think there's any easy way to do this there.

If you are using SQL Server 2005 or newer, you could write a stored procedure in VB.NET or C# that would do the job. For that matter, you could probably write an ordinary stored procedure that would do it. Just would be long and complex compared to what MySQL is capable of.

If you gave this to me as a job to do, I'd probably just hack something together in VBScript or JScript and then execute it as an ASP page or even via WindowsScriptHost. Much easier than trying to get it to work with SQL Server alone. Plus I could probably easily handle the non-conforming cases.

happyinLA
08-12-2010, 01:40 AM
*SIGH* This is the MYSQL forum.

You are apparently using SQL Server.

No, SQL Server does not have TRIM() or LENGTH() or RLIKE.

By comparison to MySQL's dialect, SQL Server is brain-dead.

There's nothing really equivalent to RLIKE in SQL Server, so I don't think there's any easy way to do this there.

If you are using SQL Server 2005 or newer, you could write a stored procedure in VB.NET or C# that would do the job. For that matter, you could probably write an ordinary stored procedure that would do it. Just would be long and complex compared to what MySQL is capable of.

If you gave this to me as a job to do, I'd probably just hack something together in VBScript or JScript and then execute it as an ASP page or even via WindowsScriptHost. Much easier than trying to get it to work with SQL Server alone. Plus I could probably easily handle the non-conforming cases.

Oh. :o Well, I'm back to the drawing board then. :) I think I'll look into MySQL. Thanks for pointing me in that direction.

Old Pedant
08-12-2010, 01:54 AM
This REALLY would NOT be hard to do using, say, VBScript.

You could just create a ".vbs" file and then just execute it from the DOS window command line.

Or you could use JScript (MS's version of JavaScript) the same way.

You have any experience with scripting?

Old Pedant
08-12-2010, 02:17 AM
Set spacezap = New RegExp
spacezap.Pattern = "\s+"
spacezap.Global = True

Set conn = CreateObject("ADODB.Connection")
connStr = "... you will need a connection string here ..."
conn.Open connStr

Set RS = CreateObject("ADODB.Recordset")
RS.Open "yourtablename", conn, 3, 3
Do Until RS.EOF
' remove duplicate spaces and spaces on ends:
csz = spacezap.Replace( Trim(RS("csz")), " " )
temp = Split( csz, " " )
maxt = UBound(temp)
zip = temp( maxt )
If Len(zip) >= 5 AND IsNumeric(zip) Then
state = temp( maxt - 1 )
temp(maxt) = ""
temp(maxt-1) = ""
city = Trim( Join( temp, " " ) )
RS("city") = city
RS("state") = state
RS("zip") = zip
RS.Update
End If
RS.MoveNext
Loop
RS.close
conn.Close

There...I think that would do it. Figure out the connection string you need, replace table name and field names with your own. Name it "fixTable.vbs" and then just open up a DOS window, CD to the correct directory, and type in "fixTable.vbs". Presto.

**UTTERLY UNTESTED**
Use at your own risk.