Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    New to the CF scene
    Join Date
    Nov 2011
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Complete Reorganization of Column/Row data

    I inherited a really disorganized database and I'd like to clean it up but am not sure what the best way would be. I am using SQL Maestro for MySQL. The data in each column/row is preceded with a prefix (N01, N02, N02, etc), but the data within the rows is not in order. I've created columns for each prefix, and am looking for the most efficient way to reorganize the data into the correct columns now without having to do 64x64 update statements.

    Here is what I have to work with:

    Original Table: Universal
    UPC col1, col2, col3, N01, N02, N03, N04...N64
    1234567890 N01, N02, N03
    2345678901 N33, N04, N05
    3456789012 N01, N03, N04

    I'd like to make it look like:

    UPC col1, col2, col3, N01, N02, N03, N04, N05...N33...N64
    1234567890 N01, N02, N03
    2345678901 N04, N05....N33...N64
    3456789012 N01, N03, N04

  • #2
    Super Moderator guelphdad's Avatar
    Join Date
    Mar 2006
    Location
    St. Catharines, Ontario Canada
    Posts
    2,633
    Thanks
    4
    Thanked 148 Times in 139 Posts
    Um, no you wouldn't.
    Look up database normalization so you understand the concept before going forward.

    you shouldn't have to worry about putting the data in the table in order, you should be doing that in a query.

    any time you find out you have a sequence of columns like you do N01... N02... N64 then your data is most likely not normalized.

  • #3
    New to the CF scene
    Join Date
    Nov 2011
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I am somewhat familiar with the concepts of normalization, but I don't think that is what I need here. Even if it IS what I wanted, it would not be possible until I get all this data into organized columns.

    N01, N02, etc are just the prefixes needed with our end application. Each field will have unique data after the prefix.

    Ideally, after getting the data into the correct columns I could eliminate the prefixes from the data itself and have them concatenated onto the data during the export process only.

    BTW I'm working with 1 million plus rows here, so re-entering data is not an option. I mean....I'd rather do the 64X64 queries. LOL

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,111
    Thanks
    75
    Thanked 4,333 Times in 4,299 Posts
    Sorry, but I'm with GuelphDad on this.

    But I can't make sense out of your first post.

    Your two examples are essentially identical. They both have all those columns labelled N01 through N64.

    And neither one seems to have any data in the Col1, Col2, Col3 fields.

    Almost surely, the best organization would be something like this:
    Code:
    UPC        | tag | data  
    1234567890 | N01 | whatever data goes with N01
    1234567890 | N02 | whatever data goes with N02
    ...
    1234567890 | N37 | whatever data goes with N37
    ...
    1234567890 | N64 | whatever data goes with N64
    But unless we see the real table design and data, it's hard to know for sure.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    New to the CF scene
    Join Date
    Nov 2011
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Sorry for the improper example. Hope this gives a better view of the deplorable state of this data:

    UPC | A | B | C
    1234567890 | N01 Green Beans | N03 12 oz. | N02 French Style
    4564567897 | N01 Blue shirt | N02 XXL | N32 Hanes
    5645613232 | N01 Frozen Dinner | N03 16 oz. | N21 Micro 6 minutes
    7878454566 | N34 Famous Movie | N57 PG-13 | N55 synopsis

    but would like it to be as such:

    UPC | N01 | N02 | N03 | N32 | N34
    1234567890 | Green Beans | French Style |12 oz. | |
    4564567897 | Blue shirt | XXL | | Hanes |
    5645613232 | Frozen Dinner | |16 oz. | |
    7878454566 | | | | | Famous Movie

  • #6
    Super Moderator guelphdad's Avatar
    Join Date
    Mar 2006
    Location
    St. Catharines, Ontario Canada
    Posts
    2,633
    Thanks
    4
    Thanked 148 Times in 139 Posts
    And what we are saying is you don't want that, it will be much too difficult to effectively query your table beyond a trivial SELECT * from the table.

  • #7
    New to the CF scene
    Join Date
    Nov 2011
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    On the contrary, I KNOW this is what I want. N01 is the shortcode for Product, N02 for product description, N03 size, N33 Artist, etc. They really should have been the column headers from the get go, but the person before me underestimated how large the database would get, and the need to actually organize it properly. Now I'm left with a big problem.

    Right now it is impossible to find stuff. Some Instructions (N21) are in column F and some in column P, some have product size (N03) in column B others have product description (N02) in column B. Basically, previously they filled in the data across the rows without regard to column organization at all.

    Perhaps there is no easy way to clean it up.

  • #8
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,111
    Thanks
    75
    Thanked 4,333 Times in 4,299 Posts
    Sorry, but we *STILL* have to disagree with you.

    There *ARE* reasons that people invented NORMALIZATION. People much more expert than GuelphDad or I. And both GuelphDat and I have surely been "bitten" way too many times with bad DB designs that we had to clean up.

    If you organize the data the way you are suggesting, many many queries will be more difficult and/or slower to perform.

    Anyway, whether you do it your way or listen to us, what you want is not particularly hard. Just time consuming.

    You could either do this by writing a dirt simple PHP/ASP/JSP/whatever program or you could do it using a MySQL stored procedure.

    You *would* need to do it by copying the existing data into a new table (or, if you listen to us, pair of tables). But the code is pretty trivial.

    I don't code in PHP, though I probably could figure out how to do this one, but I'm not going to try. I'll write it in ASP code and you can translate.

    The code assumes that you have already created the new table. It assumes that UPC is in the first column (column 0) and that all the other columns contain your "Nxx" stuff.

    Code:
    <%
    Set conn = Server.CreateObject("ADODB.Connection")
    conn.Open "...connection string to your DB..."
    
    Set RS = conn.Execute("SELECT * FROM oldtable")
    Do Until RS.EOF
         SQL = "INSERT INTO newtable (UPC) VALUES('" & RS("UPC") & "')"
         conn.Execute SQL
         For fldnum = 1 To RS.Fields.Count
            data = RS(fldnum)
            ns = Left(data,3) ' will be N01, etc... but let's make sure:
            If Left(ns,1) = "N" AND IsNumeric(Mid(ns,2)) Then
                SQL = "UPDATE newtable (" & ns & ") VALUES('" & Trim(Mid(data,4)) & "')"
                conn.Execute SQL
            End IF
        Next
        RS.MoveNext
    Loop
    %>
    That's the simplest coding. With a little more work, we could make it much more efficient. But as I doubt you would be using ASP, I won't bother writing it.

    It would probably be most efficient to translate this into a MySQL Stored Procedure.

    *************

    If you follow our advice, you would instead do something like this:
    Code:
    <%
    Set conn = Server.CreateObject("ADODB.Connection")
    conn.Open "...connection string to your DB..."
    
    Set RS = conn.Execute("SELECT * FROM oldtable")
    Do Until RS.EOF
         SQL = "INSERT INTO newtable (UPC) VALUES('" & RS("UPC") & "')"
         conn.Execute SQL
         For fldnum = 1 To RS.Fields.Count
            data = RS(fldnum)
            ns = Left(data,3) ' will be N01, etc... but let's make sure:
            If Left(ns,1) = "N" AND IsNumeric(Mid(ns,2)) Then
                SQL = "INSERT INTO nsTable (UPC,NUM,VALUE) " _
                      & " VALUES('" & RS("UPC") & "'," & Mid(ns,2) & ",'" & Trim(Mid(data,4)) & "')"
                conn.Execute SQL
            End IF
        Next
        RS.MoveNext
    Loop
    %>
    That assumes your nsTable looks like this:
    Code:
    CREATE nsTable (
        UPC varchar(50) REFERENCES newtable(UPC),
        NUM int,
        VALUE varchar(500),
        PRIMARY KEY (upc,num)
        );
    Using a composite primary key is a good idea, but if you do a lot of searching by NUM then you might be better off with a separate key on that column (non-unique, of course).
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #9
    New to the CF scene
    Join Date
    Nov 2011
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for the advice. I'll have to address this with my boss.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •