Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    New Coder
    Join Date
    Mar 2009
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts

    How can I parse this text file and save the results in a database?

    I want to take this text file:
    http://phpir.com/user/files/text/lexicon.txt

    and in each row I want a value for the word, and the identifier (the 2 or 3 letters beside that word).

    How can I separate the two though if there's no delimeter? Is there a way to represent the new line as the delimeter instead?

  • #2
    Master Coder
    Join Date
    Jun 2003
    Location
    Cottage Grove, Minnesota
    Posts
    9,500
    Thanks
    8
    Thanked 1,089 Times in 1,080 Posts
    Is this a one-time operation, or will you be doing this over and over again in the future?

    And is the junk stuff at the beginning of the file needed, or do you only want the lines that are letters?

  • #3
    New Coder
    Join Date
    Mar 2009
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts
    one time, and junk not needed.

    I also noticed some words have two identifiers ex:
    word nn nnp

    I guess I could store a second in another col?

  • #4
    Master Coder
    Join Date
    Jun 2003
    Location
    Cottage Grove, Minnesota
    Posts
    9,500
    Thanks
    8
    Thanked 1,089 Times in 1,080 Posts
    1) I would use notepad+ to edit the text file ... cleaning it up.
    Example,
    change all / characters to spaces
    change all , to spaces
    change all | to spaces
    change all \ to spaces
    etc.

    2) Get it cleaned up so that you only have the special characters you allow, like & _ . -

    3) So now you'll have a text file, each line has the characters that what you want.
    Some lines will have multiple spaces in between some words, but that's OK.

    4) Now you'll use PHP to clean up the extra spaces.

    PHP Code:
    <?php
    // Put your file into an array ... each line is one array element.
    $data=file("lexicon.txt");

    // Connect to your MySQL database here.

    // Loop through the array of lines, clean up extra spaces, and insert into table ...
    foreach($data as $line){
    // Remove extra spaces so there is only one space.
    $line preg_replace('/\s+/'' '$line);
    // Explode the line into separate elements
       
    $parts=explode(" ",$line);
       
    // Now you have some parts to write into your database table
       // Example: Average NNP JJ NN
       // $parts[0] = Average
       // $parts[1] = NNP
       // $parts[2] = JJ
       // $parts[3] = NN

    // insert $parts[0] for sure, and then the other columns if they exist.
    }

    ?>

    That's sort of what I would do. Using Notepad+ to edit files, PHP, HTML, CSS is the best text editor.


    .


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •