Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    New Coder
    Join Date
    May 2009
    Posts
    29
    Thanks
    11
    Thanked 0 Times in 0 Posts

    Converting table in aspx file to a csv file

    I raised a thread in the SQL forum about this to be told that it had to be done via php; there is no simple way to do it.

    I have done a fair bit of work in getting the tags out, and commas in and am now almost there - I have one final problem however! I am getting an extra line between each entry.

    PHP Code:
    <?php
    $fd 
    fopen("ClubReportMembership.aspx"r);
    $content fread($fdfilesize("ClubReportMembership.aspx"));
    fclose($fd);
    $search = array('<div>''<table class="noborder" cellspacing="2" cellpadding="6" border="0" id="ctl00_cphMain_gvMembership">','<tr class="head">''<th scope="col">MemberNumberMTBA</th><th scope="col">MemberNumberIMBA</th><th scope="col">First Name</th><th scope="col">Surname</th><th scope="col">Date Of Birth</th><th scope="col">Gender</th><th scope="col">Address Line 1</th><th scope="col">Address Line 2</th><th scope="col">Suburb</th><th scope="col">State</th><th scope="col">Postcode</th><th scope="col">Citizenship</th><th scope="col">Telephone</th><th scope="col">Mobile</th><th scope="col">Email</th><th scope="col">Membership Type</th><th scope="col">Membership Start Date</th><th scope="col">Membership Stop Date</th><th scope="col">Processing Date</th>','</tr><tr class="data">''</tr>''</table>''</div>''&nbsp;');
    $replace "";
    $modified str_replace($search$replace$content);
    $search = array('</td><td>');
    $replace ",";
    $modified str_replace($search$replace$modified);
    $modified strip_tags($modified);
    $modified trim($modified);
    $f fopen("test3.csv""w"); 
    fwrite($f$modified); 
    fclose($f);
    ?>
    the aspx file looks like this:
    Code:
    <div>
    	<table class="noborder" cellspacing="2" cellpadding="6" border="0" id="ctl00_cphMain_gvMembership">
    		<tr class="head">
    			<th scope="col">MemberNumberMTBA</th><th scope="col">MemberNumberIMBA</th><th scope="col">First Name</th><th scope="col">Surname</th><th scope="col">Date Of Birth</th><th scope="col">Gender</th><th scope="col">Address Line 1</th><th scope="col">Address Line 2</th><th scope="col">Suburb</th><th scope="col">State</th><th scope="col">Postcode</th><th scope="col">Citizenship</th><th scope="col">Telephone</th><th scope="col">Mobile</th><th scope="col">Email</th><th scope="col">Membership Type</th><th scope="col">Membership Start Date</th><th scope="col">Membership Stop Date</th><th scope="col">Processing Date</th>
    		</tr><tr class="data">
    			<td>44181</td><td>&nbsp;</td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td>&nbsp;</td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td>&nbsp;</td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
    		</tr><tr class="altData">
    ....Continues
    I'm so close to getting this section done! Any help would be immensely appreciated.
    Last edited by ryantakers; 01-15-2011 at 09:06 PM.

  • #2
    Gütkodierer
    Join Date
    Apr 2009
    Posts
    2,127
    Thanks
    1
    Thanked 426 Times in 424 Posts
    I can see that you have put some work into this and that it's something you just have to get done anyway you manage to get it done, so I'll provide you with a solution a bit later on.

    Allow me to ramble for a bit though: HTML has an inherent structure that allows it to be quite easily parsed. It happens all the time. Your browser does that. And there are quite a few PHP classes that do just that – parse HTML, so you can easily extract any data buried in whatever mess of tags you have to work with. Possibly someone is linking you to one of those classes right now while I'm typing this (probably not though, since this is not the type of question people tend to jump on).

    That said, if you're feeling masochistic, you can do it by just replacing stuff. Here's how I would do it:

    PHP Code:
    // This is just your original input, with a few table rows added to show that it's doing what it's supposed to do
    $content = <<<EOD
        <div>
            <table class="noborder" cellspacing="2" cellpadding="6" border="0" id="ctl00_cphMain_gvMembership">
                <tr class="head">
                    <th scope="col">MemberNumberMTBA</th><th scope="col">MemberNumberIMBA</th><th scope="col">First Name</th><th scope="col">Surname</th><th scope="col">Date Of Birth</th><th scope="col">Gender</th><th scope="col">Address Line 1</th><th scope="col">Address Line 2</th><th scope="col">Suburb</th><th scope="col">State</th><th scope="col">Postcode</th><th scope="col">Citizenship</th><th scope="col">Telephone</th><th scope="col">Mobile</th><th scope="col">Email</th><th scope="col">Membership Type</th><th scope="col">Membership Start Date</th><th scope="col">Membership Stop Date</th><th scope="col">Processing Date</th>
                </tr><tr class="data">
                    <td>44181</td><td>&nbsp;</td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td>&nbsp;</td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td>&nbsp;</td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
                </tr><tr class="data">
                    <td>44181</td><td>&nbsp;</td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td>&nbsp;</td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td>&nbsp;</td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
                </tr><tr class="data">
                    <td>44181</td><td>&nbsp;</td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td>&nbsp;</td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td>&nbsp;</td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
                </tr>
            </table>
        </div>
    EOD;

    // Remove the whole table head
    $content preg_replace('#<tr class="head">.*?</tr>#s'''$content);
    // Remove the HTML-encoded whitespace
    $content preg_replace('#&nbsp;#'''$content);
    // Remove the whitespace
    $content preg_replace('#\s#'''$content);
    // Replace the end of each table row with a line feed (replace the last </td> as well, so there won't be any superfluous commas after the last cell in a row)
    $content preg_replace('#</td>\s*</tr>#'"\r\n"$content);
    // Replace the end of each cell with a comma
    $content preg_replace('#</td>#'','$content);
    // Remove any tags that are still in there
    $content strip_tags($content);
    // Print out CSV
    echo $content;
    // Rejoice 

  • Users who have thanked venegal for this post:

    ryantakers (01-15-2011)

  • #3
    New Coder
    Join Date
    May 2009
    Posts
    29
    Thanks
    11
    Thanked 0 Times in 0 Posts
    Thanks very much Venegal.

    It works brilliantly, however it is stripping out too much! Allow me to explain the problem:

    Some of the fields have a space in them, for instance '24 myhouse road', which needs to remain. I notice that if I comment out the line:
    PHP Code:
    $content preg_replace('#\s#'''$content); 
    then the spaces are back, however, so is the extra line between entries.

    Any tips?

    Thanks again.

  • #4
    Gütkodierer
    Join Date
    Apr 2009
    Posts
    2,127
    Thanks
    1
    Thanked 426 Times in 424 Posts
    You're completely right, of course, I'm sorry. That whitespace within data fields has to stay in there. I altered the code a bit so it does:

    PHP Code:
    // Remove the whole table head
    $content preg_replace('#<tr class="head">.*?</tr>#s'''$content); 
    // Remove the HTML-encoded whitespace 
    $content preg_replace('#&nbsp;#'''$content); 
    // Conserve whitespace within data cells
    $content preg_replace_callback('#<td>.*?</td>#'create_function('$data''return preg_replace("#\s#", "@@CONSERVED_WHITESPACE@@", $data[0]);'), $content); 
    // Remove unconserved whitespace 
    $content preg_replace('#\s*#'''$content);
    // Rebuild conserved whitespace
    $content preg_replace('#@@CONSERVED_WHITESPACE@@#'' '$content);
    // Replace the end of each table row with a line feed (replace the last </td> as well, so there won't be any superfluous commas after the last cell in a row) 
    $content preg_replace('#</td>\s*</tr>#'"\r\n"$content); 
    // Replace the end of each cell with a comma 
    $content preg_replace('#</td>#'','$content); 
    // Remove any tags that are still in there 
    $content strip_tags($content); 
    // Print out CSV 
    echo $content
    // Rejoice 

  • Users who have thanked venegal for this post:

    ryantakers (01-15-2011)

  • #5
    New Coder
    Join Date
    May 2009
    Posts
    29
    Thanks
    11
    Thanked 0 Times in 0 Posts
    Fantastic! Thanks for all your help!


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •