I raised a thread in the SQL forum about this to be told that it had to be done via php; there is no simple way to do it.
I have done a fair bit of work in getting the tags out, and commas in and am now almost there - I have one final problem however! I am getting an extra line between each entry.
I can see that you have put some work into this and that it's something you just have to get done anyway you manage to get it done, so I'll provide you with a solution a bit later on.
Allow me to ramble for a bit though: HTML has an inherent structure that allows it to be quite easily parsed. It happens all the time. Your browser does that. And there are quite a few PHP classes that do just that – parse HTML, so you can easily extract any data buried in whatever mess of tags you have to work with. Possibly someone is linking you to one of those classes right now while I'm typing this (probably not though, since this is not the type of question people tend to jump on).
That said, if you're feeling masochistic, you can do it by just replacing stuff. Here's how I would do it:
PHP Code:
// This is just your original input, with a few table rows added to show that it's doing what it's supposed to do
$content = <<<EOD
<div>
<table class="noborder" cellspacing="2" cellpadding="6" border="0" id="ctl00_cphMain_gvMembership">
<tr class="head">
<th scope="col">MemberNumberMTBA</th><th scope="col">MemberNumberIMBA</th><th scope="col">First Name</th><th scope="col">Surname</th><th scope="col">Date Of Birth</th><th scope="col">Gender</th><th scope="col">Address Line 1</th><th scope="col">Address Line 2</th><th scope="col">Suburb</th><th scope="col">State</th><th scope="col">Postcode</th><th scope="col">Citizenship</th><th scope="col">Telephone</th><th scope="col">Mobile</th><th scope="col">Email</th><th scope="col">Membership Type</th><th scope="col">Membership Start Date</th><th scope="col">Membership Stop Date</th><th scope="col">Processing Date</th>
</tr><tr class="data">
<td>44181</td><td> </td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td> </td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td> </td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
</tr><tr class="data">
<td>44181</td><td> </td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td> </td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td> </td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
</tr><tr class="data">
<td>44181</td><td> </td><td>editedout</td><td>editedout</td><td>2editedout</td><td>Male</td><td>editedout</td><td> </td><td>Bittern</td><td>VIC</td><td>3918</td><td>Australia</td><td>editedout</td><td> </td><td>editedout</td><td>Junior Membership</td><td>15/12/2010 12:00:00 AM</td><td>15/12/2011 12:00:00 AM</td><td>15/12/2010 11:04:43 PM</td>
</tr>
</table>
</div>
EOD;
// Remove the whole table head
$content = preg_replace('#<tr class="head">.*?</tr>#s', '', $content);
// Remove the HTML-encoded whitespace
$content = preg_replace('# #', '', $content);
// Remove the whitespace
$content = preg_replace('#\s#', '', $content);
// Replace the end of each table row with a line feed (replace the last </td> as well, so there won't be any superfluous commas after the last cell in a row)
$content = preg_replace('#</td>\s*</tr>#', "\r\n", $content);
// Replace the end of each cell with a comma
$content = preg_replace('#</td>#', ',', $content);
// Remove any tags that are still in there
$content = strip_tags($content);
// Print out CSV
echo $content;
// Rejoice
You're completely right, of course, I'm sorry. That whitespace within data fields has to stay in there. I altered the code a bit so it does:
PHP Code:
// Remove the whole table head
$content = preg_replace('#<tr class="head">.*?</tr>#s', '', $content);
// Remove the HTML-encoded whitespace
$content = preg_replace('# #', '', $content);
// Conserve whitespace within data cells
$content = preg_replace_callback('#<td>.*?</td>#', create_function('$data', 'return preg_replace("#\s#", "@@CONSERVED_WHITESPACE@@", $data[0]);'), $content);
// Remove unconserved whitespace
$content = preg_replace('#\s*#', '', $content);
// Rebuild conserved whitespace
$content = preg_replace('#@@CONSERVED_WHITESPACE@@#', ' ', $content);
// Replace the end of each table row with a line feed (replace the last </td> as well, so there won't be any superfluous commas after the last cell in a row)
$content = preg_replace('#</td>\s*</tr>#', "\r\n", $content);
// Replace the end of each cell with a comma
$content = preg_replace('#</td>#', ',', $content);
// Remove any tags that are still in there
$content = strip_tags($content);
// Print out CSV
echo $content;
// Rejoice