PDA

View Full Version : Copy HTML source and/or HTML to XML


BubikolRamios
07-25-2007, 09:13 PM
I would like to easily read this data:
http://sunearth.gsfc.nasa.gov/eclipse/phase/phases2001.html

so: does anyone has a code in java that would
1.copy source to file
2.convert html to xml, although lokking to source of this page that wouldnt solve my problem, some parsing would have to be done anyway.

Thanks for help.

BubikolRamios
07-26-2007, 01:51 AM
for All those interested: googled it to:

http://schmidt.devlib.org/java/file-download.html#source

any thoughts or explanations about second part of question ?

ess
07-26-2007, 02:37 PM
How about using a HTML Parser?

check the following open source package
http://htmlparser.sourceforge.net/

a2z
01-26-2008, 09:27 PM
The htmLawed PHP script can be used to convert HTML body content to XML. http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/index.php

zett
02-02-2009, 04:16 PM
if u are interested in converting:

html --> pdf
doc --> pdf, html, txt, rtf
xls --> pdf, html, csv
ppt --> pdf, swf

then this can help: http://www.dancrintea.ro/doc-to-pdf/