View Full Version : Load HTML-Document into DOM

03-08-2006, 01:11 PM

Iīm trying to load a HTML-Document to read the <p>-Tag (for PDF-Export).

The link to the index.html is embded in a xml-element. Donīt know how to execute the link to get the HTML-file but on the first step I try to read the content of the html-file.
With the following code, I get the content as a stream:

function Textstream() {
var fso, f, ts;
var ForReading = 1;
var TristateUseDefault = -2;

fso = new ActiveXObject("Scripting.FileSystemObject");
f = fso.GetFile("c:\\index.html");

ts = f.OpenAsTextStream(ForReading, TristateUseDefault);
s = ts.ReadAll( );
But i would like to load the index.html into a DHTML DOM an read the Tags with:


Thanx !

03-08-2006, 01:39 PM
the collection of all the <p> tags is referenced as


to work with it you should circle

var oPar = document.getElementsByTagName('p')
for(var i =0;i<oPar.length;i++){
... do something with oPar[i]....

03-08-2006, 01:58 PM
Thanx for your reply !
But I need to read the <p>-Tag from several html-documents. So I wanted to refer the index.html to the DOM of DHTML.

03-08-2006, 04:18 PM
you could temporarily load each file into an iframe element of your page.
that way it is a disconnected DOM that you simply are using to "scrape" the contents of.

<iframe id="HTMLParser"/>

function getDOM(src, onload) {
var parser = document.getElementById("HTMLParser");
parser.onload = function() {onload(this.contentWindow.document)}
parser.contentWindow.location.href = src;

getDOM("somepage.html", function(doc) {
alert("found: " + doc.getElementsByTagName("*").length);

this is the basic concept, g'luck