...

View Full Version : Find External Links with in the web Page



dvpra_gnt
08-19-2005, 02:28 PM
Hi All,

I am developing a Toolbar, My requirement is that i have to find external links with in a web page. I am trying to find externla links with in http://www.w3.org site. My problem is that It is showing the following links also as an external links http://jigsaw.w3.org/css-validator/, http://validator.w3.org/ etc., But they are internal links. Anybody help me to find exact external links with in a web page. I have written the following code to find external links:

function DisplayExternalLinks()
{
var netDomains = document.location.host;
if(netDomains.indexOf(" ") != -1)
{
netDomains = getDomains(netDomains);
}

var netDomainsArray = getDomainsArray(netDomains);

for(var k = 0,n=1; k < document.links.length; k++)
{
if(document.links[k].hostname.length < 1)
{
continue;
}

if(document.links[k].target.length > 0)
{
continue;
}
var hostName = document.links[k].hostname.toLowerCase();
for(var m = 0; m < netDomainsArray.length; m++)
{
if(netDomainsArray[m] != hostName)
{
var docLink=document.links[k];
var im='<img src="http://www.rampweb.com/toolbar/images/external_link.gif"
alt="External Link">';
var h1= docLink.outerHTML;
docLink.outerHTML='<span style=\"color:#91060A;font:x-small arial;\"><b>'+ im+' '+'</b></span> '+h1; n=n+1; continue;
}//end of if()
}//end of inner for()
}//end of outer for loop

if(n == 1)
{
alert('External Links are not found in this web page !');
}
}DisplayExternalLinks()

function getDomains(netDomains)
{
var splitarray = netDomains.split("");
netDomains = myArray.join("");
return netDomains;
}

function getDomainsArray(netDomains)
{
netDomains = netDomains.toLowerCase();
var myArray = netDomains.split(",");
return myArray;
}


Thanks in Advance
Prasad

A1ien51
08-19-2005, 02:35 PM
This forum is for completed scripts......NOT questions..... :thumbsup:

Kor
08-19-2005, 03:07 PM
what about this:


<script type="text/javascript">
var param;
function findL(){
var myLinks = [];
var allT = document.getElementsByTagName('*');
for(var i=0;i<allT.length;i++){
if(allT[i].getAttribute('href')){
myLinks[myLinks.length]=allT[i].href;
for(var j=0;j<myLinks.length-1;j++){
check(allT[i].href,myLinks[j]);
if(param==true){myLinks.splice(myLinks.length-1);break}
}
}
}
alert(myLinks)
}
function check(a,b){
a = a.split('//')[1].split('/')[0];
b = b.split('//')[1].split('/')[0];
param = (a==b);
}
onload=findL
</script>

Bill Posters
08-19-2005, 03:08 PM
How about checking the href value for the presence of http:// string and the absence of the w3.org string?

e.g.


var anchors = document.links;
for (i=0; i<anchors.length;i++) {
if ( (anchors[i].href.indexOf('http://') != -1) && (anchors[i].href.indexOf('w3.org') == -1) ) {
anchors[i].target = "_blank";
}
}

…sorta thing.


[edit]

The above method gives inconsistent results when tested offline and online.
Oddly - and annoyingly - the browser implicitly credits all href values as having http:// present, even if that string isn't explicitly present in the href value.

Here's an alternative check which appears to give better results…


if ( (anchors[i].href.indexOf(window.location.hostname) == -1) && (anchors[i].href.indexOf('w3.org') == -1) )

It might prove useful in some way.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum