...

View Full Version : sitemap script to improve



nashua
02-14-2006, 12:28 PM
I have a PHP script, which parse the contents of all files in a given directory and builds a site map based on <title> tags and recursive directory order.

I want to improve the script so that the results would be in alphabetical order according to the <title> tags and NOT depending on the directory structure.

Logically, the titles should be added into an array and then sorted out, but I cannot figure out how to do it (not at good terms with php yet) :(

----------------- code------------------
<html>
<head>
<title>Site Map</title>
</head>
<body>
<?php
// starting directory. Dot means current directory
$basedir = "folder_name";

// function to count depth of directory
function getdepth($fn){
return (($p = strpos($fn, "/")) === false) ? 0 : (1 + getdepth(substr($fn, $p+1)));
}

// function to print a line of html for the indented hyperlink
function printlink($fn){
$indent = getdepth($fn); // get indent value
echo "<li class=\"$indent\"><a href=\"$fn\">"; //print url
$handle = fopen($fn, "r"); //open web page file
$filestr = fread($handle, 1024); //read top part of html
fclose($handle); //close web page file
if (preg_match("/<title>.+<\/title>/i",$filestr,$title)) { //get page title
echo substr($title[0], 7, strpos($title[0], '/')-8); //print title
} else {
echo "No title";
}
echo "</a></li><br>\n"; //finish html
}

// main function that scans the directory tree for web pages
function listdir($basedir){
if ($handle = @opendir($basedir)) {
while (false !== ($fn = readdir($handle))){
if ($fn != '.' && $fn != '..' && $fn != 'common' && $fn != 'lib'){ // ignore these
$dir = $basedir."/".$fn;
if (is_dir($dir)){
listdir($dir); // recursive call to this function
} else { //only consider .html etc. files
if (preg_match("/[^.\/].+\.(htm|html|php)$/",$dir,$fname)) {
printlink($fname[0]); //generate the html code
}
}
}
}
closedir($handle);
}
}
// function call
//listdir($basedir); //this line starts the ball rolling

?>
</body>
</html>

--------------code----------------------------

dumpfi
02-14-2006, 08:31 PM
Something like this?

function sortFiles($fileA, $fileB)
{
return strnatcasecmp($fileA[2], $fileB[2]);
}
function showSitemap($path)
{
$sitemap = dirToArray(dir($path), 'usort', array('sortFiles'));
/*
// include $path in Sitemap
$sitemap = array(
array(basename($path), dirname($path), basename($path), $sitemap)
);*/
printSitemap($sitemap, $path);
}
function printSitemap($sitemap, $indent = 0)
{
echo '<ol>';
foreach($sitemap as $fileArray)
{
$isDir = (count($fileArray) == 4);
echo '<li class="indent',$indent,'"><a href="',$fileArray[1],'/',$fileArray[0],'" class="',($isDir) ? 'dir' : 'file','">',$fileArray[2],'</a>';
if($isDir)
{
printSitemap($fileArray[3], $indent + 1);
}
echo '</li>';
}
echo '</ol>';
}
function dirToArray($dir, $sortFunction = NULL, $sortArgs = array())
{
$dirArray = array();
$extensions = array('.htm', '.html', '.php');
while(($fileName =$dir->read()) !== FALSE)
{
if($fileName == '.' || $fileName == '..' || $fileName == 'common' || $fileName == 'lib')
{
continue;
}
$path = $dir->path.'/'.$fileName;
if(is_dir($path))
{
$dirArray[] = array($fileName, $dir->path, $fileName, dirToArray(dir($path), $sortFunction, $sortArgs));
}
elseif(hasExtension($fileName, $extensions))
{
$dirArray[] = array($fileName, $dir->path, getTitle($path));
}
}
if(is_callable($sortFunction))
{
$sortArgs = array_merge(array(&$dirArray), $sortArgs);
call_user_func_array($sortFunction, $sortArgs);
}
return $dirArray;
}
function hasExtension($fileName, $extensions)
{
$fileExtension = strrchr($fileName, '.');
foreach($extensions as $extension)
{
if($fileExtension == $extension)
{
return TRUE;
}
}
return FALSE;
}
function getTitle($filePath)
{
$content = file_get_contents($filePath);
if(preg_match('#<title>(.+?)</title>#msi', $content, $matches))
{
return $matches[1];
}
return 'Untitled';
}
?>
<html>
<head>
<style type="text/css">
ol
{
margin:0em;
padding-left:1.5em;
background-color:#eee;
}
li
{
list-style-type:none;
}
.dir
{
color:#4a4;
}
.file
{
color:#44c;
}
</style>
</head>
<body>
<?php
showSitemap('folder name');
?>
</body>
</html>dumpfi

nashua
02-15-2006, 10:06 AM
Should I simply remove the closing tags </body> and </html> in the originally posted code and add your code?
--------------
OK. It worked fine, but I thought it would build a long list of files in ABC order without any reference to directories. To explain it, please see example:

Now it returns something like this:

france
Bordeaux
Lyon
Paris
usa
New York
Washington

I would like it to be as follows:

Bordeaux
Lyon
New York
Paris
Washington

... and no reference to 'france' or 'us'.

In fact, I would be happy with the directory-file structure, but the directories on the list

a) appear in lowercase,
b) looks the same way as they are, while I would prefer 'United States of America' instead of 'us'

Zegg90
02-15-2006, 12:01 PM
To make all words start with a capital letter, you could try this:


ucwords(strtolower($text));

dumpfi
02-15-2006, 04:54 PM
To make just one big list with all the directories lowercased, use this:
<?php
function sortFiles($fileA, $fileB)
{
return strnatcasecmp($fileA[2], $fileB[2]);
}
function showSitemap($path)
{
$sitemap = dirToArray(dir($path), 'usort', array('sortFiles'));
/*
// include $path in Sitemap
$sitemap = array(
array(basename($path), dirname($path), basename($path), $sitemap)
);*/
printSitemap($sitemap, $path);
}
function printSitemap($sitemap, $startPath)
{
echo '<ol>';
foreach($sitemap as $fileArray)
{
$fullPath = $fileArray[1].'/'.$fileArray[0];
$isDir = is_dir($fullPath);
echo '<li class="indent',getIndent($fullPath, $startPath),'"><a href="',$fullPath,'" class="',($isDir) ? 'dir' : 'file','">',($isDir) ? strtolower($fileArray[2]) : $fileArray[2],'</a></li>';
}
echo '</ol>';
}
function getIndent($fullPath, $startPath)
{
$offset = strlen($startPath);
return substr_count($fullPath, '/', $offset) + substr_count($fullPath, '\\', $offset);
}
function dirToArray($dir, $sortFunction = NULL, $sortArgs = array())
{
$dirArray = array();
$extensions = array('.htm', '.html', '.php');
while(($fileName =$dir->read()) !== FALSE)
{
if($fileName == '.' || $fileName == '..' || $fileName == 'common' || $fileName == 'lib')
{
continue;
}
$path = $dir->path.'/'.$fileName;
if(is_dir($path))
{
$dirArray = array_merge($dirArray, array(array($fileName, $dir->path, $fileName)), dirToArray(dir($path), $sortFunction, $sortArgs));
}
elseif(hasExtension($fileName, $extensions))
{
$dirArray[] = array($fileName, $dir->path, getTitle($path));
}
}
if(is_callable($sortFunction))
{
$sortArgs = array_merge(array(&$dirArray), $sortArgs);
call_user_func_array($sortFunction, $sortArgs);
}
return $dirArray;
}
function hasExtension($fileName, $extensions)
{
$fileExtension = strrchr($fileName, '.');
foreach($extensions as $extension)
{
if($fileExtension == $extension)
{
return TRUE;
}
}
return FALSE;
}
function getTitle($filePath)
{
$content = file_get_contents($filePath);
if(preg_match('#<title>(.+?)</title>#msi', $content, $matches))
{
return $matches[1];
}
return 'Untitled';
}
?>
<html>
<head>
<style type="text/css">
ol
{
margin:0em;
padding-left:1.5em;
background-color:#eee;
}
li
{
list-style-type:none;
}
.dir
{
color:#4a4;
}
.file
{
color:#44c;
}
</style>
</head>
<body>
<?php
showSitemap('folder name');
?>
</body>
</html>
To convert "us" to "United States of America" (and expand other abbreviations in the file and directory names), you must edit the echo line in printSitemap() to pass the names to a converting function, where you specify which abbreviation should be converted to what, and print the result.

A simple way to do this would be:

// replace the echo line in printSitemap() with this
echo '<li class="indent',getIndent($fullPath, $startPath),'"><a href="',$fullPath,'" class="',($isDir) ? 'dir' : 'file','">',($isDir) ? expandAbbreviation(strtolower($fileArray[2])) : $fileArray[2],'</a></li>';

// this is the converter function, a lower-cased directory name is passed to it
function expandAbbreviation($dirName)
{
switch($dirName)
{
case 'us': return 'United States of America';
case 'fr': return 'France';
case 'ger': return 'Germany';
// etc.

// all directory names, which are not a converted, are simply returned unchanged
default: return $dirName;
}
}dumpfi

nashua
02-17-2006, 10:29 AM
Wow, it works! It would have taken ages for me to figure out how to arrange it :confused:

I understood that if I want to use caps in the beginning of words, I should change the line:

? expandAbbreviation(strtolower($fileArray[2])) : $fileArray[2],'</a></li>';

to

? expandAbbreviation (ucwords(strtolower($fileArray[2]))) : $fileArray[2],'</a></li>';

Is this correct? :confused:

I have three more questions:

1. I use some <php> coding within <title> tags, for example <?php echo date("Y"); ?>, so that the title look like "John Doe: 1945-2006". The script does not return 2006, but writes <?php echo date("Y"); ?> as a part of html code. How can I prevent it?

2. Just in case I need it. I figured out how to remove the references to all files, leaving only folders on the list, but how can remove the sub-folders if required?

3. I substitute 'folder name' for the folder's real name in:

<?php showSitemap('folder name'); ?>

Is it possible to use wildcards: a*, b*, etc. ?

Again, I'm really grateful for your help!!!! :thumbsup:

nashua
02-23-2006, 09:55 AM
no ideas? maybe just a little hint? :rolleyes:

nashua
03-23-2006, 10:14 AM
How can I get rid of listing directories in the above script and to keep the files only?



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum