Originally Posted by DrDOS
OK, get me clear on this, are you parsing the entire files, or just extracting certain information from them? If it's the latter you do have other means available. And also, how fast is the process of downloading them. How long to does it take to DL the eighty, without doing the parsing?
I noticed that your server is setup nearly identical to my hosts' and to my localhost machine, including safe mode OFF. So you have available something called SED, the stream editor, which can perform many perl like tasks, and do them very fast.
There are a sequence of nearly 3,000 web pages, each with 20 odd properties. I have to parse the country of each to see if it's in France and then parse about a third of it for the info I want if it is in France
The problem appears to be a timeout of some kind that occurs at about 80seconds or 75 pages, the lost time appears to be an accumulation of the time for the remote pages to load/respond to the simple xml load file line.
It seems to my inexpert eye that my options are either to find a way to get the whole xml file loaded in one go and parse it then (hopefully their techies can supply that) or I find a way to parse a page or small sections of pages at a time.
I need to run this operation once a night automated so it doesn't have to be fast although heaven knows I expected it to be going into this