The reason you normally can't do this is security. Pure and simple.
You can't use xmlhttp (or equivalent) to read content from another site.
It has nothing whatsoever to do with the KIND of content in the pages. You simply can't get *TO* the pages. At all.
(Unless you use the Mozilla hack mentioned by Kevin.)
On the other hand, doing this server-side is easy. PHP/JSP/ASP can all do it. The server-side equivalents of xmlhttp in those various systems don't have the "not in my domain" restrictions of a browser.
So unless you are fanatical about doing it in JS--and willing to hack it the Mozilla way--why not use a tiny bit of server-side code to help yourself???
One very very simple thing to do would be to create a server-side "proxy server" for foreign site content. That is, you'd use AJAX (or IFRAME!) to hit a page on your own site with a URL something like:
The PHP (or ASP or JSP) proxy page simply loads the given URL using server side code and returns the full source of the page back to your AJAX code or IFRAME. Presto. You can now see the full HTML and do whatever you want to do.
I am studiously ignoring the legal aspects of this. I am *assuming* that of course you have contacted these various sites and gotten permission to use their copyrighted material. And I hope Santa Claus was good to you last month.