Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    New Coder
    Join Date
    Apr 2006
    Posts
    29
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Question SRC URL extraction method - HTML or TXT to TXT...

    Hi,

    I'm looking for an app, or online form, to extract image URLs from HTML code saved on TXT files. To be taken from <IMG SRC> tags, to be more exact.

    I have several code snippets like this:

    Code:
    <img src="http://dummy.site.com/here/images/09/10065/file01.jpg" width="64" height="100" alt="image title" />
    image name
    <img src="http://dummy.site.com/here/images/09/10065/file02.jpg" width="64" height="100" alt="image title" />
    image name
    <img src="http://dummy.site.com/here/images/09/10065/file03.jpg" width="64" height="100" alt="image title" />
    image name
    <img src="http://dummy.site.com/here/images/09/10065/file04.jpg" width="64" height="100" alt="image title" />
    image name
    <img src="http://dummy.site.com/here/images/09/10065/file05.jpg" width="64" height="100" alt="image title" />
    image name
    <img src="http://dummy.site.com/here/images/09/10065/file06.jpg" width="64" height="100" alt="image title" />
    image name
    And I need an automated way to extract just the URLs, and save them on a TXT file like this:

    Code:
    http://dummy.site.com/here/images/09/10065/file01.jpg
    http://dummy.site.com/here/images/09/10065/file02.jpg
    http://dummy.site.com/here/images/09/10065/file03.jpg
    http://dummy.site.com/here/images/09/10065/file04.jpg
    http://dummy.site.com/here/images/09/10065/file05.jpg
    http://dummy.site.com/here/images/09/10065/file06.jpg
    One URL per line.

    The code snippets are not too big, just a bit over 100 entries for the bigger ones. I don't care if I have to do it one TXT at a time. Beats doing the whole thing by hand.

    This is the sort of thing that makes me mad for not being a programmer! Any one of you guys could probably come up with a number of ways to pull this off in just a couple of minutes.

    And I'm quite sure the tools to pull it off are already out there, but trying a search for it... well, let's just say there's way too much out there, and installing small random apps is really not safe.

    I may be completely wrong, but I think I was able to feed code like this to flashget, and he'd just go through the whole thing and listed the actual URLs it found on a confirmation box, allowing me then to select just a few and copy them to the clipboard, in the exact same one-URL-per-line format I need here. But somehow my flashget installation got screwed and now I can't figure out what version I was using. Already tested 4 different ones and none of them seems to be able to do that.

    I need those URLs in that format so I can then batch replace URL segments and, finally, feed the updated URLs to flashget. But the first step is extracting the initial URL from that code.

    So, any ideas?


    Thanks.


    PS: hope I'm not screwing up but posting this here, but I really couldn't find a better match... And it IS HTML related, I guess.

  • #2
    New to the CF scene
    Join Date
    Nov 2009
    Posts
    9
    Thanks
    1
    Thanked 0 Times in 0 Posts
    why not open it up in a text editor and use the search and replace utility? Just put in search for <img src=" and replace it with an empty space, then search for the ending string and do the same thing?

    Jack

  • #3
    Senior Coder Rowsdower!'s Avatar
    Join Date
    Oct 2008
    Location
    Some say it's everything.
    Posts
    2,027
    Thanks
    5
    Thanked 397 Times in 390 Posts
    You could do this with javascript or PHP if you have a web host that supports PHP.

    If you're looking for customized code to be built for you then this thread is probably most appropriately placed in the paid work forum.

    If you want to learn to do it yourself and be guided then by all means make an effort of your own and we will help you sort out the issues you run into. The logic involved with this would be pretty simple.
    The object of opening the mind, as of opening the mouth, is to shut it again on something solid. G.K. Chesterton
    See Mediocrity in its Infancy
    It's usually a good idea to start out with this at the VERY TOP of your CSS: * {border:0;margin:0;padding:0;}
    Seek and you shall find... basically:
    validate your markup | view your page cross-browser/cross-platform | free web tutorials | free hosting

  • #4
    New Coder
    Join Date
    Apr 2006
    Posts
    29
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Unhappy

    Hey guys.

    Quote Originally Posted by Jack Corzine View Post
    why not open it up in a text editor and use the search and replace utility? Just put in search for <img src=" and replace it with an empty space, then search for the ending string and do the same thing
    That would be my default approach, but the end string is always different because of the ALT tag and text, which are always different!

    Quote Originally Posted by Rowsdower! View Post
    You could do this with javascript or PHP if you have a web host that supports PHP.
    My webhost supports PHP, and I'll take any solution, it's just that I'm sure there are freely available ready-made solutions out there. From website 'suckers' to download managers, or even html tag strippers that can be costumized. But, for sure, there's gotta be something just goes through random text and collects just the HTML links. I just need to be sure about one, so I don't end up installing 10 or more before I get the right one.

    Quote Originally Posted by Rowsdower! View Post
    If you're looking for customized code to be built for you then this thread is probably most appropriately placed in the paid work forum.
    I have a friend who can probably do it for free in just a few hours, it's just that the tools I need already exist. And he's a busy guy.

    Quote Originally Posted by Rowsdower! View Post
    If you want to learn to do it yourself and be guided then by all means make an effort of your own and we will help you sort out the issues you run into. The logic involved with this would be pretty simple.
    My coding skills are zero. I do HTML and CSS, that's about it. Calling that coding is almost like elevating paper airplane throwing to space exploration...
    Last edited by whopub; 11-30-2009 at 03:17 PM. Reason: spelling

  • #5
    New Coder
    Join Date
    Apr 2006
    Posts
    29
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Even an app that goes through text and just extracts all words starting with http will do (the " can easily be removed later.

    But still, there must be apps out there to suck URLs out of text files. Anyone?


  •  

    Tags for this Thread

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •