Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7

Thread: Scrapping

  1. #1
    New to the CF scene
    Join Date
    Sep 2012
    Posts
    2
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Scrapping

    Hello,

    Kinda new a JS, but here it goes. I want to be able to pull and element from one page and display it on another page.

    So here is what I needed exactly.

    Page1.html has element "test" and I would like to be able to pull this element and display it on Page2.html.

    Thanks to anyone who can solve this for me.

    *Edit: Page1.html will not be open, I want Page2.html to be able to navigate to it.

  • #2
    Regular Coder
    Join Date
    Apr 2012
    Location
    St. Louis, MO
    Posts
    985
    Thanks
    7
    Thanked 101 Times in 101 Posts
    You might be able to do it with AJaX. Put the responseText into a string, then do a regex search for specific code and place that into another string, then make that the innerHTML of an element on Page2.html.
    ^_^

    If anyone knows of a website that can offer ColdFusion help that isn't controlled by neurotic, pedantic jerks* (stackoverflow.com), please PM me with a link.
    *
    The neurotic, pedantic jerks are not the owners; just the people who are in control of the "popularity contest".

  • #3
    New to the CF scene
    Join Date
    Sep 2012
    Posts
    2
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Thank you, however I know it can be done, just don't know how.

  • #4
    Regular Coder
    Join Date
    Apr 2012
    Location
    St. Louis, MO
    Posts
    985
    Thanks
    7
    Thanked 101 Times in 101 Posts
    AJaX via jQuery documentation can be found here.

    AJaX without jQuery can be found here.

    I'm not going to write the code, for you. This forum is for helping troubleshoot erroneous code. I prefer the non-jQuery route, but that's your choice. If you go non-jQ, write up your xhr (XMLHttpRequest), call the Page1.html to get the code and put it in a string, then use JS RegEx to get the start and end position of the code you want and use substr() to get that and put it into a string. Then give the element you want to display the data an id and document.getElementById("name").innerHTML = that string.
    ^_^

    If anyone knows of a website that can offer ColdFusion help that isn't controlled by neurotic, pedantic jerks* (stackoverflow.com), please PM me with a link.
    *
    The neurotic, pedantic jerks are not the owners; just the people who are in control of the "popularity contest".

  • Users who have thanked WolfShade for this post:

    gatoraid (09-25-2012)

  • #5
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,294
    Thanks
    10
    Thanked 583 Times in 564 Posts
    regexp is hard to write, and harder to maintain, i always preferred using DOM methods.
    regexp can break on a simple redesign or widget addition.
    finding the start point is easy, but the end point is harder, and subject to change with content.
    with regexp, you almost always end up have to "over bite" and work your back from the end.
    for example, you can find a div with a certain ID pretty easily, but if it nests 5 other divs, where the heck do you stop?
    trying to match end tags in RegExp is next to impossible for any non-guru.
    you would need to call exec() several times on the same regexp (instead of a one-shot like match or split) to step through result matches, and yikes, what a pita...

    dom can be much more targeted/filtered/precise, and you can use easy-to-read css selectors instead of RegExp gobblygook.
    plus you don't have to worry about finding the end tag, it's immune to whitespace/char encoding changes, and it resists breakage from authors inserting duplicate-shaped html.


    demo in firebug right here, then adapt as needed:

    Code:
    function aGet(turl, callback) {
        var XHRt = new XMLHttpRequest;
        XHRt.onreadystatechange = function () {if (XHRt.readyState == 4 && XHRt.status == 200) {callback(XHRt.responseText);}};
        XHRt.open("GET", turl, true);
        XHRt.send();
        return XHRt;
    }
    
    
    aGet("/", function(str){
       var t= document.createElement("head");
          t.innerHTML=str;
    
    
    //adjust here:
      var path="#collapseobj_forumbit_1 .alt1Active div [href$='2']";
      alert(  t.querySelectorAll( path )[0].textContent || "NO MATCH"  );
    });
    edit:
    of course, i can hear them now, "what about the poor IE7 users?".
    you can still for them, you can still use the dom, just not CSS selectors.
    it's still a LOT easier to jump to the nearest ID and .getElementsByTag(Name) your way down to the content you need than it is splitting hairs with regexp.
    i say forget about IE7; click jackers and zombie bots will probably finish those boxes off in the next 6 months, and 2/3 of IE7 boxes are in china, which is likely not your site's target demo anyway...
    Last edited by rnd me; 09-25-2012 at 08:22 PM.
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/5/28) IE7:0.1, IE8:5.3, IE11:8.4, IE9:3.2, IE10:3.2, FF:18.2, CH:46, SF:7.9, NON-MOUSE:32%

  • #6
    Regular Coder
    Join Date
    Apr 2012
    Location
    St. Louis, MO
    Posts
    985
    Thanks
    7
    Thanked 101 Times in 101 Posts
    for example, you can find a div with a certain ID pretty easily, but if it nests 5 other divs, where the heck do you stop?
    OP isn't screen scraping another site, it's his/her page. Not sure why the OP wants it that way, but.. (shrug)
    ^_^

    If anyone knows of a website that can offer ColdFusion help that isn't controlled by neurotic, pedantic jerks* (stackoverflow.com), please PM me with a link.
    *
    The neurotic, pedantic jerks are not the owners; just the people who are in control of the "popularity contest".

  • #7
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,294
    Thanks
    10
    Thanked 583 Times in 564 Posts
    Quote Originally Posted by WolfShade View Post
    OP isn't screen scraping another site, it's his/her page. Not sure why the OP wants it that way, but.. (shrug)
    true, but OP may not control every page on the site, and i like to over-explain sometimes to make up for my one-phrase answers...
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/5/28) IE7:0.1, IE8:5.3, IE11:8.4, IE9:3.2, IE10:3.2, FF:18.2, CH:46, SF:7.9, NON-MOUSE:32%


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •