Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    New to the CF scene
    Join Date
    Mar 2008
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts

    scraping, autologin using cURL and cookies

    Hi,

    I'm trying to write a script to scrape and autologin into a site. I could manage to scrape and log into the site using cURL but I'm facing a couple of problems:

    - all the hyperlinks of my source get screwed. for eg. if I scrape www.sourcesite.com from www.mysite.com, all the links of www.sourcesite.com start pointing to www.mysite.com so www.sourcesite.com/page1.html becomes www.mysite.com/page1.html. How do I fix this?

    - also, a direct login into www.sourcesite.com sets a cookie on user's machine and to handle that while trying to autologin, right now my script has the following lines, making the cookie getting dumped into cookie.txt.
    curl_setopt($login, CURLOPT_COOKIEJAR, "cookie.txt");
    curl_setopt($login, CURLOPT_COOKIEFILE, "cookie.txt");

    This works fine but this is not what I want. I want the cookie to be set on the user's machine under www.sourcesite.com's name. How do I implement that?

    Any help would be greatly appreciated.

    Thanks!

  • #2
    Regular Coder Stooshie's Avatar
    Join Date
    Mar 2008
    Location
    Dundee, Scotland
    Posts
    378
    Thanks
    9
    Thanked 39 Times in 39 Posts
    I wouldn't advise doing this as user's will become suspicious as to your reasons.

    It means you could manage, if so inclined, to obtain their username and password.

    Is there a reason you are not linking directly to the site's login page?
    Regards, Stooshie
    O

  • #3
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    It's not possible to set a cookie for a site other than your own for security reasons.

  • #4
    New to the CF scene
    Join Date
    Mar 2008
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Yeah there is a reason. I have to create this website for an internal network. There is a form with info irrelevant and constant for a set of users. All they need to be concerned with is the username and password. So I need to create this sort of a wrapper page that logs into the other page, while concealing the unnecessary info.

  • #5
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    The limit is enforced by the browser, it's not just some ethical decision. The only way you might be able to do it is to write out some Javascript to make and submit a form in a frame/iframe, but that would expose their password in the source..
    Last edited by Inigoesdr; 03-24-2008 at 11:58 PM.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •