...

View Full Version : How does Yandex do its trick?



complete
11-15-2012, 04:57 AM
How does Yandex do its trick?

I want to programatically get the contents of a Yandex.com search result.

The proglem is that a search page url does not change when you do a search on yandex.com and advance to see more pages. It must be done somehow by javascript. Any ideas?

devnull69
11-15-2012, 09:07 AM
This is most probably done using a method called Ajax (Asynchronous Javascript and XML)

In that case, the values entered into fields (or any other information) will still be sent to the server but without refreshing the page. So the URL does not (have to) change and most of the elements of the page don't need to be rerendered.

General structure of an Ajax GET request


var xmlhttp = new XMLHttpRequest();
var parameter = "myparameter=myvalue";
xmlhttp.open('GET', 'newinformation.php?' + parameter, true);
xmlhttp.onreadystatechange = function() {
// will be called on any change of the readyState of the request
// the readyState will be a number between 0 and 4 where 4 means "finished"
// the status is the HTTP status returned from the server
if(xmlhttp.readyState==4) {
if(xmlhttp.status==200) {
// request is finished and HTTP OK
// responseText or responseXML will contain the server response
alert(xmlhttp.responseText);
}
}
};
xmlhttp.send();

On the server side (for example PHP) you can receive the parameter(s) as usual with $_GET[] or $_POST[]. Every output from PHP (using print or echo or die) will end up in xmlhttp.responseText/responseXML

rnd me
11-17-2012, 12:08 AM
the url was changing for me.
it looked pretty quick, maybe they are using pushState() to set history.

it's jsonp, not ajax that they use, so you should be able to rip it off yourself.

example:

<script src='http://www.yandex.com/yandsearch?callback=console.log&yu=7901815871353110738&text=obama&lr=102943&ajax=%7B%22b-serp%22%3A%7B%7D%2C%22b-search__input%22%3A%7B%7D%2C%22b-filters%22%3A%7B%7D%2C%22b-serp2-list%22%3A%7B%7D%2C%22b-more%22%3A%7B%7D%2C%22b-feet%22%3A%7B%7D%7D&_=1353110804690'></script>
adjust the red to customize

complete
11-22-2012, 11:57 AM
the url was changing for me.

not from the first search results page to the second.

I am using IE. Here are the steps.

1. Open IE and go to www.yandex.com
2. Type some text to search for.
3. The web site changes, and yes, the URL changes one time to show the results page.
4. But this is the only change. If you scroll down to the bottom of the page and click to advance to another page of results, the URL never changes but the results content expands.
5. Now, after you have clicked to show more results a few times so that you have about one hundred search results, and click in the IE browser to view the HTML source code, and save that source into an html page and load that content back into the browser, you will only see the first page of search results and the button to see more content is not displayed.

So this makes it difficult to use this search results programatically unless you know this trick.




it looked pretty quick, maybe they are using pushState() to set history.

it's jsonp, not ajax that they use, so you should be able to rip it off yourself.

example:

<script src='http://www.yandex.com/yandsearch?callback=console.log&yu=7901815871353110738&text=obama&lr=102943&ajax=%7B%22b-serp%22%3A%7B%7D%2C%22b-search__input%22%3A%7B%7D%2C%22b-filters%22%3A%7B%7D%2C%22b-serp2-list%22%3A%7B%7D%2C%22b-more%22%3A%7B%7D%2C%22b-feet%22%3A%7B%7D%7D&_=1353110804690'></script>
adjust the red to customize

I just learned something.

Putting '&p=' at the end of the URL actually works.

http://www.yandex.com/yandsearch?text=javascript
http://www.yandex.com/yandsearch?text=javascript&p=2
http://www.yandex.com/yandsearch?text=javascript&p=3

Now I would like to figure out how another search engine does this. I am talking about www.duckduckgo.com which is also a very popular search engine. It too does not display a different URL in the address bar of the browser as I advance through more content.

And the display for
http://duckduckgo.com/?q=javascript&p=100
is no different from
http://duckduckgo.com/?q=javascript

When I tried
http://duckduckgo.com/?q=javascript&page=1
it thew a 403 forbidden error.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum