View Full Version : In What Way Is Python Suitable For Programming A Web Search Engine?

08-02-2011, 11:09 AM
My new project is to build a web search engine with a web spider, and I'm thinking of using three languages, namely, Python, Java and C++. Now I'm somewhat confused about which programming language is well-suited for creating a web crawler, content indexer, ranking algorithm and searching mechanism.

I fully agree that some programming languages deliver optimal performance for certain tasks, and lag behind in other areas. So, we want to make the right choices. A friend of my suggested that I use C++ to develop features that demand ultimate speed and Python for glue code that is not very time-critical. But I'm not yet too sure of the exact features that will require absolute speed, so you may want to enlighten me.

Now my questions are:

Where should Python come in? Which features should it be used for?
Which language (C++ or Java) is most suitable for developing a web crawler and why?
Which language is best suited for developing a search ranking algorithm - C++ or Java?
Which features of the search engine should C++ be used for?
Which features should Java be used for?
Do these three languages make a good combination when developing a search application?
Which database management system will be excellent for this type of application? Will MySQL be reliable or is there a higher-level database system that will be most suitable?

Please, enlighten me on the above-mentioned points, so that I'll be more equipped to get down to work. Any positive response and suggestion will be highly appreciated.

10-03-2011, 02:44 AM
I would say that you'd use C++ (or Java I prefer C though) for the actual searching portion. Maybe like a command line interface. ./search [options] query_terms or something that returns a reusable interface and I assume you're going to have the application usable from the web so you could use something like php to call it.

Keep in mind that doing direct calls can be dangerous because you could get injected. i.e. shell_exec("./search query; rm -rf /") so that may be where you'd want to be real careful.

I love python but I feel like you don't necessarily need it in this case. Of course there are infinitely many possibilities for this kind of thing.


10-03-2011, 06:12 AM

Thanks for your input.

I'm not planning to make direct calls, and I prefer Python to PHP.