searching through a text document "search engine"

I just wanted to know if I have a text document that contains html code,
how would I search through the code and retrieve all the data related to the search. i.e. by creating a search engine.

The problem I face is if the format of the tag names change in the html txt document then it won't work. Therfore using DOM to retrieve the text would fail.

Is there anything other than DOM that I can use to parse through the text document, so even if the code changed the search engine would still work.

The aim is to search through a html "text document" and retrieve the data that the user wants.

thanks in advance

Not sure I understand the question...
You want to search html coded HTML files or plain text files?