Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    Senior Coder Spudhead's Avatar
    Join Date
    Jun 2002
    Location
    London, UK
    Posts
    1,856
    Thanks
    8
    Thanked 110 Times in 109 Posts

    RegExp on similar search strings - finds most, fails one.. why?

    I'm searching the source HTML of pages for particular values. I get the value by first finding the TD containing the value's label, then pattern-matching the code immediately after it:

    Code:
    dim sRegExp_TDInteger, sRegExp_TDWords, sRegExp_TDDecimal
    	dim sRegExp_UniqueVisitors, sRegExp_VisitorSessions, sRegExp_AvgVisitorsPerHour, sRegExp_MostPopSearchTerms
    	sRegExp_TDInteger			= ">\d+<"
    	sRegExp_TDWords				= ">\w+<"
    	sRegExp_TDDecimal			= ">\d+.\d+<"
    	sRegExp_UniqueVisitors		= "<td class=""summ1"">Unique Visitors</td>"
    	sRegExp_VisitorSessions		= "<td class=""summ1"">Visitor Sessions</td>"
    	sRegExp_AvgVisitorsPerHour	= "<td class=""summ1"">Average Visitors Per Hour</td>"
    	sRegExp_MostPopSearchTerms	= "<td class=""summ1"">Most Popular Search Term(s)</td>"
    Code:
    	function getTableCellValue(sTitleCellToMatch, sValueCellFormatToMatch, sSourceHTML)
    		'assume we're dealing with two-column table
    		'find the cell containing the title/label text that we want
    		'get a 100-char string starting where the title cell starts
    		'find, in that string, a value between HTML tags that matches the required format
    		Dim rv : rv = ""
    		Set RE = New RegExp
    			RE.Pattern		= sTitleCellToMatch
    			RE.IgnoreCase	= True
    			RE.Global		= True
    		Set oLabelMatches = RE.Execute(sSourceHTML)
    		If oLabelMatches.Count > 0 Then
    			set oLabelMatch = oLabelMatches(0)
    			response.write(oLabelMatch.Value & "<br/>")
    			intLabelStartPos = cInt(oLabelMatch.FirstIndex)
    			sChunk = mid(sSourceHTML,intLabelStartPos,100)
    			RE.Pattern = sValueCellFormatToMatch
    			Set oValueMatches = RE.Execute(sChunk)
    			If oValueMatches.Count > 0 then
    				set oValueMatch = oValueMatches(0)
    				rv = replace(replace(oValueMatch.Value, "<", ""), ">", "")
    			End If
    		End If
    		set RE				= nothing
    		set oValueMatch		= nothing
    		set oValueMatches	= nothing
    		set oLabelMatch		= nothing
    		set oValueMatches	= nothing
    		getTableCellValue	= rv
    	end function
    Code:
    strUniqueVisitors		= getTableCellValue(sRegExp_UniqueVisitors, sRegExp_TDInteger, sSummarySource)
    	strVisitorSessions		= getTableCellValue(sRegExp_VisitorSessions, sRegExp_TDInteger, sSummarySource)
    	strAvgVisitorsPerHour	= getTableCellValue(sRegExp_AvgVisitorsPerHour, sRegExp_TDDecimal, sSummarySource)
    	strMostPopSearchTerms	= getTableCellValue(sRegExp_MostPopSearchTerms, sRegExp_TDWords, sSummarySource)
    I know my function's a little bit... mickey mouse... but it works on all of the label-finding patterns above, except the one where it tries to find the string "<td class=""summ1"">Most Popular Search Term(s)</td>". It can't find it, and I don't know why. It's definitely in the source code, it's c&p'd directly (with the double-quoting added, obviously).

    Any ideas?

  • #2
    Regular Coder
    Join Date
    Sep 2004
    Posts
    152
    Thanks
    0
    Thanked 0 Times in 0 Posts
    My knowledge of REs is very "noobish", but at first glance I would say that you need to escape the ()s.

  • #3
    Senior Coder Spudhead's Avatar
    Join Date
    Jun 2002
    Location
    London, UK
    Posts
    1,856
    Thanks
    8
    Thanked 110 Times in 109 Posts
    Quote Originally Posted by neocool00
    My knowledge of REs is very "noobish", but at first glance I would say that you need to escape the ()s.

    And you're right Not as noobish as mine, then

  • #4
    Regular Coder
    Join Date
    Sep 2004
    Posts
    152
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well, I've been trying to educate myself on REs because they are quite powerful. I mainly use them in client-side javascript for field validation.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •