Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 6 of 6
  1. #1
    Regular Coder
    Join Date
    Nov 2002
    Posts
    567
    Thanks
    2
    Thanked 4 Times in 4 Posts

    Understanding preg_match

    I am looking at a file written by others. This function is at the top and I would like to see if I am understanding it right. preg_match is the most confusing part of this.

    PHP Code:
    foreach ($HTTP_GET_VARS as $key=>$value){
      if (
    preg_match("/^\<script/"$value)){
        
    $HTTP_GET_VARS[$key] = NULL;
        $
    $key NULL;
      }

    In previewing this the preg_match does not show the same as it is written. it is written like this "preg_match("/^\<script/", $value)".

    I understand foreach, but since this file is called by another and is not passed anything then $HTTP_GET_VARS should be empty. I am assuming that $key and $value are variables but there is no declaration for the variables.

    For the preg_match string I guess the first character after the " is an escape character? So what is the ^ for? Then the next character is an escape for the open < for an end script tag?

    Finally if there is a value in the $HTTP_GET_VARS array then set it NULL. But why two $$?

    Can someone shed some light on this?
    Scott Stewart
    Always happy to learn from pros.

  • #2
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Understanding preg_match

    Originally posted by ScottInTexas
    In previewing this the preg_match does not show the same as it is written. it is written like this "preg_match("/^\<script/", $value)".
    Yeah, this board tends to eat backslashes in highlighted PHP-Code. Can be quite a PITA if you don't think about it while posting.

    I understand foreach, but since this file is called by another and is not passed anything then $HTTP_GET_VARS should be empty.
    Assuming the file is included/required by another one, it would still be able to access the GET parameters contained in $HTTP_GET_VARS.

    I am assuming that $key and $value are variables but there is no declaration for the variables.
    The foreach loop declares these variables anew on each iteration. Are you still sure you understood foreach completely?

    For the preg_match string I guess the first character after the " is an escape character? So what is the ^ for? Then the next character is an escape for the open < for an end script tag?
    preg_match uses a so-called regular expression to match a certain pattern to a string. Regular expressions are a special language which lets you define very explicitly how a matching pattern should look like. In your example, the pattern translates to written english as "a match is found if the string starts with <script". The slashes are called delimiters, they are used to mark the start and the end of a regular expression. They can be any non-alphanumerical character.

    If you want to know about the RegExp syntax, check out
    http://www.php.net/manual/en/pcre.pattern.syntax.php

    Finally if there is a value in the $HTTP_GET_VARS array then set it NULL. But why two $$?
    Not quite, only if the value matches with the RegExp. Two $s indicated that a variable variable is used. IE if $key would be "test", then $$key maps to $test. The content of $key is the name of the variable in $$key. Just substitute the last $...
    It's a slightly confusing way to program, but can be useful for some elegant coding solutions.

  • #3
    Regular Coder
    Join Date
    Nov 2002
    Posts
    567
    Thanks
    2
    Thanked 4 Times in 4 Posts
    Thanks for your answer. It does help, but I would like to get a little more clarification.

    My understanding of foreach comes from VB where I use for each, for example, to iterate through the controls on a form or to iterate through forms in a project etc. However, I usually define the variable first, then I use the variable in foreach.

    Code:
    Dim ctrl as Control
    For Each ctrl in ThisForm
      Do something
    Next
    I was also aware of the fact that the commands within the if would only be run if the statement tested true but how can it test true if there is never anything in the array?

    [Huh?] (custom tag)
    So in the php example this says; for each variable in the array $HTTP_GET_VARS, identifiable through the variable $key that is equal to or greater than $value -- if $value has the string "<script" in it then set the array value to null and set the variable to Null.
    [/HUH?]

    I suppose this makes sense. I just can't see the purpose. This is the first line of the index.php in the example I am looking at. Index.php is the first document opened in the entire web site (if you don't count .htaccess). So what's the point? In case you just said to yourself "what's in .htaccess?" the answer is;

    DirectoryIndex index.php

    Now, when next I hear from you or someone else I will probably be able to use another custom tag [DUH?] which is reserved for when I finally get it.
    Maybe we can get the board to create these where there is some color or something to reflect the meaning.
    Scott Stewart
    Always happy to learn from pros.

  • #4
    Regular Coder
    Join Date
    Jun 2002
    Location
    UK
    Posts
    577
    Thanks
    0
    Thanked 0 Times in 0 Posts
    So in the php example this says; for each variable in the array $HTTP_GET_VARS, identifiable through the variable $key that is equal to or greater than $value -- if $value has the string "<script" in it then set the array value to null and set the variable to Null.

    stop that way of thinking now - way tooooo confusing.

    foreach ($HTTP_GET_VARS as $key=>$value){

    so, for very get variable we set the variable $key equal to the variable name and the variable $value equal to the value - so if we had
    .php?varname=varval
    we would have
    $key = 'varname';
    $value = 'varval';
    for very var=value pair in the get array.

    If any of the values have <script in them we
    $HTTP_GET_VARS[$key] = NULL;
    set the relative get array value to null and set
    $$key = NULL;
    which just sets the unarrayed variable name to null ($varname = NULL; in our example)
    Ökii - formerly pootergeist
    teckis - take your time and it'll save you time.

  • #5
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Originally posted by ScottInTexas
    However, I usually define the variable first, then I use the variable in foreach.
    You don't need to do that with $HTTP_GET_VARS. It's a predefined variable, comparable to an environment variable, and automagically filled by PHP with all GET parameters that happen to reach the script.

    I was also aware of the fact that the commands within the if would only be run if the statement tested true but how can it test true if there is never anything in the array?
    If no GET parameters are attached to the URL, the array $HTTP_GET_VARS contains nothing and hence the foreach loop does not run. It never starts, but that's not problematic for the purpose of this script.

    [Huh?] (custom tag)
    So in the php example this says; for each variable in the array $HTTP_GET_VARS, identifiable through the variable $key that is equal to or greater than $value -- if $value has the string "<script" in it then set the array value to null and set the variable to Null.
    [/HUH?]

    I suppose this makes sense. I just can't see the purpose. This is the first line of the index.php in the example I am looking at.
    Nice idea with the custom tags...

    From what I see, the purpose of this code snippet is to sanitize incoming variables. Because the value of a GET parameter can consist of malicious code, it's good to check it before you use it later on that page. If for instance in this page all GET variables were printed out, you might hack that code and insert a javascript by simply putting test=<script>while(true) alert('hacked')</script> in the URL.
    It depends on what follows in that page, so I can't say for sure. But it's most certainly a security feature, and good practice dictates to be paranoid concerning user-submitted values...

    hth

  • #6
    Regular Coder
    Join Date
    Nov 2002
    Posts
    567
    Thanks
    2
    Thanked 4 Times in 4 Posts
    Thanks to both of you for the answers.

    [DUH]
    I GOT IT!
    [/DUH]

    The last (regarding security) makes a heck of a lot of sense. And yes, I am paranoid ever since my machine was attacked and I went through days of living hell to fix. I had let my guard down and a virus was allowed in with my email. Something called GWSPACES or some such thing. The first thing it did was attack my virus scan.

    Anyway, thanks again.
    Scott Stewart
    Always happy to learn from pros.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •