Wednesday, August 13, 2014

PHP Escape Functions

For those of you who don't know, Magic Quotes is deprecated because a one-size-fits-all approach to escaping/quoting is wrongheaded and downright dangerous.

There are probably plenty of better ways to do this in a PHP framework, but if you're hand-coding a small project you may benefit from these utility functions.

The basic approach is to keep all data in its raw format and escape it on demand when it is embedded in other data. So if you want to put a value into an sql statement, you escape it for sql first and ignore the fact that you may embed it in html later. If you want to put a value (say that you read from your database) into the innerHTML of a div, you html escape it first. If you want to put a value into an attribute of an html tag you html attribute escape it first.

I've included a few non-escape related php functions that you might also find useful. And I've included some javascript escapes, but you're probably better off using a framework or always writing to object members rather than writing raw innerHtml.

As far as I know the existing php urlencode and javascript encodeURIComponent are both safe and compatible. Here's examples of their use:

<?
  $encoded = 'http://site.com/page.php?x='.urlencode($x).'&y='.urlencode($y);
?>
<script>
  var encoded = 'http://site.com/page.php?x='+encodeURIComponent(x)+'&y='+encodeURIComponent(y);
</script>

php.ini

; set short tags to on, that's not necessary, just my style
; it lets you use <? instead of <?php
; don't be confused by the first hit during find, the actual setting is lower in the file
short_open_tag = On

; tell error reporting to ignore E_NOTICE so that your logs don't fill up
error_reporting = E_ALL & ~E_NOTICE & ~E_DEPRECATED

; append this to the [mbstring] block
; the existing mbstring should all be just commented out examples
; these changes make utf8 the default
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = Off
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.detect_order = auto
mbstring.substitute_character = none
default_charset = UTF-8

; decrease max post size so you don't waste time on bogus payloads
; this also limits the size of attack packets and reduces the risk of overflow
post_max_size = 1M

index.php

<?
  session_start();
  set_time_limit(30);
  header('Content-type: text/html; charset=UTF-8');
  date_default_timezone_set('America/Toronto');

  include 'inc_code_functions.php';

  $db = new mysqli('localhost','username','password','dbname');
  if ($db->connect_errno)
  {
    echo 'The Database Is Down.';
    exit;
  }
  
  $db->set_charset('utf8');
  if ($db->character_set_name() != 'utf8')
  {
    echo 'The Database Is Misconfigured.';
    exit;
  }

  if (mb_internal_encoding() != "UTF-8")
  {
    echo 'PHP Is Misconfigured.';
    exit;
  }
?>
<!DOCTYPE html>
<html>
<head>
 <meta http-equiv="content-type" content="text/html; charset=utf-8" />
 <title>Test Page</title>
</head>
<body>
  hello world
</body>
</html>

inc_code_functions.php

<?
  ///////////////////////////////////////////////////////////////////////
  // inline database fixers
  
  /**
   * Make safe to include as a string in an sql statement. Note that single quotes are provided.
   * If conversion fails (because db is not accessible), echo failure and exit.
   *
   * $sql = "INSERT INTO table (myInt,myString) VALUES (".iFix($myInt).",".sFix($myString).")";
   * $sql = "UPDATE table SET myString=".sFix($myString)." WHERE myInt=".iFix($myInt);
   */
  function sFix($val)
  {
    global $db;
    $val = trim($val);
    $sval = $db->escape_string($val);
    if (ByteLength($sval) < ByteLength($val))
    {
      echo 'Failed To Escape String.';
      exit;
    }
    return "'".$sval."'";
  }

  /**
   * Make safe to include as an integer in an sql statement.
   * If conversion fails (because input wasn't an integer), echo failure and exit.
   *
   * $sql = "INSERT INTO table (myInt,myString) VALUES (".iFix($myInt).",".sFix($myString).")";
   * $sql = "UPDATE table SET myString=".sFix($myString)." WHERE myInt=".iFix($myInt);
   */
  function iFix($val)
  {
    if (!is_numeric($val))
    {
      echo 'Bad Integer.';
      exit;
    }  
    $ival = intval($val);
    if ($ival != $val)
    {
      echo 'Bad Integer.';
      exit;
    }
    return $val;
  }

  /**
   * Make safe to include as a string in a double-sided sql LIKE statement.
   * Note that single quotes and %% are provided.
   * If conversion fails (because db is not accessible), echo failure and exit.
   * It is your responsibility to trim if you don't want outer spaces.
   *
   * $sql = "SELECT * FROM table WHERE col LIKE ".lFix($searchTerm);
   */
  function lFix($val)
  {
    global $db;
    $sval = $db->escape_string($val);
    if (ByteLength($sval) < ByteLength($val))
    {
      echo 'Failed To Escape String.';
      exit;
    }
    $find_ary    = array(   "_",   "%" );
    $replace_ary = array( "\\_", "\\%" );
    $sval = str_replace($find_ary, $replace_ary, $sval);    
    
    return "'%".$sval."%'";
  }

  ///////////////////////////////////////////////////////////////////////
  // string replacement

  // note that because ascii is 00-7F and because utf8 is 80-FF it's impossible
  // for an ascii byte to be a sub-byte of a utf8 sequence, therefore it's okay
  // to use single-byte find/replace when modifying only ascii characters

  /**
   * Make safe to include in a java string constant.
   * I do some extra escaping in case it ends up in some html without further escaping.
   * var mystr = "hello <?=jFix($val)?> world";
   */
  function jFix($val)
  {
    $find_ary    = array(   "'", "&"    , ">"   , "<"   , "\"",     "\r\n",  "\n",  "\r" );
    $replace_ary = array( "\\'", "&amp;", "&gt;", "&lt;", "&quot;",  "\\n", "\\n", "\\n" );

    $val = trim($val);
    return str_replace($find_ary, $replace_ary, $val);
  }

  /**
   * Make safe to include in a double quoted html string constant.
   * <input type="text" value="<?=tFix($val)?>"/>
   *
   * If a default is provided and input is blank then use the default.
   */
  function tFix($val, $default="")
  {
    $find_ary    = array( "&"    , ">"   , "<"   , "\""    , "\r", "\n" );
    $replace_ary = array( "&amp;", "&gt;", "&lt;", "&quot;",  " ",  " " );

    $val = trim($val);
    if ($val == "") { $val = trim($default); }
    return str_replace($find_ary, $replace_ary, $val);
  }

  /**
   * Make safe to include as raw html.
   * <div><?=hFix($val)?></div>
   *
   * If a default is provided and input is blank then use the default.
   */
  function hFix($val, $default="")
  {
    $find_ary    = array( "&"    , ">"   , "<"    );
    $replace_ary = array( "&amp;", "&gt;", "&lt;" );

    $val = trim($val);
    if ($val == "") { $val = trim($default); }
    return str_replace($find_ary, $replace_ary, $val);
  }

  /**
   * Convert CRLF to <br/>
   */
  function toBR($val)
  {
    $find_ary    = array( "\n\r" , "\n"   , "\r"    );
    $replace_ary = array( "<br/>", "<br/>", "<br/>" );

    $val = trim($val);
    return str_replace($find_ary, $replace_ary, $val);
  }

  ///////////////////////////////////////////////////////////////////////
  // string tests

  /**
   * Return the number of bytes in a string.
   */
  function ByteLength($val)
  {
    return strlen($val);
  }

  /**
   * Return true if $haystack starts with $needle
   */
  function StartsWith($haystack,$needle)
  {
    if ( mb_strpos( $haystack, $needle ) === 0 )
    { return true; }
    else
    { return false; }
  }
  
  /**
   * Case insensitive test if $needle is in $haystack
   */
  function InStrI($haystack,$needle)
  {
    if ( mb_strpos(mb_strtolower($haystack),mb_strtolower($needle)) === false )
    { return false; }
    else { return true; }
  }

  ///////////////////////////////////////////////////////////////////////
  // validation

  /**
   * Database char length is not byte length. So if your field is utf8
   * and your size is 255 you can hold 255 large utf8 chars, i.e. 255*3 bytes.
   * This func silently truncates your val down to max length, replacing the 
   * last legal chars with '...' or the postfix you specify.
   */
  function Truncate($str,$len,$postfix='...')
  {
    if ( mb_strlen($str) > $len )
    { return mb_substr($str,0,$len - mb_strlen($postfix)).$postfix; }
    else { return $str; }
  }
  
  /**
   * If $str has more chars than $len then echo error (using $name) and exit.
   */
  function MaxLen($str,$len,$name)
  {
    $actual = mb_strlen($str);
    if ( $actual > $len )
    {
      echo "$name is $actual characters long, but cannot be more than $len characers.";
      exit;
    }
    return $str;
  }
  
  /**
   * If $str is empty echo error using $name and exit.
   */
  function Required($str,$name)
  {
    if ( strlen($str) < 1 )
    {
      echo "$name is required.";
      exit;
    }

  ///////////////////////////////////////////////////////////////////////
  // time

  /**
   * Return a unix time stamp based on the input stamp but with hours
   * etc modified. This is in time local to the server (EST).
   */
  function Stamp($stamp,$hour=NULL,$minute=NULL,$second=NULL,$day=NULL,$month=NULL,$year=NULL)
  {
    if (is_null($hour  )) { $hour   = date('H',$stamp); }
    if (is_null($minute)) { $minute = date('i',$stamp); }
    if (is_null($second)) { $second = date('s',$stamp); }
    if (is_null($day   )) { $day    = date('d',$stamp); }
    if (is_null($month )) { $month  = date('m',$stamp); }
    if (is_null($year  )) { $year   = date('Y',$stamp); }
    
    return mktime($hour, $minute, $second, $month, $day, $year);
  }

  /**
   * Add the specified number of days to the provided timestamp.
   */
  function Stamp_Plus_Days($stamp,$days)
  {
    $hour   = date('H',$stamp);
    $minute = date('i',$stamp);
    $second = date('s',$stamp);
    $day    = date('d',$stamp);
    $month  = date('m',$stamp);
    $year   = date('Y',$stamp);

    return mktime($hour, $minute, $second, $month, $day+$days, $year);
  }

  ///////////////////////////////////////////////////////////////////////
  // misc

  /**
   * Convert a value to an integer. If conversion fails, return zero.
   * If optional $positive is true then negative numbers return zero.
   */
  function SafeInteger($val,$positive=false)
  {
    $val = trim($val);
    if (is_numeric($val))
    {
      $val = intval($val);
      if ($positive && $val < 0) { return 0; }
      else { return $val; }
    }    
    return 0;
  }
  
  /**
   * Generate 16 random ascii-hex bytes. "3e940c727a59e9d64082b1765027c0c2"
   */
  function Nonce()
  {
    $bytes = openssl_random_pseudo_bytes(16);
    return bin2hex($bytes);    
  }

  function IsDigits($val)
  {
    if (preg_match("/^([0-9]+)$/",$val))
    { return true; }
    else
    { return false; }
  }
?>

util.js

//////////////////////////////////////////////////////////////////////////////////

var util = {};

/**
 * Make input safe to display in html.
 * http://lawrence.ecorp.net/inet/samples/regexp-intro.php
 */
util.hFix = function(txt)
{
  var safe = ''+txt;
  safe = safe.replace(/&/g,"&amp;");
  safe = safe.replace(/>/g,"&gt;");
  safe = safe.replace(/</g,"&lt;");
  return safe;
};

/**
 * Make input safe to display in a double quoted html attribute.
 * http://lawrence.ecorp.net/inet/samples/regexp-intro.php
 */
util.tFix = function(txt)
{
  var safe = ''+txt;
  safe = txt.replace(/&/g,"&amp;");
  safe = safe.replace(/>/g,"&gt;");
  safe = safe.replace(/</g,"&lt;");
  safe = safe.replace(/"/g,"&quot;");
  safe = safe.replace(/\r/g," ");
  safe = safe.replace(/\n/g," ");
  return safe;
};

/**
 * Make input safe to display in a java variable attribute.
 * http://lawrence.ecorp.net/inet/samples/regexp-intro.php
 */
util.jFix = function(txt)
{
  var safe = ''+txt;
  safe = txt.replace(/&/g,"&amp;");
  safe = safe.replace(/>/g,"&gt;");
  safe = safe.replace(/</g,"&lt;");
  safe = safe.replace(/"/g,"&quot;");
  safe = safe.replace(/'/g,"\'");
  safe = safe.replace(/\r/g,"\\r");
  safe = safe.replace(/\n/g,"\\n");
  return safe;
};

/**
 * Remove leading and trailing spaces.
 * http://lawrence.ecorp.net/inet/samples/regexp-intro.php
 *
 * /   separator
 * ^   find start of line
 * \s  followed by a single white space character
 * +   followed by one or more of the preceding
 * |   or
 * \s  a single white space character
 * +   followed by one or more of the preceding
 * $   followed by the end of line
 * /   separator
 * g   don't stop at the first match, do them all
 * ''  replace all matches with the empty string
 */
util.trim = function(str)
{
  return str.replace(/^\s+|\s+$/g, '');
};
web
{ "loggedin": false, "owner": false, "avatar": "", "render": "nothing", "trackingID": "UA-36983794-1", "description": "", "page": { "blogIds": [ 507 ] }, "domain": "holtstrom.com", "base": "\/michael", "url": "https:\/\/holtstrom.com\/michael\/", "frameworkFiles": "https:\/\/holtstrom.com\/michael\/_framework\/_files.4\/", "commonFiles": "https:\/\/holtstrom.com\/michael\/_common\/_files.3\/", "mediaFiles": "https:\/\/holtstrom.com\/michael\/media\/_files.3\/", "tmdbUrl": "http:\/\/www.themoviedb.org\/", "tmdbPoster": "http:\/\/image.tmdb.org\/t\/p\/w342" }