Some Best Practices

Previous Section
Next Section

Some Best Practices

These are general best practices when operating Web application environments in production Internet-based environments:

  1. Build, or implement, the target infrastructure such that Reverse Proxy servers are used and direct sockets cannot be established with the real web server hosts from the public Internet. This would also include the implementation of some tightly configured WAF. If Web services are involved, they have to be protected in unique ways also.

  2. Isolate the web servers — it’s always possible that something is either mistakenly or purposely left open on a web server. Architectures must be designed assuming that the bad guys will have full access to web servers via breaches. With that assumption you must isolate web servers in ultra-tight fashion so that the compromise of other hosts via web servers is mitigated.

  3. Tighten up, and use customized, error handling. Because the default error handling for most web frameworks includes wanton exposure of sensitive data, you must force tight error handling so as not to needlessly expose data. For example, imagine how much easier an attacker would have it if a full SQL query were shown due to an error.

  4. Input Validation, Validation of Input, noitadilaV tupnI, get the point? This cannot be stressed enough.

Input Validation

Web application developers have proven that the masses of them often simply do not think about the unorthodox data inputs that can cause security problems. But security minded developers do, and worse, attackers master these techniques. Input validation or sanitization is the one technique that can be used to mitigate many of these issues with Web applications. This can be done in code or on the edge with devices or software that support this type of functionality. If you reference Appendix D you will see the type of data that needs to be sanitized so that its malicious effect is neutralized.

The general best practices are as follows:

  • Assume all input to be malicious

  • Accept only known good input

  • Implement a combination of both client- and server-side input validation

  • Implement a centralized and consistent framework for input validation across the entire application space

In some of the following sections other aspects of best practices are coupled with input validation suggestions. This is done to keep general areas together; for example the XPath section also covers the parameterization of query data.

One arguable approach that has made itself popular over time is not to filter out for known bad data but to filter out everything except known good data. The reason for this is that languages of the Internet make it difficult to identify all possible elements of bad data, especially considering that characters can be represented in so many different ways. Filtering out everything except known good data is possible if you look at some of the regexes from OWASP’s Validation project. They focus on establishing the good data based on some rules; everything else should be treated as dangerous.

Take a look at an example so that you can see the thought process behind the build out of a proper regex. E-mail addresses are the ideal contender because there is most likely no Web application at this point that does not handle this type of data. Generally, they can only contain characters from these sets as per RFC-2822:

  • abcdefghijklmnopqrstuvwxyz


  • 0123456789

  • @.-_

So if you revisit the e-mail regex exposed to you in Chapter 9:


you see that the ruleset is very similar to that set forth in RFC-2822 is enforced. In this example unsupported characters are not filtered out, but the legal characters are enforced and everything else is simply not accepted.

But reality being what it is, you need to be aware of the fact that there are always those that deviate from these rules, typically for business reasons. The regex you see enforces the rules set forth in RFC-2822 tightly. But, for example, say your target needs to support another character like the plus sign (+) in the local-part (left side of the @ symbol); you can extend that regex to meet their needs as such:


An e-mail example was not chosen at random here. It’s important to understand that data like e-mail addresses are challenging when it comes to programmatic validation. This is because every entity seems to have their own ideas about what constitutes a “valid” e-mail address. Moreover, they all also seem to have exceptions to their own rules, which make this work even more challenging. So understand the RFC base and then be ready to contend with the realities of modern-day organizations.

Once you are in tune with your client you can help them implement beneficial filtering rules. There is really no benefit in allowing characters that could never be valid to them in particular. So work with them to be able to understand their data and reject invalid data early in the transmission flow.


As you should already see, in doing input validation or sanitization regex will be a way of life. OWASP’s Validation project ( gives you some solid regexes that can be implemented in the target code base or even some sophisticated edge entities in order to sanitize input properly. Here is a small sampling (you already saw a couple of these in Chapter 9) of the very useful data available from this project:

US zip code with optional dash-four

US phone number with or without dashes


4 to 8 character password requiring numbers, lowercase letters, and uppercase

9 digit social security number with dashes


Dangerous meta-characters, which you have seen throughout the book, can be the cause of many headaches. They include the following:

  • | or %7c - pipe

  • < or %3c

  • > or %3e

  • ` or %60

  • & or %26

And they can be filtered out with a regex like this:


Path Traversal

The realm of path traversal vulnerabilities can be mitigated in two main ways. One technique is to implement a unique, and internal, numeric index to documents. This can be augmented via the use of some further custom work in the form of static prefixes and suffixes. In reference to proper input validation, the target code base should strip and not process any directory path characters such as the following:

  • / or %2f — forward slash

  • \ or %5c — backward slash

  • .. ellipse characters (dots, periods)

which attackers could use to force access to resources outside of the web server directory tree. For more examples please reference the “Path Traversal” section of Chapter 4 and Appendix D. An example of a path traversal detection regex is as such:


HTTP Response Splitting

To introduce protection from HTTP Response Splitting attacks all input should be parsed scanning for

  • CR

  • LF

  • rn

  • %0d

  • %0a

or any other morphed variations or encodings of these characters. The bottom line is that explicit carriage return and line feed elements should not be allowed in any form within HTTP headers. An example of a regex (there are different ways to do this) that detects HTTP Response Splitting attacks is as such:



XSS attacks can come in many forms as you have seen throughout the book, but they generally include the following characters:

  • < or %3c

  • > or %3e

A general simple regex for detection looks like this:


LDAP Injection

LDAP Injection attacks rely on a very finite set of characters due the legal query characters used by the LDAP protocol. They following characters must be filtered out carefully:

  • | or %7c —pipe

  • ( or %28

  • & or %26

A general simple regex for detection looks like this:


SQL Injection

Here you trek forward with the concept of “not removing known bad data” but “removing everything but the known good data.” The distinction is not to be taken lightly. Writing regexes for SQL Injection detection is a bit more challenging than some of the other areas. The reasons for this are complex, but take for instance the fact that even regular characters can be troublesome in the SQL Injection world. You have seen the ideal example of this throughout this book and it appears many times in Appendix D. Take a look at this poisoned SQL:

SELECT field
  FROM table
 WHERE ix = 200 OR 1=1;

Clearly the condition will always be true, but how do you regex out the 1s? It is not that straightforward and so you will have to deeply analyze your target’s SQL in order to advise them effectively. The following regexes are generally great in stand-alone fashion or as starting points to put together something useful for your client. To detect standard SQL meta-characters that would be used in an attack string:


To detect any SQL Injection that uses the UNION keyword:


For MS SQL-Server attacks:


Mitigation of SQL Injection vulnerability could potentially take one of two paths. One is to push the concept of using stored procedures, and the other is to use prepared statements when dynamically constructing SQL statements. Whichever way is opted for, data validation or sanitization is a must. Beyond that, take heed to the following suggestions.

Escape Input Characters

A name data field represents a major challenge. The reason for this is that one of the most dangerous characters to SQL statements is also a legitimate character in some names. The single quote is simply a valid character for name data fields. A very simple, tempting, yet ineffective technique is to concatenate two single quotes together, effectively escaping the second one. For example, to protect a query the developers could use said tactic and force a query to look something like this:

  FROM table
 WHERE name = 'Jack O''Lantern';

Though this query will work, this approach provides little along the lines of true protection. Some DBs, for example, have full support for alternate escaping mechanisms such as using \’ to escape a single quote. Hence, following the previous example this mechanism can be used with something nasty like \'; DROP TABLE table; -- . Then the underlying query would look something like this:

  FROM table
 WHERE name = '\''; DROP TABLE table; --';

Getting quotes escaped correctly is notoriously difficult. Based on your target language you should hunt down escaping functions/methods that do this for you. This way you use a tried and true method and you know that the escaping will be done properly and safely. For example, if your target uses PHP and MySQL there is a function called mysql_real_escape_string(), which prepends escaping backslashes to the following characters: \x00, \n, \r, \, ', ", and \x1a.

If your target is a Perl app, there are (among other things) two DBI methods called quote($value) and prepare($value). “quote” correctly quotes and escapes SQL statements in a way that is safe for many DB engines. “prepare” ensures that valid SQL is being sent to the DB server. Here is a simple code snippet as a basic example:

$strsql = "select * from users where blah'";
print "Raw SQL: " . $strsql . "\n";
print "Quoted SQL: " . $dbh->quote($strsql) . "\n\n";
print $cursor = $dbh->prepare($strsql);

If you pay attention you will see that there is a tick (single quote) injected to the end of the SQL query string (simply for exemplary purposes). This is how many attacks start; a single quote is injected and the response is analyzed. If you run this Perl snippet you will see that in the output this single quote is escaped by the quote($value) function:

Raw SQL: select * from users where blah'
Quoted SQL: 'select * from users where blah\''
Use Bound Parameters

The use of bound parameters is a more elegant approach than the escaping of characters. This model is supported by modern-day DB programming interfaces. The basis for it is that a SQL statement will be created with placeholders (like ?) and then compiled into a form understood internally by the interface. Then execution takes place with parameters against this compiled and safe method. Take a look at this small snippet in Perl, which uses prepare($value) and exemplifies this via a slightly different approach:

my $sql = qq{ SELECT name FROM table WHERE name = ?; };
my $sth = $dbh->prepare( $sql );

To further drive the concept home take a look at an example in Java:

PreparedStatement ps = connection.prepareStatement(
   "SELECT name FROM table WHERE name = ?");
ps.setString(1, strVar);
ResultSet rs = ps.executeQuery();

In these two snippets strVar is some string variable that could have come from different sources. It represents the point of entry for a possible attack string. The data in this variable could be anything from quotes to backslashes to SQL comments. It is of no relevance because it is treated as just simple string data that is cleaned up (sanitized) as part of the normal process.

Limit Permissions and Segregate Accounts

This one may seem obvious but it is mentioned here due to the countless times the exact opposite practice is encountered out there. The Web application must use a database connection and account with the most limited rights possible. Moreover, wherever possible account segregation should be used. In this case, for instance, one specific account is used to run select queries and that is the only permission the respective account has in the target DB. The net effect here is that even a successful SQL Injection attack is going to face limited success due to the segregation of permissions.

Use Stored Procedures

Stored procedures represent a different level of security from SQL Injection attacks. They are not infallible but require a much higher skill set for an attack to be successful. The key point to pushing SQL back to stored procedures is that client-supplied data is not able to modify the underlying syntax of SQL statements. This point can be taken to the extreme level of protection where the Web app is so isolated that it never touches SQL at all. The bottom line is that you can offer this as an option to your clients. The goal would be that all SQL statements used by the target should reside in stored procedures and be processed on the database server. In-line SQL is then done away with. Then the target application must be modified so that it executes the stored procedures using safe interfaces, such as Callable statements of JDBC or the CommandObject of ADO.

By encapsulating the rules for certain DB actions into stored procedures, they can be treated as isolated objects. Based on this they can be tested and documented on a stand-alone basis and business logic can be pushed off into the background for some added protection. Be advised that pushing SQL back to stored procedures for simple queries may seem counterproductive, but over time and complexity the reasons why this is a good idea will become self-evident.

It is possible to write stored procedures that construct queries dynamically based off input. This provides no protection against SQL Injection attacks.

XPATH Injection

XPath protection is extremely similar to the SQL Injection measures of protection. Performing regex-based detection of XPath Injection attacks is quite difficult. A good approach to get around this difficulty is to use parameterized queries. Instead of dynamically forming XPath query expressions, the parameterized queries are statically precompiled. This way user input is treated as a parameter, not as an expression, and the risk of attack is mitigated. Take a look at an example based on the example from Chapter 4. First, here is a traditional injectable login XPath query:

String(//users[username/text()=' " + username.Text + " ' and password/text()=' "+
password.Text +" '])

A proper ingestion of data would be treated as such:

String(//users[username/text()[email protected]' and
password/text()[email protected]'])

And maliciously injected input x' or 1=1 or 'x'='y could force the query to become

String(//users[username/text()='x' or 1=1 or 'x'='y' and password/text()=''])

To mitigate this risk, treat the variable elements of data as such:

String(//users[username/text()= $username and password/text()= $password])

Now the input is not utilized to form the underlying query, instead the query looks for data values in the variables themselves; they could come out of the XML document as well. This also nullifies the attack meta-characters based on quotation marks. This technique prevents XPath Injection attacks.


Stinger ( from Aspect Security is an excellent example of one avenue of protection within Java code bases. It represents a regex-based validation engine to be used in J2EE environments. You can find an excellent example in the following article by Jeff Williams: The basics of its usefulness are in the creation of the validation regex in an XML file; for example, take a look at this:


The rule displayed in this snippet will enforce a tight and secure cookie usage model. The code utilizing a rule like this will only accept a JSESSIONID cookie. Then the data within that cookie must consist of 32 characters within the range A–F or 0–9. The ruleset treats any extra cookies or the lack of the JSESSIONID cookie as a fatal condition. This XML-based ruleset would be part of a larger collection you would construct based on your target; the file would be named something like target_X.svdl. The svdl extension stands for Security Validation Description Language. These ruleset files exist in a directory named “rules” hanging off main directory of the webapp. Then in your java code you call validate(). This call triggers Stinger to hunt down the appropriate rulesets and apply them to the data flowing through. A FatalValidationException is thrown if a fatal rule is violated.


PHPFilters ( is a project out of OWASP. It provides easy-to-implement PHP functions that sanitize certain types of input. The current set of functions is as follows:

  • sanitize_paranoid_string($string) — Returns string stripped of all non-alphanumeric characters

  • sanitize_system_string($string) — Returns string stripped of special characters

  • sanitize_sql_string($string) — Returns string with escaped quotes

  • sanitize_html_string($string) — Returns string with html replaced with special characters

  • sanitize_int($integer) — Returns the integer with no extraneous characters

  • sanitize_float($float) — Returns the float with no extraneous characters

  • sanitize($input, $flags) — Performs sanitization function as per the value specified in the “flags” parameter. The options for “flags” are PARANOID, SQL, SYSTEM, HTML, INT, FLOAT, LDAP, UTF8

Here is some text (from the HTML test page) as an example of running a data type attack on an integer and using the sanitize function:

Nirvana Test Suite

Server Software: Apache/1.3.33 (Unix) on Linux
PHP Version: 4.4.2
Register Globals: 0
Magic Quotes GPC: 1
Nirvana Test Flag: INT

Test String was: -98#$76\\00543
Sanitized: -98

As you can see, the attack string was sanitized and a clean integer was returned.

Web Application Security Project (WASP — is another PHP alternative. It gives you a similar set of libraries/functions that you can utilize in your code to sanitize input. You see it a bit more in Chapter 11.


Because the .NET Framework is not an open one there are other options in this area. There are commercial products that claim to seamlessly provide .NET data validation. They are not covered here; this section sticks with the built-in objects and some manual work. Your target may not allow for third-party software so you should always be competent in performing your work without third-party involvement. The concentration will be in the following:

  • ASP.NET request validation

  • Input constraining

  • Control of output

By default, ASP.NET versions 1.1 and 2.0 come with active request validation functionality built in. It detects any HTML elements and reserved meta-characters in data sent in to the server. This provides protection from the insertion of scripts into your targets. The protection is provided by automatically checking all input data against a static list of potentially malicious values. If a match occurs, an exception of type HttpRequestValidationException is thrown. This can be disabled if need be.

Many security experts involved with .NET technology agree that the built-in validation is not to be relied upon exclusively. It should be treated as one layer in addition to custom input validation.

To constrain input, the following are best practices:

  • Validate input length, bounds, format, and type. Filter based on known good input.

  • Use strong data typing.

  • Use server-side input validation. Only use client-side validation to augment server-side validation and reduce round trips to the server and back.

ASP.NET brings five built-in objects (controls) for validation:

  • RequiredFieldValidator — This forces entry of some value in order to continue operating.

  • CompareValidator — This is used with a comparison operator to compare values.

  • RangeValidator — Checks whether the value entered is within some established bounds.

  • RegularExpressionValidator — Uses regex to validate user input.

  • CustomValidator — Allows you to create customized validation functionality.

You should investigate those based on your target or just for practice. To give you an example, take a quick look at the regex validator, which gives you some of this flexibility. This would apply to an HTML form. At a high level, to use it you must set the ControlToValidate, ValidationExpression, and ErrorMessage properties to appropriate values as seen here:

<form id="MyForm" method="POST" runat="server">
  <asp:TextBox id="txtUserName" runat="server"></asp:TextBox>
  <asp:RegularExpressionValidator id="nameRegex" runat="server"
        ErrorMessage="Invalid name">

The regex in the preceding snippet example establishes bounds for a text input field to alphabetic characters, white-space characters, the single apostrophe, and the period. In addition, the field length is constrained to 40 characters.

The RegularExpressionValidator control automatically adds a caret (^) and dollar sign ($) as delimiters to the beginning and end of expressions, respectively. That is of course if you have not added them yourself. You should get in the habit of adding them to all of your regex strings. Enclosing your regex within these delimiters ensures that the expression consists of the accepted content and nothing else.

To control output basically means that you don’t carelessly give away critical data under unexpected conditions. In essence, you ensure that errors are not returned to any clients. You can use the <customErrors> element to configure some custom error messages that should be returned to the client in the face of an unexpected condition. Then you can isolate this behavior to remote clients only by ensuring that the mode attribute is set to remoteOnly in the web.config file. You can then also configure this setting to point to some custom error page as shown here:

<customErrors mode="On" defaultRedirect="CustomErrorPage.html" />

Session Management

When addressing Web apps that seek, or are in need of, strong session management you should be advised that it is generally not easily achieved. Moreover, there is an entirely subjective slice to it based on your given target. Chapter 4 gave you the generic session management best practices and covered areas like randomness and strength. Oddly enough, this is one area where the more custom and complex you get the better off the security of the target is. There will always be the counter argument of support and administration, but right now focus on providing the best security for your target. A great starting point is RFC-2109 (

Here is a simple example of some custom security-driven session management work performed by software engineers David Shu and Cosmin Banciu. It is a simple but powerful example of some solid use of modern-day techniques for session management, and it represents the creativity you should use when advising your clients in an area like state management.

When addressing a J2EE-based application that used cookies for state management in an insecure fashion, they decided to utilize strong encryption for better protection of the client’s data. The target app handled access control via cookie data that constituted the relevant username and the level of authorization to be enforced server-side. The information was stored in a cookie using two values, mail and az (authorization access level). In the data flow of the target app, the cookie is set once upon successful user authentication. The enhancement consists of the strong encryption of the data in the cookie based on a randomly chosen key. The pool of keys is obviously finite, but each new cookie gets its data encrypted with a randomly chosen key. Using this random selection (from the finite set) ensures that the same cookie values will look different to the naked eye. The encryption algorithm chosen for the application was the AES Algorithm (Rijndael). This snippet shows you the key choosing and encryption process:

public String encrypt(String clearString) throws Exception {
   Random generator = new Random();
   //produces a number between 0 and total number of keys - 1
   int random = generator.nextInt(keys.size()) + 1;
   Iterator keysIt = keys.iterator();

   int counter = 0;
   byte[] rawkey = null;
   while(counter < random){
      rawkey = (byte[]);
  AesCrypt cryptor = new AesCrypt(rawkey);
  return cryptor.encryptAsHex(clearString);

The code that actually creates the cookie data in the HTTP response is seen in this snippet:

public void setCookie(HttpServletResponse response) {
   Cookie c = new Cookie("Z3sti5H3AO", ss.getEncryptedValues());

One thing to note here is the setSecure attribute being set to true. This basically enforces the cookie being sent only over an encrypted stream (SSL or TLS). This provides another layer of protection to the overall solution. The getEncryptedValues() method is seen here:

public String getEncryptedValues() {
   String retVal = null;
   try {
      retVal = aesUtils.encrypt(getDecryptedString());
   } catch (Exception e) {
      System.out.println("Encrypting cookie failed" + e.getMessage());
return retVal;

Clearly, the getEncryptedValues() method calls the encrypt(String) method shown earlier. This is what the actual cookie data looks like before remediation:

Z3sti5H3AO=az=StandardUser&[email protected]

After remediation the same cookie data looks like this:


The name of the cookie in this example is not encrypted or encoded in any fashion. A set of randomly chosen alphanumeric characters were chosen and statically coded into the application. This method is meant to confuse hackers into thinking that both the name and the value of the cookie are encrypted when they are not. The delimiter used to separate the az and mail values is “&” but it can be anything you wish and it all depends on the code at hand. To implement the setCookie method call the authentication JSP page does a POST to a Struts-based action (actions/ This action forwards that request to a Java class that has some code that calls setCookie upon successful authentication.

The decryption side of the house looks like this:

public String getCookieValue(HttpServletRequest request,String key) {
   Cookie[] cArray = request.getCookies();
   if(cArray == null){
      return null;
   if(cArray.length > 0){
      for(int i=0;i<cArray.length;++i){
         Cookie c = cArray[i];
            SessionState tmpSS = new SessionState(c.getValue(),crypt);
            return tmpSS.getValue(key);
      return null;
   } else {
      return null;

The SessionState constructor actually handles the call for the decryption via an init method shown here:

private void init(String encryptedString, AESKeys utils) {
   this.aesUtils = utils;

   if (encryptedString != null && encryptedString.trim().length() > 0) {
      String decrypted = null;
      StringTokenizer tokens = null;

      decrypted = aesUtils.decrypt(encryptedString);
      tokens = new StringTokenizer(decrypted, "&");

      while (tokens.hasMoreTokens()) {
         String nameValue = tokens.nextToken();
         int index = nameValue.indexOf("=");

         if (index >= 0) {
            String key = nameValue.substring(0, index);
            String value = nameValue.substring(index+1);
            addValue(key, value);

And the actual decrypt method being called from init is shown here:

public String decrypt(String encryptedString) {
   String retVal = null;

   boolean success = false;
   Iterator keysIt = keys.iterator();

   while(keysIt.hasNext() && success == false) {
      byte[] key = (byte[]);
         AesCrypt cryptor = new AesCrypt(key);
         retVal = cryptor.decrypt(Hex2Byte(encryptedString));
         success = true;
      } catch(Exception exp) {
         success = false;
   return retVal;

To ultimately tie this into the target app, something akin to the following was dropped into its code base. It exists on every JSP page within the target Web application:

<%@ page import= "coreservlets.AppSecurity" %>
   AppSecurity as = new  AppSecurity();
   if (!as.isValidUser(request)) {
<%@ taglib uri= "/tags/struts-logic"  prefix= "logic"  %>
<logic:redirect   forward =" LoginError " />

The AppSecurity class is the class that interfaces with all necessary elements of this solution. The class holds the setCookie and getCookieValue methods that you have already seen. The isValidUser method simply calls upon the getCookieValue method in such fashion:

if(getCookieValue(req,"mail") != null && getCookieValue(req,"az") != null) {
   return true;
} else {
   return false;

At that point the target app started operating with a different level of protection. This was merely an example of a real-world solution. The lesson for you to learn is that creativity coupled with an understanding of the technology at hand can lead to greater levels of Web application and data security.

Previous Section
Next Section

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows