Validate and Escape Data to Prevent SQL Injection
So we’re back on the topic of SQL Injection, and how to prevent them with PHP. This subject, while often discussed, is not taken as seriously as it should be by many a web-developer, whether it be because of laziness, or merely due to lack of knowledge. Assuming the latter is the most common cause, I felt that I should elaborate a little more on my previous article on the topic.
If you don’t know much about SQL Injections, you may be interested in reading my earlier article covering the basics of what SQL Injections are, how they are carried out, and how to prevent them. You can find that article here:
Ok. So SQL Injections. How can we further prevent them? Well, let’s go over what we know.
- The majority of all web applications interact in some fashion with a database.
- The data that goes into a database, generally originates from an outside source, usually an end-user inputting data in a form
- End-users are NEVER to be trusted
So given those three basic facts, we can deduce that the best place to begin preventing SQL Injections is at the end-user. So what can we do? We can’t prevent an end-user from attempting to enter in whatever data they want into an input field (actually we can with javascript, but that’s a topic for another day) – but we can make sure we don’t let bad data go anywhere near our database.
In the previous tutorial, we went over using mysql_escape_real_string() to sanitize our $_GET and $_POST variables. This is still the most crucial part of preventing an attack. However, there are things we can do before our data gets there. What am I talking about? Form validation. If we know what fields are being filled out, we generally know what kind of data we are expecting, and we can verify that it is what it should be.
Take the following form for example:
myform.html
1 2 3 4 5 6 7 |
<form action="validate.php" method="post" name="my_form"> First Name: <input type="text" name="fname" /> Email Address: <input type="text" name="email" /> Number of siblings? (numbers only) <input type="text" name="siblings" /> <input type="submit" value="validate" /> </form> |
We have three simple text input fields, but each one has a specific format that we can check against. We can assume the following three things about the data we will be receiving.
- The first name will most likely not contain any numbers in it
- The Email address will contain any variety of characters, both letters and numbers, but will have only 1 “@” symbol.
- The sibling count will be a number!
Given those asumptions, we can validate the data upon form submission, to make sure we are working with properly entered data. Below I’ve thrown together a few functions, in all of their simplistic glory. Given the scope of this tutorial, I want to make sure that everyone understand what is happening and where.
validation_funcs.php
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
<?php // makes sure argument is alpha-string. function check_string($name) { $name = preg_replace(‘/[\s]+/is’, ‘ ‘, $name); $name = trim($name); return preg_match(‘/^[a-z\s]+$/i’, $name); } // validate email format. function check_email($email) { $regexp='/^[a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}$/'; return preg_match($regexp, trim($email)); } // make sure argument is number, and not // greater than defined limit. function check_siblings($count) { if(!is_numeric($count) ) return false; // lets limit the sibling count. if( $count --> 20 ) return false; return true; } // escapes an entire array function mysql_real_escape_array($t) { return array_map("mysql_real_escape_string",$t); } ?> |
Each of the first three functions takes an argument, validates against what it was made to validate, and returns a true or false. In the first two functions we are making use of regular expressions. For further reading on regular expressions, and more specifically on the two functions we are using (preg_replace and preg_match) see these links (via PHP.NET:
The last function is a handy little function that escapes an entire array, saving you the trouble if having to escape each argument one by one. See example below for usage.
So now what? Well? We validate our data!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
<?php // include the file with our functions in it. include "./validation_funcs.php"; if( check_name( $_POST['fname'] ) == false ) die( "NAME IS INVALID!" ); if( check_email( $_POST['email'] ) == false ) die( "EMAIL IS INVALID!" ); if( check_siblings( $_POST['siblings'] ) == false ) die( "INVALID SIBLING INPUT" ); // Escape the entire $_POST array in one go! Make sure that // you don't escape the variables again, or you risk double-escaping // them - which essentially cancels each other out leaving you at // risk! $_POST = mysql_real_escape_array( $_POST ); echo "Everything is good! Proceed with data insertion."; ?> |
Now this is by no means a full-fledged way to validate your data. Actually, it’s pretty crude, but I wanted to keep it simple for introduction’s sake. To sum it all up, if you validate your data, and then escape it, you can be that much more sure that unexpected queries wont go anywhere near your valuable data. Escaping is crucial!! But taking the extra step and validating it first helps make your web application that much more secure.
For more in-depth articles on form validation, see the following links:
Questions? Comments?
There are 3 comments.
super
Nice article, you´re never completly protected from hacking so extravalidation is always usefull
Good blog
By submitting a comment you grant digifuzz.net a perpetual license to reproduce your words and name/web site in attribution. Inappropriate and irrelevant comments will be removed at an admin’s discretion. Your email is used for verification purposes only, it will never be shared.