HTML File Uploads

Filed: Thu, Dec 28 2006 under Programming|| Tags: upload php files

One of the most powerful HTML form fields is <input type='file'>, which allows your visitors to specify a file on their computer that they would like to upload to your server. This article will show how to transmit a file from the browser's computer to the server with straight-up HTML and PhP.

Back when I was running an ISP in the mid 90's one of the very first web applications I ever made allowed for files to be uploaded with the, then, revolutionary, Netscape only <input type='file'> HTML form tag. Mind you this was before PHP and back when Netscape owned the browser world and believe it or not, IE didn't support type='file'! (They didn't implement it globally without a patch until 4.0, which caused me no end of grief!).

Well here it is, over ten years later and other than the fact that IE now supports <input type='file'> and PHP can recognize and handle this type of data, not much has changed. Cascading Style Sheets have come and gone leaving the file fields virtually untouchable (much to the chagrin of web designers everywhere), and even AJAX is powerless to alter the behavior of this dinosaur without resorting to special activeX objects (IE only) and/or signed code.

There's a good reason for this really. Your browser is a sandbox the rest of the world plays in, but to protect you as best it can, the browser sets up very high and impenetrable walls between whats going on in the browser and the rest of your computer system. In a way, <input type='file'> goes against everything your browser is built to do because it allows the web page to have access (however limited) to your computer's local file system. Which means, in theory, an unscrupulous website could give you exact directions to entering the quicken file which contains all your bank records and then get you to transfer that file right there in the web page.

That doesn't happen very often thank goodness because it requires the user to find and locate the file and deliberately place the filename in the field and then deliberately submit the file to the server. That is a lot of work. So file uploads are mostly relegated to what they've always been intended for -- a legitimate way for the user to transfer data to the server. This has made web 2.0 sites possible and contributed greatly to their successes, and, of course, file uploads are an incredible boon to intra-office networks; but because it runs counter to everything your browser's security model stands for, the limitations on this tag are numerous.

As powerful as it is, you as a programmer and web designer will always be bumping into the restrictions of the tag which means it won't integrate with AJAX and you won't be able to use CSS to easily style it to match the rest of your site.

And with that history lesson concluded it's time to learn how to use and implement this valuable tool. For this tutorial you'll need access to a browser -- you do have a browser don't you? You will also need access to PhP on your web server.

We'll get started by defining a basic form which has a few more attributes than an ordinary form.

<form ENCTYPE="multipart/form-data" METHOD="post" ACTION='sendfile.php'>
   <input id='uploadName' name='fileName' type='file'>
</form>

The key to making <input type='file'> work is the ENCTYPE attribute in the <form> definition. By specifying "multipart/form-data" we are indicating to the server that we are sending a file as part of the form data. The METHOD must be POST, you can not transfer files with a GET method. And of course with ACTION you specify the program on your server that you'd like to handle the form.

The input field responds to many of the same styling elements as a normal input field. You can specify a horizontal size with 'size=', but you can't specify a CSS width. You can use CSS to specify a font and background color but you can't specify a default starting value.

The contents of the field are READ ONLY by Javascript. You will not be allowed to change the contents of the field in any way via Javascript. And no matter what you do, there will always be a browse button that will be unstylable and unchangeable beside the input field. This field belongs to the user, and the user alone. You can look but you can't touch.

Now given these limitations and the amount of time involved it's no surprise people have been able to work around at least some of the limitations. For instance, this article on quirksmode shows how to style the browse button with the graphic of your choosing. And this blog article shows how to hide a file input field after the user has filled it in and insert a new blank field to provide an elegant way to send multiple files without needing multiple (visible) input fields.

Unfortunately, your Javascript can only see the file name the user selected (and in later browsers you can see ONLY the file name, the pathname itself will be hidden as well). This means you can do basic filename checking which means you can ensure the file ends in .jpg if you're expecting a picture but you have no access to the file data itself so you can't verify that the actual file is a jpg, you can't even see how large the file is, that won't be revealed until your server-side program is called.

If you don't like the restrictions on the browser-side, just wait until you hit the limitations on the server side.

PHP allows you to limit the size of the file upload with an invisible tag in your form.

   <input type="hidden" name="MAX_FILE_SIZE" value="10000" />

Now comes the good part. First, since the file size limit is on the user's side a malicious user can simply modify the page, remove the limit and flood your server. You can do additional checks on the server side with the $_FILES['userfile']['size'] variable, however this only happens AFTER the server has received all the data.

What this all means is that if you only want to accept small 10k images there's really not much you can do in PhP if the user selects a 10gigabyte file and hits the submit button. The 10 gigabyte file will start filling up your tmp folder until either the file has been received or something happens to the connection.

The most modern versions of PhP work more closely with the web-server and allow for neat things like file-upload-progress meters which means that handling oversized files is a bit more elegant. Another option is to use one of the many Java upload applets (not Javascript). The downside of course is that not all web hosts use the most modern version of PhP (4.4 seems to be the most common, while you need a version greater than 5 for file meters). Java applets require the user to have Java installed on their computer and you may still run into security problems.

So for the most part, we poor developers are stuck with a nearly impossible to style input field on the browser side, and a nearly impossible to secure environment on the server side.

Now that we know the disadvantages, it's time to pull up our boots and start dealing with all the cool things we can do once we start accepting files. The bulk of our work now is on the server, once we set up the form on the web page there's not much else left to do really. And just to recap, here's that form again in its entirety.

<form ENCTYPE="multipart/form-data" METHOD="post" ACTION='sendfile.php'>
   <input type="hidden" name="MAX_FILE_SIZE" value="10000">
   <input id='uploadName' name='fileName' type='file'>
   <input type='submit'>
</form>

The server side really isn't that complicated. PHP does make things very easy on you. Your file will be stored in the $_FILES array. The first dimension is the name of your file as defined in your form (in this case "fileName" in our example). The second dimension can be either 'name', 'type', 'size', 'tmp_name', or 'error'.

Name is the filename of the file as supplied by the browser. This value is useful but should not be trusted -- ever. You should not use it as the filename for the file on your server, ever, because there's no limit to the mischief a malicious hacker can get into if given that sort of access to your server's filesystem. Type will return information like text/plain, image/png, typical mime-type information about the upload -- the caveat is that different browsers will give slightly different answers. For instance, firefox and opera reports a jpg file as "image/jpeg" and IE reports it as "image/pjpeg" (and people wonder why MS has such a bad rep). Size is the size in bytes of the file (which is clogging your temp directory). 'tmp_name' is the name of the uploaded file which is clogging your tmp directory. And error is any error messages associated with the transfer.

By the time any of your code starts running the file has already been received by the server and stored in the tmp directory as $_FILES['fileName']['tmp_name']; The good news is that your server survived the file upload and after a quick check of ['size'] you can determine if you need to delete ['tmp_name'] or not.

After you've determined the filesize is acceptable and of the expected type you can move it to it's final, permanent destination with the move_uploaded_file command. Here's a small php file to handle the file upload.

<?php
$maxSize=300000;                            // Only save files smaller than 300k
$uploadSize = $_FILES['fileName']['size'];  // The size of our uploaded file
$uploadType = $_FILES['fileName']['type'];  // The type of the file.
$uploadName = getcwd().'/uploadedFile.dat'; // Never trust the upload, make your own name

if ($uploadSize<$maxSize) {              // Make sure the file size isn't too big.
   move_uploaded_file($_FILES['fileName']['tmp_name'], $uploadName);   // save file.
   echo "File saved as $uploadName<BR>";
   echo "It was $uploadSize bytes of type $uploadType"; 
} else {
   echo "File not saved.  It's too big!  Max filesize is $maxSize";
}
?>

This snippet defines the $maxSize as 300,000 bytes, if the uploaded file is larger than that we'll throw an error message and won't save it. We extract the filesize and the file type information from the $_FILES array and store it. Remember that 'fileName' is simply the name of our input field in the browser form. If we had used <input type='file' name='file1'> we would be using $_FILES['file1']['size'].

$uploadName is the most important line. It gets the current working directory (getcwd) and appends '/uploadFile.txt'. Basically it's creating a file name in the current directory of uploadedFile.dat. Even though a filename was uploaded along with the file and we have full access to that name, you should not use that when determining the name of the file on YOUR server. Too many things can go wrong. If you must use the user supplied file name you should STILL generate your own unique filename and then just use a database or lookup file to associate the user's filename with your generated file name.

After our variables are initialized we make sure the file isn't too big and if it isn't we use the move_uploaded_file command to transfer the file from our tmp directory to our new permanent location (and filename), then give some feedback. If the filesize was too big we'll just let the user know we didn't save the file.

As you can see there's a LOT to worry about when you undertake the task of allowing file-uploads in your forms. Between the limitations in the client and the limitations in the server it's very hard to provide a solid user experience. However even with all the caveats and headaches allowing browser based file uploads are still a very powerful tool in your development arsenal.