Module 0311: Uploading files and linking files to a database table

Tak Auyeung, Ph.D.

November 30, 2017

1 About this module

2 Uploading files in PHP

Uploading a file requires the use of a HTTP POST request. HTML forms has a special input element to handle file uploading. From the HTML side, the form may look something like the following:

 
<?php 
  print "<html><head></head><body>"; 
  $strUploadedfile = uploadedfile; 
  $strSubmit = submit; 
  $myName = basename(__FILE__); 
 
  if (isset($_POST[$strSubmit])) 
  { 
    print "formissubmitted<br>"; 
    if (isset($_FILES[$strUploadedfile])) 
    { 
      if ($_FILES[$strUploadedfile][error] == UPLOAD_ERR_OK) 
      { 
        print "fileisreceived<br>"; 
        print "tmpnameis".$_FILES[$strUploadedfile][tmp_name]."<br>"; 
        if (move_uploaded_file( 
              $_FILES[$strUploadedfile][tmp_name], 
              basename($_FILES[$strUploadedfile][name]))) 
        { 
          print "fileismovedsuccessfully<br>"; 
        } 
      } 
    } 
  } 
 
  print "<formenctype=’multipart/form-dataaction=’$myNamemethod=’POST’> 
<inputtype=’hiddenname=’MAX_FILE_SIZEvalue=’300000’/> 
␣␣␣␣␣␣␣␣Selectafiletoupload:<inputtype=’filename=’$strUploadedfile/> 
␣␣␣␣␣␣␣␣<inputtype=’submitname=’$strSubmitvalue=’Uploadnow!’/> 
␣␣␣␣␣␣␣␣</form>"; 
  print "</body></html>"; 
 
?>

In this code, note how the ’enctype’ attribute is explicitly specified. This is necessary because the content of a file is submitted as a part of a multipart HTTP request. Also, the ‘method’ attribute is also specified to be POST because the default GET method does not support file uploads.

PHP makes the processing of file uploads a bit easier. The super global variable $_FILES is specifically provided to make it easy to access uploaded files. With this API (application program interface), there are a few steps to perform:

Several issues must be addressed when a file is moved using the move_uploaded_file subroutine. The first issue has to do with how file (and folder) permissions are set up. This is an issue that a web developer must work out with the administrator of a server.

Once a file is uploaded, Linux systems have a command called file to report back the actual nature of a file. This may be a better method to identify the type of a file rather than the reported MIME type. The PHP function system can be used to execute a Linux command collect the result.

3 Linking image files to a database

A relational database is not designed to store large files. Although most databases support the datatype of BLOB (binary large object), it is inefficient to store large binary objects.

As such, the linking between an image and a database entry can be done using individual files and database fields.

This potentially has a problem due to name collision. A folder cannot contain two files of the same name. Popular names, such as cat.jpg is most likely used.

This is why it may not be beneficial to store files using the original filenames as uploaded. Instead, it is more systematic to utilize the an autoincrementing field of a table to systematically create new file names, now guaranteed unique in a folder.

Because a file is now associated with an entry in a database table, additional attributes can be stored. For example, accessibility fields can be stored so that when the img element is generated, the proper alt attribute can be included.

This approach (of storing files in a folder but linked to database entries) also requires special care when files or entries should be removed. Both the database entry and the actual file should be removed at the same time. Furthermore, the web script should include some kind of validation to check and ensure consistency. This includes periodically checking that files that are linked from the database do exist, and make sure all files have a corresponding database entry.

In a large web application where the number of files to be maintained can be a really large number, files are stored in many different folders for efficiency. This is because looking up a file in a particular folder takes time. When the number of files increase, a hierarchy of folders of files proves to be more efficient than a single folder with all the files.