Developing useful and professional HTML forms requires a lot more work than simple examples might suggest. There are a whole range of problems such as handling initial data, client-side validation, server-side validation, optimistic concurrency and performing the final action. This post explores the various steps.
The initial GET request
The initial GET request is a simple subset of what we need to consider. This is made up of two parts:
- Load any initial data and convert it into a form suitable for rendering.
- Render the form as HTML and transmit that to the client.
Gathering initial data is simpler for forms that are initially blank. Forms that provide editing will need to get the original data, convert it to a values that are suitable for HTML inputs and then render the form.
The POST request
When users fill in the form they post their results back to the server. Regardless of the client-side code, we need to do a number of things.
- Parse and validate the entered values.
- If valid, perform the intended action.
- If invalid, render the form with error information.
If the intended action is an edit to an existing logical record, we might need some of the original data from the database, and we might need to convert the POST data into a form suitable for our application or database.
If the data is invalid then error information needs to be rendered back with the fields. You need to know which fields are invalid and how to help the user to correct what they’ve entered, and this needs handling in a way that can be rendered back into the form.
HTML forms support key-value pairs or text, that’s it! Any structure is purely coincidental and is not understood by the browser. There is also support for files, where the value is binary and comes with a content type, but that’s beyond the scope of this post.
Often, when working with structures data such as objects and arrays field names can be generated according to a pattern. Using Node JS modules such as
flat help with bi-directional conversion of structures. More complex conversions will involve numbers, boolean and dates.
The conversion only needs to be done on valid data. If you are converting invalid data it is often a mistake.
Rendering and validation
Rendering is needed for both the GET and POST requests so it makes sense to use the same trunking to do this. The only real difference is that the POST request involves rendering error information. This is where forms get complicated as you might add classes to fields to show their status and include a header with more complex information.
You might also have some optional client-side helpers that make the user experience better and provide client-side validation and assistance.
The validation rules are often rendered to the client as HTML5 or ECMAScript but must also be enforced on the server to avoid security issues. Inconsistencies in these rules create opportunities for users to exploit the system and for valid data to be rejected.
Keeping an abstract data structure that represents the fields, their validation rules, rendering hints, current value and any error status, makes it easier to develop rendering templates field and provide consistent rendering and validation for all conditions and field types. It also makes debugging much easier as you can inspect this structure more easily that looking at the complex HTML output.
Optimistic concurrency is extremely important when editing data on the web. It allows people to modify data out-of-process without the server keeping locks.
There are two common models, although sometimes something more complex is done, these are:
- Fail if modified and
- Last one wins.
The first approach requires sending information with the form that lets you determine if the data on the server has been changed by another user or process since the form was presented. This might mean a hash or revision identifier of a logical record, such as that used by Couch DB, or it might be a complete copy of the original data so that data can be individually checked.
The second approach is simpler and often the most appropriate, depending on the granularity and nature of the data. With either approach, one user loses out, the difference with this approach is that the first user loses and doesn’t get to know about it. When considering how and why data is updated there are many scenarios where forcing the second user to lose and try again gives the same result as ignoring what the first user did.
Some database technology includes optimistic concurrency systems. It is often useful to combine the initialisation step into the POST request to get what we need to overwrite existing data without error, especially when implementing a last-one-wins approach.
This post has been written to cover some of the issues that can arise when developing HTML forms, it is not comprehensive and assumes the usual level of support in your HTTP server for dealing with general HTTP request handling issues.