Should web forms allow invalid input?

When completing online forms, many online applications will instruct users how to format their input, detect whether the input is formatted correctly, and issue error messages if it’s entered “wrong” forcing the user to make corrections or (worse) start over.

Examples: credit cards, “don’t enter spaces“; phone numbers “no punctuation!” or “must enter as (123) 456-7890“; SSN / SIN / NIN “no spaces!” or “must include spaces!“, etc.

Why go to the effort of warning, parsing, detecting, scolding, and blocking users when, for many of the most common input elements, the software can simply accept the data as-is and scrub it into acceptable input with near-trivial transformations? Especially since there is a night-and-day difference between allowing users to enter standard, accepted, published formats — and telling the user that they are wrong and must re-do their work.

Why does this input technique seems ubiquitous? Explicit requirements? A common design pattern? Modern software layers? Specific web technologies? Cost? Risk aversion? Inexperienced developers? Offshoring? Or do many developers simply see no difference — these are equivalent solutions ?

Given that it seems like such a poor user experience, why does it continue to be so common?

9

Because user interfaces are often built by programmers. Programmers tend to think of user interfaces like interfaces to any other subsystem: Input should be provided in an unambiguous specified format, and invalid input should be rejected.

Supporting multiple different input format or flexible formats (e.g. optional dashes and spaces) does not provide any value in systems integration, but only increases complexity and the risk of undetected errors.

An UX expert will argue that human-computer interfaces are different than systems integration, and having a more flexible input format provides value in a user interface. But a typical developer without UX training will not think like this as default.

Note that supporting only a single unambiguous format is the simplest and safest. Flexible formats requires some design decision (e.g how many dashes, spaces and parentheses do we allow in a phone number? Are they only allowed at particular positions or everywhere?) So there is a certain (small) cost compared to only allowing a single input format.

1

Technical reasons

The easiest way to put a form on line is to design it in HTML. The browser is left with the responsibility to collect data, and post it to the server when the user submits it. It’s fast to develop, cheap, and doesn’t need much expertise. All this at the expense of user experience due to the asynchronous processing by the server.

Having interactive input control, validations and warnings require more work. Either on the client side (e.g. JS or other scripting languages) or on the server (e.g. especially for more complex validations, that depends on previously entered data). But then with active communication between client and server (e.g. ajax) and more bandwidth. The user experience is much better, but it requires more expertise, more time and more costs.

Ambiguity, complexity and arbitration

Some data is ambiguous. Date typically are entered DD/MM/AAAA in some parts of the world, MM/DD/AAAA in other parts (and AAAA-MM-DD in ISO’s world). And old users (me ?) still try AA instead of AAAA. So 10/12/16 would be highly ambiguous and too risky to interpret freely.

One could imagine adding disambiguation dialogue. But the third time the user would have disambiguate something that appears clear to him, he’ll start to get upset. SO either you keep it complex, or your let your web app learn the preferences from the user. One could allow an even easier dialogue “next Monday” is not ambiguous, but it would require more intelligence to analyse the answer.

And here come the chatbots that will make webforms belong to the past…

Poor requirement analysis and lack of anticipation

Not a week passes, that someone in my company asks for more controls on some fields. “It will avoid many stupid errors”. Yes ! But it might also avoid many valid input that where not considered when the control was asked.

Typically, international phone numbers, foreign addresses, foreign postal codes, are frequently forgotten by local companies that are not used to handle such cases every day (I live abroad and encounter such cases at least once a quarter !).

But please, don’t blame the programmers !!!

It would be easy to blame the wrong folk: “It’s IT’s fault”, or “It’s lazy or incompetent programmers”.

I can’t agree with that. In most cases a customer or a manger comes and says: “I want this new webform for yesterday, and it should cost less than you will request to me“. And the programmer will then look for the fastest, easiest and cheapest solution… (go to paragraph 1)

I really think that businesses start to acknowledge the importance of UX, when they fear the risk of loosing customers because competitors do UX better. Only then is there enough investment on this kind of issues.

3

All of your suggestions may be the reason for the described phenomena and often it will not have been a concious decision to make a web form behave like that, but… in general, in any type of communication, it is a bad idea to go with “I think I know what you mean”.

The more explicit you force the user to be, the more likely it will be you will get what the user intended to provide.

If you allow no dashes or no spaces, it is easier to miss a character and not notice it. If you ask for an email address once or allow the second one to be copied and pasted, it is easier to type it wrong and not notice it. Yes, it is pushing work to the user but for a good reason, the developer cannot possibly know what the right data should be and guessing is not what we like to do. So the best you can do is to make it clear upfront what the system expects from the user and make him “spell it out” rather than let him mumble it and have you fix it later, hoping you will get it right.

Why does this input technique seems ubiquitous? Explicit requirements?
A common design pattern? Modern software layers? Specific web
technologies? Cost? Risk aversion? Inexperienced developers?
Offshoring? Or do many developers simply see no difference — these
are equivalent solutions ?

There is no good technical reason. The reason why you often see this is that many programmers are lazy programmers who don’t think or care about end user usability.

If the data can indeed be made acceptable by simple transformations (eg: removing or adding a dash or spaces in telephone and social security numbers) it absolutely should. The fact that many sites don’t is because, frankly, they were written by bad programmers.

Of course, that’s not a 100% true assessment. Sometimes the best solution takes more time to solve than the “good enough” solution, and business constraints don’t allow for a best solution.

It’s the same reason why you see messages all the time of the form “you have 1 item(s) …”. The computer is smart enough to know whether to pluralize or not, but a programmer is just too lazy or just doesn’t care.

1

Given that it seems like such a poor user experience, why does it continue to be so common?

What about:

Laziness…

… and amateurish work in general. And the fact that many programmers don’t care about user experience.

Take phone numbers. In my country, national phone numbers are composed of ten digits, the first digit being necessarily zero. While a number can be entered as:

  • 0123456789
  • 01 23 45 67 89
  • +33 1 23 45 67 89
  • +33 (0)1 23 45 67 89

most websites accept only the first form, independently of the fact that:

  • It’s difficult to enter and error prone (half of the time when I type my phone number using the first form, I make a mistake… the other half I simply intentionally enter a number which doesn’t exist).

  • It makes it impossible for a foreigner to use the form (register on a site, for instance).

Instead, what a professional developer would do is to allow any phone number¹ to be entered, and then try to parse it. Actually, it’s not even that hard: Twilio does a great job of parsing numbers for you.

The same applies to postal addresses: Google API does a great job of parsing an address represented as a string.

However, it makes sense to be strict when parsing some types of fields. For instance, using a strict format for a date can be useful if you deal with multiple cultures: “04/07/2016” means July 4th, 2016 in France, but April 7th, 2016 in USA, so letting the user enter the date in any format and trying to figure out what the date is may lead to unwanted results.

Talking about spaces…

In fields which are just a bunch of numbers or characters, such as phone numbers, spaces matter. Try a little experiment: find any Windows or Microsoft Office disk with a serial number on it. Copy this number in a text editor and remove spaces (you’ll end up with 25 random characters). Try to copy it by hand from the screen to a piece of paper (without moving a cursor on the screen). Now try again while adding spaces or newlines every four characters. Which one was easier?

Unfortunately, the value of spaces is underestimated by many programmers. Not only many fields don’t accept spaces (phone numbers, credit card numbers), but many displayed numbers are missing spaces, which matters a lot when those numbers should be spelled during a phone call, or copied by hand (and if they aren’t, why display them at all?)

For instance, my last order from Amazon displays a shipment number as 6Z00148794199. Great that I can simply copy-paste it to the courier website for tracking; less great is the fact that when I received the shipment, the agent had to copy the code by hand from my mobile phone on her computer in order to access the information. How difficult would it be for the company to display the number as 6Z 001 487 941 99?

The same order is identified by Amazon as 402-9261109-4961946. At least they made the effort to add dashes. Unfortunately, those dashes don’t help much if someone has to call Amazon’s support and spell the ID. Would it be so hard to write it as 402 926 110 949 619 46 instead?

Recently, I had a trip in Scotland using ScotRail. I ordered the tickets through their website, and received an e-mail containing the number CFJRC9T9 to use to retrieve the tickets at the railway station. Now how easy is to type eight meaningless characters on a self-service ticket machine, with people walking all around? Wouldn’t it be much simpler if it was CF JR C9 T9 instead?


One can be quite liberal in accepting truly anything sane. Limits may be purely technical; for instance, the length of the field can be limited to 50 characters—the probability of a longer value to be a wrong number is close to 100% (if not exactly 100%.)

1

Sometimes it is not always clear what the user intent is or what date, number, etc they are trying to enter.

Sometimes frameworks such as Ruby on Rails are used by programmers not skilled in User Experience techniques on web forms to aid users with data formats.

Also some programmers do not know the HTML5 data types not available on the web that aid in entering the correct format and are used by client-side devices.

Sometimes internationalization and localization can make validation harder to perform client-side

Web text and dropdowns are generic in themselves, often just using basic input and select elements so to correctly validate the actual semantic use of the elements, e.g. ‘age’ or ‘color’ requires extra work by the programmers.

Good practices can help improve the situation, such as

  • dropdowns for limited and fixed set of allowable entries (avoiding text fields and their inherent validation issues)

  • erros by text not just red, use bold and ‘x’

  • ajax for each field showing if valid ‘as you go’ instead of after form submission
  • placeholder text describing data and any format for text input fields

1

Because…

  • it’s faster
  • it’s easier
  • it requires knowing a lot more than alert('Fail!');

And nobody cares enough to allow some combination of providing more time to solve problems properly, insist on stronger work from the developers, or hire developers that are actually capable of doing the work.

Why does Chase bank only allow alpha-numeric passwords even though that’s just a core library string-escape method problem in most cases and makes all their accounts hugely vulnerable to rainbow tables in the event of a breach? There’s no mysteries here. It’s just people sucking.

Also, I really need to change my bank.

It’s not laziness. As you point out its the same amount of coding to parse the input either way. But I wonder if any of you have had this conversation…

QA. “ok and then I typed ‘hello world’ in the telephone field and it just saved it to the db”

Dev. “Well, there are no requirements for checking the users telephone number. what do you want it to do?”

BA. “hmm It should throw a nice error when saying only numbers please”

QA. “what about american style phone numbers with hyphens and brackets etc?”

BA. “ok, ‘only numbers, brackets spaces and/or hyphens'”

QA. “what about international prefixes? you need a plus”

BA. “ok ‘only numbers, brackets spaces and/or hyphens and pluses'”

UX. “why not let the user type whatever they like?”

QA. “we would have to parse any phone number format in the world!”

BA. “dev, whats the estimate for you to write something which parses any phone number in the world”

Dev. “do you have a list of the formats?”

BA. “no.”

Dev. “infinity?”

BA, “what about the message?”

Dev. “1 hour”

BA. “Ok just do the message”

The fact of the matter is that the first solution, just saving whatever the user typed in was fine.

But the process of testing and requiring specifications, pushed the team into first requiring that numbers be validated and then into writing an easy to achive specification.

2

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *