---
title: Towards simple language on the web
date: 2017-06-26
tags: [a11y]
description: It has been two years now that I was tasked with creating a website with a "simple German" translation. Little did I know back then that this was the beginning of a fantastic journey.
---

It has been two years now that I was tasked with creating a website. One detail
of this project was that it would have a "simple German" translation in
addition to a German one. Little did I know back then that this was the
beginning of a fantastic journey.

## The Standards

First things first. There is a standard that requires us to provide simple
language alternatives:

> **Reading Level**: When text requires reading ability more advanced than the
> lower secondary education level after removal of proper names and titles,
> supplemental content, or a version that does not require reading ability more
> advanced than the lower secondary education level, is available.
> — [WCAG 2.0][1]

On the web, an alternate version of a page in a different language is usually
marked up as a link somewhere in the document's head:

    <link rel="alternate" hreflang="de-simple" href="…" />

This allows browsers and search engines to automatically discover content in a
language you understand. Unfortunately, the language tag "de-simple" did not
exist.

Most people know the language tags defined in [ISO 639][2], but the web
actually uses language tags as defined in [BCP 47][3]. These are based on
ISO 639 tags, but they can be further refined with script, region, variant, or
private-use subtags. Common examples for this are "en-GB" (English as used in
Great Britain) or "sr-Latn" (Serbian written using the Latin script). The [full
list of subtags][4] is maintained by IANA.

Of course the website that had started this whole topic for me could not wait.
So I implemented it using the private-use tag `de-x-simple`. But this meant
that browsers would just ignore the `x-simple` part. I wanted to find a
solution that had actual benefits for users.

I started reading the archives of the [IETF-languages mailing list][10].
There had already been some controversial discussion on the topic in 2006 that
had died down since. I sent [my initial mail][5] (along with [another one][11]
to the Web Accessibility Initiative) in September 2015; [Michael Everson][6]
helped out with some more tangible proposals ([basiceng][7] and [wpsimple][8])
in October; and sometime in December, the ["simple"
variant subtag][9] was finally accepted.

## Summary of the Discussion

You may ask yourself: "Why did it take three months of discussion to add four
lines to that registry?" There are many reasons for that.

First of all, let me say that some people on that mailing list should seriously
consider fixing their mail setups. It is very hard to follow the discussion if
some key contributers just omit the `In-Reply-To` headers.

That aside, BCP 47 language tags are – similar to many other web standards –
very important. They govern a global network that we all use. It is imperative
to get them right.  And getting languages right is inherently difficult. What
constitutes a language is a political question as much as it is a linguistic
one. Mix that with engineering and you have very few people who can make
informed decisions.

In the case of simple language, it boils down to this: Are there fixed rules
for this variant? And the answer is: Yes, there are more than enough rules:
[Basic English][16], the [US Plain Writing Act][17], [Leichte Sprache][18], and
many more. Unfortunately, these are all distinct systems for specific
languages.  The only thing they all have in common is their intent to somehow
be simple.

This makes sense if you factor in the many different target groups: children,
second language learners, people with cognitive disabilities, non-experts
reading a scientific text, … – they all have have slightly different needs.
Many people therefore argued that a general `simple` subtag would not mean
anything and we should have distinct subtags for the individual well defined
systems instead. But even if different target groups have different needs,
most forms of simplification help all of them:

> […] the problem with divergent E2R [easy to read] user groups is usually solved
> indirectly by just developing websites that are simple enough and include
> reduced amount of the most relevant content. This manner of approach
> guarantees an accessible site or a section of a site for almost all
> E2R-users. — [Sami
> Älli][19]

So in the end, a generic `simple` subtag was accepted. However it was clearly
stated that additional, more specific variants could be added on top of that,
resulting in language tags like `de-simple-leicht`. This would automatically
fall back to `de-simple` if the more specific variant was not available.

> If I'm a user I want "simple" English.  Users could care less about a
> distinction between Voice of America English, Wikipedia Simplified English
> and Basic English.  I just want an English I might be able to understand a
> bit better than normal English.  I can't specify en-US-VoA in
> http-accept-language, because it'll match "en-US" not "en-US-wpsimple".  So
> those tags are useless to the user.  (However if we wanted to consider
> en-US-simple-VoA and en-US-simple-odgenbe and en-US-simple-wp that might
> work). — [Shawn Steele][12]

## Is a new language variant the Answer?

The question remains whether the accessibility issue can be solved on a
language level alone.  Some people argued that complex websites cannot be
simplified just by using a different language:

> Language can be made simple or complex. That's not the main problem in many
> cases though. The bigger problem is that complex ideas will remain complex,
> even when described in simple language. — [Paul Bohman][20]

Other people argued the exact opposite, and it seemed to be a matter of
personal believe:

> Explaining complex ideas is difficult. Explaining them with simple language
> is more difficult. But complex ideas can be explained in simple language.
> Thousands of very good teachers do that every day. — [Chaals McCathie
> Nevile][21]

And it is not only the concepts presented in a text that can be an issue.
Navigation, layout, typography, interactions and much more can considerably add
to complexity as well:

> […] understandability of text seems to be related to its presentation for many
> people. For example, it is not just the complexity of the text itself but how
> well it is organized (for example using headers, lists, and structures), and
> how it is presented (font, spacing, width, etc.). — [Shadi Abou-Zahra][22]

## Adoption

More than a year later I came back to see who had adopted this new language
tag. Unsurprisingly, nobody seemed to have noticed. We had failed to get the
relevant experts involved in the discussion, so now they either did not know
about the new language tag, or – even worse – they did not care.

I sent [yet another mail][13] to the Web Accessibility Initiative asking about
including my pattern as recommended technique for WCAG 2.0. The answer was
simple: If browsers do not support it, we cannot recommend it. A classic
chicken-egg dilemma.

Someone proposed that I should write a browser extension that allows to
automatically switch to the simple version of a website if available. I created
a [simple extension][14] that should
work in chrome, firefox, and some other browsers. Unfortunately I am very bad
in advertising my projects. So nothing came of it.

So what are the next steps for me? I will try to contact big projects that
publish simple content already. An obvious choice is [wikipedia][23] (which had
[ferocious debates][15] about deleting its simple version). But there are
probably more publishers that might benefit from this.

## Conclusion

What have we learned? Naming languages is a fascinating and difficult area.
Simple language is a controversial topic that cannot yet find consensus in the
W3C-WAI, IETF-languages, or wikipedia communities. And it is not actually that
hard to get involved in web standards, even for some non-expert like me.

As a final word, let me say that you will probably never need the `simple`
variant subtag. In the vast majority of cases you should have a single,
accessible version instead of creating custom alternatives. If you want to know
how to keep your texts simple, the [US Federal Plain Language Guidelines][24]
are a good place to start.

[1]: https://www.w3.org/TR/WCAG20/#meaning
[2]: https://www.iso.org/iso-639-language-codes.html
[3]: https://tools.ietf.org/html/bcp47
[4]: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
[5]: https://mailarchive.ietf.org/arch/msg/ietf-languages/hX9pMnjrDzlSa8igy7y1Jqe_qjU
[6]: https://en.wikipedia.org/wiki/Michael_Everson
[7]: https://mailarchive.ietf.org/arch/msg/ietf-languages/szrOQLvRScrQ2Ai78PSV3QCnJGU
[8]: https://mailarchive.ietf.org/arch/msg/ietf-languages/dE-UxnEbNsQnAoy_wk2lkEfz9LU
[9]: https://mailarchive.ietf.org/arch/msg/ietf-languages/yVTGZS-6IfBnnk8hmz1n36jVpgM
[10]: https://www.ietf.org/mailman/listinfo/ietf-languages
[11]: https://lists.w3.org/Archives/Public/w3c-wai-ig/2015JulSep/0105.html
[12]: https://mailarchive.ietf.org/arch/msg/ietf-languages/zll03aA6zz_glQ8ns0rWoEmjQ68
[13]: https://lists.w3.org/Archives/Public/w3c-wai-ig/2017AprJun/0029.html
[14]: https://github.com/xi/simple-alternative
[15]: https://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Simple_English_Wikipedia_(2)
[16]: http://ogden.basic-english.org/
[17]: http://www.plainlanguage.gov
[18]: http://leichte-sprache.org/
[19]: https://www.w3.org/WAI/RD/2012/easy-to-read/paper5/
[20]: https://lists.w3.org/Archives/Public/w3c-wai-ig/2015JulSep/0112.html
[21]: https://lists.w3.org/Archives/Public/w3c-wai-ig/2015JulSep/0113.html
[22]: https://lists.w3.org/Archives/Public/w3c-wai-ig/2015JulSep/0119.html
[23]: https://phabricator.wikimedia.org/T110190
[24]: http://www.plainlanguage.gov/howto/guidelines/FederalPLGuidelines/FederalPLGuidelines.pdf
