Apple Business guidance for capturing locale encodings
Locale
Locale specifies the language used to express a particular attribute of a location. Locale has three components:
- Primary language tag
- Script subtag
- Region subtag
Locale Syntax
{2-character-primary-language-tag}-{3-character-script-subtag}-{2-character-region-subtag}
Locale - Simple
{
"displayNames": [
{
"name": "中国银行",
"locale": "zh",
"primary": true
},
{
"name": "Bank of China",
"locale": "en",
"primary": true
}
]
}
Locale - Complex
This location has:
- Two primary display names in different languages
- Transliteration of the arabic text (primary language tag + script subtag)
An English translation "Happy Yemen Restaurant" must not be present because the location, by that name, does not exist.
{
"displayNames": [
{
"name": "Yemen Cafe",
"locale": "en",
"primary": true
},
{
"name": "مطعم اليمن السعيد",
"locale": "ar",
"primary": true
},
{
"name": "Mat‘am al-Yaman al-Sa‘id",
"locale": "ar-Ltn",
"primary": false
}
]
}

When locale is required:
- Primary language tag MUST be present
- Script subtag MAY be present
- Region subtag MAY be present
Primary Language Tag
Primary language tag is "...used to help identify languages, whether spoken, written, signed, or otherwise signaled, for the purpose of communication." RFC 5646
Region Subtag
Region subtags are "...are used to indicate linguistic variations associated with or appropriate to a specific country, territory, or region. Typically, a region subtag is used to indicate variations such as regional dialects or usage, or region-specific spelling conventions. It can also be used to indicate that content is expressed in a way that is appropriate for use throughout a region, for instance, Spanish content tailored to be useful throughout Latin America." RFC 5646
Although optional, region subtags facilitate selection of content that's most relevant to readers/listeners of a particular language associated with a "...country, territory, or region." For example, English (as spoken in USA) versus English (as spoken in Great Britain); Spanish (as spoken in Mexico) versus Spanish (as spoken in Spain), and so on.
Region subtag SHOULD be included when all of the following are true:
- A linguistic variation of a language has a verifiable association with the country/region identified by the region subtag.
- The localized text exhibits characteristics attributable to the linguistic variation.
- The linguistic variation is not exclusively a difference in the audible accent.
Region subtag MUST be included when the following is true:
- Speakers of mutually unintelligible varieties of the same language benefit from an encoding which identifies the regional distinction.
Whether a language is native to an existing country, a region within a country, or is dispersed, region subtag SHOULD NOT be included when any of the following is true:
- No verifiable linguistic variant exists. The language may exhibit a dialect continuum but at no "distance" does the language cease to be mutually intelligible
- The language is a pidgin. For example
- The language is a blend of words ("portmanteau") from different countries and their respective languages
Primary Language Tag & Display Names
Ordinarily, identifying the language of descriptive content is straight forward. However, determining the language of a name may present special challenges and require:
- Identification of the character-set
- Within a character-set, filtering to likely languages
- Identification of the language that is native or officially recognized in a location's country
- Identification of the location's country
- Consideration of the name's societal reach (local vs global)
Example Names and Expected Locale
Expected locale encodings which take into consideration the previously mentioned factors.
| Name | Country | Expected | Considerations |
|---|---|---|---|
7-Eleven | Dubai | en | Global reach and character-set |
Şok | Türkiye | tr | Local reach, character-set, language of content |
Migros Jet | Türkiye | tr | Local reach |
Pret A Manger | France | fr | Character-set, language of content. Global recognition makes en also acceptable |
Pret A Manger | USA | en | Global reach and character-set. fr acceptable as a non-primary name |
Region Subtag and Display Names
Example Names and Expected Locale
Expected locale encodings when region subtag is present.
| Name | Location Country | Locale | Comment |
|---|---|---|---|
Pret A Manger | France | fr or fr-FR | Note |
Pret A Manger | GB | en or en-GB | -As above- |
Pret A Manger | USA | en or en-US | -As above- |
Common Misunderstandings
A region subtag component of a locale must not be used to identify where in the world a display name is physically located. It is strictly used to identify where a linguistic variation of the primary language originates from. Primary language tag and region subtag, together, identify the linguistic variation that's used to express a particular attribute of a location.
Example: McDonald's/Macca's, Australia
"Macca's" is established within the respective cultural lexicons of Australia and New Zealand.

Is Macca's a "linguistic variation" and eligible to be encoded as en-AU?
Let's look at the broader context. The divergence and separation of Australian English (en-AU) from British English (en-GB) does not mean that new words entering the Australian vocabulary are strictly Australian. "Boomerang" and "Aboriginal" are culturally and historically associated with Australia. These words are members of their vocabulary but they can't be claimed to be words only belonging to the Australian language.
Macca's has a cultural and business association with Australia. Like "boomerang" and "Aboriginal," the word "Macca's" doesn't soley belong the Australian language.
The correct locale for Macca's in Australia is en, with no region subtag.
To further drive this point home using a non-English speaking country, if Macca's opened in the Philippines, a linguistic variation of the English language (en-PH) is not established. Furthermore, we would not petition the ISO to add Australian as a new primary language tag to their registry because we wanted to encode the display name for Macca's in the Philippines as au-AU or au-PH. In this theoretical scenario, a new word would enter the Filipino language. The locale may be fil. A non-primary "copy" of the display name may be encoded as en in this theoretical scenario.
Example: Yemen Cafe, Brooklyn, NYC
- A Yemeni owner's location has physical signage in Arabic and English
- Location's address is in Brooklyn, New York
Is the display name a "linguistic variation" of Arabic and eligible to be encoded as ar-US?
No.
ar-US is not the correct locale. A US variation of Arabic is not formally recognized. ar is sufficient but the more precise encoding of ar-YE is acceptable.
{
"displayNames": [
{
"name": "Yemen Cafe",
"locale": "en",
"primary": true
},
{
"name": "مطعم اليمن السعيد",
"locale": "ar",
"primary": true
},
{
"name": "Mat‘am al-Yaman al-Sa‘id",
"locale": "ar-Ltn",
"primary": false
}
]
}

ISO 639-1 Encoding Limitations
ISO 639-1 codes have known limitations. Mutually unintelligible varieties of the same language may exist in the same country or region. In these scenarios, the combination of 639-1 primary language tag and region subtag is not deterministic.
For example, the encoding for Omani Arabic is ar-OM which identifies Omani Hadari Arabic. However, with an absence of 639-1 codes to identify certain regional languages, it would be practical to use ar-OM to identify Gulf Arabic or Bahrani Arabic, to name just a few, which are varieties of Arabic also spoken in Oman.
Although ISO 639-3 codes offer a precise way to encode languages, Apple Business does not fully support them at this time.
Occurrences
Related
| Name | Description |
|---|---|
| Countries and languages | Information about supported languages, scripts and regions. Includes unsupported regions. |