Angelika Langer - Training & Consulting
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | Twitter | Lanyrd | Linkedin
 
HOME 

  OVERVIEW

  BY TOPIC
    JAVA
    C++

  BY COLUMN
    EFFECTIVE JAVA
    EFFECTIVE STDLIB

  BY MAGAZINE
    JAVA MAGAZIN
    JAVA SPEKTRUM
    JAVA WORLD
    JAVA SOLUTIONS
    JAVA PRO
    C++ REPORT
    CUJ
    OTHER
 

GENERICS 
LAMBDAS 
IOSTREAMS 
ABOUT 
CONTACT 
The Locale Framework

The Locale Framework
The Locale Framework

C++ Report, September 1997
Klaus Kreft & Angelika Langer


 
 

Computer users all over the world prefer to interact with their systems using their own language and cultural conventions. Cultural differences affect for instance the display of monetary values, of date and time. Just think of the way numeric values are formatted in different cultures: 1,000,000.00 in the US is 1.000.000,00 in Germany and 10,00,000.00 in Nepal. If you aim for high international acceptance of your products you must build into your software the to adapt to varying requirements that stem from cultural differences.

Building into software the potential for worldwide use is called internationalization . The Standard C++ Library provides an extensible framework that supports internationalization of C++ programs. Its main elements are locales and facets . In this column we focus on their architecture; subsequent columns will show you how to use and extend them.

Locales and Facets. The abstraction that holds all the information about a certain cultural area and its conventions is called locale . In the Standard C++ Library a locale is a class that represents a container of facets . A facet is an abstraction that contains the information about a certain localization aspect. A localization aspect is a set of related services and information needed for internationalization. An example of a facet is the standard facet numpunct . It holds information about the formating of a numeric value. This facet, for instance, has a member function decimal_point() , which returns the character that is used as radix separator in a given cultural area. For an US-English environment this is‘.’, but for a German environment this is ‘,’.

Facets do not only hold information, they also provide functionality. The num_put facet, another standard facet in the library, formats a numeric value to a sequence of characters; this transformation is done according to the current localization settings, part of which are determined by the numpunct facet mentioned above.

One interesting aspect about the way the locale maintains its facets is its capability to handle facets polymorphically . Lets consider an example. Assume we derive our own facet from numpunct and make it return ‘|’ as radix separator, instead of ‘.’ or ‘,’. Lets also assume that we create a locale object that contains an instance of our newly derived numpunct class. Then we pass this locale object to a function that retrieves the current numpunct facet from the locale object in order to access its decimal_point() member function. An instance of the derived numpunct class will be obtained, its decimal_point() function will be called, and eventually ‘|’, instead of ‘.’ or ‘,’ will be returned. In this sense facets are polymorphic in a locale; a request for retrieval of a facet type, numpunct in our example, yields different results depending on the actual content of the locale.

The implementation of this polymorphic facet selection does not only rely on overwriting virtual functions, which is the most obvious way to implement polymorphism in C++. It is also based on a special framework, called the locale framework in our articles. In this article we are going to describe the architecture of this framework. The C++ standard defines a number of standard facet classes that support the most common internationalization tasks. We have already mentioned numpunct and num_put . In a follow-up article we will give a comprehensive overview of all available standard facets, their functionality and usage. We will then close our introduction to the standard C++ locale with an article that describes how new user-defined facet types can be implemented and used.

Facets Interfaces and Ids

We are going to start the description of the locale framework in the Standard C++ Library with a closer look at facets. Two classes nested into class locale play a central role in the definition of a facet: locale::facet and locale::id .. Let’s see how these classes are defined in the C++ standard:

class locale::facet
{
protected:
explicit facet(size_t refs = 0);
virtual ~facet();
private:
facet(const facet&); // not defined
void operator=(const facet&); // not defined
};
class locale::id
{
public:
id();
private:
void operator=(const id&); // not defined
id(const id&); // not defined
};
The C++ standard defines a facet that can be contained in a locale object in terms of these two classes : "A class is a facet if it is publicly derived from another facet , or if it is a class derived from locale::facet and containing a declaration (in the public section) as follows : static ::std::locale::id id; ." See [ 1 ]for reference. Let us see what role the id member plays.
 
 

Facet Identifications

The declaration static locale::id id; inserts a data member id into a facet class. It provides an identification of a facet interface. What does this mean?

First, let’s agree on some technical terms. We call a class derived from locale::facet, that defines an id member, a facet base class . All classes derived from a facet base class refer to the same static data member id. We call this static data member the facet (interface) identification . At the same time, all of the derived classes implement at least the same public interface as the base class; this is the semantics of public derivation in C++. We call this interface a facet interface . The essence is that all facet classes that implement the same interface as the base class share the same identification.

Can other classes, that do not implement the base class interface, have the same identification, too? The answer is: No! An implementation of a standard locale has to provide a unique value for each locale::id object.

How about identification of facets that are templates? A facet can be defined as a class template. An example is the standard facet numpunct<class charT> . It has the character type as a template parameter because it contains information that is expressed by means of characters, like the radix separator character, and the character type shall not be restricted to type char .

The facet identification is a static data member id declared in the class template, i.e. there is a separate id object for each template instantiation. We have mentioned earlier that the locale::id class guarantees assignment of a unique value for each instance of an id . This ensures that each template instantiation has a unique identification.

As we’ve seen above, all facet with the same identification implement the same facet interface. Also, facets with the same facet interface represent the same localization aspect, i.e. the localization services and information provided by their base class. Hence there is a one-to-one match of facet interfaces, facet identifications, and localization aspects.

Maintenance of Facets in a Locale

A locale is basically a container of facets. The interaction between a locale and its facets is invoked by the user when:

  • the user wants to retrieve a certain facet, that is expected to be contained in the locale object; or
  • the user wants to store a certain facet in the locale object.
Let’s examine how the locale uses the identification of a facet interface described above to deal with each of the situations.

Retrieval of Facets from a Locale

We start with the retrieval of a facet from a locale. Say, we have a function foo() that receives a locale object containing different facets, and each facet describes a certain localization aspect. We want to access the decimal_point() member function of the numpunct facet. For the purpose of facet retrieval from a locale the standard provides the following global function template:

template <class Facet> const Facet& use_facet(const locale& loc); Using this function template, we can implement foo() in the following way (see also sidebar on Explicit Template Argument Specification): void foo(const locale& loc)
{
const char radix_separator
= use_facet< numpunct<char> >(loc).decimal_point();
}
This is the way a user retrieves a facet from a locale. Now let’s see how use_facet can be implemented.

Implementing use_facet . Let’s examine an example implementation. For exposition, we assume that use_facet is a friend of class locale , so that it has access to a private member function of class locale that implements retrieval of a facet from the facet repository contained in the locale. This function might have the following signature: const locale::facet* get_facet (const locale::id&).

You can think of the facet repository as a map with locale::id as the key and const locale::facet* as the value. An implementation is conceivable that uses an instantiation map<size_t, const locale::facet*> of the map class template from the standard, where locale::id allows a conversion to size_t . However, keep in mind that this is only an example; the C++ Standard does not define any implementation issues. A real implementation probably uses a faster data structure for the facet repository.

Here is a tentative implementation of use_facet :

template <class Facet>
const Facet& use_facet(const locale& loc)
{
const locale::facet *pb;
const Facet *pd;
// use the Facet identification
if ((pb = loc.get_facet(Facet::id)) == 0)
throw(bad_cast("missing locale facet"));
// use the Facet type
if ((pd = dynamic_cast<const Facet*>(pb)) == 0)
throw(bad_cast("missing locale facet"));
return (*pd);
}
The example code shows that use_facet first tries to retrieve the facet from the locale’s facet repository via the interface identification Facet::id . A locale can contain no more than one facet with the facet interface identification Facet::id . If such a facet can be found, it uses a dynamic cast to check if the found facet can be cast down to const Facet*.

use_facet and Facet Hierarchies. Let’s have a closer look at what will happen if we invoke use_facet on different classes from a class hierarchy . Let’s assume we have the following situation:

class base_facet : public locale::facet
{
// constructors and destructors
public:
virtual string bar() { return "this is the base class"; }
static ::std::locale::id id;
};
class derived_facet : public base_facet
{
// constructors and destructors
public:
virtual string bar() { return "this is the derived class"; }
virtual string bar_2() { return "hello world"; }
};
Neither base_facet nor derived_facet contain any localization services or information. This is done deliberately, in order to keep the example simple and concise.

Now let’s examine the different possible cases, and let’s discuss them in terms of our example implementation of use_facet above.

1. Exact type match . We are going to start with a situation where the locale object contains a facet of the same type as the type requested in the use_facet template specification. Say, a locale object loc contains a facet of type base_facet and we call:

cout << use_facet<base_facet>(loc).bar(); use_facet will find a facet with the facet interface identification base_facet::id in loc, and the dynamic cast to const base_base* will be processed correctly, because the facet is of exactly this type. The text sent to standard out will be: this is the base class . Okay, that was no surprise.

2. Base requested, derived available. Let’see what happens when the locale object contains a facet instance of the derived class, and the type requested in the use_facet template specification is the base class. In terms of our example classes: loc contains a facet of type derived_facet and we call:

cout << use_facet<base_facet>(loc).bar(); As derived_facet is derived from base_facet , the identifications derived_facet::id and base_facet::id refer to the same static member. This means that use_facet retrieves the instance of derived_facet from loc when it uses base_facet::id as search key. The dynamic cast to const base_facet* will process correctly because the current object is of type const derived_facet*. The text send to standard out will be : this is the derived class . This is a really interesting effect.

Two-Phase Polymorphism. What we have here is a kind of two-phase polymorphic dispatch.

  • First, whatever type of derived class is contained in the locale, it is extracted by specifying the base class type in the use_facet template specification.
  • Second, the invocation of a virtual function is dispatched by C++ means to the implementation of the extracted class.
After the first enthusiasm you might think: "Nice technique - but where is the benefit?" Well, let’s consider an example. It’s a situation where you pass a const locale& to a function that needs a certain member functions of a certain facet interface . It retrieves the facet from the locale by specifying the facet class type, that corresponds to this interface, in the use_facet template specification: void function(const locale& loc)
{ use_facet< facet_type >(loc).facet_member_function(); }
Now, take into account that facet instances are constant objects. They are not mutated inside of a locale (we will discuss the reasons later). They can neither change, nor can they be replaced in a locale. When you want to provide a different localization environment to the function above, you have to pass in a different locale object. Now, let’s imagine that in our example above the new locale object is one that has the facet of type facet_type replaced. The new facet can be an instance of the same type as before, or an instance of a type derived from facet_type . In either case, the function can retrieve the facet the same way as before. It need not care about the actual type of the facet object contained in the current locale (as long as use_facet does not fail). Nor does it have to care if the facet instance or its type have changed. Instead - thanks to the two-phase-polymorphism - the function will always process the current localization information contained in the locale.

3. Derived requested, base available. Let’s get back to base_facet and derived_facet . Eventually we are going to examine the situation where the locale object contains a facet instance of the base class and the type requested in the use_facet template specification is the derived class, i.e. loc contains a facet of type base_facet and we call:

cout << use_facet<derived_facet>(loc).bar_2(); With the same argumentation as in the previous situation, use_facet retrieves the instance of base_facet from loc when it uses derived_facet::id as search key. The dynamic cast to const derived_facet* will fail, because the current object is of type const base_facet*. This is okay, because the base class pointer is not a compatible type to the derived class pointer; even in our simple example the base class ( base_facet ) does not support the full interface of the derived class ( derived_facet ).

4. Wrong id. A call to use_facet also fails with an bad_cast exception when the facet repository in the locale object contains no facet with the requested locale::id. To avoid the exception the has_facet function can be used:

 
template <class Facet> bool has_facet(const locale&) throw();
It checks if a locale can satisfy a call to use_facet . The preceding example can be changed to: if (has_facet<derived_facet>(loc))
cout << use_facet<derived_facet>(loc).bar();
so that no exception will be thrown.
 
 

Storing Facets in a Locale

We’ve seen above that the functions use_facet and has_facet provide the functionality to retrieve facets from a locale object. But how do the facets get into a locale? It all happens when a locale object is created. A locale fills its facet repository depending on the arguments passed to its constructor. Here is an example of a locale constructor:

template <class Facet> locale(const locale& other, Facet* f); It creates a locale which is a copy of other. If other contains a facet with the identification Facet::id , then this facet is replaced in the copy. The new facet is the one that the pointer f points to. If other does not contain a facet with the identification Facet::id, then the facet that the pointer f points to is added and extends the copy.

One interesting aspect of this behavior is, that it allows to add instances of new user-defined facet in a simple way. We will discuss all locale constructors in detail in our next article.
 
 

Memory Management of Facets in a Locale

The locale does not only provide means for retrieving and storing facets, it is also capable of taking over the memory management of its facets. In fact, all facets in the Standard C++ Library are designed for maintenance by a locale; they have a protected virtual destructor. The consequence is that standard facets can only be created on the heap, and, more importantly, they can only be deleted by friends or derived classes. Class locale is friend of class locale::facet, and thus has permission to delete facets via the virtual destructor locale:: ~ facet().

All standard facets offer the same control mechanism for their deletion: When you create a standard facet you can determine, whether the facet shall later be deleted by the locale or not. This is implemented by means of a certain argument for the constructor of the facet base class locale::facet; it takes an arguments refs , which controls deletion of a facet:

  • If refs = = 0, then the locale deletes the facet. In this case the facet should only be used in conjunction with a locale. Its lifetime is tied to the lifetime of the locale it belongs to; the facet becomes invalid when the locale is destroyed.
  • If refs = = 1, then the creator of the facet is fully responsible for the facet’s lifetime and deletion.
Note that facet with refs= =1 must be of a type derived from a standard facet, because the standard facets have no public destructor, but a derived facet can provide one. In fact, non-standard facets can differ substantially. For instance, they need not at all provide the parameter described above for the standard facet constructors. However, the mechanism used by the locale for managing the lifetime of its facets remains the same. It is contained in class locale::facet already, and thus is inherited by all facet types.

In the example shown earlier we had omitted the constructors and destructors of the classes base_facet and derived_facet .. Here is the completed example. The base class follows the pattern demonstrated by the standard facets in the library; i.e. it provides the latitude to control deletion of the facet by setting the constructor argument refs to 1, if necessary. The derived class is more restrictive and cannot be deleted independently of the locale, because it always sets refs to 0.

class base_facet : public locale::facet
{
public:
base_facet(size_t refs=0) : locale::facet(refs) {}
virtual string bar() { return "this is the base class"; }
static ::std::locale::id id;
protected:
~ base_facet() {}
};
class derived_facet : public base_facet
{
public:
derived_facet() : base_facet(0) {}
virtual string bar() { return "this is the derived class"; }
virtual string bar_2() { return "hello world"; }
protected:
~ derived_facet() {}
};
So far our focus was maintenance of facets by just one locale. However, facets are shared among locales. As we have seen earlier, the facet repository contained in a locale is a mapping between facet identifications and facet pointers. Once a locale is copied, only the facets pointers are duplicated. It would be wasteful and inefficient to create copies of all contained facets in a locale each time a locale is copied

For this reason there is a more global management scheme: a locale expects its facets to have a reference counter, which the locale increments and decrements. When a locale deletes the last pointer to a facet it also deletes the facet itself. The reference counter is likely to be part of the private guts of class locale::facet . Again, details of an implementation are not specified by the standard.

Let’s examine an example. Say, we have a function that creates a new locale object by combining a given locale with a certain facet.

void function(const locale& loc)
{
locale temp_locale(loc, new facet_type(0));
// do something fancy with temp_locale
} // here temp_locale goes out of scope
The diagram below shows an arbitrary locale loc provided as an argument to the function.
 
 

After creation of the second locale object temp_locale both locale objects share almost all of their facets except the one of type facet_type that is replaced in the newly constructed locale object.

When the locale object temp_locale goes out of scope its destructor decrements the reference counters of the locale’s facets. The reference counter of the new facet_type object will be 0 by then and consequently the facet will be deleted. After destruction of the locale object temp_locale the situation will be as before.

Facets and locales are designed to work closely together. The locale manages references and the lifetime of its facets. Another area of close collaboration between locales and facets is the immutability of both.
 
 

Immutability of Facets in a Locale

Locales and their contained facets are immutable objects. This is because they represent information and rules that describe a certain cultural area. Such localization aspects are naturally fixed. Consequently a locale does not change throughout its lifetime. Apart from this intuitive understanding of a locale, its immutability also has practical reasons.

Facets are shared among locales. Would it be permitted that a facet were changed through the interface of one locale, all other locales sharing this facet would be affected by this change. Hence notification would be needed, or some kind of synchronization between the locales concerned. In any case, it would complicate every program using locales and facets. The design path taken for the locale framework is to make locales and facets immutable. Once created neither a locale nor its facets can change anymore.

The immutability is reflected in various places. Locales cannot be modified, but only be built by composition. Also, if you look at the standard facets, you will notice that all their member functions are declared constant. This is sensible because the function use_facet returns a const Facet& . Consequently, you can only invoke constant operations on a facet retrieved from a locale.
 
 

Summary

We have seen that a locale represents a repository of facets, and that a facet encapsulates certain localization aspects. Facets can be stored in a locale and retrieved from a locale. The retrieval works polymorphically, using the facet identification and the facet type for dispatch. Locales and facets are designed to closely work together; locales manage references and the lifetime of their facets. Another area of close collaboration is the immutability of both locales and facets.
 
 

Acknowledgment. We would like to thank Nathan Myers, who proposed the locale framework to the standards committee, for answering countless questions.

References.

[1]
Working Paper for Draft Proposed International Standard for Information Systems
Programming Language C++
Doc No: X3J16/96-0219R1, WG21/N1037
Date: December 2, 1996
Copies of the Committee draft are available for downloading at:
<http://www.setech.com/x3.html> and
<http://www.maths.warwick.ac.uk/c++/pub/>
 
 
 
 
 
Sidebar: Explicit Template Argument Specification
Template arguments of a function instantiated from a function template can either be explicitly specified in a call or be deduced from the function arguments. The explicit template argument specification is a language feature that is relatively new to C++. We have used it a couple of times in this article. Here is a brief explanation of explicit template argument specification for function templates.

Traditionally, function template arguments are deduced from the function arguments. If you have a function template

template <class T> void foo (T t) { /* ... */ } then you usually do not care about instantiation of the function template. You simply use this function template as in the following example: int i = 5;
foo(i);
float x = 1.5;
foo(x);
The compiler does the work for you; it examines the arguments to these function calls, determines the argument types, and deduces that in the cases above the function templates need to be instantiated for type int and for type float

Now, let’s take a look at the use_facet function template from the Standard C++ Library. It is declared as:

template <class Facet> const Facet& use_facet(const locale& loc); Different from the example above, the template parameter Facet does not appear as type of a function parameter. The only function parameter to use_facet is the locale. Now, consider a call to this function template:  locale loc;
const numpunct<char>& fac = use_facet(loc); // will not compile !!!
The function argument loc does not allow to deduce the template argument, because its type has nothing to do with the template argument Facet . The return type of the function template is not considered for template argument deduction. Hence in the call to use_facet above the template argument Facet cannot be deduced. It has to be explicitly specified. 

Explicit template argument specification is done like this:

locale loc;
const numpunct<char>& fac = use_facet<numpunct<char> >(loc); 
Note that the syntax for explicit template argument specification of a function template is similar to template argument specification of class templates. If you have a class template template <class T> class list; you naturally specify the template arguments whenever you need an instantiation of the class template: list<int> counters;
list<float> sizes;
With a function template  template <class T> void foo(); you do exactly the same: foo<int>();
foo<float>();
if you have to. If the template argument appears in the function argument list, it is more convenient to let the compiler deduce the template argument for you.

Disclaimer: Not all compilers available on the market these days are capable of understanding this new language feature. So, don’t be surprised if your compiler starts complaining about constructs like foo<int>();


 
 
 

If you are interested to hear more about this and related topics you might want to check out the following seminar:
Seminar
 
Effective STL Programming - The Standard Template Library in Depth
4-day seminar (open enrollment and on-site)
IOStreams and Locales - Standard C++ IOStreams and Locales in Depth
5-day seminar (open enrollment and on-site)
 

  © Copyright 1995-2003 by Angelika Langer.  All Rights Reserved.    URL: < http://www.AngelikaLanger.com/Articles/C++Report/LocaleFramework/LocaleFramework.html  last update: 22 Oct 2003