|
|||||||||||||||||||
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | | | | |||||||||||||||||||
|
The Locale Framework
|
||||||||||||||||||
The Locale Framework
C++ Report, September 1997
Computer users all over the world prefer to interact with their systems using their own language and cultural conventions. Cultural differences affect for instance the display of monetary values, of date and time. Just think of the way numeric values are formatted in different cultures: 1,000,000.00 in the US is 1.000.000,00 in Germany and 10,00,000.00 in Nepal. If you aim for high international acceptance of your products you must build into your software the to adapt to varying requirements that stem from cultural differences. Building into software the potential for worldwide use is called internationalization . The Standard C++ Library provides an extensible framework that supports internationalization of C++ programs. Its main elements are locales and facets . In this column we focus on their architecture; subsequent columns will show you how to use and extend them. Locales and Facets. The abstraction that holds all the information about a certain cultural area and its conventions is called locale . In the Standard C++ Library a locale is a class that represents a container of facets . A facet is an abstraction that contains the information about a certain localization aspect. A localization aspect is a set of related services and information needed for internationalization. An example of a facet is the standard facet numpunct . It holds information about the formating of a numeric value. This facet, for instance, has a member function decimal_point() , which returns the character that is used as radix separator in a given cultural area. For an US-English environment this is‘.’, but for a German environment this is ‘,’. Facets do not only hold information, they also provide functionality. The num_put facet, another standard facet in the library, formats a numeric value to a sequence of characters; this transformation is done according to the current localization settings, part of which are determined by the numpunct facet mentioned above. One interesting aspect about the way the locale maintains its facets is its capability to handle facets polymorphically . Lets consider an example. Assume we derive our own facet from numpunct and make it return ‘|’ as radix separator, instead of ‘.’ or ‘,’. Lets also assume that we create a locale object that contains an instance of our newly derived numpunct class. Then we pass this locale object to a function that retrieves the current numpunct facet from the locale object in order to access its decimal_point() member function. An instance of the derived numpunct class will be obtained, its decimal_point() function will be called, and eventually ‘|’, instead of ‘.’ or ‘,’ will be returned. In this sense facets are polymorphic in a locale; a request for retrieval of a facet type, numpunct in our example, yields different results depending on the actual content of the locale. The implementation of this polymorphic facet selection does not only rely on overwriting virtual functions, which is the most obvious way to implement polymorphism in C++. It is also based on a special framework, called the locale framework in our articles. In this article we are going to describe the architecture of this framework. The C++ standard defines a number of standard facet classes that support the most common internationalization tasks. We have already mentioned numpunct and num_put . In a follow-up article we will give a comprehensive overview of all available standard facets, their functionality and usage. We will then close our introduction to the standard C++ locale with an article that describes how new user-defined facet types can be implemented and used. Facets Interfaces and Ids We are going to start the description of the locale framework in the Standard C++ Library with a closer look at facets. Two classes nested into class locale play a central role in the definition of a facet: locale::facet and locale::id .. Let’s see how these classes are defined in the C++ standard: { protected: explicit facet(size_t refs = 0); virtual ~facet(); private: facet(const facet&); // not defined void operator=(const facet&); // not defined }; class locale::id { public: id(); private: void operator=(const id&); // not defined id(const id&); // not defined }; Facet Identifications The declaration static locale::id id; inserts a data member id into a facet class. It provides an identification of a facet interface. What does this mean? First, let’s agree on some technical terms. We call a class derived from locale::facet, that defines an id member, a facet base class . All classes derived from a facet base class refer to the same static data member id. We call this static data member the facet (interface) identification . At the same time, all of the derived classes implement at least the same public interface as the base class; this is the semantics of public derivation in C++. We call this interface a facet interface . The essence is that all facet classes that implement the same interface as the base class share the same identification. Can other classes, that do not implement the base class interface, have the same identification, too? The answer is: No! An implementation of a standard locale has to provide a unique value for each locale::id object. How about identification of facets that are templates? A facet can be defined as a class template. An example is the standard facet numpunct<class charT> . It has the character type as a template parameter because it contains information that is expressed by means of characters, like the radix separator character, and the character type shall not be restricted to type char . The facet identification is a static data member id declared in the class template, i.e. there is a separate id object for each template instantiation. We have mentioned earlier that the locale::id class guarantees assignment of a unique value for each instance of an id . This ensures that each template instantiation has a unique identification. As we’ve seen above, all facet with the same identification implement the same facet interface. Also, facets with the same facet interface represent the same localization aspect, i.e. the localization services and information provided by their base class. Hence there is a one-to-one match of facet interfaces, facet identifications, and localization aspects. Maintenance of Facets in a Locale A locale is basically a container of facets. The interaction between a locale and its facets is invoked by the user when:
Retrieval of Facets from a Locale We start with the retrieval of a facet from a locale. Say, we have a function foo() that receives a locale object containing different facets, and each facet describes a certain localization aspect. We want to access the decimal_point() member function of the numpunct facet. For the purpose of facet retrieval from a locale the standard provides the following global function template: { const char radix_separator = use_facet< numpunct<char> >(loc).decimal_point(); } Implementing use_facet . Let’s examine an example implementation. For exposition, we assume that use_facet is a friend of class locale , so that it has access to a private member function of class locale that implements retrieval of a facet from the facet repository contained in the locale. This function might have the following signature: const locale::facet* get_facet (const locale::id&). You can think of the facet repository as a map with locale::id as the key and const locale::facet* as the value. An implementation is conceivable that uses an instantiation map<size_t, const locale::facet*> of the map class template from the standard, where locale::id allows a conversion to size_t . However, keep in mind that this is only an example; the C++ Standard does not define any implementation issues. A real implementation probably uses a faster data structure for the facet repository. Here is a tentative implementation of use_facet : const Facet& use_facet(const locale& loc) { const locale::facet *pb; const Facet *pd; // use the Facet identification if ((pb = loc.get_facet(Facet::id)) == 0) throw(bad_cast("missing locale facet")); // use the Facet type if ((pd = dynamic_cast<const Facet*>(pb)) == 0) throw(bad_cast("missing locale facet")); return (*pd); } use_facet and Facet Hierarchies. Let’s have a closer look at what will happen if we invoke use_facet on different classes from a class hierarchy . Let’s assume we have the following situation: { // constructors and destructors public: virtual string bar() { return "this is the base class"; } static ::std::locale::id id; }; class derived_facet : public base_facet { // constructors and destructors public: virtual string bar() { return "this is the derived class"; } virtual string bar_2() { return "hello world"; } }; Now let’s examine the different possible cases, and let’s discuss them in terms of our example implementation of use_facet above. 1. Exact type match . We are going to start with a situation where the locale object contains a facet of the same type as the type requested in the use_facet template specification. Say, a locale object loc contains a facet of type base_facet and we call: 2. Base requested, derived available. Let’see what happens when the locale object contains a facet instance of the derived class, and the type requested in the use_facet template specification is the base class. In terms of our example classes: loc contains a facet of type derived_facet and we call: Two-Phase Polymorphism. What we have here is a kind of two-phase polymorphic dispatch.
{ use_facet< facet_type >(loc).facet_member_function(); } 3. Derived requested, base available. Let’s get back to base_facet and derived_facet . Eventually we are going to examine the situation where the locale object contains a facet instance of the base class and the type requested in the use_facet template specification is the derived class, i.e. loc contains a facet of type base_facet and we call: 4. Wrong id. A call to use_facet also fails with an bad_cast exception when the facet repository in the locale object contains no facet with the requested locale::id. To avoid the exception the has_facet function can be used: template <class Facet> bool has_facet(const locale&) throw(); cout << use_facet<derived_facet>(loc).bar(); Storing Facets in a Locale We’ve seen above that the functions use_facet and has_facet provide the functionality to retrieve facets from a locale object. But how do the facets get into a locale? It all happens when a locale object is created. A locale fills its facet repository depending on the arguments passed to its constructor. Here is an example of a locale constructor:
One interesting aspect of this behavior is, that it allows
to add instances of new user-defined facet in a simple way. We will discuss
all locale constructors in detail in our next article.
Memory Management of Facets in a Locale The locale does not only provide means for retrieving and storing facets, it is also capable of taking over the memory management of its facets. In fact, all facets in the Standard C++ Library are designed for maintenance by a locale; they have a protected virtual destructor. The consequence is that standard facets can only be created on the heap, and, more importantly, they can only be deleted by friends or derived classes. Class locale is friend of class locale::facet, and thus has permission to delete facets via the virtual destructor locale:: ~ facet(). All standard facets offer the same control mechanism for their deletion: When you create a standard facet you can determine, whether the facet shall later be deleted by the locale or not. This is implemented by means of a certain argument for the constructor of the facet base class locale::facet; it takes an arguments refs , which controls deletion of a facet:
In the example shown earlier we had omitted the constructors and destructors of the classes base_facet and derived_facet .. Here is the completed example. The base class follows the pattern demonstrated by the standard facets in the library; i.e. it provides the latitude to control deletion of the facet by setting the constructor argument refs to 1, if necessary. The derived class is more restrictive and cannot be deleted independently of the locale, because it always sets refs to 0. { public: base_facet(size_t refs=0) : locale::facet(refs) {} virtual string bar() { return "this is the base class"; } static ::std::locale::id id; protected: ~ base_facet() {} }; class derived_facet : public base_facet { public: derived_facet() : base_facet(0) {} virtual string bar() { return "this is the derived class"; } virtual string bar_2() { return "hello world"; } protected: ~ derived_facet() {} }; For this reason there is a more global management scheme: a locale expects its facets to have a reference counter, which the locale increments and decrements. When a locale deletes the last pointer to a facet it also deletes the facet itself. The reference counter is likely to be part of the private guts of class locale::facet . Again, details of an implementation are not specified by the standard. Let’s examine an example. Say, we have a function that creates a new locale object by combining a given locale with a certain facet. { locale temp_locale(loc, new facet_type(0)); // do something fancy with temp_locale } // here temp_locale goes out of scope
After creation of the second locale object temp_locale both locale objects share almost all of their facets except the one of type facet_type that is replaced in the newly constructed locale object.
When the locale object temp_locale goes out of scope its destructor decrements the reference counters of the locale’s facets. The reference counter of the new facet_type object will be 0 by then and consequently the facet will be deleted. After destruction of the locale object temp_locale the situation will be as before.
Facets and locales are designed to work closely together.
The locale manages references and the lifetime of its facets. Another area
of close collaboration between locales and facets is the immutability of
both.
Immutability of Facets in a Locale Locales and their contained facets are immutable objects. This is because they represent information and rules that describe a certain cultural area. Such localization aspects are naturally fixed. Consequently a locale does not change throughout its lifetime. Apart from this intuitive understanding of a locale, its immutability also has practical reasons. Facets are shared among locales. Would it be permitted that a facet were changed through the interface of one locale, all other locales sharing this facet would be affected by this change. Hence notification would be needed, or some kind of synchronization between the locales concerned. In any case, it would complicate every program using locales and facets. The design path taken for the locale framework is to make locales and facets immutable. Once created neither a locale nor its facets can change anymore.
The immutability is reflected in various places. Locales
cannot be modified, but only be built by composition. Also, if you look
at the standard facets, you will notice that all their member functions
are declared constant. This is sensible because the function
use_facet
returns a
const Facet&
.
Consequently, you can only invoke constant operations on a facet retrieved
from a locale.
Summary
We have seen that a locale represents a repository of
facets, and that a facet encapsulates certain localization aspects. Facets
can be stored in a locale and retrieved from a locale. The retrieval works
polymorphically, using the facet identification and the facet type for
dispatch. Locales and facets are designed to closely work together; locales
manage references and the lifetime of their facets. Another area of close
collaboration is the immutability of both locales and facets.
Acknowledgment. We would like to thank Nathan Myers, who proposed the locale framework to the standards committee, for answering countless questions. References.
[1]
|
|||||||||||||||||||
© Copyright 1995-2003 by Angelika Langer. All Rights Reserved. URL: < http://www.AngelikaLanger.com/Articles/C++Report/LocaleFramework/LocaleFramework.html> last update: 22 Oct 2003 |