UnicodeString Class Reference

UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer classes. More...

#include <unistr.h>

Inheritance diagram for UnicodeString:

Replaceable UObject UMemory

Public Types

enum  EInvariant { kInvariant }
 Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string. More...

Public Member Functions

UBool operator== (const UnicodeString &text) const
 Equality operator.
UBool operator!= (const UnicodeString &text) const
 Inequality operator.
UBool operator> (const UnicodeString &text) const
 Greater than operator.
UBool operator< (const UnicodeString &text) const
 Less than operator.
UBool operator>= (const UnicodeString &text) const
 Greater than or equal operator.
UBool operator<= (const UnicodeString &text) const
 Less than or equal operator.
int8_t compare (const UnicodeString &text) const
 Compare the characters bitwise in this UnicodeString to the characters in text.
int8_t compare (int32_t start, int32_t length, const UnicodeString &text) const
 Compare the characters bitwise in the range [start, start + length) with the characters in text.
int8_t compare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Compare the characters bitwise in the range [start, start + length) with the characters in srcText in the range [srcStart, srcStart + srcLength).
int8_t compare (const UChar *srcChars, int32_t srcLength) const
 Compare the characters bitwise in this UnicodeString with the first srcLength characters in srcChars.
int8_t compare (int32_t start, int32_t length, const UChar *srcChars) const
 Compare the characters bitwise in the range [start, start + length) with the first length characters in srcChars.
int8_t compare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Compare the characters bitwise in the range [start, start + length) with the characters in srcChars in the range [srcStart, srcStart + srcLength).
int8_t compareBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit) const
 Compare the characters bitwise in the range [start, limit) with the characters in srcText in the range [srcStart, srcLimit).
int8_t compareCodePointOrder (const UnicodeString &text) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (const UChar *srcChars, int32_t srcLength) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrder (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Compare two Unicode strings in code point order.
int8_t compareCodePointOrderBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit) const
 Compare two Unicode strings in code point order.
int8_t caseCompare (const UnicodeString &text, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UnicodeString &srcText, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (const UChar *srcChars, int32_t srcLength, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UChar *srcChars, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompare (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
int8_t caseCompareBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit, uint32_t options) const
 Compare two strings case-insensitively using full case folding.
UBool startsWith (const UnicodeString &text) const
 Determine if this starts with the characters in text.
UBool startsWith (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Determine if this starts with the characters in srcText in the range [srcStart, srcStart + srcLength).
UBool startsWith (const UChar *srcChars, int32_t srcLength) const
 Determine if this starts with the characters in srcChars.
UBool startsWith (const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Determine if this ends with the characters in srcChars in the range [srcStart, srcStart + srcLength).
UBool endsWith (const UnicodeString &text) const
 Determine if this ends with the characters in text.
UBool endsWith (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength) const
 Determine if this ends with the characters in srcText in the range [srcStart, srcStart + srcLength).
UBool endsWith (const UChar *srcChars, int32_t srcLength) const
 Determine if this ends with the characters in srcChars.
UBool endsWith (const UChar *srcChars, int32_t srcStart, int32_t srcLength) const
 Determine if this ends with the characters in srcChars in the range [srcStart, srcStart + srcLength).
int32_t indexOf (const UnicodeString &text) const
 Locate in this the first occurrence of the characters in text, using bitwise comparison.
int32_t indexOf (const UnicodeString &text, int32_t start) const
 Locate in this the first occurrence of the characters in text starting at offset start, using bitwise comparison.
int32_t indexOf (const UnicodeString &text, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in text, using bitwise comparison.
int32_t indexOf (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in srcText in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t indexOf (const UChar *srcChars, int32_t srcLength, int32_t start) const
 Locate in this the first occurrence of the characters in srcChars starting at offset start, using bitwise comparison.
int32_t indexOf (const UChar *srcChars, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in srcChars, using bitwise comparison.
int32_t indexOf (const UChar *srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the first occurrence in the range [start, start + length) of the characters in srcChars in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t indexOf (UChar c) const
 Locate in this the first occurrence of the BMP code point c, using bitwise comparison.
int32_t indexOf (UChar32 c) const
 Locate in this the first occurrence of the code point c, using bitwise comparison.
int32_t indexOf (UChar c, int32_t start) const
 Locate in this the first occurrence of the BMP code point c, starting at offset start, using bitwise comparison.
int32_t indexOf (UChar32 c, int32_t start) const
 Locate in this the first occurrence of the code point c starting at offset start, using bitwise comparison.
int32_t indexOf (UChar c, int32_t start, int32_t length) const
 Locate in this the first occurrence of the BMP code point c in the range [start, start + length), using bitwise comparison.
int32_t indexOf (UChar32 c, int32_t start, int32_t length) const
 Locate in this the first occurrence of the code point c in the range [start, start + length), using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &text) const
 Locate in this the last occurrence of the characters in text, using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &text, int32_t start) const
 Locate in this the last occurrence of the characters in text starting at offset start, using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &text, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in text, using bitwise comparison.
int32_t lastIndexOf (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in srcText in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t lastIndexOf (const UChar *srcChars, int32_t srcLength, int32_t start) const
 Locate in this the last occurrence of the characters in srcChars starting at offset start, using bitwise comparison.
int32_t lastIndexOf (const UChar *srcChars, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in srcChars, using bitwise comparison.
int32_t lastIndexOf (const UChar *srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length) const
 Locate in this the last occurrence in the range [start, start + length) of the characters in srcChars in the range [srcStart, srcStart + srcLength), using bitwise comparison.
int32_t lastIndexOf (UChar c) const
 Locate in this the last occurrence of the BMP code point c, using bitwise comparison.
int32_t lastIndexOf (UChar32 c) const
 Locate in this the last occurrence of the code point c, using bitwise comparison.
int32_t lastIndexOf (UChar c, int32_t start) const
 Locate in this the last occurrence of the BMP code point c starting at offset start, using bitwise comparison.
int32_t lastIndexOf (UChar32 c, int32_t start) const
 Locate in this the last occurrence of the code point c starting at offset start, using bitwise comparison.
int32_t lastIndexOf (UChar c, int32_t start, int32_t length) const
 Locate in this the last occurrence of the BMP code point c in the range [start, start + length), using bitwise comparison.
int32_t lastIndexOf (UChar32 c, int32_t start, int32_t length) const
 Locate in this the last occurrence of the code point c in the range [start, start + length), using bitwise comparison.
UChar charAt (int32_t offset) const
 Return the code unit at offset offset.
UChar operator[] (int32_t offset) const
 Return the code unit at offset offset.
UChar32 char32At (int32_t offset) const
 Return the code point that contains the code unit at offset offset.
int32_t getChar32Start (int32_t offset) const
 Adjust a random-access offset so that it points to the beginning of a Unicode character.
int32_t getChar32Limit (int32_t offset) const
 Adjust a random-access offset so that it points behind a Unicode character.
int32_t moveIndex32 (int32_t index, int32_t delta) const
 Move the code unit index along the string by delta code points.
void extract (int32_t start, int32_t length, UChar *dst, int32_t dstStart=0) const
 Copy the characters in the range [start, start + length) into the array dst, beginning at dstStart.
int32_t extract (UChar *dest, int32_t destCapacity, UErrorCode &errorCode) const
 Copy the contents of the string into dest.
void extract (int32_t start, int32_t length, UnicodeString &target) const
 Copy the characters in the range [start, start + length) into the UnicodeString target.
void extractBetween (int32_t start, int32_t limit, UChar *dst, int32_t dstStart=0) const
 Copy the characters in the range [start, limit) into the array dst, beginning at dstStart.
virtual void extractBetween (int32_t start, int32_t limit, UnicodeString &target) const
 Copy the characters in the range [start, limit) into the UnicodeString target.
int32_t extract (int32_t start, int32_t startLength, char *target, int32_t targetCapacity, enum EInvariant inv) const
 Copy the characters in the range [start, start + length) into an array of characters.
int32_t extract (int32_t start, int32_t startLength, char *target, const char *codepage=0) const
 Copy the characters in the range [start, start + length) into an array of characters in a specified codepage.
int32_t extract (int32_t start, int32_t startLength, char *target, uint32_t targetLength, const char *codepage=0) const
 Copy the characters in the range [start, start + length) into an array of characters in a specified codepage.
int32_t extract (char *dest, int32_t destCapacity, UConverter *cnv, UErrorCode &errorCode) const
 Convert the UnicodeString into a codepage string using an existing UConverter.
int32_t length (void) const
 Return the length of the UnicodeString object.
int32_t countChar32 (int32_t start=0, int32_t length=INT32_MAX) const
 Count Unicode code points in the length UChar code units of the string.
UBool hasMoreChar32Than (int32_t start, int32_t length, int32_t number) const
 Check if the length UChar code units of the string contain more Unicode code points than a certain number.
UBool isEmpty (void) const
 Determine if this string is empty.
int32_t getCapacity (void) const
 Return the capacity of the internal buffer of the UnicodeString object.
int32_t hashCode (void) const
 Generate a hash code for this object.
UBool isBogus (void) const
 Determine if this object contains a valid string.
UnicodeStringoperator= (const UnicodeString &srcText)
 Assignment operator.
UnicodeStringfastCopyFrom (const UnicodeString &src)
 Almost the same as the assignment operator.
UnicodeStringoperator= (UChar ch)
 Assignment operator.
UnicodeStringoperator= (UChar32 ch)
 Assignment operator.
UnicodeStringsetTo (const UnicodeString &srcText, int32_t srcStart)
 Set the text in the UnicodeString object to the characters in srcText in the range [srcStart, srcText.length()).
UnicodeStringsetTo (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Set the text in the UnicodeString object to the characters in srcText in the range [srcStart, srcStart + srcLength).
UnicodeStringsetTo (const UnicodeString &srcText)
 Set the text in the UnicodeString object to the characters in srcText.
UnicodeStringsetTo (const UChar *srcChars, int32_t srcLength)
 Set the characters in the UnicodeString object to the characters in srcChars.
UnicodeStringsetTo (UChar srcChar)
 Set the characters in the UnicodeString object to the code unit srcChar.
UnicodeStringsetTo (UChar32 srcChar)
 Set the characters in the UnicodeString object to the code point srcChar.
UnicodeStringsetTo (UBool isTerminated, const UChar *text, int32_t textLength)
 Aliasing setTo() function, analogous to the readonly-aliasing UChar* constructor.
UnicodeStringsetTo (UChar *buffer, int32_t buffLength, int32_t buffCapacity)
 Aliasing setTo() function, analogous to the writable-aliasing UChar* constructor.
void setToBogus ()
 Make this UnicodeString object invalid.
UnicodeStringsetCharAt (int32_t offset, UChar ch)
 Set the character at the specified offset to the specified character.
UnicodeStringoperator+= (UChar ch)
 Append operator.
UnicodeStringoperator+= (UChar32 ch)
 Append operator.
UnicodeStringoperator+= (const UnicodeString &srcText)
 Append operator.
UnicodeStringappend (const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Append the characters in srcText in the range [srcStart, srcStart + srcLength) to the UnicodeString object at offset start.
UnicodeStringappend (const UnicodeString &srcText)
 Append the characters in srcText to the UnicodeString object at offset start.
UnicodeStringappend (const UChar *srcChars, int32_t srcStart, int32_t srcLength)
 Append the characters in srcChars in the range [srcStart, srcStart + srcLength) to the UnicodeString object at offset start.
UnicodeStringappend (const UChar *srcChars, int32_t srcLength)
 Append the characters in srcChars to the UnicodeString object at offset start.
UnicodeStringappend (UChar srcChar)
 Append the code unit srcChar to the UnicodeString object.
UnicodeStringappend (UChar32 srcChar)
 Append the code point srcChar to the UnicodeString object.
UnicodeStringinsert (int32_t start, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Insert the characters in srcText in the range [srcStart, srcStart + srcLength) into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, const UnicodeString &srcText)
 Insert the characters in srcText into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
 Insert the characters in srcChars in the range [srcStart, srcStart + srcLength) into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, const UChar *srcChars, int32_t srcLength)
 Insert the characters in srcChars into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, UChar srcChar)
 Insert the code unit srcChar into the UnicodeString object at offset start.
UnicodeStringinsert (int32_t start, UChar32 srcChar)
 Insert the code point srcChar into the UnicodeString object at offset start.
UnicodeStringreplace (int32_t start, int32_t length, const UnicodeString &srcText, int32_t srcStart, int32_t srcLength)
 Replace the characters in the range [start, start + length) with the characters in srcText in the range [srcStart, srcStart + srcLength).
UnicodeStringreplace (int32_t start, int32_t length, const UnicodeString &srcText)
 Replace the characters in the range [start, start + length) with the characters in srcText.
UnicodeStringreplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcStart, int32_t srcLength)
 Replace the characters in the range [start, start + length) with the characters in srcChars in the range [srcStart, srcStart + srcLength).
UnicodeStringreplace (int32_t start, int32_t length, const UChar *srcChars, int32_t srcLength)
 Replace the characters in the range [start, start + length) with the characters in srcChars.
UnicodeStringreplace (int32_t start, int32_t length, UChar srcChar)
 Replace the characters in the range [start, start + length) with the code unit srcChar.
UnicodeStringreplace (int32_t start, int32_t length, UChar32 srcChar)
 Replace the characters in the range [start, start + length) with the code point srcChar.
UnicodeStringreplaceBetween (int32_t start, int32_t limit, const UnicodeString &srcText)
 Replace the characters in the range [start, limit) with the characters in srcText.
UnicodeStringreplaceBetween (int32_t start, int32_t limit, const UnicodeString &srcText, int32_t srcStart, int32_t srcLimit)
 Replace the characters in the range [start, limit) with the characters in srcText in the range [srcStart, srcLimit).
virtual void handleReplaceBetween (int32_t start, int32_t limit, const UnicodeString &text)
 Replace a substring of this object with the given text.
virtual UBool hasMetaData () const
 Replaceable API.
virtual void copy (int32_t start, int32_t limit, int32_t dest)
 Copy a substring of this object, retaining attribute (out-of-band) information.
UnicodeStringfindAndReplace (const UnicodeString &oldText, const UnicodeString &newText)
 Replace all occurrences of characters in oldText with the characters in newText.
UnicodeStringfindAndReplace (int32_t start, int32_t length, const UnicodeString &oldText, const UnicodeString &newText)
 Replace all occurrences of characters in oldText with characters in newText in the range [start, start + length).
UnicodeStringfindAndReplace (int32_t start, int32_t length, const UnicodeString &oldText, int32_t oldStart, int32_t oldLength, const UnicodeString &newText, int32_t newStart, int32_t newLength)
 Replace all occurrences of characters in oldText in the range [oldStart, oldStart + oldLength) with the characters in newText in the range [newStart, newStart + newLength) in the range [start, start + length).
UnicodeStringremove (void)
 Remove all characters from the UnicodeString object.
UnicodeStringremove (int32_t start, int32_t length=(int32_t) INT32_MAX)
 Remove the characters in the range [start, start + length) from the UnicodeString object.
UnicodeStringremoveBetween (int32_t start, int32_t limit=(int32_t) INT32_MAX)
 Remove the characters in the range [start, limit) from the UnicodeString object.
UBool padLeading (int32_t targetLength, UChar padChar=0x0020)
 Pad the start of this UnicodeString with the character padChar.
UBool padTrailing (int32_t targetLength, UChar padChar=0x0020)
 Pad the end of this UnicodeString with the character padChar.
UBool truncate (int32_t targetLength)
 Truncate this UnicodeString to the targetLength.
UnicodeStringtrim (void)
 Trims leading and trailing whitespace from this UnicodeString.
UnicodeStringreverse (void)
 Reverse this UnicodeString in place.
UnicodeStringreverse (int32_t start, int32_t length)
 Reverse the range [start, start + length) in this UnicodeString.
UnicodeStringtoUpper (void)
 Convert the characters in this to UPPER CASE following the conventions of the default locale.
UnicodeStringtoUpper (const Locale &locale)
 Convert the characters in this to UPPER CASE following the conventions of a specific locale.
UnicodeStringtoLower (void)
 Convert the characters in this to lower case following the conventions of the default locale.
UnicodeStringtoLower (const Locale &locale)
 Convert the characters in this to lower case following the conventions of a specific locale.
UnicodeStringtoTitle (BreakIterator *titleIter)
 Titlecase this string, convenience function using the default locale.
UnicodeStringtoTitle (BreakIterator *titleIter, const Locale &locale)
 Titlecase this string.
UnicodeStringtoTitle (BreakIterator *titleIter, const Locale &locale, uint32_t options)
 Titlecase this string, with options.
UnicodeStringfoldCase (uint32_t options=0)
 Case-fold the characters in this string.
UChar * getBuffer (int32_t minCapacity)
 Get a read/write pointer to the internal buffer.
void releaseBuffer (int32_t newLength=-1)
 Release a read/write buffer on a UnicodeString object with an "open" getBuffer(minCapacity).
const UChar * getBuffer () const
 Get a read-only pointer to the internal buffer.
const UChar * getTerminatedBuffer ()
 Get a read-only pointer to the internal buffer, making sure that it is NUL-terminated.
 UnicodeString ()
 Construct an empty UnicodeString.
 UnicodeString (int32_t capacity, UChar32 c, int32_t count)
 Construct a UnicodeString with capacity to hold capacity UChars.
 UnicodeString (UChar ch)
 Single UChar (code unit) constructor.
 UnicodeString (UChar32 ch)
 Single UChar32 (code point) constructor.
 UnicodeString (const UChar *text)
 UChar* constructor.
 UnicodeString (const UChar *text, int32_t textLength)
 UChar* constructor.
 UnicodeString (UBool isTerminated, const UChar *text, int32_t textLength)
 Readonly-aliasing UChar* constructor.
 UnicodeString (UChar *buffer, int32_t buffLength, int32_t buffCapacity)
 Writable-aliasing UChar* constructor.
 UnicodeString (const char *codepageData, const char *codepage=0)
 char* constructor.
 UnicodeString (const char *codepageData, int32_t dataLength, const char *codepage=0)
 char* constructor.
 UnicodeString (const char *src, int32_t srcLength, UConverter *cnv, UErrorCode &errorCode)
 char * / UConverter constructor.
 UnicodeString (const char *src, int32_t length, enum EInvariant inv)
 Constructs a Unicode string from an invariant-character char * string.
 UnicodeString (const UnicodeString &that)
 Copy constructor.
 UnicodeString (const UnicodeString &src, int32_t srcStart)
 'Substring' constructor from tail of source string.
 UnicodeString (const UnicodeString &src, int32_t srcStart, int32_t srcLength)
 'Substring' constructor from subrange of source string.
virtual Replaceableclone () const
 Clone this object, an instance of a subclass of Replaceable.
virtual ~UnicodeString ()
 Destructor.
UnicodeString unescape () const
 Unescape a string of characters and return a string containing the result.
UChar32 unescapeAt (int32_t &offset) const
 Unescape a single escape sequence and return the represented character.
virtual UClassID getDynamicClassID () const
 ICU "poor man's RTTI", returns a UClassID for the actual class.

Static Public Member Functions

static UClassID getStaticClassID ()
 ICU "poor man's RTTI", returns a UClassID for this class.

Protected Member Functions

virtual int32_t getLength () const
 Implement Replaceable::getLength() (see jitterbug 1027).
virtual UChar getCharAt (int32_t offset) const
 The change in Replaceable to use virtual getCharAt() allows UnicodeString::charAt() to be inline again (see jitterbug 709).
virtual UChar32 getChar32At (int32_t offset) const
 The change in Replaceable to use virtual getChar32At() allows UnicodeString::char32At() to be inline again (see jitterbug 709).

Friends

class StringThreadTest
union StackBufferOrFields

Data Structures

union  StackBufferOrFields

Detailed Description

UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer classes.

It is a concrete implementation of the abstract class Replaceable (for transliteration).

The UnicodeString class is not suitable for subclassing.

For an overview of Unicode strings in C and C++ see the User Guide Strings chapter.

In ICU, a Unicode string consists of 16-bit Unicode code units. A Unicode character may be stored with either one code unit (the most common case) or with a matched pair of special code units ("surrogates"). The data type for code units is UChar. For single-character handling, a Unicode character code point is a value in the range 0..0x10ffff. ICU uses the UChar32 type for code points.

Indexes and offsets into and lengths of strings always count code units, not code points. This is the same as with multi-byte char* strings in traditional string handling. Operations on partial strings typically do not test for code point boundaries. If necessary, the user needs to take care of such boundaries by testing for the code unit values or by using functions like UnicodeString::getChar32Start() and UnicodeString::getChar32Limit() (or, in C, the equivalent macros U16_SET_CP_START() and U16_SET_CP_LIMIT(), see utf.h).

UnicodeString methods are more lenient with regard to input parameter values than other ICU APIs. In particular:

In string comparisons, two UnicodeString objects that are both "bogus" compare equal (to be transitive and prevent endless loops in sorting), and a "bogus" string compares less than any non-"bogus" one.

Const UnicodeString methods are thread-safe. Multiple threads can use const methods on the same UnicodeString object simultaneously, but non-const methods must not be called concurrently (in multiple threads) with any other (const or non-const) methods.

Similarly, const UnicodeString & parameters are thread-safe. One object may be passed in as such a parameter concurrently in multiple threads. This includes the const UnicodeString & parameters for copy construction, assignment, and cloning.

UnicodeString uses several storage methods. String contents can be stored inside the UnicodeString object itself, in an allocated and shared buffer, or in an outside buffer that is "aliased". Most of this is done transparently, but careful aliasing in particular provides significant performance improvements. Also, the internal buffer is accessible via special functions. For details see the User Guide Strings chapter.

See also:
utf.h

CharacterIterator

Stable:
ICU 2.0

Definition at line 183 of file unistr.h.


Member Enumeration Documentation

Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string.

Use the macro US_INV instead of the full qualification for this value.

See also:
US_INV
Stable:
ICU 3.2
Enumerator:
kInvariant 
See also:
EInvariant
Stable:
ICU 3.2

Definition at line 195 of file unistr.h.


Constructor & Destructor Documentation

UnicodeString::UnicodeString (  ) 

Construct an empty UnicodeString.

Stable:
ICU 2.0

UnicodeString::UnicodeString ( int32_t  capacity,
UChar32  c,
int32_t  count 
)

Construct a UnicodeString with capacity to hold capacity UChars.

Parameters:
capacity the number of UChars this UnicodeString should hold before a resize is necessary; if count is greater than 0 and count code points c take up more space than capacity, then capacity is adjusted accordingly.
c is used to initially fill the string
count specifies how many code points c are to be written in the string
Stable:
ICU 2.0

UnicodeString::UnicodeString ( UChar  ch  ) 

Single UChar (code unit) constructor.

Parameters:
ch the character to place in the UnicodeString
Stable:
ICU 2.0

UnicodeString::UnicodeString ( UChar32  ch  ) 

Single UChar32 (code point) constructor.

Parameters:
ch the character to place in the UnicodeString
Stable:
ICU 2.0

UnicodeString::UnicodeString ( const UChar *  text  ) 

UChar* constructor.

Parameters:
text The characters to place in the UnicodeString. text must be NULL (U+0000) terminated.
Stable:
ICU 2.0

UnicodeString::UnicodeString ( const UChar *  text,
int32_t  textLength 
)

UChar* constructor.

Parameters:
text The characters to place in the UnicodeString.
textLength The number of Unicode characters in text to copy.
Stable:
ICU 2.0

UnicodeString::UnicodeString ( UBool  isTerminated,
const UChar *  text,
int32_t  textLength 
)

Readonly-aliasing UChar* constructor.

The text will be used for the UnicodeString object, but it will not be released when the UnicodeString is destroyed. This has copy-on-write semantics: When the string is modified, then the buffer is first copied into newly allocated memory. The aliased buffer is never modified. In an assignment to another UnicodeString, the text will be aliased again, so that both strings then alias the same readonly-text.

Parameters:
isTerminated specifies if text is NUL-terminated. This must be true if textLength==-1.
text The characters to alias for the UnicodeString.
textLength The number of Unicode characters in text to alias. If -1, then this constructor will determine the length by calling u_strlen().
Stable:
ICU 2.0

UnicodeString::UnicodeString ( UChar *  buffer,
int32_t  buffLength,
int32_t  buffCapacity 
)

Writable-aliasing UChar* constructor.

The text will be used for the UnicodeString object, but it will not be released when the UnicodeString is destroyed. This has write-through semantics: For as long as the capacity of the buffer is sufficient, write operations will directly affect the buffer. When more capacity is necessary, then a new buffer will be allocated and the contents copied as with regularly constructed strings. In an assignment to another UnicodeString, the buffer will be copied. The extract(UChar *dst) function detects whether the dst pointer is the same as the string buffer itself and will in this case not copy the contents.

Parameters:
buffer The characters to alias for the UnicodeString.
buffLength The number of Unicode characters in buffer to alias.
buffCapacity The size of buffer in UChars.
Stable:
ICU 2.0

UnicodeString::UnicodeString ( const char *  codepageData,
const char *  codepage = 0 
)

char* constructor.

Parameters:
cod