UCharacter

Kotlin |Java

class UCharacter : UCharacterEnums.ECharacterCategory, UCharacterEnums.ECharacterDirection

kotlin.Any
↳	android.icu.lang.UCharacter

[icu enhancement] ICU's replacement for java.lang.Character. Methods, fields, and other functionality specific to ICU are labeled '[icu]'.

The UCharacter class provides extensions to the java.lang.Character class. These extensions provide support for more Unicode properties. Each ICU release supports the latest version of Unicode available at that time.

For some time before Java 5 added support for supplementary Unicode code points, The ICU UCharacter class and many other ICU classes already supported them. Some UCharacter methods and constants were widened slightly differently than how the Character class methods and constants were widened later. In particular, Character.MAX_VALUE is still a char with the value U+FFFF, while the UCharacter.MAX_VALUE is an int with the value U+10FFFF.

Code points are represented in these API using ints. While it would be more convenient in Java to have a separate primitive datatype for them, ints suffice in the meantime.

Aside from the additions for UTF-16 support, and the updated Unicode properties, the main differences between UCharacter and Character are:

UCharacter is not designed to be a char wrapper and does not have APIs to which involves management of that single char.
These include:
- char charValue(),
- int compareTo(java.lang.Character, java.lang.Character), etc.
UCharacter does not include Character APIs that are deprecated, nor does it include the Java-specific character information, such as boolean isJavaIdentifierPart(char ch).
Character maps characters 'A' - 'Z' and 'a' - 'z' to the numeric values '10' - '35'. UCharacter also does this in digit and getNumericValue, to adhere to the java semantics of these methods. New methods unicodeDigit, and getUnicodeNumericValue do not treat the above code points as having numeric values. This is a semantic change from ICU4J 1.3.1.

In addition to Java compatibility functions, which calculate derived properties, this API provides low-level access to the Unicode Character Database.

Unicode assigns each code point (not just assigned character) values for many properties. Most of them are simple boolean flags, or constants from a small enumerated list. For some properties, values are strings or other relatively more complex types.

For more information see "About the Unicode Character Database" (http://www.unicode.org/ucd/) and the ICU User Guide chapter on Properties (https://unicode-org.github.io/icu/userguide/strings/properties).

There are also functions that provide easy migration from C/POSIX functions like isblank(). Their use is generally discouraged because the C/POSIX standards do not define their semantics beyond the ASCII range, which means that different implementations exhibit very different behavior. Instead, Unicode properties should be used directly.

There are also only a few, broad C/POSIX character classes, and they tend to be used for conflicting purposes. For example, the "isalpha()" class is sometimes used to determine word boundaries, while a more sophisticated approach would at least distinguish initial letters from continuation characters (the latter including combining marks). (In ICU, BreakIterator is the most sophisticated API for word boundaries.) Another example: There is no "istitle()" class for titlecase characters.

ICU 3.4 and later provides API access for all twelve C/POSIX character classes. ICU implements them according to the Standard Recommendations in Annex C: Compatibility Properties of UTS #18 Unicode Regular Expressions (http://www.unicode.org/reports/tr18/#Compatibility_Properties).

API access for C/POSIX character classes is as follows:

<code>- alpha:     isUAlphabetic(c) or hasBinaryProperty(c, UProperty.ALPHABETIC)
  - lower:     isULowercase(c) or hasBinaryProperty(c, UProperty.LOWERCASE)
  - upper:     isUUppercase(c) or hasBinaryProperty(c, UProperty.UPPERCASE)
  - punct:     ((1&lt;&lt;getType(c)) &amp; ((1&lt;&lt;DASH_PUNCTUATION)|(1&lt;&lt;START_PUNCTUATION)|
                (1&lt;&lt;END_PUNCTUATION)|(1&lt;&lt;CONNECTOR_PUNCTUATION)|(1&lt;&lt;OTHER_PUNCTUATION)|
                (1&lt;&lt;INITIAL_PUNCTUATION)|(1&lt;&lt;FINAL_PUNCTUATION)))!=0
  - digit:     isDigit(c) or getType(c)==DECIMAL_DIGIT_NUMBER
  - xdigit:    hasBinaryProperty(c, UProperty.POSIX_XDIGIT)
  - alnum:     hasBinaryProperty(c, UProperty.POSIX_ALNUM)
  - space:     isUWhiteSpace(c) or hasBinaryProperty(c, UProperty.WHITE_SPACE)
  - blank:     hasBinaryProperty(c, UProperty.POSIX_BLANK)
  - cntrl:     getType(c)==CONTROL
  - graph:     hasBinaryProperty(c, UProperty.POSIX_GRAPH)
  - print:     hasBinaryProperty(c, UProperty.POSIX_PRINT)</code>

The C/POSIX character classes are also available in UnicodeSet patterns, using patterns like [:graph:] or \p{graph}.

[icu] Note: There are several ICU (and Java) whitespace functions. Comparison:

isUWhiteSpace=UCHAR_WHITE_SPACE: Unicode White_Space property; most of general categories "Z" (separators) + most whitespace ISO controls (including no-break spaces, but excluding IS1..IS4)
isWhitespace: Java isWhitespace; Z + whitespace ISO controls but excluding no-break spaces
isSpaceChar: just Z (including no-break spaces)

This class is not subclassable.

Summary

Nested classes
abstract	`BidiPairedBracketType` Bidi Paired Bracket Type constants.
abstract	`DecompositionType` Decomposition Type constants.
abstract	`EastAsianWidth` East Asian Width constants.
abstract	`GraphemeClusterBreak` Grapheme Cluster Break constants.
abstract	`HangulSyllableType` Hangul Syllable Type constants.
	`IdentifierType` Identifier Type constants.
abstract	`IndicPositionalCategory` Indic Positional Category constants.
abstract	`IndicSyllabicCategory` Indic Syllabic Category constants.
abstract	`JoiningGroup` Joining Group constants.
abstract	`JoiningType` Joining Type constants.
abstract	`LineBreak` Line Break constants.
abstract	`NumericType` Numeric Type constants.
abstract	`SentenceBreak` Sentence Break constants.
	`UnicodeBlock` [icu enhancement] ICU's replacement for `java.lang.Character.UnicodeBlock`.
abstract	`VerticalOrientation` Vertical Orientation constants.
abstract	`WordBreak` Word Break constants.

Constants
static Int	`FOLD_CASE_DEFAULT` [icu] Option value for case folding: use default mappings defined in CaseFolding.
static Int	`FOLD_CASE_EXCLUDE_SPECIAL_I` [icu] Option value for case folding: Use the modified set of mappings provided in CaseFolding.
static Int	`MAX_CODE_POINT` Constant U+10FFFF, same as `Character.MAX_CODE_POINT`.
static Char	`MAX_HIGH_SURROGATE` Constant U+DBFF, same as `Character.MAX_HIGH_SURROGATE`.
static Char	`MAX_LOW_SURROGATE` Constant U+DFFF, same as `Character.MAX_LOW_SURROGATE`.
static Int	`MAX_RADIX` Compatibility constant for Java Character's MAX_RADIX.
static Char	`MAX_SURROGATE` Constant U+DFFF, same as `Character.MAX_SURROGATE`.
static Int	`MAX_VALUE` The highest Unicode code point value (scalar value), constant U+10FFFF (uses 21 bits).
static Int	`MIN_CODE_POINT` Constant U+0000, same as `Character.MIN_CODE_POINT`.
static Char	`MIN_HIGH_SURROGATE` Constant U+D800, same as `Character.MIN_HIGH_SURROGATE`.
static Char	`MIN_LOW_SURROGATE` Constant U+DC00, same as `Character.MIN_LOW_SURROGATE`.
static Int	`MIN_RADIX` Compatibility constant for Java Character's MIN_RADIX.
static Int	`MIN_SUPPLEMENTARY_CODE_POINT` Constant U+10000, same as `Character.MIN_SUPPLEMENTARY_CODE_POINT`.
static Char	`MIN_SURROGATE` Constant U+D800, same as `Character.MIN_SURROGATE`.
static Int	`MIN_VALUE` The lowest Unicode code point value, constant 0.
static Double	`NO_NUMERIC_VALUE` Special value that is returned by getUnicodeNumericValue(int) when no numeric value is defined for a code point.
static Int	`REPLACEMENT_CHAR` Unicode value used when translating into Unicode encoding form and there is no existing character.
static Int	`SUPPLEMENTARY_MIN_VALUE` The minimum value for Supplementary code points, constant U+10000.
static Int	`TITLECASE_NO_BREAK_ADJUSTMENT` Do not adjust the titlecasing indexes from BreakIterator::next() indexes; titlecase exactly the characters at breaks from the iterator.
static Int	`TITLECASE_NO_LOWERCASE` Do not lowercase non-initial parts of words when titlecasing.

Inherited constants

From class ECharacterDirection

`Int`	`ARABIC_NUMBER` Directional type AN
`Int`	`BLOCK_SEPARATOR` Directional type B
`Int`	`BOUNDARY_NEUTRAL` Directional type BN
`Int`	`COMMON_NUMBER_SEPARATOR` Directional type CS
`Byte`	`DIRECTIONALITY_ARABIC_NUMBER` Equivalent to `java.lang.Character#DIRECTIONALITY_ARABIC_NUMBER`. Synonym of `ARABIC_NUMBER`.
`Byte`	`DIRECTIONALITY_BOUNDARY_NEUTRAL` Equivalent to `java.lang.Character#DIRECTIONALITY_BOUNDARY_NEUTRAL`. Synonym of `BOUNDARY_NEUTRAL`.
`Byte`	`DIRECTIONALITY_COMMON_NUMBER_SEPARATOR` Equivalent to `java.lang.Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR`. Synonym of `COMMON_NUMBER_SEPARATOR`.
`Byte`	`DIRECTIONALITY_EUROPEAN_NUMBER` Equivalent to `java.lang.Character#DIRECTIONALITY_EUROPEAN_NUMBER`. Synonym of `EUROPEAN_NUMBER`.
`Byte`	`DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR` Equivalent to `java.lang.Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR`. Synonym of `EUROPEAN_NUMBER_SEPARATOR`.
`Byte`	`DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR` Equivalent to `java.lang.Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR`. Synonym of `EUROPEAN_NUMBER_TERMINATOR`.
`Byte`	`DIRECTIONALITY_LEFT_TO_RIGHT` Equivalent to `java.lang.Character#DIRECTIONALITY_LEFT_TO_RIGHT`. Synonym of `LEFT_TO_RIGHT`.
`Byte`	`DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING` Equivalent to `java.lang.Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING`. Synonym of `LEFT_TO_RIGHT_EMBEDDING`.
`Byte`	`DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE` Equivalent to `java.lang.Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE`. Synonym of `LEFT_TO_RIGHT_OVERRIDE`.
`Byte`	`DIRECTIONALITY_NONSPACING_MARK` Equivalent to `java.lang.Character#DIRECTIONALITY_NONSPACING_MARK`. Synonym of `DIR_NON_SPACING_MARK`.
`Byte`	`DIRECTIONALITY_OTHER_NEUTRALS` Equivalent to `java.lang.Character#DIRECTIONALITY_OTHER_NEUTRALS`. Synonym of `OTHER_NEUTRAL`.
`Byte`	`DIRECTIONALITY_PARAGRAPH_SEPARATOR` Equivalent to `java.lang.Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR`. Synonym of `BLOCK_SEPARATOR`.
`Byte`	`DIRECTIONALITY_POP_DIRECTIONAL_FORMAT` Equivalent to `java.lang.Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT`. Synonym of `POP_DIRECTIONAL_FORMAT`.
`Byte`	`DIRECTIONALITY_RIGHT_TO_LEFT` Equivalent to `java.lang.Character#DIRECTIONALITY_RIGHT_TO_LEFT`. Synonym of `RIGHT_TO_LEFT`.
`Byte`	`DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC` Equivalent to `java.lang.Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC`. Synonym of `RIGHT_TO_LEFT_ARABIC`.
`Byte`	`DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING` Equivalent to `java.lang.Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING`. Synonym of `RIGHT_TO_LEFT_EMBEDDING`.
`Byte`	`DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE` Equivalent to `java.lang.Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE`. Synonym of `RIGHT_TO_LEFT_OVERRIDE`.
`Byte`	`DIRECTIONALITY_SEGMENT_SEPARATOR` Equivalent to `java.lang.Character#DIRECTIONALITY_SEGMENT_SEPARATOR`. Synonym of `SEGMENT_SEPARATOR`.
`Byte`	`DIRECTIONALITY_UNDEFINED` Undefined bidirectional character type. Undefined `char` values have undefined directionality in the Unicode specification.
`Byte`	`DIRECTIONALITY_WHITESPACE` Equivalent to `java.lang.Character#DIRECTIONALITY_WHITESPACE`. Synonym of `WHITE_SPACE_NEUTRAL`.
`Int`	`DIR_NON_SPACING_MARK` Directional type NSM
`Int`	`EUROPEAN_NUMBER` Directional type EN
`Int`	`EUROPEAN_NUMBER_SEPARATOR` Directional type ES
`Int`	`EUROPEAN_NUMBER_TERMINATOR` Directional type ET
`Byte`	`FIRST_STRONG_ISOLATE` Directional type FSI
`Int`	`LEFT_TO_RIGHT` Directional type L
`Int`	`LEFT_TO_RIGHT_EMBEDDING` Directional type LRE
`Byte`	`LEFT_TO_RIGHT_ISOLATE` Directional type LRI
`Int`	`LEFT_TO_RIGHT_OVERRIDE` Directional type LRO
`Int`	`OTHER_NEUTRAL` Directional type ON
`Int`	`POP_DIRECTIONAL_FORMAT` Directional type PDF
`Byte`	`POP_DIRECTIONAL_ISOLATE` Directional type PDI
`Int`	`RIGHT_TO_LEFT` Directional type R
`Int`	`RIGHT_TO_LEFT_ARABIC` Directional type AL
`Int`	`RIGHT_TO_LEFT_EMBEDDING` Directional type RLE
`Byte`	`RIGHT_TO_LEFT_ISOLATE` Directional type RLI
`Int`	`RIGHT_TO_LEFT_OVERRIDE` Directional type RLO
`Int`	`SEGMENT_SEPARATOR` Directional type S
`Int`	`WHITE_SPACE_NEUTRAL` Directional type WS

From class ECharacterCategory

`Byte`	`COMBINING_SPACING_MARK` Character type Mc
`Byte`	`CONNECTOR_PUNCTUATION` Character type Pc
`Byte`	`CONTROL` Character type Cc
`Byte`	`CURRENCY_SYMBOL` Character type Sc
`Byte`	`DASH_PUNCTUATION` Character type Pd
`Byte`	`DECIMAL_DIGIT_NUMBER` Character type Nd
`Byte`	`ENCLOSING_MARK` Character type Me
`Byte`	`END_PUNCTUATION` Character type Pe
`Byte`	`FINAL_PUNCTUATION` Character type Pf
`Byte`	`FINAL_QUOTE_PUNCTUATION` Character type Pf This name is compatible with java.lang.Character's name for this type.
`Byte`	`FORMAT` Character type Cf
`Byte`	`GENERAL_OTHER_TYPES` Character type Cn Not Assigned (no characters in [UnicodeData.txt] have this property)
`Byte`	`INITIAL_PUNCTUATION` Character type Pi
`Byte`	`INITIAL_QUOTE_PUNCTUATION` Character type Pi This name is compatible with java.lang.Character's name for this type.
`Byte`	`LETTER_NUMBER` Character type Nl
`Byte`	`LINE_SEPARATOR` Character type Zl
`Byte`	`LOWERCASE_LETTER` Character type Ll
`Byte`	`MATH_SYMBOL` Character type Sm
`Byte`	`MODIFIER_LETTER` Character type Lm
`Byte`	`MODIFIER_SYMBOL` Character type Sk
`Byte`	`NON_SPACING_MARK` Character type Mn
`Byte`	`OTHER_LETTER` Character type Lo
`Byte`	`OTHER_NUMBER` Character type No
`Byte`	`OTHER_PUNCTUATION` Character type Po
`Byte`	`OTHER_SYMBOL` Character type So
`Byte`	`PARAGRAPH_SEPARATOR` Character type Zp
`Byte`	`PRIVATE_USE` Character type Co
`Byte`	`SPACE_SEPARATOR` Character type Zs
`Byte`	`START_PUNCTUATION` Character type Ps
`Byte`	`SURROGATE` Character type Cs
`Byte`	`TITLECASE_LETTER` Character type Lt
`Byte`	`UNASSIGNED` Unassigned character type
`Byte`	`UPPERCASE_LETTER` Character type Lu

Public methods
static Int	`charCount(cp: Int)` Same as `Character.charCount`.
static Int	`codePointAt(text: CharArray!, index: Int)` Same as `Character.codePointAt(char[],int)`.
static Int	`codePointAt(text: CharArray!, index: Int, limit: Int)` Same as `Character.codePointAt(char[],int,int)`.
static Int	`codePointAt(seq: CharSequence!, index: Int)` Same as `Character.codePointAt(CharSequence,int)`.
static Int	`codePointBefore(text: CharArray!, index: Int)` Same as `Character.codePointBefore(char[],int)`.
static Int	`codePointBefore(text: CharArray!, index: Int, limit: Int)` Same as `Character.codePointBefore(char[],int,int)`.
static Int	`codePointBefore(seq: CharSequence!, index: Int)` Same as `Character.codePointBefore(CharSequence,int)`.
static Int	`codePointCount(text: CharArray!, start: Int, limit: Int)` Equivalent to the `Character.codePointCount(char[],int,int)` method, for convenience.
static Int	`codePointCount(text: CharSequence!, start: Int, limit: Int)` Equivalent to the `Character.codePointCount(CharSequence,int,int)` method, for convenience.
static Int	`digit(ch: Int)` Returnss the numeric value of a decimal digit code point.
static Int	`digit(ch: Int, radix: Int)` Returnss the numeric value of a decimal digit code point.
static Int	`foldCase(ch: Int, defaultmapping: Boolean)` [icu] The given character is mapped to its case folding equivalent according to UnicodeData.
static Int	`foldCase(ch: Int, options: Int)` [icu] The given character is mapped to its case folding equivalent according to UnicodeData.
static String!	`foldCase(str: String!, defaultmapping: Boolean)` [icu] The given string is mapped to its case folding equivalent according to UnicodeData.
static String!	`foldCase(str: String!, options: Int)` [icu] The given string is mapped to its case folding equivalent according to UnicodeData.
static Char	`forDigit(digit: Int, radix: Int)` Provide the java.
static VersionInfo!	`getAge(ch: Int)` [icu] Returns the "age" of the code point.
static Int	`getBidiPairedBracket(c: Int)` [icu] Maps the specified character to its paired bracket character.
static Int	`getCharFromExtendedName(name: String!)` [icu]
static Int	`getCharFromName(name: String!)` [icu]
static Int	`getCharFromNameAlias(name: String!)` [icu]
static Int	`getCodePoint(char16: Char)` [icu] Returns the code point corresponding to the BMP code point.
static Int	`getCodePoint(lead: Char, trail: Char)` [icu] Returns a code point corresponding to the two surrogate code units.
static Int	`getCodePoint(lead: Int, trail: Int)` [icu] Returns a code point corresponding to the two surrogate code units.
static Int	`getCombiningClass(ch: Int)` [icu] Returns the combining class of the argument codepoint
static Int	`getDirection(ch: Int)` [icu] Returns the Bidirection property of a code point.
static Byte	`getDirectionality(cp: Int)` Equivalent to the `Character.getDirectionality(char)` method, for convenience.
static String!	`getExtendedName(ch: Int)` [icu] Returns a name for a valid codepoint.
static ValueIterator!	`getExtendedNameIterator()` [icu]
static Int	`getHanNumericValue(ch: Int)` [icu] Returns the numeric value of a Han character.
static Int	`getIdentifierTypes(c: Int, types: EnumSet<UCharacter.IdentifierType!>!)` Writes code point c's Identifier_Type as a set of IdentifierType values and returns the number of types.
static Int	`getIntPropertyMaxValue(type: Int)` [icu] Returns the maximum value for an integer/binary Unicode property.
static Int	`getIntPropertyMinValue(type: Int)` [icu] Returns the minimum value for an integer/binary Unicode property type.
static Int	`getIntPropertyValue(ch: Int, type: Int)` [icu] Returns the property value for a Unicode property type of a code point.
static Int	`getMirror(ch: Int)` [icu] Maps the specified code point to a "mirror-image" code point.
static String!	`getName(ch: Int)` [icu] Returns the most current Unicode name of the argument code point, or null if the character is unassigned or outside the range `UCharacter.MIN_VALUE` and `UCharacter.MAX_VALUE` or does not have a name.
static String!	`getName(s: String!, separator: String!)` [icu] Returns the names for each of the characters in a string
static String!	`getNameAlias(ch: Int)` [icu] Returns the corrected name from NameAliases.
static ValueIterator!	`getNameIterator()` [icu]
static Int	`getNumericValue(ch: Int)` Returns the numeric value of the code point as a nonnegative integer.
static Int	`getPropertyEnum(propertyAlias: CharSequence!)` [icu] Return the UProperty selector for a given property name, as specified in the Unicode database file PropertyAliases.
static String!	`getPropertyName(property: Int, nameChoice: Int)` [icu] Return the Unicode name for a given property, as given in the Unicode database file PropertyAliases.
static Int	`getPropertyValueEnum(property: Int, valueAlias: CharSequence!)` [icu] Return the property value integer for a given value name, as specified in the Unicode database file PropertyValueAliases.
static String!	`getPropertyValueName(property: Int, value: Int, nameChoice: Int)` [icu] Return the Unicode name for a given property value, as given in the Unicode database file PropertyValueAliases.
static Int	`getType(ch: Int)` Returns a value indicating a code point's Unicode category.
static RangeValueIterator!	`getTypeIterator()` [icu]
static Double	`getUnicodeNumericValue(ch: Int)` [icu] Returns the numeric value for a Unicode code point as defined in the Unicode Character Database.
static VersionInfo!	`getUnicodeVersion()` [icu] Returns the version of Unicode data used.
static Boolean	`hasBinaryProperty(ch: Int, property: Int)` [icu] Check a binary Unicode property for a code point.
static Boolean	`hasBinaryProperty(s: CharSequence!, property: Int)` [icu] Returns true if the property is true for the string.
static Boolean	`hasIdentifierType(c: Int, type: UCharacter.IdentifierType!)` Does the set of Identifier_Type values code point c contain the given type?
static Boolean	`isBMP(ch: Int)` [icu] Determines if the code point is in the BMP plane.
static Boolean	`isBaseForm(ch: Int)` [icu] Determines whether the specified code point is of base form.
static Boolean	`isDefined(ch: Int)` Determines if a code point has a defined meaning in the up-to-date Unicode standard.
static Boolean	`isDigit(ch: Int)` Determines if a code point is a Java digit.
static Boolean	`isHighSurrogate(ch: Char)` Same as `Character.isHighSurrogate`,
static Boolean	`isHighSurrogate(codePoint: Int)` Same as `Character.isHighSurrogate`, except that the ICU version accepts `int` for code points.
static Boolean	`isISOControl(ch: Int)` Determines if the specified code point is an ISO control character.
static Boolean	`isIdentifierIgnorable(ch: Int)` Determines if the specified code point should be regarded as an ignorable character in a Java identifier.
static Boolean	`isJavaIdentifierPart(cp: Int)` Compatibility override of Java method, delegates to java.
static Boolean	`isJavaIdentifierStart(cp: Int)` Compatibility override of Java method, delegates to java.
static Boolean	`isLegal(ch: Int)` [icu] A code point is illegal if and only if Out of bounds, less than 0 or greater than UCharacter.MAX_VALUE A surrogate value, 0xD800 to 0xDFFF Not-a-character, having the form 0x xxFFFF or 0x xxFFFE Note: legal does not mean that it is assigned in this version of Unicode.
static Boolean	`isLegal(str: String!)` [icu] A string is legal iff all its code points are legal.
static Boolean	`isLetter(ch: Int)` Determines if the specified code point is a letter.
static Boolean	`isLetterOrDigit(ch: Int)` Determines if the specified code point is a letter or digit.
static Boolean	`isLowSurrogate(ch: Char)` Same as `Character.isLowSurrogate`,
static Boolean	`isLowSurrogate(codePoint: Int)` Same as `Character.isLowSurrogate`, except that the ICU version accepts `int` for code points.
static Boolean	`isLowerCase(ch: Int)` Determines if the specified code point is a lowercase character.
static Boolean	`isMirrored(ch: Int)` Determines whether the code point has the "mirrored" property.
static Boolean	`isPrintable(ch: Int)` [icu] Determines whether the specified code point is a printable character according to the Unicode standard.
static Boolean	`isSpaceChar(ch: Int)` Determines if the specified code point is a Unicode specified space character, i.
static Boolean	`isSupplementary(ch: Int)` [icu] Determines if the code point is a supplementary character.
static Boolean	`isSupplementaryCodePoint(cp: Int)` Same as `Character.isSupplementaryCodePoint`.
static Boolean	`isSurrogatePair(high: Char, low: Char)` Same as `Character.isSurrogatePair`.
static Boolean	`isSurrogatePair(high: Int, low: Int)` Same as `Character.isSurrogatePair`, except that the ICU version accepts `int` for code points.
static Boolean	`isTitleCase(ch: Int)` Determines if the specified code point is a titlecase character.
static Boolean	`isUAlphabetic(ch: Int)` [icu]
static Boolean	`isULowercase(ch: Int)` [icu]
static Boolean	`isUUppercase(ch: Int)` [icu]
static Boolean	`isUWhiteSpace(ch: Int)` [icu]
static Boolean	`isUnicodeIdentifierPart(ch: Int)` Determines if the specified character is permissible as a non-initial character of an identifier according to UAX #31 Unicode Identifier and Pattern Syntax.
static Boolean	`isUnicodeIdentifierStart(ch: Int)` Determines if the specified character is permissible as the first character in an identifier according to UAX #31 Unicode Identifier and Pattern Syntax.
static Boolean	`isUpperCase(ch: Int)` Determines if the specified code point is an uppercase character.
static Boolean	`isValidCodePoint(cp: Int)` Is cp a Unicode code point U+0000.
static Boolean	`isWhitespace(ch: Int)` Determines if the specified code point is a white space character.
static Int	`offsetByCodePoints(text: CharArray!, start: Int, count: Int, index: Int, codePointOffset: Int)` Equivalent to the `Character.offsetByCodePoints(char[],int,int,int,int)` method, for convenience.
static Int	`offsetByCodePoints(text: CharSequence!, index: Int, codePointOffset: Int)` Equivalent to the `Character.offsetByCodePoints(CharSequence,int,int)` method, for convenience.
static CharArray!	`toChars(cp: Int)` Same as `Character.toChars(int)`.
static Int	`toChars(cp: Int, dst: CharArray!, dstIndex: Int)` Same as `Character.toChars(int,char[],int)`.
static Int	`toCodePoint(high: Char, low: Char)` Same as `Character.toCodePoint`.
static Int	`toCodePoint(high: Int, low: Int)` Same as `Character.toCodePoint`, except that the ICU version accepts `int` for code points.
static String!	`toLowerCase(locale: ULocale!, str: String!)` Returns the lowercase version of the argument string.
static Int	`toLowerCase(ch: Int)` The given code point is mapped to its lowercase equivalent; if the code point has no lowercase equivalent, the code point itself is returned.
static String!	`toLowerCase(str: String!)` Returns the lowercase version of the argument string.
static String!	`toLowerCase(locale: Locale!, str: String!)` Returns the lowercase version of the argument string.
static String!	`toString(ch: Int)` Converts argument code point and returns a String object representing the code point's value in UTF-16 format.
static String!	`toTitleCase(locale: ULocale!, str: String!, titleIter: BreakIterator!)` Returns the titlecase version of the argument string.
static String!	`toTitleCase(locale: ULocale!, str: String!, titleIter: BreakIterator!, options: Int)` Returns the titlecase version of the argument string.
static Int	`toTitleCase(ch: Int)` Converts the code point argument to titlecase.
static String!	`toTitleCase(str: String!, breakiter: BreakIterator!)` Returns the titlecase version of the argument string.
static String!	`toTitleCase(locale: Locale!, str: String!, breakiter: BreakIterator!)` Returns the titlecase version of the argument string.
static String!	`toTitleCase(locale: Locale!, str: String!, titleIter: BreakIterator!, options: Int)` [icu]
static String!	`toUpperCase(locale: ULocale!, str: String!)` Returns the uppercase version of the argument string.
static Int	`toUpperCase(ch: Int)` Converts the character argument to uppercase.
static String!	`toUpperCase(str: String!)` Returns the uppercase version of the argument string.
static String!	`toUpperCase(locale: Locale!, str: String!)` Returns the uppercase version of the argument string.

Parameters
`text`	CharArray!: the characters to check
`index`	Int: the index of the first or only char forming the code point

Parameters
`seq`	CharSequence!: the characters to check
`index`	Int: the index of the first or only char forming the code point

Parameters
`text`	CharArray!: the characters to check
`start`	Int: the start of the range
`limit`	Int: the limit of the range

Parameters
`ch`	Int: the character to be converted
`defaultmapping`	Boolean: Indicates whether the default mappings defined in CaseFolding.txt are to be used, otherwise the mappings for dotted I and dotless i marked with 'T' in CaseFolding.txt are included.

Parameters
`ch`	Int: the character to be converted
`options`	Int: A bit set for special processing. Currently the recognised options are FOLD_CASE_EXCLUDE_SPECIAL_I and FOLD_CASE_DEFAULT

Parameters
`str`	String!: the String to be converted
`defaultmapping`	Boolean: Indicates whether the default mappings defined in CaseFolding.txt are to be used, otherwise the mappings for dotted I and dotless i marked with 'T' in CaseFolding.txt are included.

Parameters
`lead`	Int: the lead unit (In ICU 2.1-69 the type of both parameters was `char`.)
`trail`	Int: the trail unit

Parameters
`c`	Int: code point
`types`	EnumSet<UCharacter.IdentifierType!>!: output set

Parameters
`ch`	Int: code point to test.
`type`	Int: UProperty selector constant, identifies which binary property to check. Must be UProperty.BINARY_START <= type < UProperty.BINARY_LIMIT or UProperty.INT_START <= type < UProperty.INT_LIMIT or UProperty.MASK_START <= type < UProperty.MASK_LIMIT.

Parameters
`s`	String!: string to format
`separator`	String!: string to go between names

Parameters
`property`	Int: UProperty selector.
`nameChoice`	Int: UProperty.NameChoice selector for which name to get. All properties have a long name. Most have a short name, but some do not. Unicode allows for additional names; if present these will be returned by UProperty.NameChoice.LONG + i, where i=1, 2,...

Parameters
`property`	Int: UProperty selector constant. UProperty.INT_START <= property < UProperty.INT_LIMIT or UProperty.BINARY_START <= property < UProperty.BINARY_LIMIT or UProperty.MASK_START < = property < UProperty.MASK_LIMIT. Only these properties can be enumerated.
`valueAlias`	CharSequence!: the value name to be matched. The name is compared using "loose matching" as described in PropertyValueAliases.txt.

Parameters
`property`	Int: UProperty selector constant. UProperty.INT_START <= property < UProperty.INT_LIMIT or UProperty.BINARY_START <= property < UProperty.BINARY_LIMIT or UProperty.MASK_START < = property < UProperty.MASK_LIMIT. If out of range, null is returned.
`value`	Int: selector for a value for the given property. In general, valid values range from 0 up to some maximum. There are a few exceptions: (1.) UProperty.BLOCK values begin at the non-zero value BASIC_LATIN.getID(). (2.) UProperty.CANONICAL_COMBINING_CLASS values are not contiguous and range from 0..240. (3.) UProperty.GENERAL_CATEGORY_MASK values are mask values produced by left-shifting 1 by UCharacter.getType(). This allows grouped categories such as [:L:] to be represented. Mask values are non-contiguous.
`nameChoice`	Int: UProperty.NameChoice selector for which name to get. All values have a long name. Most have a short name, but some do not. Unicode allows for additional names; if present these will be returned by UProperty.NameChoice.LONG + i, where i=1, 2,...

Parameters
`s`	CharSequence!: String to test.
`property`	Int: UProperty selector constant, identifies which binary property to check. Must be BINARY_START<=which<BINARY_LIMIT.

Parameters
`c`	Int: code point
`type`	UCharacter.IdentifierType!: Identifier_Type to check

Parameters
`high`	Char: the high (lead) char
`low`	Char: the low (trail) char

Parameters
`high`	Int: the high (lead) unit (In ICU 3.0-69 the type of both parameters was `char`.)
`low`	Int: the low (trail) unit

Parameters
`cp`	Int: the code point to convert
`dst`	CharArray!: the destination array into which to put the char(s) representing the code point
`dstIndex`	Int: the index at which to put the first (or only) char

Parameters
`high`	Char: the high (lead) surrogate
`low`	Char: the low (trail) surrogate

Parameters
`high`	Int: the high (lead) surrogate (In ICU 3.0-69 the type of both parameters was `char`.)
`low`	Int: the low (trail) surrogate

UCharacter

Summary

Constants

FOLD_CASE_DEFAULT

FOLD_CASE_EXCLUDE_SPECIAL_I

MAX_CODE_POINT

MAX_HIGH_SURROGATE

MAX_LOW_SURROGATE

MAX_RADIX

MAX_SURROGATE

MAX_VALUE

MIN_CODE_POINT

MIN_HIGH_SURROGATE

MIN_LOW_SURROGATE

MIN_RADIX

MIN_SUPPLEMENTARY_CODE_POINT

MIN_SURROGATE

MIN_VALUE

NO_NUMERIC_VALUE

REPLACEMENT_CHAR

SUPPLEMENTARY_MIN_VALUE

TITLECASE_NO_BREAK_ADJUSTMENT

TITLECASE_NO_LOWERCASE

Public methods

charCount

codePointAt

codePointAt

codePointAt

codePointBefore

codePointBefore

codePointBefore

codePointCount

codePointCount

digit

digit

foldCase

foldCase

foldCase

foldCase

forDigit

getAge

getBidiPairedBracket

getCharFromExtendedName

getCharFromName

getCharFromNameAlias

getCodePoint

getCodePoint

getCodePoint

getCombiningClass

getDirection

getDirectionality

getExtendedName

getExtendedNameIterator

getHanNumericValue

getIdentifierTypes

getIntPropertyMaxValue

getIntPropertyMinValue

getIntPropertyValue

getMirror

getName

getName

getNameAlias

getNameIterator

getNumericValue

getPropertyEnum

getPropertyName

getPropertyValueEnum

getPropertyValueName

getType

getTypeIterator

getUnicodeNumericValue

getUnicodeVersion

hasBinaryProperty

hasBinaryProperty

hasIdentifierType

isBMP

isBaseForm

isDefined

isDigit

isHighSurrogate

Parameters
`locale`	ULocale!: which string is to be converted in
`str`	String!: source string to be performed on

Parameters
`locale`	Locale!: which string is to be converted in
`str`	String!: source string to be performed on

Parameters
`str`	String!: source string to be performed on
`breakiter`	BreakIterator!: break iterator to determine the positions in which the character should be title cased.