com.upokecenter.text.Normalizer

# com.upokecenter.text.Normalizer

@Deprecated public final class Normalizer extends Object

Implements the Unicode normalization algorithm and contains methods and functionality to test and convert Unicode strings for Unicode normalization.

NOTICE: While this class's source code is in the public domain, the class uses an class, called NormalizationData, that includes data derived from the Unicode Character Database. See the documentation for the NormalizerInput class for the permission notice for the Unicode Character Database.

Constructors

Methods

static boolean IsNormalized(String str, Normalization form)
Deprecated. Returns whether this string is normalized.
static String Normalize(String str, Normalization form)
Deprecated. Converts a string to the specified Unicode normalization form.
int Read(int[] chars, int index, int length)
Deprecated. Reads a sequence of Unicode code points from a data source.
int ReadChar()
Deprecated. Reads a Unicode character from a data source.

Method Details

Normalize

public static String Normalize(String str, Normalization form)

Converts a string to the specified Unicode normalization form.

Parameters:

str - An arbitrary string.
form - The Unicode normalization form to convert to.

Returns:

The parameter str converted to the specified normalization form.

Throws:

NullPointerException - The parameter str is null.

IsNormalized

public static boolean IsNormalized(String str, Normalization form)

Returns whether this string is normalized.

Parameters:

str - The string to check.
form - The parameter form is a Normalization object.

Returns:

true if this string is normalized; otherwise, false. Returns false if the string contains an unpaired surrogate code point.

ReadChar

public int ReadChar()

Reads a Unicode character from a data source.

Returns:

Either a Unicode code point (from 0-0xd7ff or from 0xe000 to 0x10ffff), or the value -1 indicating the end of the source.

Read

public int Read(int[] chars, int index, int length)

Reads a sequence of Unicode code points from a data source.

Parameters:

chars - Output buffer.
index - Index in the output buffer to start writing to.
length - Maximum number of code points to write.

Returns:

The number of Unicode code points read, or 0 if the end of the source is reached.

Throws:

IllegalArgumentException - Either index or length is less than 0 or greater than chars ‘s length, or chars ‘s length minus index is less than length.
NullPointerException - The parameter chars is null.

Back to MailLib start page.