com.upokecenter.text.Normalizer
com.upokecenter.text.Normalizer
@Deprecated public final class Normalizer extends Object
Implements the Unicode normalization algorithm and contains methods and functionality to test and convert Unicode strings for Unicode normalization.
NOTICE: While this class's source code is in the public domain, the class uses an class, called NormalizationData, that includes data derived from the Unicode Character Database. See the documentation for the NormalizerInput class for the permission notice for the Unicode Character Database.
Constructors
Methods
static boolean IsNormalized(String str, Normalization form)
Deprecated. Returns whether this string is normalized.static String Normalize(String str, Normalization form)
Deprecated. Converts a string to the given Unicode normalization form.int Read(int[] chars, int index, int length)
Deprecated. Reads a sequence of Unicode code points from a data source.int ReadChar()
Deprecated. Reads a Unicode character from a data source.
Method Details
Normalize
public static String Normalize(String str, Normalization form)
Converts a string to the given Unicode normalization form.
Parameters:
str
- An arbitrary string.form
- The Unicode normalization form to convert to.
Returns:
- The parameter
str
converted to the given normalization form.
Throws:
NullPointerException
- The parameterstr
is null.
IsNormalized
public static boolean IsNormalized(String str, Normalization form)
Returns whether this string is normalized.
Parameters:
str
- The string to check.form
- The parameterform
is a Normalization object.
Returns:
true
if this string is normalized; otherwise,false
. Returnsfalse
if the string contains an unpaired surrogate code point.
ReadChar
public int ReadChar()
Reads a Unicode character from a data source.
Returns:
- Either a Unicode code point (from 0-0xd7ff or from 0xe000 to 0x10ffff), or the value -1 indicating the end of the source.
Read
public int Read(int[] chars, int index, int length)
Reads a sequence of Unicode code points from a data source.
Parameters:
chars
- Output buffer.index
- Index in the output buffer to start writing to.length
- Maximum number of code points to write.
Returns:
- The number of Unicode code points read, or 0 if the end of the source is reached.
Throws:
IllegalArgumentException
- Eitherindex
orlength
is less than 0 or greater thanchars
's length, orchars
's length minusindex
is less thanlength
.NullPointerException
- The parameterchars
is null.