PeterO.Text.CharacterReader
## PeterO.Text.CharacterReader
public sealed class CharacterReader : PeterO.Text.ICharacterInput
A general-purpose character input for reading text from byte streams and text strings. When reading byte streams, this class supports the UTF-8 character encoding by default, but can be configured to support UTF-16 and UTF-32 as well.
Member Summary
[Read(int[], int, int)](#Read_int_int_int)
- Reads a series of code points from a Unicode stream or a string.[ReadChar()](#ReadChar)
- Reads the next character from a Unicode stream or a string.
### CharacterReader Constructor
public CharacterReader( string str);
Initializes a new instance of the PeterO.Text.CharacterReader class.
Parameters:
- str: The parameter str is a text string.
### CharacterReader Constructor
public CharacterReader( string str, bool skipByteOrderMark);
Initializes a new instance of the PeterO.Text.CharacterReader class.
Parameters:
-
str: The parameter str is a text string.
-
skipByteOrderMark: If true and the first character in the string is U+FEFF, skip that character.
Exceptions:
- System.ArgumentNullException: The parameter str is null.
### CharacterReader Constructor
public CharacterReader( string str, bool skipByteOrderMark, bool errorThrow);
Initializes a new instance of the PeterO.Text.CharacterReader class.
Parameters:
-
str: The parameter str is a text string.
-
skipByteOrderMark: If true and the first character in the string is U+FEFF, skip that character.
-
errorThrow: When encountering invalid encoding, throw an exception if this parameter is true, or replace it with U+FFFD (replacement character) if this parameter is false.
Exceptions:
- System.ArgumentNullException: The parameter str is null.
### CharacterReader Constructor
public CharacterReader( string str, int offset, int length);
Initializes a new instance of the PeterO.Text.CharacterReader class.
Parameters:
-
str: The parameter str is a text string.
-
offset: An index, starting at 0, showing where the desired portion of str begins.
-
length: The length, in code units, of the desired portion of str (but not more than str ‘s length).
Exceptions:
-
System.ArgumentException: Either “offset” or “length” is less than 0 or greater than “str”’s length, or “str”’s length minus “offset” is less than “length”.
-
System.ArgumentNullException: The parameter str is null.
### CharacterReader Constructor
public CharacterReader( string str, int offset, int length, bool skipByteOrderMark, bool errorThrow);
Initializes a new instance of the PeterO.Text.CharacterReader class.
Parameters:
-
str: The parameter str is a text string.
-
offset: An index, starting at 0, showing where the desired portion of str begins.
-
length: The length, in code units, of the desired portion of str (but not more than str ‘s length).
-
skipByteOrderMark: If true and the first character in the string portion is U+FEFF, skip that character.
-
errorThrow: When encountering invalid encoding, throw an exception if this parameter is true, or replace it with U+FFFD (replacement character) if this parameter is false.
Exceptions:
-
System.ArgumentNullException: The parameter str is null.
-
System.ArgumentException: Either offset or length is less than 0 or greater than str ‘s length, or str ‘s length minus offset is less than length .
### CharacterReader Constructor
public CharacterReader( System.IO.Stream stream);
Initializes a new instance of the PeterO.Text.CharacterReader class; will read the stream as UTF-8, skip the byte-order mark (U+FEFF) if it appears first in the stream, and replace invalid byte sequences with replacement characters (U+FFFD).
Parameters:
- stream: A readable data stream.
Exceptions:
- System.ArgumentNullException: The parameter stream is null.
### CharacterReader Constructor
public CharacterReader( System.IO.Stream stream, int mode);
Initializes a new instance of the PeterO.Text.CharacterReader class; will skip the byte-order mark (U+FEFF) if it appears first in the stream and replace invalid byte sequences with replacement characters (U+FFFD).
Parameters:
-
stream: A readable byte stream.
-
mode: The method to use when detecting encodings other than UTF-8 in the byte stream. This usually involves checking whether the stream begins with a byte-order mark (BOM, U+FEFF) or a non-zero basic code point (U+0001 to U+007F) before reading the rest of the stream. This value can be one of the following:
-
0: UTF-8 only.
-
1: Detect UTF-16 using BOM or non-zero basic code point, otherwise UTF-8.
-
2: Detect UTF-16/UTF-32 using BOM or non-zero basic code point, otherwise UTF-8. (Tries to detect UTF-32 first.)
-
3: Detect UTF-16 using BOM, otherwise UTF-8.
-
4: Detect UTF-16/UTF-32 using BOM, otherwise UTF-8. (Tries to detect UTF-32 first.)
.
Exceptions:
- System.ArgumentNullException: The parameter stream is null.
### CharacterReader Constructor
public CharacterReader( System.IO.Stream stream, int mode, bool errorThrow);
Initializes a new instance of the PeterO.Text.CharacterReader class; will skip the byte-order mark (U+FEFF) if it appears first in the stream and a UTF-8 stream is detected.
Parameters:
-
stream: A readable data stream.
-
mode: The method to use when detecting encodings other than UTF-8 in the byte stream. This usually involves checking whether the stream begins with a byte-order mark (BOM, U+FEFF) or a non-zero basic code point (U+0001 to U+007F) before reading the rest of the stream. This value can be one of the following:
-
0: UTF-8 only.
-
1: Detect UTF-16 using BOM or non-zero basic code point, otherwise UTF-8.
-
2: Detect UTF-16/UTF-32 using BOM or non-zero basic code point, otherwise UTF-8. (Tries to detect UTF-32 first.)
-
3: Detect UTF-16 using BOM, otherwise UTF-8.
-
4: Detect UTF-16/UTF-32 using BOM, otherwise UTF-8. (Tries to detect UTF-32 first.)
.
- errorThrow: When encountering invalid encoding, throw an exception if this parameter is true, or replace it with U+FFFD (replacement character) if this parameter is false.
### CharacterReader Constructor
public CharacterReader( System.IO.Stream stream, int mode, bool errorThrow, bool dontSkipUtf8Bom);
Initializes a new instance of the PeterO.Text.CharacterReader class.
Parameters:
-
stream: A readable byte stream.
-
mode: The method to use when detecting encodings other than UTF-8 in the byte stream. This usually involves checking whether the stream begins with a byte-order mark (BOM, U+FEFF) or a non-zero basic code point (U+0001 to U+007F) before reading the rest of the stream. This value can be one of the following:
-
0: UTF-8 only.
-
1: Detect UTF-16 using BOM or non-zero basic code point, otherwise UTF-8.
-
2: Detect UTF-16/UTF-32 using BOM or non-zero basic code point, otherwise UTF-8. (Tries to detect UTF-32 first.)
-
3: Detect UTF-16 using BOM, otherwise UTF-8.
-
4: Detect UTF-16/UTF-32 using BOM, otherwise UTF-8. (Tries to detect UTF-32 first.)
.
-
errorThrow: If true, will throw an exception if invalid byte sequences (in the detected encoding) are found in the byte stream. If false, replaces those byte sequences with replacement characters (U+FFFD) as the stream is read.
-
dontSkipUtf8Bom: If the stream is detected as UTF-8 (including when “mode” is 0) and this parameter is
true
, won’t skip the BOM character if it occurs at the start of the stream.
Exceptions:
- System.ArgumentNullException: The parameter stream is null.
public sealed int Read( int[] chars, int index, int length);
Reads a series of code points from a Unicode stream or a string.
Parameters:
-
chars: An array where the code points that were read will be stored.
-
index: An index starting at 0 showing where the desired portion of chars begins.
-
length: The number of elements in the desired portion of chars (but not more than chars ‘s length).
Return Value:
The number of code points read from the stream. This can be less than the length parameter if the end of the stream is reached.
Exceptions:
-
System.ArgumentNullException: The parameter chars is null.
-
System.ArgumentException: Either index or length is less than 0 or greater than chars ‘s length, or chars ‘s length minus index is less than length .
public sealed int ReadChar();
Reads the next character from a Unicode stream or a string.
Return Value:
The next character, or -1 if the end of the string or stream was reached.