Class Utf8StringBuilder

java.lang.Object
org.eclipse.jetty.util.Utf8StringBuilder
All Implemented Interfaces:
CharsetStringBuilder
Direct Known Subclasses:
CharsetStringBuilder.ReportingUtf8StringBuilder, NullAppendable

public class Utf8StringBuilder extends Object implements CharsetStringBuilder

UTF-8 StringBuilder.

This class wraps a standard StringBuilder and provides methods to append UTF-8 encoded bytes, that are converted into characters.

This class is stateful and up to 4 calls to append(byte) may be needed before state a character is appended to the string buffer.

The UTF-8 decoding is done by this class and no additional buffers or Readers are used. The algorithm is fast fail, in that errors are detected as the bytes are appended. However, no exceptions are thrown and only the hasCodingErrors() method indicates the fast failure, otherwise the coding errors are replaced and may be returned, unless the build() method is used, which may throw CharacterCodingException. Already decoded characters may also be appended (e.g. append(char) making this class suitable for decoding % encoded strings of already decoded characters.

See Also:
  • Field Details

    • LOG

      protected static final org.slf4j.Logger LOG
    • REPLACEMENT

      public static final char REPLACEMENT
      See Also:
    • _state

      protected int _state
  • Constructor Details

    • Utf8StringBuilder

      public Utf8StringBuilder()
    • Utf8StringBuilder

      public Utf8StringBuilder(int capacity)
    • Utf8StringBuilder

      protected Utf8StringBuilder(StringBuilder buffer)
  • Method Details

    • length

      public int length()
      Specified by:
      length in interface CharsetStringBuilder
      Returns:
      the length in characters
    • hasCodingErrors

      public boolean hasCodingErrors()
      Returns:
      True if the characters decoded have contained UTF8 coding errors.
    • reset

      public void reset()
      Reset the appendable, clearing the buffer, resetting decoding state and clearing any errors.
      Specified by:
      reset in interface CharsetStringBuilder
    • partialReset

      public void partialReset()
      Partially reset the appendable: clear the buffer and clear any errors, but retain the decoding state of any partially decoded sequences.
    • checkCharAppend

      protected void checkCharAppend()
    • append

      public void append(char c)
      Specified by:
      append in interface CharsetStringBuilder
      Parameters:
      c - A decoded character to append
    • append

      public void append(String s)
    • append

      public void append(String s, int offset, int length)
    • append

      public void append(byte b)
      Specified by:
      append in interface CharsetStringBuilder
      Parameters:
      b - An encoded byte to append
    • append

      public void append(ByteBuffer buf)
      Specified by:
      append in interface CharsetStringBuilder
      Parameters:
      buf - Buffer of encoded bytes to append. The bytes are consumed from the buffer.
    • append

      public void append(byte[] b)
      Specified by:
      append in interface CharsetStringBuilder
      Parameters:
      b - Array of encoded bytes to append
    • append

      public void append(byte[] b, int offset, int length)
      Specified by:
      append in interface CharsetStringBuilder
      Parameters:
      b - Array of encoded bytes
      offset - offset into the array
      length - the number of bytes to append from the array.
    • append

      public boolean append(byte[] b, int offset, int length, int maxChars)
    • bufferAppend

      protected void bufferAppend(char c)
    • bufferReset

      protected void bufferReset()
    • appendByte

      public void appendByte(byte b) throws IOException
      Throws:
      IOException
    • isComplete

      public boolean isComplete()
      Returns:
      True if the appended sequences are complete UTF-8 sequences.
    • complete

      public void complete()
      Complete the appendable, adding a replacement character and coding error if the sequence is not currently complete.
    • toString

      public String toString()
      Overrides:
      toString in class Object
      Returns:
      The currently decoded string, excluding any partial sequences appended.
    • toPartialString

      public String toPartialString()
      Returns:
      The currently decoded string, excluding any partial sequences appended.
    • toCompleteString

      public String toCompleteString()
      Get the completely decoded string, which is equivalent to calling complete() then toString().
      Returns:
      The completely decoded string.
    • takeCompleteString

      public <X extends Throwable> String takeCompleteString(Supplier<X> onCodingError) throws X
      Take the completely decoded string.
      Type Parameters:
      X - The type of the exception thrown
      Parameters:
      onCodingError - A supplier of a Throwable to use if hasCodingErrors() returns true, or null for no error action
      Returns:
      The complete string.
      Throws:
      X - if hasCodingErrors() is true after complete().
    • takePartialString

      public <X extends Throwable> String takePartialString(Supplier<X> onCodingError) throws X
      Take the partially decoded string.
      Type Parameters:
      X - The type of the exception thrown
      Parameters:
      onCodingError - A supplier of a Throwable to use if hasCodingErrors() returns true, or null for no error action
      Returns:
      The complete string.
      Throws:
      X - if hasCodingErrors() is true after complete().
    • build

      public String build() throws CharacterCodingException
      Description copied from interface: CharsetStringBuilder

      Build the completed string and reset the buffer.

      Specified by:
      build in interface CharsetStringBuilder
      Returns:
      The decoded built string which must be complete in regard to any multibyte sequences.
      Throws:
      CharacterCodingException - If the bytes cannot be correctly decoded or a multibyte sequence is incomplete.