Class MarkdownTextSplitter

Abstract base class for document transformation systems.

A document transformation system takes an array of Documents and returns an array of transformed Documents. These arrays do not necessarily have to have the same length.

One example of this is a text splitter that splits a large document into many smaller documents.

Hierarchy

Implements

Constructors

Properties

chunkOverlap: number = 200
chunkSize: number = 1000
keepSeparator: boolean = false
lengthFunction: ((text) => number) | ((text) => Promise<number>)

Type declaration

    • (text): number
    • Parameters

      • text: string

      Returns number

Type declaration

    • (text): Promise<number>
    • Parameters

      • text: string

      Returns Promise<number>

separators: string[] = ...

Methods

  • Method to invoke the document transformation. This method calls the transformDocuments method with the provided input.

    Parameters

    • input: Document<Record<string, any>>[]

      The input documents to be transformed.

    • Optional _options: Partial<BaseCallbackConfig>

      Optional configuration object to customize the behavior of callbacks.

    Returns Promise<Document<Record<string, any>>[]>

    A Promise that resolves to the transformed documents.

  • Stream all output from a runnable, as reported to the callback system. This includes all inner runs of LLMs, Retrievers, Tools, etc. Output is streamed as Log objects, which include a list of jsonpatch ops that describe how the state of the run has changed in each step, and the final state of the run. The jsonpatch ops can be applied in order to construct state.

    Parameters

    • input: Document<Record<string, any>>[]
    • Optional options: Partial<BaseCallbackConfig>
    • Optional streamOptions: Omit<LogStreamCallbackHandlerInput, "autoClose">

    Returns AsyncGenerator<RunLogPatch, any, unknown>

Generated using TypeDoc