Class: SentenceSplitter
SentenceSplitter is our default text splitter that supports splitting into sentences, paragraphs, or fixed length chunks with overlap.
Constructors
constructor
• new SentenceSplitter(chunkSize?
, chunkOverlap?
, tokenizer?
, tokenizerDecoder?
, paragraphSeparator?
, chunkingTokenizerFn?
)
Parameters
Name | Type | Default value |
---|---|---|
chunkSize | number | DEFAULT_CHUNK_SIZE |
chunkOverlap | number | DEFAULT_CHUNK_OVERLAP |
tokenizer | any | null |
tokenizerDecoder | any | null |
paragraphSeparator | string | "\n\n\n" |
chunkingTokenizerFn | any | undefined |
Defined in
TextSplitter.ts:33
Properties
chunkOverlap
• Private
chunkOverlap: number
Defined in
TextSplitter.ts:26
chunkSize
• Private
chunkSize: number
Defined in
TextSplitter.ts:25
chunkingTokenizerFn
• Private
chunkingTokenizerFn: any
Defined in
TextSplitter.ts:30
paragraphSeparator
• Private
paragraphSeparator: string
Defined in
TextSplitter.ts:29
tokenizer
• Private
tokenizer: any
Defined in
TextSplitter.ts:27
tokenizerDecoder
• Private
tokenizerDecoder: any
Defined in
TextSplitter.ts:28
Methods
combineTextSplits
▸ combineTextSplits(newSentenceSplits
, effectiveChunkSize
): TextSplit
[]
Parameters
Name | Type |
---|---|
newSentenceSplits | SplitRep [] |
effectiveChunkSize | number |
Returns
TextSplit
[]
Defined in
TextSplitter.ts:153
getEffectiveChunkSize
▸ Private
getEffectiveChunkSize(extraInfoStr?
): number
Parameters
Name | Type |
---|---|
extraInfoStr? | string |
Returns
number
Defined in
TextSplitter.ts:72
getParagraphSplits
▸ getParagraphSplits(text
, effectiveChunkSize?
): string
[]
Parameters
Name | Type |
---|---|
text | string |
effectiveChunkSize? | number |
Returns
string
[]
Defined in
TextSplitter.ts:89
getSentenceSplits
▸ getSentenceSplits(text
, effectiveChunkSize?
): string
[]
Parameters
Name | Type |
---|---|
text | string |
effectiveChunkSize? | number |
Returns
string
[]
Defined in
TextSplitter.ts:115
processSentenceSplits
▸ Private
processSentenceSplits(sentenceSplits
, effectiveChunkSize
): SplitRep
[]
Parameters
Name | Type |
---|---|
sentenceSplits | string [] |
effectiveChunkSize | number |
Returns
SplitRep
[]
Defined in
TextSplitter.ts:128
splitText
▸ splitText(text
, extraInfoStr?
): string
[]
Parameters
Name | Type |
---|---|
text | string |
extraInfoStr? | string |
Returns
string
[]
Defined in
TextSplitter.ts:233
splitTextWithOverlaps
▸ splitTextWithOverlaps(text
, extraInfoStr?
): TextSplit
[]
Parameters
Name | Type |
---|---|
text | string |
extraInfoStr? | string |
Returns
TextSplit
[]
Defined in
TextSplitter.ts:205