Item 43697573

SparkyMcUnicorn • 5 days ago

My understanding is that long context models can create embeddings that are much better at capturing the overall meaning, and are less effective (without chunking) for documents that consist of short standalone sentences.

For example, "The configuration mentioned above is critical" now "knows" what configuration is being referenced, along with which project and anything else talked about in the document.

mmstroik • 1 day ago

when you say long context models as less effective for documents that consist of short sentences, do you mean that embedding models that have long context capabilities tend to be worse with shorter sentences or are you just saying that _using_ their large context windows will be less effective for docs with short sentences