Hello,
I built an interface where I search for text with Xapian, and I'd like
to implement highlighting of matching terms on query results.
I could not find something to help with that in Xapian itself, so I
tried implementing my own one. It works reasonably well, except in case
of phrase searches, where it will highlight individual words even if
they are not consecutive (due to turning the query into a set of words).
Can one do better, without ending up knee-deep in yak fur?
Here's what I have so far:
def get_highlights(self, result: ResultEntry) -> Generator[str, None,
None]:
"""
Return text with highlighted search terms
"""
# From https://github.com/daevaorn/djapian/issues/73
search_terms = set(self.last_query)
for text in result.entry.iter_text_paragraphs():
if not text:
continue
words = text.split()
found = False
highlighted: List[str] = []
for word in words:
token = word.encode().lower()
token = self.re_token.sub(b"", token)
stemmed = b'Z' + self.stemmer(token)
if token in search_terms or stemmed in search_terms:
highlighted.append("<b>" + html.escape(word)
+ "</b>")
found = True
else:
highlighted.append(html.escape(word))
if found:
yield " ".join(highlighted)
Enrico
--
GPG key: 4096R/634F4BD1E7AD5568 2009-05-08 Enrico Zini <enrico at
enricozini.org>