What I require is to index as much web as I can with a certain limit on database size. How do I achieve that? I want to crawl html web pages and when the format is reasonably good as with my webpage below, I want to parse and store it in xapian glass. How do I do such a thing? I just want text, no images or videos. Thanking you Sagar Acharya https://humaaraartha.in