Utilize the llms.txt
file to improve the web crawler's experience
#3864
Labels
enhancement
New feature or request
llms.txt
file to improve the web crawler's experience
#3864
Please describe the feature you want
The technology of llms.txt is gradually gaining popularity, and many companies, such as Anthropic, Perplexity, Cloudflare, etc., are providing corresponding support. Using llms.txt makes it easier to access official plain text documents instead of scraping HTML documents, which allows large language models (LLMs) to retrieve critical information.
https://directory.llmstxt.cloud/ is a directory that aggregates a list of llms.txt support. The list contains links to llms.txt files for various document sites, and by following the links, you can access the corresponding documents.
We could consider adding support in the web crawler (it seems the Directory does not have API support), or we could provide our own list. When users select a llms.txt, Tabby can download and index the corresponding txt document.
Additional context
Add any other context or screenshots about the feature request here.
Tabby is already capable of parsing a crawled website into a structured document; we could use it to index the llms.txt file.
Please reply with a 👍 if you want this feature.
The text was updated successfully, but these errors were encountered: