Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize the llms.txt file to improve the web crawler's experience #3864

Open
zwpaper opened this issue Feb 18, 2025 · 0 comments
Open

Utilize the llms.txt file to improve the web crawler's experience #3864

zwpaper opened this issue Feb 18, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@zwpaper
Copy link
Member

zwpaper commented Feb 18, 2025

Please describe the feature you want

The technology of llms.txt is gradually gaining popularity, and many companies, such as Anthropic, Perplexity, Cloudflare, etc., are providing corresponding support. Using llms.txt makes it easier to access official plain text documents instead of scraping HTML documents, which allows large language models (LLMs) to retrieve critical information.

https://directory.llmstxt.cloud/ is a directory that aggregates a list of llms.txt support. The list contains links to llms.txt files for various document sites, and by following the links, you can access the corresponding documents.

We could consider adding support in the web crawler (it seems the Directory does not have API support), or we could provide our own list. When users select a llms.txt, Tabby can download and index the corresponding txt document.

Additional context
Add any other context or screenshots about the feature request here.

Tabby is already capable of parsing a crawled website into a structured document; we could use it to index the llms.txt file.


Please reply with a 👍 if you want this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants