You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, I would like to express my gratitude. I have been using TabbyML and Qwen-2.5-Coder-32b-instruct for several months now, and I feel joy and excitement every day!
Recently, I discovered another model called Mixtral-8x7b-instruct, which surpassed Qwen in response speed by approximately 3-4 times! I ran Mixtral with koboldcpp, but I couldn't manage to run it with TabbyML no matter how hard I tried.
I even tried launching Mixtral-8x7b-instruct under the guise of Mistral-7b, but that didn't work either.
Perhaps you can suggest a way to launch it through parameters in the config or add support for this model. Although it is slightly larger than Qwen-32b, it works significantly faster, even on CPU.
The text was updated successfully, but these errors were encountered:
Hi @scoute, Thank you for the information. We are pleased that Tabby could help.
I discovered that KoboldCPP is a fork of Llama.cpp, and llama.cpp itself has also implemented mixtral support, as seen here: ggml-org/llama.cpp#4406. I believe it should not be challenging to integrate it. We will investigate this further at a later time.
Additionally, could you please share with us your hardware information related to running the Qwen 32B? We would also like to know about your user experience and the specific casino where you are utilizing Tabby.
Hi, guys!
First of all, I would like to express my gratitude. I have been using TabbyML and Qwen-2.5-Coder-32b-instruct for several months now, and I feel joy and excitement every day!
Recently, I discovered another model called Mixtral-8x7b-instruct, which surpassed Qwen in response speed by approximately 3-4 times! I ran Mixtral with koboldcpp, but I couldn't manage to run it with TabbyML no matter how hard I tried.
I even tried launching Mixtral-8x7b-instruct under the guise of Mistral-7b, but that didn't work either.
Perhaps you can suggest a way to launch it through parameters in the config or add support for this model. Although it is slightly larger than Qwen-32b, it works significantly faster, even on CPU.
The text was updated successfully, but these errors were encountered: