Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ollama conf] Adding num_predict param for ollama models #4271

Open
2 tasks done
TheStarAlight opened this issue Feb 21, 2025 · 0 comments
Open
2 tasks done

[ollama conf] Adding num_predict param for ollama models #4271

TheStarAlight opened this issue Feb 21, 2025 · 0 comments
Assignees
Labels
area:configuration Relates to configuration options kind:enhancement Indicates a new feature request, imrovement, or extension "needs-triage"

Comments

@TheStarAlight
Copy link

Validations

  • I believe this is a way to improve. I'll try to join the Continue Discord for questions
  • I'm not able to find an open issue that requests the same enhancement

Problem

I'm using ollama as the model provider, using the official Deepseek-R1 distilled 32B model on ollama.
The model config of continue is:

"model": "deepseek-r1:32b",
"title": "DeepSeek R1-distilled 32B",
"provider": "ollama",
"contextLength": 128000

However, when I'm asking the model to edit a long piece of code, the response would cutoff suddenly.
I suspect that it is related to the ollama's limitation of the length of the model response, which is controlled by num_predict (see https://github.com/ollama/ollama/blob/main/docs/modelfile.md#parameter).
although the ollama's default num_predict is infinite, I found that when I'm chatting with the model, the log shows the continue gives a maxTokens of 4096:

==========================================================================
==========================================================================
##### Completion options #####
{
  "contextLength": 128000,
  "model": "deepseek-r1:32b",
  "maxTokens": 4096,
  "stop": [
    "<|begin▁of▁sentence|>",
    "<|end▁of▁sentence|>",
    "<|User|>",
    "<|Assistant|>"
  ]
}

##### Request options #####
{}

##### Prompt #####
<user>
How much is 40eV of single-photon energy converted into atomic units?

<assistant>

To convert 40 eV into atomic units (Hartrees), we use the conversion factor where 1 Hartree ≈ 27.2114 eV.

The calculation is:

Energy in atomic units = 40 eV / 27.2114 ≈ 1.47

Therefore, 40 eV of single-photon energy is approximately **1.47 Hartrees** when converted into atomic units.

I think the maxTokens should also become a tunable parameter in the configuration of continue to allow for longer response. I really need to ask AI to help me to revise my LaTeX document!!

Solution

The solution is just add maxTokens as a tunable parameter in the configuration of continue to allow for longer response :) Thank you for attention.

@dosubot dosubot bot added area:configuration Relates to configuration options kind:enhancement Indicates a new feature request, imrovement, or extension labels Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:configuration Relates to configuration options kind:enhancement Indicates a new feature request, imrovement, or extension "needs-triage"
Projects
None yet
Development

No branches or pull requests

2 participants