rlhf

Star

Here are 201 public repositories matching this topic...

hiyouga / LLaMA-Factory

Star

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Updated Feb 20, 2025
Python

LAION-AI / Open-Assistant

Star

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

python machine-learning ai nextjs discord-bot assistant language-model chatgpt rlhf

Updated Aug 17, 2024
Python

RUCAIBox / LLMSurvey

Star

The official GitHub page for the survey paper "A Survey of Large Language Models".

natural-language-processing pre-training pre-trained-language-models in-context-learning large-language-models llm llms chain-of-thought chatgpt rlhf instruction-tuning

Updated Aug 20, 2024
Python

ymcui / Chinese-LLaMA-Alpaca-2

Star

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

nlp yarn llama alpaca 64k large-language-models llm rlhf flash-attention llama2 llama-2 alpaca-2 alpaca2

Updated Sep 23, 2024
Python

InternLM / InternLM

Star

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

chatbot chinese gpt pretrained-models llm long-context rlhf large-language-model flash-attention fine-tuning-llm

Updated Feb 7, 2025
Python

huggingface / alignment-handbook

Star

Robust recipes to align language models with human and AI preferences

transformers llm rlhf

Updated Nov 21, 2024
Python

argilla-io / argilla

Star

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets

nlp machine-learning natural-language-processing ai weak-supervision developer-tools active-learning annotation-tool text-annotation weakly-supervised-learning human-in-the-loop mlops text-labeling gpt-4 llm langchain rlhf

Updated Feb 17, 2025
Python

opendilab / awesome-RLHF

Star

A curated list of reinforcement learning with human feedback resources (continually updated)

reinforcement-learning deep-learning deep-reinforcement-learning large-language-models human-feedback rlhf

Updated Feb 19, 2025

hiyouga / ChatGLM-Efficient-Tuning

Star

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

transformers pytorch lora language-model alpaca fine-tuning peft huggingface chatgpt rlhf chatglm qlora chatglm2

Updated Oct 12, 2023
Python

Kiln-AI / Kiln

Star

The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

python windows macos machine-learning ai prompt ml collaboration openai dataset-generation synthetic-data fine-tuning prompt-engineering chain-of-thought rlhf ollama

Updated Feb 22, 2025
Python

Docta-ai / docta

Star

A Doctor for your data

data language-model data-curation data-centric-ai data-diagnosis data-centric-machine-learning rlhf

Updated Jan 14, 2025
Python

argilla-io / distilabel

Star

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

python ai openai synthetic-data synthetic-dataset-generation huggingface llms rlhf rlaif

Updated Feb 18, 2025
Python

PKU-Alignment / align-anything

Star

Align Anything: Training All-modality Model with Feedback

chameleon multimodal dpo large-language-models rlhf vision-language-model

Updated Feb 19, 2025
Python

tatsu-lab / alpaca_eval

Star

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

nlp deep-learning leaderboard evaluation instruction-following foundation-models large-language-models rlhf

Updated Dec 27, 2024
Jupyter Notebook

THUDM / WebGLM

Star

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

llm chatgpt rlhf webglm

Updated Dec 13, 2024
Python

PKU-Alignment / safe-rlhf

Star

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Updated Jun 13, 2024
Python

transformerlab / transformerlab-app

Sponsor

Star

Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

electron transformers llama lora mlx llms rlhf

Updated Feb 21, 2025
TypeScript

OpenLMLab / MOSS-RLHF

Star

Secrets of RLHF in Large Language Models Part I: PPO

alignment ai-safety rlhf

Updated Mar 3, 2024
Python

THUDM / ImageReward

Star

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

generative-model diffusion-models human-preferences rlhf

Updated Jan 24, 2025
Python

RLHFlow / RLHF-Reward-Modeling

Star

Recipes to train reward model for RLHF.

llm rlhf reward-models llama3

Updated Feb 9, 2025
Python

Improve this page

Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rlhf

Here are 201 public repositories matching this topic...

hiyouga / LLaMA-Factory

LAION-AI / Open-Assistant

RUCAIBox / LLMSurvey

ymcui / Chinese-LLaMA-Alpaca-2

InternLM / InternLM

huggingface / alignment-handbook

argilla-io / argilla

opendilab / awesome-RLHF

hiyouga / ChatGLM-Efficient-Tuning

Kiln-AI / Kiln

Docta-ai / docta

argilla-io / distilabel

PKU-Alignment / align-anything

tatsu-lab / alpaca_eval

THUDM / WebGLM

PKU-Alignment / safe-rlhf

transformerlab / transformerlab-app

OpenLMLab / MOSS-RLHF

THUDM / ImageReward

RLHFlow / RLHF-Reward-Modeling

Improve this page

Add this topic to your repo