Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Use this Model' code snippets for timm models in Transformers could use improvements #1124

Open
rwightman opened this issue Jan 21, 2025 · 9 comments

Comments

@rwightman
Copy link
Contributor

rwightman commented Jan 21, 2025

There are a few issues with the code snippets currently show for timm models with Transformers (via the wrapper model)

  1. timm and transformers models don't show any pipeline snippet for image-feature-extraction task models
  2. timm models show the same AutoModel.from_pretrained snippet for both feat extraction and image classification models
  3. timm model snippets do not include the pre-processor creation like Transformers snippets

The 1st issue may be partially addressed by #1120

I don't immediately see a path to fixing 2 & 3. The snippets should be the same as for Transformers, so it's not missing snippet code, it is somehow the identification of the AutoModel / Processors.

For 2/3, if a timm model is tagged with image-classification task I'd expect

# Load model directly
from transformers import AutoImageProcessor, AutoModelForImageClassification

processor = AutoImageProcessor.from_pretrained("timm/vit_base_patch16_224.augreg2_in21k_ft_in1k")
model = AutoModelForImageClassification.from_pretrained("timm/vit_base_patch16_224.augreg2_in21k_ft_in1k")

but we currently get this

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("timm/vit_base_patch16_224.augreg2_in21k_ft_in1k")

See e.g.

Am I correct that the fields of TransformersInfo in ModelData are derived from a combo of metadata in the model repo + possibly config and/or preprocessor config files? The code for the population doesn't appear to be here, but seems the auto_model, processor fields may not be populated for timm models in a manner that results in correct snippets.

export interface TransformersInfo {
	/**
	 * e.g. AutoModelForSequenceClassification
	 */
	auto_model: string;
	/**
	 * if set in config.json's auto_map
	 */
	custom_class?: string;
	/**
	 * e.g. text-classification
	 */
	pipeline_tag?: PipelineType;
	/**
	 * e.g. "AutoTokenizer" | "AutoFeatureExtractor" | "AutoProcessor"
	 */
	processor?: string;
}
@coyotte508
Copy link
Member

coyotte508 commented Jan 21, 2025

@rwightman
Copy link
Contributor Author

rwightman commented Jan 22, 2025

@coyotte508 thanks! my day to day does not involve any internal repos so not an hf-internal member and can't see the code there.

Might be good time for me to cc @pcuenca and @qubvel for visibility

@coyotte508
Copy link
Member

coyotte508 commented Jan 22, 2025

Looking at the two models:

vs

These are the different transformersInfo:

{
	auto_model	"AutoModel"
}

vs

{
    "auto_model": "AutoModelForImageClassification",
    "pipeline_tag": "image-classification",
    "processor": "AutoImageProcessor"
}

In https://huggingface.co/datasets/huggingface/transformers-metadata/blob/main/pipeline_tags.json#L1118 we do have a line {"model_class":"TimmWrapperForImageClassification","pipeline_tag":"image-classification","auto_class":"AutoModelForImageClassification"}

And in https://huggingface.co/datasets/huggingface/transformers-metadata/blob/main/frameworks.json#L248 we do have a line {"model_type":"timm_wrapper","pytorch":true,"tensorflow":false,"flax":false,"processor":"AutoImageProcessor"}

Btw those come from https://huggingface.co/datasets/huggingface/transformers-metadata/tree/main which is maintained by the transformers team / @LysandreJik

So in theory we have all the info we need to provide the correct transformersInfo, as long as we detect the model_type is timm_wrapper

Going back to https://huggingface.co/api/models/timm/vit_base_patch16_224.augreg2_in21k_ft_in1k, we have an empty config object, and in https://huggingface.co/api/models/facebook/deit-base-patch16-224?library=transformers we have this config:

  "config": {
    "architectures": [
      "ViTForImageClassification"
    ],
    "model_type": "vit"
  },

So the problem is probably the empty config for the timm model.

If we compare the two config.json, for the timm model: https://huggingface.co/timm/vit_base_patch16_224.augreg2_in21k_ft_in1k/blob/main/config.json and the facebook model: https://huggingface.co/facebook/deit-base-patch16-224/blob/main/config.json

The timm model lacks the model_type and architectures key in its config.json. It does have an architecture field with value vit_base_patch16_224 though 🤔

How we generate config is based on the config.json, although some architectures can have config sub-keys from different json files, eg config.peft is generated from the adapter_config.json file if it's a peft model. (Edit: note that we have access to config when generating the snippets, for what it's worth)

So my best guess would be to add something like architectures: ["TimmWrapperForImageClassification"], model_type: "timm_wrapper" in the config.json directly in the timm models.

Keep in mind I'm way out of my depth on the feasibility/reasonableness of this ask 🙏

@merveenoyan
Copy link
Contributor

@coyotte508 is correct, they're inferred from config indeed

@julien-c
Copy link
Member

So my best guess would be to add something like architectures: ["TimmWrapperForImageClassification"], model_type: "timm_wrapper" in the config.json directly in the timm models.

yes it's on the open source / transformers team side but probably not very prio imo

@rwightman
Copy link
Contributor Author

So looks like the metadata does indeed have the right info. The config.jsons for timm are not Transformers though, so adding those fields doesn't make sense, it'd be more infer model_type = timm wrapper from the fact that it's a timm model and then use the values that are there... something (fuzzily) along those lines.

@julien-c
Copy link
Member

can you explain what is a "Timm model in transformers" BTW? I haven't followed and i'm not clear what it actually means

@rwightman
Copy link
Contributor Author

rwightman commented Jan 22, 2025

@julien-c Can use all of the timm models as image classifiers or feature extractors with transformers, including the AutoModel/AutoProcessor and pipeline APIs (https://huggingface.co/blog/timm-transformers). Also allows timm models to work with the HF Trainer, can push the models back to the hub and they work with either timm or transformers. The hub models remain natively in timm format (checkpoint formats, keys, etc are timm), and the config.json remains timm. But the timm wrapper adapts the model & image processor for use in Transformers.

The reason for this issue is that the pipelines snippets for timm models (for Transformers lib) should now be the same as for equivalent types of native Transformer models.


e.g. pick a timm model on the hub, e.g. https://huggingface.co/timm/vit_so150m_patch16_reg4_gap_256.sbb_e250_in12k

With transformers:

import transformers
pipe = transformers.pipeline("image-classification", model="timm/vit_so150m_patch16_reg4_gap_256.sbb_e250_in12k")
pipe('torch2up.jpg')

Out[8]: 
[{'label': 'chromatic color, chromatic colour, spectral color, spectral colour',
  'score': 0.9763004779815674},
 {'label': 'parallel', 'score': 0.005799838807433844},
 {'label': 'circle', 'score': 0.003302227472886443},
 {'label': 'triangle, trigon, trilateral', 'score': 0.0012768494198098779},
 {'label': 'graduated cylinder', 'score': 0.0011402885429561138}]

or

from transformers import (
    AutoModelForImageClassification,
    AutoImageProcessor,
)
image_processor = AutoImageProcessor.from_pretrained('timm/vit_so150m_patch16_reg4_gap_256.sbb_e250_in12k')
model = AutoModelForImageClassification.from_pretrained('timm/vit_so150m_patch16_reg4_gap_256.sbb_e250_in12k').eval()

@coyotte508
Copy link
Member

The relevant code is here in huggingace.js:

export const transformers = (model: ModelData): string[] => {
const info = model.transformersInfo;
if (!info) {
return [`# ⚠️ Type of model unknown`];
}
const remote_code_snippet = model.tags.includes(TAG_CUSTOM_CODE) ? ", trust_remote_code=True" : "";
let autoSnippet: string;
if (info.processor) {
const varName =
info.processor === "AutoTokenizer"
? "tokenizer"
: info.processor === "AutoFeatureExtractor"
? "extractor"
: "processor";
autoSnippet = [
"# Load model directly",
`from transformers import ${info.processor}, ${info.auto_model}`,
"",
`${varName} = ${info.processor}.from_pretrained("${model.id}"` + remote_code_snippet + ")",
`model = ${info.auto_model}.from_pretrained("${model.id}"` + remote_code_snippet + ")",
].join("\n");
} else {
autoSnippet = [
"# Load model directly",
`from transformers import ${info.auto_model}`,
`model = ${info.auto_model}.from_pretrained("${model.id}"` + remote_code_snippet + ")",
].join("\n");
}
if (model.pipeline_tag && LIBRARY_TASK_MAPPING.transformers?.includes(model.pipeline_tag)) {
const pipelineSnippet = ["# Use a pipeline as a high-level helper", "from transformers import pipeline", ""];
if (model.tags.includes("conversational") && model.config?.tokenizer_config?.chat_template) {
pipelineSnippet.push("messages = [", ' {"role": "user", "content": "Who are you?"},', "]");
}
pipelineSnippet.push(`pipe = pipeline("${model.pipeline_tag}", model="${model.id}"` + remote_code_snippet + ")");
if (model.tags.includes("conversational") && model.config?.tokenizer_config?.chat_template) {
pipelineSnippet.push("pipe(messages)");
}
return [pipelineSnippet.join("\n"), autoSnippet];
}
return [autoSnippet];
};

eg:

	const info = model.transformersInfo;

You can add something like this maybe:

	const info = model.transformersInfo;
	if (info && model.library_name === "timm" && model.pipeline_tag === "image-classification") {
		info.processor = ...;
		info.auto_model = ...;
	}

That would fix the snippets. We could also change our internal codebase to change transformersInfo at the source, but it would change the API responses (/api/models/...) and I'm not sure how valid it is to do so, given it is not "transformers".

So the ~4 line-change in the huggingface.js snippets ⬆probably the simplest way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants