was just checking the code and saw only accuracy as a metrics, are we planning to add

Currently, we support any classification metric from the <code class="not

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

more metrics addition (i.e f1score, precision ) in the trainer.evaluate(),about huggingface/setfit

kamilc-bst commented on June 11, 2024 2

What if I have multi-label classification problem + I want to evaluate my eval dataset via "f1" + specific average strategy like 'weighted' or 'micro' ? Is it supported ?

from setfit.

nsorros commented on June 11, 2024 2

Currently, we support any classification metric from the evaluate library: https://huggingface.co/docs/evaluate/index

@lewtun one limitation though seems to be that in case of non binary labels there is no way to pass an average strategy. this would need to happen here

setfit/src/setfit/trainer.py

Line 303 in a06be0e

return metric_fn.compute(predictions=y_pred, references=y_test)

So in those cases you need to fallback to accuracy or write your own trainer. Would it be in scope to support those evaluations? I am happy to open a PR for this.

from setfit.

jmwoloso commented on June 11, 2024 1

assuming I'm not missing something major, this seems like an easy fix. here's my current implementation that I'm testing. happy to start a formal PR for this if desired @lewtun

https://github.com/huggingface/setfit/compare/main...jmwoloso:setfit:fix_multilabel_metrics?expand=1

EDIT: haven't added tests or anything which i'd need to do, etc. to make this a formal PR which i'm happy to do; this is just to unblock me from using SetFitTrainer at the moment

from setfit.

tomaarsen commented on June 11, 2024 1

You can use "accuracy" or "f1" (now fully supported for multi-label by also using metric_kwargs) as simple low-effort solutions to evaluation, but you can also provide a function if you'd like. See this for more information:

setfit/src/setfit/trainer.py

Lines 43 to 48 in b503c0b

    
                   metric (`str` or `Callable`, *optional*, defaults to `"accuracy"`): 
        
                       The metric to use for evaluation. If a string is provided, we treat it as the metric name and load it with default settings. 
        
                       If a callable is provided, it must take two arguments (`y_pred`, `y_test`). 
        
                   metric_kwargs (`Dict[str, Any]`, *optional*): 
        
                       Keyword arguments passed to the evaluation function if `metric` is an evaluation string like "f1". 
        
                       For example useful for providing an averaging strategy for computing f1 in a multi-label setting.

I think this should allow for as much flexibility as is required, so I'll close this.

Tom Aarsen

from setfit.

lewtun commented on June 11, 2024

Hi @snayan06 thanks for your interest in setfit! Although accuracy is the default metric, you can specify a different one when you create the SetFitTrainer, e.g.:

from setfit import SetFitTrainer

trainer = SetFitTrainer(model=model, train_dataset=train_dataset, metric="f1")

Currently, we support any classification metric from the evaluate library: https://huggingface.co/docs/evaluate/index

from setfit.

snayan06 commented on June 11, 2024

ok thanks for answering , was just looking to contribute and thought that was not there in setfit, and would have been helpful to add. thanks to pointing to the the library as well.

from setfit.

snayan06 commented on June 11, 2024

Is there any way u guys track features and people can contribute to build these features, or it is just in the issues , new to open source but have been using huggingface/sbert models from a long time and thought if i can contribute in some way that would be good.

from setfit.

jmwoloso commented on June 11, 2024

shouldn't this line in the same file (and method) handle multi-label though?

setfit/src/setfit/trainer.py

Line 295 in a06be0e

    
           metric_config = "multilabel" if self.model.multi_target_strategy is not None else None

from setfit.

joostjansen commented on June 11, 2024

I was wondering if there have been any updates as regards computing other metrics for multi label output?

from setfit.

more metrics addition (i.e f1score, precision ) in the trainer.evaluate() about setfit HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	metric (`str` or `Callable`, optional, defaults to `"accuracy"`):
	The metric to use for evaluation. If a string is provided, we treat it as the metric name and load it with default settings.
	If a callable is provided, it must take two arguments (`y_pred`, `y_test`).
	metric_kwargs (`Dict[str, Any]`, optional):
	Keyword arguments passed to the evaluation function if `metric` is an evaluation string like "f1".
	For example useful for providing an averaging strategy for computing f1 in a multi-label setting.