Code Monkey home page Code Monkey logo

Comments (7)

Craigacp avatar Craigacp commented on July 19, 2024 1

I'm not sure I understand the question. At the moment Tribuo doesn't have any implementations of feature selection wrappers. To add one you need to implement org.tribuo.FeatureSelector with the desired algorithm. The SelectedFeatureSet produced by a run of the algorithm can be saved out, and you can produce a dataset containing only the selected features by constructing a SelectedFeaturesDataset.

from tribuo.

Craigacp avatar Craigacp commented on July 19, 2024 1

You should keep the test set used by the wrapper completely separate from the test set used to evaluate the final classifier, so you need to split your data into at least three chunks, a train set for the wrapper, a test set for the wrapper and a final test set. You can also train the final classifier on the wrappers train & test set combined if you want, but that's not necessary. You can also do cross validation inside the wrapper, or randomly split the data each time for each feature set, but essentially all three of those options operate on whatever data you pass into the wrapper which should be separate from your final test set.

from tribuo.

Mohammed-Ryiad-Eiadeh avatar Mohammed-Ryiad-Eiadeh commented on July 19, 2024

Thats all I need to know now. And for further concerns I may reopen this issue.

from tribuo.

Mohammed-Ryiad-Eiadeh avatar Mohammed-Ryiad-Eiadeh commented on July 19, 2024

Dear Adam,

I implemented a wrapper FS based Cuckoo search algorithm. But I want to know your opinion about this:

var data = new CSVLoader(new LabelFactory()).loadDataSource(Paths.get("C:\Users\20187\Desktop\o.csv"), "Class");

    var dataSplitter = new TrainTestSplitter<Label>(data, 0.5, Trainer.DEFAULT_SEED);
    var TrainingPart = new MutableDataset<Label>(dataSplitter.getTrain());
    var TestinfPart = new MutableDataset<Label>(dataSplitter.getTest());

    var opt = new CuckooSearchOptimizer(TestinfPart,
            TransferFunction.TransferFunction_V2,
            50,
            2,
            2,
            0.1,
            1.5,
            10);

    var SFS = opt.select(TrainingPart);

This is how the algorithm looks like, and my concern is about passing the test part to the constructor since I think the code should be better but the wrapper FS requires to train and test each solution from the population so I need to use train and test portions for it, now my suggestion is to pass the datasource to the FS algorithm such as:

var data = new CSVLoader(new LabelFactory()).loadDataSource(Paths.get("C:\Users\20187\Desktop\o.csv"), "Class");

    var opt = new CuckooSearchOptimizer(data,
            TransferFunction.TransferFunction_V2,
            50,
            2,
            2,
            0.1,
            1.5,
            10);

    var SFS = opt.getSelectedFeature();

With some other methods to get all needed information.

Please tell me if there is another appropriate solution for this

from tribuo.

Craigacp avatar Craigacp commented on July 19, 2024

I would pass the feature selection algorithm a dataset and have it split that internally, controlled by a parameter. DataSources should only be converted into Datasets, nothing should really be processing them in the DataSource form.

from tribuo.

Mohammed-Ryiad-Eiadeh avatar Mohammed-Ryiad-Eiadeh commented on July 19, 2024

Okay, in the algorithm I need to train some trainer like KNN (lazy algorithm) in order to evaluate each solution from the population, therefore I need the train and test parts to be used inside the algorithm and I cant do that by passing the training part, I want to know your suggesion

from tribuo.

Mohammed-Ryiad-Eiadeh avatar Mohammed-Ryiad-Eiadeh commented on July 19, 2024

I think 10-fold cross validation is suitable for such a task and it solved the issue I was asking about. Now I want to add some other constructors, writing some comments too. Thanks for your help. I will request to add the model to the Tribuo engine and I may add more wrapper approaches for FS in the near future. The code looks like this:

var data = new CSVLoader(new LabelFactory()).loadDataSource(Paths.get("C:\Users\20187\Desktop\o.csv"), "Class");

    var dataSplitter = new TrainTestSplitter<Label>(data, 0.5, Trainer.DEFAULT_SEED);
    var TrainingPart = new MutableDataset<Label>(dataSplitter.getTrain());
    var TestinfPart = new MutableDataset<Label>(dataSplitter.getTest());

    var opt = new CuckooSearchOptimizer(TransferFunction.TransferFunction_V2,
            50,
            2,
            2,
            0.1,
            1.5,
            20);

    var SFS = opt.select(TrainingPart);
    System.out.println(SFS.featureNames().size());
    var SFDS = new SelectedFeatureDataset<>(TrainingPart, SFS);

from tribuo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.