Code Monkey home page Code Monkey logo

Comments (4)

mallick84 avatar mallick84 commented on May 28, 2024 1

Thank you. let me check it.

from autoquant.

AdrianAntico avatar AdrianAntico commented on May 28, 2024

@mallick84 Try out this version. Looks like you might be referencing an older version of the function. You'll have to keep in mind some of the new args, such as TrainOnFull. You set that to TRUE to train on full data and set it to FALSE to have the regression model insights returned.

# Build forecast
  Results <- RemixAutoML::AutoCatBoostCARMA(

    # data args
    data = data2,
    TimeWeights = 0.9999,
    TargetColumnName = "Weekly_Sales",
    DateColumnName = "Date",
    HierarchGroups = NULL,
    GroupVariables = c("Store","Dept"),
    TimeUnit = "weeks",
    TimeGroups = c("weeks","months"),

    # Production args
    TrainOnFull = TRUE,
    SplitRatios = c(1 - 2*30 / 143, 30 / 143, 30 / 143),
    PartitionType = "random",
    FC_Periods = 52,
    TaskType = "GPU",
    NumGPU = 1,
    Timer = TRUE,
    DebugMode = FALSE,

    # Target variable transformations
    TargetTransformation = FALSE,
    Methods = c("YeoJohnson", "BoxCox", "Asinh", "Log", "LogPlus1", "Sqrt", "Asin", "Logit"),
    Difference = FALSE,
    NonNegativePred = TRUE,
    RoundPreds = FALSE,

    # Calendar-related features
    CalendarVariables = c("week","wom","month","quarter"),
    HolidayVariable = c("USPublicHolidays"),
    HolidayLags = c(1,2,3),
    HolidayMovingAverages = c(2,3),

    # Lags, moving averages, and other rolling stats
    Lags = list("weeks" = c(1,2,3,4,5,8,9,12,13,51,52,53), "months" = c(1,2,6,12)),
    MA_Periods = list("weeks" = c(2,3,4,5,8,9,12,13,51,52,53), "months" = c(2,6,12)),
    SD_Periods = NULL,
    Skew_Periods = NULL,
    Kurt_Periods = NULL,
    Quantile_Periods = NULL,
    Quantiles_Selected = NULL,

    # Bonus features
    AnomalyDetection = NULL,
    XREGS = NULL,
    FourierTerms = 0,
    TimeTrendVariable = TRUE,
    ZeroPadSeries = NULL,
    DataTruncate = FALSE,

    # ML grid tuning args
    GridTune = FALSE,
    PassInGrid = NULL,
    ModelCount = 5,
    MaxRunsWithoutNewWinner = 50,
    MaxRunMinutes = 60*60,

    # ML evaluation output
    PDFOutputPath = NULL,
    SaveDataPath = NULL,
    NumOfParDepPlots = 0L,

    # ML loss functions
    EvalMetric = "RMSE",
    EvalMetricValue = 1,
    LossFunction = "RMSE",
    LossFunctionValue = 1,

    # ML tuning args
    NTrees = 1000L,
    Depth = 6L,
    L2_Leaf_Reg = NULL,
    LearningRate = NULL,
    Langevin = FALSE,
    DiffusionTemperature = 10000,
    RandomStrength = 1,
    BorderCount = 254,
    RSM = NULL,
    GrowPolicy = "SymmetricTree",
    BootStrapType = "Bayesian",
    ModelSizeReg = 0.5,
    FeatureBorderType = "GreedyLogSum",
    SamplingUnit = "Group",
    SubSample = NULL,
    ScoreFunction = "Cosine",
    MinDataInLeaf = 1)

from autoquant.

mallick84 avatar mallick84 commented on May 28, 2024

I am still stuck to resolve it.

On Catboost 0.24.3

`### Load Walmart Data from Remix Institute's Box Account----

data1 <- data.table::fread("https://remixinstitute.box.com/shared/static/9kzyttje3kd7l41y1e14to0akwl9vuje.csv")
Downloaded 3087910 bytes...

Subset for Stores / Departments With Full Series (143 time points each)

data2 <- data1[, Counts := .N, by = c("Store","Dept")][

  • Counts == 143][, Counts := NULL]

Subset Columns (remove IsHoliday column)

keep <- c("Store","Dept","Date","Weekly_Sales")
data2 <- data2[, ..keep]
data2 %>% glimpse()
Rows: 380,380
Columns: 4
$ Store 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
$ Dept 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
$ Date 2010-02-05, 2010-02-12, 2010-02-19, 2010-02-26, 2...
$ Weekly_Sales 24924.50, 46039.49, 41595.55, 19403.54, 21827.90, ...

Build forecast

Results <- RemixAutoML::AutoCatBoostCARMA(

  • data args

  • data = data2,
  • TimeWeights = 0.9999,
  • TargetColumnName = "Weekly_Sales",
  • DateColumnName = "Date",
  • HierarchGroups = NULL,
  • GroupVariables = c("Store","Dept"),
  • TimeUnit = "weeks",
  • TimeGroups = c("weeks","months"),
  • Production args

  • TrainOnFull = TRUE,
  • SplitRatios = c(1 - 2*30 / 143, 30 / 143, 30 / 143),
  • PartitionType = "random",
  • FC_Periods = 52,
  • TaskType = "GPU",
  • NumGPU = 1,
  • Timer = TRUE,
  • DebugMode = FALSE,
  • Target variable transformations

  • TargetTransformation = FALSE,
  • Methods = c("YeoJohnson", "BoxCox", "Asinh", "Log", "LogPlus1", "Sqrt", "Asin", "Logit"),
  • Difference = FALSE,
  • NonNegativePred = TRUE,
  • RoundPreds = FALSE,
  • Calendar-related features

  • CalendarVariables = c("week","wom","month","quarter"),
  • HolidayVariable = c("USPublicHolidays"),
  • HolidayLags = c(1,2,3),
  • HolidayMovingAverages = c(2,3),
  • Lags, moving averages, and other rolling stats

  • Lags = list("weeks" = c(1,2,3,4,5,8,9,12,13,51,52,53), "months" = c(1,2,6,12)),
  • MA_Periods = list("weeks" = c(2,3,4,5,8,9,12,13,51,52,53), "months" = c(2,6,12)),
  • SD_Periods = NULL,
  • Skew_Periods = NULL,
  • Kurt_Periods = NULL,
  • Quantile_Periods = NULL,
  • Quantiles_Selected = NULL,
  • Bonus features

  • AnomalyDetection = NULL,
  • XREGS = NULL,
  • FourierTerms = 0,
  • TimeTrendVariable = TRUE,
  • ZeroPadSeries = NULL,
  • DataTruncate = FALSE,
  • ML grid tuning args

  • GridTune = FALSE,
  • PassInGrid = NULL,
  • ModelCount = 5,
  • MaxRunsWithoutNewWinner = 50,
  • MaxRunMinutes = 60*60,
  • ML evaluation output

  • PDFOutputPath = NULL,
  • SaveDataPath = NULL,
  • NumOfParDepPlots = 0L,
  • ML loss functions

  • EvalMetric = "RMSE",
  • EvalMetricValue = 1,
  • LossFunction = "RMSE",
  • LossFunctionValue = 1,
  • ML tuning args

  • NTrees = 1000L,
  • Depth = 6L,
  • L2_Leaf_Reg = NULL,
  • LearningRate = NULL,
  • Langevin = FALSE,
  • DiffusionTemperature = 10000,
  • RandomStrength = 1,
  • BorderCount = 254,
  • RSM = NULL,
  • GrowPolicy = "SymmetricTree",
  • BootStrapType = "Bayesian",
  • ModelSizeReg = 0.5,
  • FeatureBorderType = "GreedyLogSum",
  • SamplingUnit = "Group",
  • SubSample = NULL,
  • ScoreFunction = "Cosine",
  • MinDataInLeaf = 1)
    Learning rate set to 0.093159
    Error in catboost::catboost.train(learn_pool = TrainPool, test_pool = TestPool, :
    c:/program files (x86)/go agent/pipelines/buildmaster/catboost.git/catboost/cuda/cuda_lib/cuda_base.h:281: CUDA error 35: CUDA driver version is insufficient for CUDA runtime version`

After updating catboost to 0.24.4

`### Load Walmart Data from Remix Institute's Box Account----

data1 <- data.table::fread("https://remixinstitute.box.com/shared/static/9kzyttje3kd7l41y1e14to0akwl9vuje.csv")
Downloaded 3087910 bytes...> # Subset Columns (remove IsHoliday column)----
keep <- c("Store","Dept","Date","Weekly_Sales")
data2 <- data2[, ..keep]
data2 %>% glimpse()
Rows: 380,380
Columns: 4
$ Store 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
$ Dept 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
$ Date 2010-02-05, 2010-02-12, 2010-02-19, 2010-02-26, 2010-03...
$ Weekly_Sales 24924.50, 46039.49, 41595.55, 19403.54, 21827.90, 21043....

Subset for Stores / Departments With Full Series (143 time points each)----

data2 <- data1[, Counts := .N, by = c("Store","Dept")][

  • Counts == 143][, Counts := NULL]

Build forecast

Results <- RemixAutoML::AutoCatBoostCARMA(

  • data args

  • data = data2,
  • TimeWeights = 0.9999,
  • TargetColumnName = "Weekly_Sales",
  • DateColumnName = "Date",
  • HierarchGroups = NULL,
  • GroupVariables = c("Store","Dept"),
  • TimeUnit = "weeks",
  • TimeGroups = c("weeks","months"),
  • Production args

  • TrainOnFull = TRUE,
  • SplitRatios = c(1 - 2*30 / 143, 30 / 143, 30 / 143),
  • PartitionType = "random",
  • FC_Periods = 52,
  • TaskType = "GPU",
  • NumGPU = 1,
  • Timer = TRUE,
  • DebugMode = FALSE,
  • Target variable transformations

  • TargetTransformation = FALSE,
  • Methods = c("YeoJohnson", "BoxCox", "Asinh", "Log", "LogPlus1", "Sqrt", "Asin", "Logit"),
  • Difference = FALSE,
  • NonNegativePred = TRUE,
  • RoundPreds = FALSE,
  • Calendar-related features

  • CalendarVariables = c("week","wom","month","quarter"),
  • HolidayVariable = c("USPublicHolidays"),
  • HolidayLags = c(1,2,3),
  • HolidayMovingAverages = c(2,3),
  • Lags, moving averages, and other rolling stats

  • Lags = list("weeks" = c(1,2,3,4,5,8,9,12,13,51,52,53), "months" = c(1,2,6,12)),
  • MA_Periods = list("weeks" = c(2,3,4,5,8,9,12,13,51,52,53), "months" = c(2,6,12)),
  • SD_Periods = NULL,
  • Skew_Periods = NULL,
  • Kurt_Periods = NULL,
  • Quantile_Periods = NULL,
  • Quantiles_Selected = NULL,
  • Bonus features

  • AnomalyDetection = NULL,
  • XREGS = NULL,
  • FourierTerms = 0,
  • TimeTrendVariable = TRUE,
  • ZeroPadSeries = NULL,
  • DataTruncate = FALSE,
  • ML grid tuning args

  • GridTune = FALSE,
  • PassInGrid = NULL,
  • ModelCount = 5,
  • MaxRunsWithoutNewWinner = 50,
  • MaxRunMinutes = 60*60,
  • ML evaluation output

  • PDFOutputPath = NULL,
  • SaveDataPath = NULL,
  • NumOfParDepPlots = 0L,
  • ML loss functions

  • EvalMetric = "RMSE",
  • EvalMetricValue = 1,
  • LossFunction = "RMSE",
  • LossFunctionValue = 1,
  • ML tuning args

  • NTrees = 1000L,
  • Depth = 6L,
  • L2_Leaf_Reg = NULL,
  • LearningRate = NULL,
  • Langevin = FALSE,
  • DiffusionTemperature = 10000,
  • RandomStrength = 1,
  • BorderCount = 254,
  • RSM = NULL,
  • GrowPolicy = "SymmetricTree",
  • BootStrapType = "Bayesian",
  • ModelSizeReg = 0.5,
  • FeatureBorderType = "GreedyLogSum",
  • SamplingUnit = "Group",
  • SubSample = NULL,
  • ScoreFunction = "Cosine",
  • MinDataInLeaf = 1)
    Error in .Call("CatBoostHashStrings_R", as.character(preprocessed[[column_index]])) :
    "CatBoostHashStrings_R" not resolved from current namespace (catboost)`

Anything I am missing?

from autoquant.

AdrianAntico avatar AdrianAntico commented on May 28, 2024

Catboost v0.24.4 isn't working for R currently. The maintainers said it will be fixed for their next release so you'll have to use v0.24.3 for now. catboost/catboost#1525

As for setting up the function to work correctly, check out the example in the help file, which is at the bottom of it. It shows you how to tune the function. Type this into your R console to see it: ?RemixAutoML::AutoCatBoostCARMA

from autoquant.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.