Code Monkey home page Code Monkey logo

Comments (3)

ShiqiYang2022 avatar ShiqiYang2022 commented on September 2, 2024

Tag Nano, Re this https://github.com/gslab-econ/blp-instruments/issues/201#issuecomment-1888211730:

I implemented the solution you suggested in 1b6333c. The memory used improved marginally, because the exponent value in these lines are not big enough. That indicates that approximating exponential function in Matlab itself is memory-costing even when we calculate with small exponent number. Hence, my conclusion is after 1b6333c, we already optimized the code at this calculation, and there's less space to further reduce that memory usage.

from template.

ShiqiYang2022 avatar ShiqiYang2022 commented on September 2, 2024

Tag Jesse Nano, I am summarizing what we have learned per goal of https://github.com/gslab-econ/blp-instruments/issues/201#issue-2018943946:

Per Goals of this issue:

  1. Rerun again /analysis_cluster/ to further test our changes in #195. See here.
  2. Revise the code to only profile 3 RCNL estimations in https://github.com/gslab-econ/blp-instruments/issues/195#issuecomment-1823738708. See here.
  3. Understand where we experienced the peak demand of computing resources(memory).
    Conclusion: The memory peaks when --
    (1) Use splitapply() function to calculate derivatives related to nonlinear parameters (D_nonlin).
    (2) Calculating matrices inverse. Specifically, it peaks when we calculate derivatives in rcnl_der() and the derivative of the utility in rcnl_ddelta(); which involves matrices inverse.
    (3) Seldomly, some odd spots where I cannot explain why it consumes lot of memory.
  4. Understand where and why we experienced the longest compute time.
    Conclusion: splitapply() function consumes the most of total time (75%) per profiling results of RCNL.
  5. Determine the peak demand of memory.
    I use job monitoring tools within SLURM system to export the memory job demanded. I cleaned the outputs into dropbox joined_jobs.csv. Codes see /issue/joboverview/ folder. This file includes the demographics of all kind jobs in the full run of Nov. 2023. This can also serve as a benchmark for future running.
    Conclusion: Per analysis of joined_jobs.csv, peak memory only shows positive relationship w.r.t the size of market. For market size = [7500, 10000, 12500], the peak memory (on average) = [10, 13, 16](GB).

To provide a high-level explanation in (3)-(5), please find the figure attached. Below is the monitoring record of job 10000_markets_default_tol_ic_RCNL23_0_0_1. This job run during 1:28 PM - 4:33 PM two weeks ago (x-axis); and I take the snapshot of its memory usage (y-axis) every second and plot the time-trend. Blue arrow indicates the places where the splitapply() function is running; and in each loop, after it finishes D_nonlin calculation, it shifts to inverse matrix calculation. The inverse matrix calculation consumes a bit less (but still a lot) memory but is time-efficient, so we can observe a very soon dive-and-recover pattern after some horizontal line.

RCNL

There are many hurdles to profile/monitor jobs simply because it's too heavy; and I have to use multiple tools to figure out what the job looks like. I reluctantly decided to make the post look concise and information-worthy. But if there's any places which needs more clarification, just let me know!

from template.

ShiqiYang2022 avatar ShiqiYang2022 commented on September 2, 2024

Tag Jesse Nano, here are the potential spaces to improve for the next steps in #205 from my side:

Memory

  1. For splitapply() function, I think there exist space to reduce memory; as splitapply() is doing the linear calculation and point multiplication of matrices. Intuitively it should cost less memory than matrix inverse functions. My sense is that splitapply is storing intermediate results as we loop over for every i, resulting big memory usage.

https://github.com/gslab-econ/blp-instruments/blob/8b9669a414e114208f09b542a651ef1e788ee769/analysis_cluster/code/rcnl_demand/rcnl_ddelta.m#L44

  1. For Calculating matrices inverse, I do not think currently there's improvement spaces without a fundamental modification in the code. I asked Christopher Conlon about reducing memory in inverting matrices during NBER IO and he suggests us to move those steps into GPUs as he feels to be efficient to let GPU specialize.

  2. One obvious path to improve is to request different memory usage based on different market size, instead of uniformly request for 20GB. I implemented in [Commit].

  3. One question from my side is, what is the marginal benefit of reducing the memory? Per Comment, the benefit of reducing memory usage is to reduce the memory per CPU ratio from 20GB to 8GB; so that we can at most have 2.5 times of jobs running simultaneously. Note that if we want to significantly cut the total memory usage, we may need to cut memory usage on both splitapply() and inverting matrices; as the memory usage of them are similar and both are large at this time, so the cost of reducing memory might be a little high. I suggest we decide whether or not further cut down the memory after we revise the DGP and estimation process in #205.

Time

Per [Comment], we spend most of time on splitapply() function. I think splitapply() function has room to speed-up. Per public discussions (example here), splitapply() is slow when there are many groups. It looks promising to cut down the time cost in splitapply() in future #205 and I propose to implement that.

from template.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.