Code Monkey home page Code Monkey logo

Comments (13)

fsmosca avatar fsmosca commented on June 27, 2024

Does this happen when using stockfish?

from lakas.

ChrisWhittington avatar ChrisWhittington commented on June 27, 2024

SF runs okay (so far) with 60 concurrency, but it puts much less load on the CPU. 18% at concurrency 60. But still shows the spike demand at the end of each game cycle (each budget) to about 44%
I also tried 500 game cycles, and depth 9, to see if it loaded more, yes, but not by much.
.
Quite probably SF is way more efficiently organised, memory-wise, than mine. Even so, the doubling of CPU demand at cycle end is weird. Be good if checked it out, and batched or time delayed (?) so not everything core-wise was being hit at once?

from lakas.

ChrisWhittington avatar ChrisWhittington commented on June 27, 2024

Because the memory and CPU demand spikes (doubling) happen at the end of each budget, I wonder if you're closing all engines out, and then opening them up again, such that there's overlap? New invocations opening before the old invocations are finished off with ending? That might account for the doubling.

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

Because the memory and CPU demand spikes (doubling) happen at the end of each budget, I wonder if you're closing all engines out, and then opening them up again, such that there's overlap? New invocations opening before the old invocations are finished off with ending? That might account for the doubling.

Only after each budget engines are restarted.

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

SF runs okay (so far) with 60 concurrency, but it puts much less load on the CPU. 18% at concurrency 60. But still shows the spike demand at the end of each game cycle (each budget) to about 44%

What time control or depth did you use for this test?

Quite probably SF is way more efficiently organised, memory-wise, than mine. Even so, the doubling of CPU demand at cycle end is weird. Be good if checked it out, and batched or time delayed (?) so not everything core-wise was being hit at once?

After a budget all engines are quitted. Then its time for nevergrad to update the params. I will try to add a log to measure the cpu and ram usage and time elapsed when nevergrad is updating its thing.

from lakas.

ChrisWhittington avatar ChrisWhittington commented on June 27, 2024

Giving back memory from many processes may be taking time?
It’s for sure there are 2xN engine threads running/opening/closing simultaneously because of the RAM usage doubling during the CPU usage spike. It can’t be nevergrad grabbing RAM, the amount is too large and more or less exactly matches what N engines grab.
Maybe it’s possible to await engine close signals before starting up again?

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

Created a branch https://github.com/fsmosca/Lakas/tree/more_logging

You need psutil for this.
pip install psutil

It will log to match_lakas.txt for the usage of cpu when nevergrad updates its data.

sample:

Initial:

2021-02-13 09:53:42,317 |  12048 | INFO  | starting main()
2021-02-13 09:53:42,581 |  12048 | INFO  | budget 1, after asking recommendation      , proc_id: 12048, cpu_usage%: 0, num_threads: 8, proc_name: python
2021-02-13 09:53:42,584 |  12048 | INFO  | before a match starts                      , proc_id: 12048, cpu_usage%: 0, num_threads: 8, proc_name: python
budget 2, after asking recommendation      , proc_id: 12048, cpu_usage%: 12, num_threads: 8, proc_name: python
budget 3, after asking recommendation      , proc_id: 12048, cpu_usage%: 10, num_threads: 8, proc_name: python

Using stockfish with concurrency 6 on my 4-core/8-thread PC, optimizer uses around 12% of python alone. In my other tests it reached 25%, this is the highest I observed.

Note match_lakas.txt can get very big as it logs the engine output from cutechess-cli.
If you test it, just use a smaller budget of 4 or so or interrupt after couple of budgets.
To reduce the log remove the line
command += ' -debug'
from lakas.py

Later I will log the memory used.

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

Giving back memory from many processes may be taking time?

That is possible, I am working on logging memory usage.

It’s for sure there are 2xN engine threads running/opening/closing simultaneously because of the RAM usage doubling during the CPU usage spike. It can’t be nevergrad grabbing RAM, the amount is too large and more or less exactly matches what N engines grab.
Maybe it’s possible to await engine close signals before starting up again?

cutechess-cli has -wait N
Wait N milliseconds between games. The default is 0

I will add it as an option later.

Or you can modify the code at

Lakas/lakas.py

Line 299 in 1c5201e

command += ' -debug'

Just add
command += ' -wait 5000'
to wait for 5 seconds.

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

The more logging branch at https://github.com/fsmosca/Lakas/tree/more_logging is updated:
v0.23.3

commit summary

  • Add --cutechess-debug flag
  • Add memory used by python when optimizer updates its data
  • Add --cutechess-wait option

There is new v0.25.0 featuring movetime.

Example:
--move-time-ms 100

from lakas.

ChrisWhittington avatar ChrisWhittington commented on June 27, 2024

--move-time-ms 10 seems to work (budget about twice as fast as with move-time-ms 25).

the default wait (code looks like it defaults to 5000 ms, if I read it correct) doesn't help with the CPU load, which is still doubling at the end of each budget.

it may be that I am just being too careless with RAM usage (256 Gb available, makes you lazy). When a process is ended, it has to clear the RAM it gives back (windows security reasons), so overlap exit-start, is going to have cores=concurrency x 2 with that many cores busy nulling out RAM given back. No doubt the RAM is segmented all over the place by concurrent starts.
What is cutechess doing? a 5000 ms wait AFTER it closes an engine? Because that ought to work. Weird.

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

the default wait (code looks like it defaults to 5000 ms, if I read it correct) doesn't help with the CPU load, which is still doubling at the end of each budget.

Yes default wait is 5000 ms in lakas.
Perhaps you will be able to see which application is using more cpu after a budget using task manager. Is it your engine or python or cutechess or other application?

I will also try to log the cpu usage of all the process ids running.

What is cutechess doing? a 5000 ms wait AFTER it closes an engine?

From cutechess:
-wait N Wait N milliseconds between games. The default is 0.

Looks like:

  1. game starts
  2. game ends
  3. wait N
    ...

Don't know what it is doing after.

I will update the branch and log the cpu and mem usage of cutechess.

What happened if you try to increase the wait like 10s.
--cutechess-wait 10000

from lakas.

fsmosca avatar fsmosca commented on June 27, 2024

Master is now updated with some changes including that of more logging branch.

from lakas.

Matthies avatar Matthies commented on June 27, 2024

Maybe related to a known issue of cutechess: cutechess/cutechess#630

from lakas.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.