Code Monkey home page Code Monkey logo

Comments (2)

feifeibear avatar feifeibear commented on July 28, 2024

40B model on 1 GPU.

Step 4 elaspe 34.75464367866516 s, 38.27567377588915 Tflops
CHUNK_LIST_prepare_device .... 42.78182029724121, 12.76010964694122 %
CHUNK_allocate_payload_cpu ... 219.98981094360352, 65.61418119535382 %
CLIENT_access ................ 230.78546500205994, 68.83409396529345 %
CLIENT_release ............... 5.45810604095459, 1.627935208537724 %
chunk_cpu_gpu_move ........... 47.5001916885376, 14.167411531003756 %
CLIENT_access_dist ........... 90.68537592887878, 27.047828544621243 %
chunk_gpu_cpu_move ........... 46.541945934295654, 13.881605064427777 %
CHUNK_LIST_chunk_move ........ 46.56918501853943, 13.889729396193417 %
FWD .......................... 44.62372398376465, 13.30947600947208 %
CLIENT_release_dist .......... 0.12889885902404785, 0.03844538551854307 %
BWD .......................... 60.193546295166016, 17.95333264054873 %
ADAM_prepare_data_grad_copy .. 18.356434106826782, 5.47499172084236 %
ADAM_prepare_data ............ 155.6247365474701, 46.41664820068707 %
ADAM_compute ................. 43.16696763038635, 12.874983725861044 %
ADAM_param_fp32_to_fp16 ...... 30.386908769607544, 9.063202197518356 %
ADAM_release_data ............ 0.3651151657104492, 0.1088992828228694 %
ADAM ......................... 230.46057200431824, 68.73719134997918 %
CHUNK_LIST_make_room ......... 5.053736686706543, 1.507327967840193 %
TOTAL ........................ 335.2778422832489
------------- DATA MOVE RESULTS --------------
chunk_cpu_gpu_move: 1122048.0 MB, 1461 times, 23621.967830305926 MB/s
chunk_gpu_cpu_move: 1095168.0 MB, 1426 times, 23530.77375720547 MB/s
ADAM_prepare_data_grad_copy: 387155.8593940735 MB, 4045 times, 21091.016759627062 MB/s
ADAM_param_fp32_to_fp16: 774311.718788147 MB, 4045 times, 25481.75349651229 MB/s
******************** LOSS ********************
[0.69580078125, 3.775390625, 164.875, 60.75, 6.90234375]

from patrickstar.

feifeibear avatar feifeibear commented on July 28, 2024

I made a mistake the above statists involves warmup iteration....

from patrickstar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.