Code Monkey home page Code Monkey logo

Comments (43)

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Hi Chris, could you attach your motor config?
In particular I would be looking for a switching frequency set too high that is crashing the RTOS timing. It happened to me and its the main reason the watchdog has been reworked.
More than 30khz is dangerous territory.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

EDIT: Disregard, bad debugging info

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Nevermind, disregard the above comment. This happens with stock settings on a brand new flash of the firmware when configured for 4.12

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

I flashed one of my palta boards with hw_410 here and I can't reproduce this issue.

  1. What do you mean by a brand new flash? Did you command a full chip erase from an stlink to ensure old configurations are erased?
  2. Are you using any app with the firmware?
  3. Are you using an encoder or other cpu load?
  4. Is your crystal okay? firmware now double checks the timing with an independent watchdog clock.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Yes, I tried a full erase.

My hardware is both a Flipsky mini vesc and a torque vesc from esk8

Steps to repro:

git reset --hard; git pull origin master

Uncomment:
#define HW_SOURCE "hw_410.c" // Also for 4.11 and 4.12
#define HW_HEADER "hw_410.h" // Also for 4.11 and 4.12
and comment the hardware60 lines

Full erase with STLink,

make upload

After this the board never boots up to the point where VCP works, as it is always rebooting.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Steps I'm doing:

  1. qstlink2 --cli -e Full flash memory erase
  2. fresh clone from the repo
  3. make clean
  4. edit conf_general
  5. make upload
  6. connects OK to vesc tool.
  7. just in case, Program with vesc tool the latest firmware found here: https://github.com/vedderb/bldc/blob/master/build_all/410_o_411_o_412/VESC_default.bin
  8. powercycle turns out ok
  9. Store default config
  10. after powercycle connects ok to vesc tool

Do you know if there are other users with the same issue?
Thanks

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

To my comment above add flashing the bootloader before step 7.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

So, did not flash the bootloader, but it shouldn't make any difference, right? Looking through the code, it doesn't touch the wdg, (other than to abuse it to reset the board, lol). Anyone with a torque or flipsky esc who can test? This is looking like a hw issue, and it must be with the xtal.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Here's some more debug info. If I accidentally leave HW60 in as the selected config, USB works! So, what's the difference between 60 and 410 that affects this?

Edit: Also, I confirmed that both boards do have an xtal loaded, but I'm guessing everything must be ok on this front, because if the clock settings or xtal were incorrect, USB wouldn't work at all.

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

A significant difference is that hw6 defaults to FOC mode and hw4 defaults to bldc mode...

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Maybe adding to hw410.h this could narrow this down:

// Default setting overrides
#ifndef MCCONF_DEFAULT_MOTOR_TYPE
#define MCCONF_DEFAULT_MOTOR_TYPE		MOTOR_TYPE_FOC
#endif

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Yup! That fixes it. So now...

  • 410 is flashed with the above mod
  • Board reboots, USB VCP comes up
  • Change to BLDC
  • Reboot
  • Board boot loops

So there is an issue starting BLDC mode with the timeout

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

More debugging info: This is absolutely related to the switching freq. I configured my motor, everything worked great, so I set 29.5K as my FOC switching freq (everything still worked great) and then I rebooted. After the reboot, the vesc now does the boot loop. I bet the reason it fails with bldc selected is because the switching freq is very high (35K) by default

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Yes, I think you are right.

So its not a problem with the watchdog, the watchdog led you to discover that the CPU usage hit 100% with your default configuration and scheduler timing is failing.

In my palta hardware I added this limit a while ago to prevent exactly that
#define HW_LIM_FOC_CTRL_LOOP_FREQ 10000.0, 30000.0 //at around 38kHz the RTOS starts crashing (26us FOC ISR)
https://github.com/vedderb/bldc/blob/master/hwconf/hw_palta.h#L268

IMO a line like that should be added to all hardware versions.

I don't use BLDC mode, but a similar limit should be implemented for that mode.
#define MCCONF_M_BLDC_F_SW_MAX 35000 // Maximum switching frequency in bldc mode
Its either decrease the frequency or optimize the code to make it run faster. (I'd decrase freq)

The frequency limit depends on the CPU load. Looks like BLDC mode (or something else) is getting more cpu intensive and now the cpu can't keep up.

Now that we have a likely solution (or at least an explanation) I think we need @vedderb

Thanks for reporting!

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

And more debugging info... This goes beyond just the switching freq. If I get a good auto detection in FOC with hall/general, and then reboot, everything is fine. If I take those settings and back them up to a file, and then reload the file the VESC will boot loop. Even if I simply backup stock settings after a fresh erase/flash and restore them, the same thing happens.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

And even more debugging info, If I do a fresh flash, load settings, not touch the motor config, but set the CAN baud to 1M and save, the vesc will bootloop on reboot

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

When you are near the cpu limit any configuration change can make it better or worse. An spi encoder will require more cpu usage, so would higher CAN packet decoding frequency.

Max frequency should be dialed down now, and then see how we are going to continue. Profiling and optimizing code is an endless endeavor once you hit your resources limit, I'd rather limit freq than making the code less clear.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

@nitrousnrg Oops, I didn't see your previous message until now. That said, my configuration isn't really anything interesting. It's a totally stock config other than CAN being 1M, and FOC with a slightly higher switching freq in sensored mode. Seems a little unreasonable that this should be at the fully limits of the hardware/RTOS?

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Memory resources are plentiful, but you can easily max out the cpu if you run the core control loop at high frequencies. Thats why my first question here was if you are running > 30kHz.

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

I just received a support ticket of a customer telling me that the latest firmware doesn't work for him in BLDC mode, so I would think this has escalated to be a critical bug that needs patching asap before more users upgrade the firmware and brick devices.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

@nitrousnrg Just a note, I encountered this running at 20Khz (default FOC switching freq) too. It does not appear to only be dependent on switching freq. I don't know the codebase well enough to speculate on what might be going on, but it seems very sensitive to any kind of configuration changes.

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Meh, customer installed a wrong resistor, totally unrelated. Too bad I emailed Benjamin about this.

from bldc.

vedderb avatar vedderb commented on July 21, 2024

I was following the conversation, but have not been home for a few days so I could not test anything myself. Emailing me is not a problem :-) When I come home I will catch up with the pull requests and issues.

If a commit from back then would break things for HW4 I suspect that I would have heard a lot more by now, so I was kind of hoping that you would resolve the issue.

@chris1seto is it ok to close this issue, or do you still have the problem? If you do, can you make sure that your compiler is working properly and that you did not disable optimizations?

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Hi Benjamin,

That's my feeling too, is that you'd have heard more if this was really broken, but it seems like it really is (or at least, I'm not sure what could be wrong in my configuration). My compiler should be working correctly, I build other projects, and the optimization options should be set in the makefile, correct? I haven't changed the makefile or any part of the FW other than the general conf file (to target 410). I don't suppose anyone has an Esk8 Torque or flipsky mini vesc they could test on?

Do you have any potential steps to try to debug? I could send you a binary of stock FW to compare to one generated by your build system, but I suspect that if we have differing versions, the binary could change slightly.

EDIT: I am using gcc-arm-none-eabi-8-2018-q4-major

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

@chris1seto, did you get the chance to confirm its not a hardware issue? Can we close this issue?

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Hi @nitrousnrg ,

It's definitely not a hardware issue. There's something else going on here in the bldc software, but I think Ben may need to look at it. Without disabling the watchdog, I cannot get the code to run on any of my 4.10 vescs. With the watchdog disabled the code seems to run fine, even if the scheduler is saturated.

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Could you attach your motor config xml AND app xml?
I can try your binary as well if you want.

If the scheduler is saturated it should not run fine, the board should reset, thats the purpose of using a wdt.

With your files I can probe this deeper, thanks!

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Hi @nitrousnrg See attached!! These are for a 6" garden variety hoverboard motor.
focworkingmini.zip

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Thanks Chris,
please send me your compiled binary, because with the latest firmware taken from https://github.com/vedderb/bldc/blob/master/build_all/410_o_411_o_412/VESC_default.bin your configs don't brick a discovery board.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Hi @nitrousnrg , see attached.fw.zip

chris@itxdev:~/Vesc1/bldc$ arm-none-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/home/chris/opt/gcc-arm-none-eabi-8-2018-q4-major/bin/../lib /gcc/arm-none-eabi/8.2.1/lto-wrapper
Target: arm-none-eabi
Configured with: /tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519_20181216_ 1544945247/src/gcc/configure --target=arm-none-eabi --prefix=/tmp/jenkins/jenkin s-GCC-8-build_toolchain_docker-519_20181216_1544945247/install-native --libexecd ir=/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519_20181216_1544945247/ins tall-native/lib --infodir=/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519_ 20181216_1544945247/install-native/share/doc/gcc-arm-none-eabi/info --mandir=/tm p/jenkins/jenkins-GCC-8-build_toolchain_docker-519_20181216_1544945247/install-n ative/share/doc/gcc-arm-none-eabi/man --htmldir=/tmp/jenkins/jenkins-GCC-8-build toolchain_docker-519_20181216_1544945247/install-native/share/doc/gcc-arm-none- eabi/html --pdfdir=/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519_2018121 6_1544945247/install-native/share/doc/gcc-arm-none-eabi/pdf --enable-languages=c ,c++ --enable-plugins --disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libssp --disable-libstdcxx -pch --disable-nls --disable-shared --disable-threads --disable-tls --with-gnu-a s --with-gnu-ld --with-newlib --with-headers=yes --with-python-dir=share/gcc-arm -none-eabi --with-sysroot=/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519 20181216_1544945247/install-native/arm-none-eabi --build=x86_64-linux-gnu --host =x86_64-linux-gnu --with-gmp=/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-5 19_20181216_1544945247/build-native/host-libs/usr --with-mpfr=/tmp/jenkins/jenki ns-GCC-8-build_toolchain_docker-519_20181216_1544945247/build-native/host-libs/u sr --with-mpc=/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519_20181216_154 4945247/build-native/host-libs/usr --with-isl=/tmp/jenkins/jenkins-GCC-8-build_t oolchain_docker-519_20181216_1544945247/build-native/host-libs/usr --with-libelf =/tmp/jenkins/jenkins-GCC-8-build_toolchain_docker-519_20181216_1544945247/build -native/host-libs/usr --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc+ +,-Bdynamic -lm' --with-pkgversion='GNU Tools for Arm Embedded Processors 8-2018 -q4-major' --with-multilib-list=rmprofile
Thread model: single
gcc version 8.2.1 20181213 (release) [gcc-8-branch revision 267074] (GNU Tools f or Arm Embedded Processors 8-2018-q4-major)

chris@itxdev:/Vesc1/bldc$ git show -s --format=%H
fb94428
chris@itxdev:
/Vesc1/bldc$

chris@itxdev:~/Vesc1/bldc$ git diff
diff --git a/conf_general.h b/conf_general.h
index 61eed55..9f20ec4 100644
--- a/conf_general.h
+++ b/conf_general.h
@@ -61,14 +61,14 @@
//#define HW_SOURCE "hw_49.c"
//#define HW_HEADER "hw_49.h"

-//#define HW_SOURCE "hw_410.c" // Also for 4.11 and 4.12
-//#define HW_HEADER "hw_410.h" // Also for 4.11 and 4.12
+#define HW_SOURCE "hw_410.c" // Also for 4.11 and 4.12
+#define HW_HEADER "hw_410.h" // Also for 4.11 and 4.12

// Benjamins first HW60 PCB with PB5 and PB6 swapped
//#define HW60_VEDDER_FIRST_PCB

-#define HW_SOURCE "hw_60.c"
-#define HW_HEADER "hw_60.h"
+//#define HW_SOURCE "hw_60.c"
+//#define HW_HEADER "hw_60.h"

//#define HW_SOURCE "hw_r2.c"
//#define HW_HEADER "hw_r2.h"

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

Chris, your attached binary doesn't work in a discovery board, while mainstream binaries do work. Looks like a building issue.

Using built-in specs.
COLLECT_GCC=arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/usr/bin/../lib/gcc/arm-none-eabi/7.3.1/lto-wrapper
Target: arm-none-eabi
Configured with: /build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/src/gcc/configure --target=arm-none-eabi --prefix=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native --libexecdir=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native/lib --infodir=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native/share/doc/gcc-arm-none-eabi/info --mandir=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native/share/doc/gcc-arm-none-eabi/man --htmldir=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native/share/doc/gcc-arm-none-eabi/html --pdfdir=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native/share/doc/gcc-arm-none-eabi/pdf --enable-languages=c,c++ --enable-plugins --disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libssp --disable-libstdcxx-pch --disable-nls --disable-shared --disable-threads --disable-tls --with-gnu-as --with-gnu-ld --with-newlib --with-headers=yes --with-python-dir=share/gcc-arm-none-eabi --with-sysroot=/build/gcc-arm-none-eabi-2DWmz3/gcc-arm-none-eabi-7-2018q2/install-native/arm-none-eabi --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-pkgversion='GNU Tools for Arm Embedded Processors 7-2018-q3-update' --with-multilib-list=rmprofile
Thread model: single
gcc version 7.3.1 20180622 (release) [ARM/embedded-7-branch revision 261907] (GNU Tools for Arm Embedded Processors 7-2018-q3-update)

My compiler version doesn't mention anything about jenkins and docker stuff

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Where did you get your compiler package from? I got mine via the official tarball from here: https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm/downloads (Linux x64)

Perhaps this is too much to ask, but would you mind downloading the tarball and using the prebuilt binaries within the build the source?

I agree that this certainly points to a build issue, and thus may not be a bug at this point, but I'm wondering what could be wrong here... I use this compiler for my fulltime day job as an STM32/Arm Cortex M3/M4F developer, so I would think that I notice if there was something wrong with my other projects. I'm more concerned about what's going on than anything...

Thanks!!

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

I followed the instructions here:
https://vesc-project.com/node/310

sudo add-apt-repository ppa:team-gcc-arm-embedded/ppa
sudo apt update
sudo apt install gcc-arm-embedded

You can also check if the mainstream binary I used bricks your board.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

I'll go ahead and try this tomorrow. I guess if I can build a successful binary using those directions we can go ahead and close the bug report. I am extremely curious as to why the tarball release generates a binary that fails in this way though. Perhaps some kind of difference in optimization?

from bldc.

nitrousnrg avatar nitrousnrg commented on July 21, 2024

I'm baffled as well, but at the same time, I'm not. The purpose of me pushing a motor simulator into vesc codebase is exactly this, to be able to automate tests on real hardware. If one day we bump the compiler version we could hit a problem like this and the test tools will catch the problem for us.
In your pc it could be an environment variable issue, ir maybe the IDE you're using. I'd try an ubuntu virtual machine to be sure.
Keep us posted!

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

I haven't had time to test this, but also I don't want to just keep this open since it's pretty clear this is some kind of bizarre build system issue. I guess we can go ahead and close it. Man, I'd really love to know where the difference is though. I'm not even sure how to debug this because I bet different versions of gcc will emit slightly different code, although I'm sure for 99.9999% of differences, it will be inconsequential. But my point is, I'm not sure how you could even diff the disassembly to pinpoint it.

from bldc.

vedderb avatar vedderb commented on July 21, 2024

I had a look, and the GCC version you are using is 8 whereas I have been using 7. That should be no problem, but I can give it a try with the same version you are using and see if I encounter the same problem. Will report back in a few days after testing.

from bldc.

chris1seto avatar chris1seto commented on July 21, 2024

Thanks Benjamin! That would be excellent!

from bldc.

Guillaume227 avatar Guillaume227 commented on July 21, 2024

I happen to also have a 4.10 Flipsky around so I tested the latest firmware on it.

  • I can reproduce the issue 'out of the box' with a fresh FW upload.
  • my debug shows that it's related to the CAN reader thread:
    in particular that line seems to not come back in the 10ms it's supposed to.
    (chEvtWaitAnyTimeout(ALL_EVENTS, MS2ST(10)) == 0) {

I have tried reducing 10ms to 1ms or 100us but still get the board reset.
If I change it to just continue, it behaves fine.
Do you see that too?

from bldc.

tdaede avatar tdaede commented on July 21, 2024

FWIW I can also reproduce this on a 4.12 VESC. I was able to bisect it to the same commit. I'm using GCC 9.2.1 from Fedora's repositories. I also tried @Guillaume227 's suggestion of always continue, however that was an incomplete fix - it gets farther, but USB never comes up.

from bldc.

tdaede avatar tdaede commented on July 21, 2024

I just rebuilt the code with gcc-arm-none-eabi-7-2018-q2-update and now it works perfectly. So it is, in fact, the gcc version that matters.

from bldc.

lalten avatar lalten commented on July 21, 2024

Had the same issue and can confirm, current master works when compiled with gcc-arm-none-eabi-7-2018-q2 - but will boot loop when compiled with gcc-arm-none-eabi-9-2019-q4.

from bldc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.