Code Monkey home page Code Monkey logo

Comments (38)

Reflejo avatar Reflejo commented on May 25, 2024 4

Update for FreeBSD

tl;dr; I have a branch that compiles and passes all the tests on FreeBSD.

Take this as a very experimental first step.

Non comprehensive TODO list

  • Provide a more elegant shim layer instead of inlining #ifdefs which is nasty
  • Discuss / decide what to do with hot restart; is it ok to use files instead of abstract sockets on FreeBSD?
  • Is it ok to disable TCMalloc?
  • Make OSX pass all tests
  • consistent mutex is not available on Darwin.

See: https://github.com/lyft/envoy/compare/master...Reflejo:freebsd?expand=1

What was done

TCMalloc disabled (commit)

envoy compiled w/TCMalloc seg faults on OSX and creates deadlocks on FreeBSD on very specific situations while removing events with event_del. I got deep into this deadlock for hours before I realized that TCMalloc was contributing to this (probably indirectly but still).

Make currentThreadId multiplatform (commit)

I decided to use pthread_getthreadid_np on FreeBSD for convenience which I believe it was added around fbsd 9. Alternatively pthread_self could be used which is more portable.

Add kqueue implementation for monitoring symlinks (commit)

inotify is not available on BSD hence why I implemented the watcher using kqueue. There is a caveat though, envoy is mainly using the watcher for symlinks but freebsd doesn't support O_SYMLINK (i.e. get a file descriptor from a symlink). I included a kernel patch for freebsd11 adding this functionality. FWIW OSX supports O_SYMLINK.

Handle EAGAIN when querying SO_ERROR on bsd (commit)

When a connect() is performed on a non-blocking socket, it'll return -1/EINPROGRESS immediately and when select()'ed for writability it'll succeed or fail but when it fails the reason is hidden. getsockopt(,,SO_ERROR,,) solves this issue on some systems but on some cases such as FreeBSD getsockopt returns 0 even on EAGAIN.

This was a big problem on freebsd that'd hang connections so instead of trusting on SO_ERROR I'm also calling getpeername() which will return -1/ENOTCONN if the socket is not connected and read(,,1) will produce the right errno.

UUID implementation on FreeBSD (commit)

Envoy was getting uuids from /proc/sys/kernel/random/uuid on linux (which afaik uses urandom). Instead I changed it to use uuid_generate_random on Linux and hand craft a uuid using arc4random on bsd.

Hot restart on BSD doesn't use abstract sockets (commit)

envoy uses abstract sockets which are only supported on Linux. As a temporary hack; I'm just using concrete files on FreeBSD, but a decision needs to be taken for the whole hot restart system to make it more portable.

Fix tests on FreeBSD (commit)

Some tests were making assumptions that only worked on Linux (for example at what point on the event loop a connection would be open/ready). I fixed some tests that were using NonBlock.

Additionally I had to disable TcpClientConnectionImplTest.BadConnectNotConnRefused since FreeBSD will happily connect to almost any ip (even checking the kernel code I couldn't find a case that'd return ENETUNREACH on a common gw configuration).

OSX

The delta from here to make it compile in OSX is actually very small. With this change it'll compile but I still haven't try to make the tests pass.

cc @dch Since I'm on and off here (more off than on), feel free to take over this and let me know how I can help.

from envoy.

zuercher avatar zuercher commented on May 25, 2024 2

I'm going to start looking at this today, starting by porting over the changes mentioned above to fresh branch off master.

from envoy.

zuercher avatar zuercher commented on May 25, 2024 2

That's good, because I fell down a rabbit hole trying to make it smaller yesterday. But... I endedup upgrading to bazel 0.5.2 on the max and it turns out strip has the same crazy arguments program as ar if you use bazel 0.5.2. I "fixed it."

Anyway, tada:
#1346
#1348

from envoy.

timperrett avatar timperrett commented on May 25, 2024 2

Managed to compile a static binary using the instructions here: https://github.com/lyft/envoy/blob/master/bazel/README.md#quick-start-bazel-build-for-developers

Running bazel build -c dbg //test/... took 30 minutes to run, but it does indeed work.

Great stuff folks!

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024 1

FYI @ldemailly and @9len have both expressed interest in finishing this up. I doubt it would really be that difficult at this point, especially I'm guessing if hot restart is just not supported on OSX (I think the MSFT folks are doing this also for the initial NT port). Please coordinate here so we don't duplicate effort.

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024 1

@lizan @ldemailly @htuch can probably advise on bazel issues. I just skimmed this real quick and it would be nice to avoid copying so much bazel code into our repo somehow (sad that we already have to copy so much. Not sure what is possible. :(

Overall this looks good modulo the details. Exciting to see this come together.

@zuercher once you get it mostly working, please add the bazel files/changes in their own PR. It will make it easier to review the other stuff.

from envoy.

zuercher avatar zuercher commented on May 25, 2024 1

@lizan thanks -- I think that's enough for to make it better.

Current status:

  • it links and doesn't immediately crash on start with no config (output looks the same as linux)
  • the same branch links and passes tests under linux
  • Executed 143 out of 143 tests: 110 tests pass and 33 fail locally.

from envoy.

zuercher avatar zuercher commented on May 25, 2024 1

I've got it down to 3 failing tests:

  1. /test/common/network:proxy_protocol_test hangs
  2. /test/common/network:connection_impl_test asserts when the ConnectionImplTest destructor destroys the client ConnectionImpl (I think) because the connection hasn't been closed.
  3. /test/common/ssl:connection_impl_test hangs

from envoy.

zuercher avatar zuercher commented on May 25, 2024 1

I've gotten all the tests passing on osx (and don't seem to have caused any linux regressions).

@htuch The tools/osx/crosstools ar invocation is still broken with bazel's master. I'll see about filing an issue with them. Maybe no one else is linking statically on OSX? Meanwhile, I'll try to trim the changes down to copying stuff from the bazel repo at build time and applying a diff for the one file I changed.

I also need to circle back and clean up the watcher code, and then I should be able to make some PRs.

from envoy.

9len avatar 9len commented on May 25, 2024 1

We should probably set up a darwin CI build.

from envoy.

ramtej avatar ramtej commented on May 25, 2024

+1

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

@Reflejo if you are working on this can you keep notes in here on your progress (just so we don't duplicate effort).

Thanks,
Matt

from envoy.

Reflejo avatar Reflejo commented on May 25, 2024

I'll keep this issue updated; I'm planning on compile envoy on freebsd fwiw, but I think the delta between freebsd and darwin should be pretty small

from envoy.

dch avatar dch commented on May 25, 2024

hi folks, I started on a port for this to FreeBSD; I'm not a C/C++ programmer but will do what I can to get this into the ports tree. Feel free to contact me directly, I'm also in the IRC channel as dch. No idea why github has unassigned @Reflejo it wasn't intentional on my part 👎 sorry!

from envoy.

9len avatar 9len commented on May 25, 2024

Cc @zuercher who is looking at this on our end.

from envoy.

zuercher avatar zuercher commented on May 25, 2024

c.f. master...turbinelabs:osx

I maintained the author field on commits I copied from Refejlo:freebsd. Some further changes were necessary since envoy has changed a lot since February. The end result almost compiles. The final build failure in the linking step because of an unsupported linker flag. Looking at that next.

If someone who has a better understanding of bazel would care to suggest a better way to accomplish what I did in turbinelabs@88344c1, I'd be grateful. I ended up copying bazel's tools/osx (at SHA 7b85122) into envoy's bazel/tools/osx. This provides detection of various xcode cross compilers (envoy for Apple Watch 🤣). The one change I made in crosstool was to switch the flags for ar from ar -static -s -o to ar -r -s (turbinelabs@88344c1#diff-0cea4aa18c29d749a512a756ef2f98c1R159). I wasn't able to find docs for a version of ar that takes those flags (on my mac -static is taken to mean -s -t -a -t -i -c; -o is not recognized and taken as the first file argument).

Also, source/common/filesystem is pretty awful. I'm not sure what I've done is really better than a single header and source file with conditionally compiled parts. Is there a better way to do OS-specific code with bazel?

I kept @Reflejo's file-based hot restart implementation, since it wasn't obvious to me how to disable it completely.

from envoy.

zuercher avatar zuercher commented on May 25, 2024

Just realized those flags for ar seem like they would be valid for ld.

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

@zuercher if you want to get rid of hot restart, basically do this: https://github.com/lyft/envoy/blob/master/test/integration/server.cc#L20

in prod code.

from envoy.

htuch avatar htuch commented on May 25, 2024

The ar flag issue looks like it should be fixed upstream. Given how we're forking cc_configure.bzl, we might not be able to avoid some of this copy as we have things implemented today. It's possible within the cc_configure repository rule we could pull down stuff from the Bazel Github and inject it, that might be a lot cleaner for files that are essentially unchanged.

We wouldn't need to do any of this if bazelbuild/bazel#2840 was fixed upstream.

from envoy.

lizan avatar lizan commented on May 25, 2024

I am happy to help.

Also, source/common/filesystem is pretty awful. I'm not sure what I've done is really better than a single header and source file with conditionally compiled parts. Is there a better way to do OS-specific code with bazel?

You can use select function in BUILD files, to compile specific cc_library as deps.

from envoy.

Reflejo avatar Reflejo commented on May 25, 2024

I'm glad someone is picking this up. It's was on my backlog and never got back to it.

@zuercher Sounds like you are good, but do let me know if you need something from me. I got all the tests passing on FreeBSD at some point

from envoy.

Reflejo avatar Reflejo commented on May 25, 2024

That's fantastic @zuercher. Thanks for taking this

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

Awesome work @zuercher !

re: bazel, at this point, I would just copy the code in the same as we have done for all the other crosstools stuff, but up to you (please just do in dedicated PR).

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

I'm reopening this for tracking until we are sure this is fully done and to see if there are any needed doc updates.

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

@9len @zuercher unfortunately right now we won't have the ability to host OSX CI (our CI story is a mess and we need to get that in shape first before taking on potentially non-Linux targets). Some options for CI are:

  • Someone else sets up a CI job for OSX. We will setup whatever GH hooks you want to fire events into your system.
  • We will not block merge if that job is broken, but at least we will know about it.

There was a question that popped up in Gitter about the OSX build not working (I don't know details), so at minimum we should probably just go through the various ci/bazel/public docs and see if anything needs to be updated for doing an OSX build.

from envoy.

9len avatar 9len commented on May 25, 2024

looks like there's only one option :)

we can try to set it up.

from envoy.

ldemailly avatar ldemailly commented on May 25, 2024

travis has OSX support (it's a small pool and fairly slow but I've had successes with it) ?

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

Travis may have OSX support but Lyft and non-Istio Google do not have the resources to manage it right now. Like I said I'm happy to install hooks and make the tests non-blocking for merge but that's the best we can do right now. cc @htuch

from envoy.

9len avatar 9len commented on May 25, 2024

That's fine with us, and about what we expected.

We run CircleCI for our in-house stuff, would you be opposed to us using that for an OSX build? Sidesteps figuring out how to do both from one travis yaml file.

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

We run CircleCI for our in-house stuff, would you be opposed to us using that for an OSX build?

Opposite of opposed. Please do!!! :) I will setup the hooks to send them to you when you are ready. We can figure that out offline.

from envoy.

9len avatar 9len commented on May 25, 2024

cool. we'll prototype in our fork until we find something we like.

from envoy.

zuercher avatar zuercher commented on May 25, 2024

@lizan You noted a compile error on the original PR for this (#1348). All my changes have landed on master, do you still see that? If so, what version of bazel & xcode do you have installed?

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

@zuercher AFAWK this is done right? Can we tell people to start testing?

from envoy.

zuercher avatar zuercher commented on May 25, 2024

@mattklein123 Sure. I'm working on a CI build for now. That'll uncover some things (like missing tools, of which I'm keeping a list to update the README).

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

OK I will blast out to try to get some people testing. This is awesome!!

from envoy.

lizan avatar lizan commented on May 25, 2024

@zuercher I still see the same compile errors (on commit 5e35e19)

Errors: bazel build -c dbg //test/...

ERROR: /Users/zlizan/github/lizan/envoy/source/common/http/BUILD:11:1: C++ compilation of rule '//source/common/http:async_client_lib' failed (Exit 1).
source/common/http/async_client_impl.cc:16:77: error: default initialization of an object of const type 'const AsyncStreamImpl::NullRateLimitPolicy' without a user-provided default constructor
const AsyncStreamImpl::NullRateLimitPolicy AsyncStreamImpl::RouteEntryImpl::rate_limit_policy_;
                                                                            ^
                                                                                              {}
source/common/http/async_client_impl.cc:17:73: error: default initialization of an object of const type 'const AsyncStreamImpl::NullRetryPolicy' without a user-provided default constructor
const AsyncStreamImpl::NullRetryPolicy AsyncStreamImpl::RouteEntryImpl::retry_policy_;
                                                                        ^
                                                                                     {}
source/common/http/async_client_impl.cc:18:74: error: default initialization of an object of const type 'const AsyncStreamImpl::NullShadowPolicy' without a user-provided default constructor
const AsyncStreamImpl::NullShadowPolicy AsyncStreamImpl::RouteEntryImpl::shadow_policy_;
                                                                         ^
                                                                                       {}
source/common/http/async_client_impl.cc:19:73: error: default initialization of an object of const type 'const AsyncStreamImpl::NullVirtualHost' without a user-provided default constructor
const AsyncStreamImpl::NullVirtualHost AsyncStreamImpl::RouteEntryImpl::virtual_host_;
                                                                        ^
                                                                                     {}
source/common/http/async_client_impl.cc:20:78: error: default initialization of an object of const type 'const AsyncStreamImpl::NullRateLimitPolicy' without a user-provided default constructor
const AsyncStreamImpl::NullRateLimitPolicy AsyncStreamImpl::NullVirtualHost::rate_limit_policy_;
                                                                             ^
                                                                                               {}

bazel version:

Build label: 0.5.3-homebrew
Build target: bazel-out/darwin_x86_64-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Jul 27 19:30:37 2017 (1501183837)
Build timestamp: 1501183837
Build timestamp as int: 1501183837

Xcode version:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/c++/4.2.1
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

from envoy.

zuercher avatar zuercher commented on May 25, 2024

I've updated the readme to indicate that XCode 8.3.3 is required.

from envoy.

mattklein123 avatar mattklein123 commented on May 25, 2024

Nice going to close this as complete since I think several people have got it working. We can track CI in a different PR/Issue.

from envoy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.