Code Monkey home page Code Monkey logo

Comments (21)

bitcrazed avatar bitcrazed commented on May 23, 2024 11

Thanks for posting this issue @wm4, though a polite ask: please file one issue per GitHub Issue - combining issues makes it very difficult to discuss and track over time. Thanks.

First, some context and background for others reading along who may be unfamiliar with some of these technologies:

On WSL

WSL currently comes in two versions:

  • WSL1 - runs unmodified Linux binaries atop a Linux-compatible layer in the NT kernel. All kernel operations are either provided by NT, or by WSL emulating Linux-specific kernel/OS behavior.

  • WSL2 - available in Windows 10 2004 or later, WSL2 runs unmodified Linux binaries in Linux containers (one per distro) atop a Linux kernel hosted in a lightweight VM that can boot from cold in < 2s. Because WSL2 runs atop a real Linux kernel, it can offer ~100% Linux syscall compatibility, and near-native IO perf when accessing files within the distro's filesystem.

Both versions of WSL also provides several useful integration features that enable you to:

  • Run unmodified Linux binaries on Windows, alongside your Windows apps and tools

  • Access your the Linux distros' filesystems from Windows, and vice-versa

  • Execute Linux commands, scripts, and binaries from Windows, and vice-versa
    image

    image

The guidance re. WSL is, if you need to runs Linux native binaries and tools, and/or build and run code that you plan to deploy & run in Linux environments, then run them in WSL

On Cygwin

Cygwin delivers a collection of GNU shells and tools ported to run atop Windows/Win32. Cygwin is a great toolset for those who need to run key GNU tools and scripts on Windows, cross-platform projects that share the same build system, but which generate Windows executables and binaries. However, Cygwin does not run unmodified Linux binaries and so you cannot "apt install ... your way to happiness".

On The Issues Described

So, to the issues you describe above:

Building POSIX apps on Windows

If you're able to pass arguments to a build system to emit binaries for a given platform, you may want to explore that as an option rather than relying on .configure to configure and build based on the environment. I know this isn't always possible, but if it is, one might then be able to build within WSL, but target Win32.

If you DO have to build on Windows for Windows, then why do builds run slower on Windows than on other platforms and why isn't porting easier?

In reverse order ...

POSIX compatibility

It doesn't matter which way one slices or dices it, Windows and POSIX (UNIX, BSD, Linux, etc.) have two very different and orthogonal philosophies, assumptions, architectures, and implementations: In *NIX, everything is a stream; in Windows, everything is an object. In *NIX, systems are constructed by chaining together lots of small tools that "do one thing and do it well", in Windows, systems are build out of larger, more sophisticated apps and tools. Etc.

These differences manifest EVERYWHERE and it can, as you point out, it can complicate porting from one to the other, especially while maintaining correct behaviors and expected levels of performance .

Performance of POSIX apps on Windows

Performance of POSIX apps on Windows is, indeed, a fundamental issue that is affected by some fundamental differences in the *NIX vs. NT IO subsystems:

For example, in POSIX systems, files and folders are enumerated by first collecting a list of files by calling opendir(), then repeatedly calling readdir() until it returns NULL, and then calling closedir(). If one then needs some/all of the file's attributes (length, last updated date/time, permissions, etc.), one must then call the stat() syscall on each file in question. In most *NIX systems, stat() is practically "free" from a performance perspective, and as a result, is called A LOT!

Windows has no direct equivalent to stat()! Why? The mechanism in Windows to enumerate the contents of a folder is to call FindFirstFile[Ex]() and then repeatedly call FindNextFile() until it returns zero, and then call FindClose(). Similar to POSIX, right? Yes, except in Windows, there's no need to then call stat() on each file in a folder to get it's attributes because the file's attributes were already returned by Find[First|Next]File()!

So, if a POSIX app is naively ported to Windows, it can result in a list of files and their properties being enumerated twice!

This is just one example and nicely demonstrates that this is not a simple issue to fix. There are many, MANY more, including how file information is cached, how files are deleted, copied and moved, etc.!

Stay Tuned!

However, don't think we're ignorant of these issues and not doing anything about them!

It's too early to discuss in detail yet, but we are working on a set of improvements to address some of the key, fundamental differences between POSIX and Win32, which we expect will provide substantial performance benefits for many POSIX apps on Windows, as well as many Windows-native apps!

We will share details when we have a better picture of what we'll be delivering and when. Until then, stay tuned!

from windows-dev-performance.

avih avatar avih commented on May 23, 2024 6

though a polite ask: please file one issue per GitHub Issue

An off-topic comment, pardon the irony of making it even more meta, but while that quote is definitely true pretty much everywhere, this approach can fail to grasp some bigger pictures.

With big-picture issues (which I do consider this one to be), there's also a value IMHO in being able to discuss them as a whole, rather than discussing each micro-issue on its own.

For what it's worth, personally I find the issue itself, the responses, and the discussion exceptionally on (this) topic and to the point, despite the seemingly impossibility of doing that.

I applaud this discussion so far and all sides which take part in it, and hope to see other big-picture issues discussed as beautifully as here.

from windows-dev-performance.

 avatar commented on May 23, 2024 2

please file one issue per GitHub Issue

I know, this issue is very broad. There are multiple possible reasons as to why Cygwin could be slow. General mismatch between POSIX/win32 is often suspected (especially fork() performance in the case of shell scripts), it could be filesystem performance, it could just be something sub-optimal that Cygwin or win32 do but which was not identified as cause yet. The stat() issue is also something I didn't hear about before.

Which repository would be a better match for filing issues about win32/POSIX impedance mismatches, that are not necessarily about performance, but which affect developers? For now, this repository seems to allow performance-related issues only. So I just made this issue about performance. Such workarounds often end up in bad performance, as seen on Cygwin.

The guidance re. WSL is, if you need to runs Linux native binaries and tools, and/or build and run code that you plan to deploy & run in Linux environments, then run them in WSL

Yes, but it's still just a glorified VM. It doesn't help in the example of FFmpeg, unless you cross-compile to Windows. But there are problems with that (I could go into details). For other types of programs, this isn't feasible, because they need access to windows APIs.

Cygwin delivers a collection of GNU shells and tools ported to run atop Windows/Win32. Cygwin is a great toolset for those who need to run key GNU tools and scripts on Windows, cross-platform projects that share the same build system, but which generate Windows executables and binaries.

That omits the quite important fact that Cygwin is a POSIX environment on top of Windows. It's not just for GNU tools. You can use it to port almost any kind of POSIX-compliant software. If carefully written, an application that didn't even attempt to target Cygwin, will build and run just fine on Cygwin.

Even git for windows appears to use Cygwin, even though it's not a GNU tool. (I didn't look too closely though. I've only seen the dev folder, which seems to be an artifact of Cygwin going a bit too far to pretend to be Unix.)

However, Cygwin does not run unmodified Linux binaries and so you cannot "apt install ... your way to happiness".

No, that isn't Cygwin's goal. However, Cygwin has its own repository of pre-built binaries, which surely made a lot of people happy. In any case, I consider WSL1/2 to be out of scope wrt. this issue.

It doesn't matter which way one slices or dices it, Windows and POSIX (UNIX, BSD, Linux, etc.) have two very different and orthogonal philosophies, assumptions, architectures, and implementations: In *NIX, everything is a stream; in Windows, everything is an object.

I mean, that's certainly true, but on the other hand there are a lot of staggering similarities. For example, win32's HANDLE is extremely similar to a UNIX FD. At least on Linux, FDs are used whenever userspace needs a handle to a kernel object. There are many types of FDs that are not associated with any kind of byte stream (consider device files, memfd, epoll, signalfd, pidfd, listener-only sockets). HANDLE on win32 is surprisingly similar. It is used for file I/O, I/O completion ports (vaguely equivalent to epoll on a conceptual level), threads, and even devices (equivalent to device files on UNIX).

Microsoft's libc (MSVCRT) emulates some POSIX primitives to some degree. For example, the open/read/write functions, which all use UNIX FDs. And indeed, the libc just maps FDs to HANDLEs in a table. Portable programs can (mostly) just use open instead of CreateFile. But this "emulation" often has problems, so advanced portable programs keep doing similar stuff (like https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/w32fd.c).

(And where win32 gets a real pain is because sockets are neither HANDLEs nor emulated FDs. They're their own thing, and it's awful. So awful.)

My point is, you shouldn't have to do this when porting to Windows.

Sorry, I guess that got quite offtopic wrt. the performance topic. Though going through these layers will also cost performance, and they require making a lot of choices that might impact performance.

A nice example which I've seen in libusb: they use win32 "events" to emulate wakeup pipes. Their central mainloop is a poll() call, which waits on all wakeup pipes. But unlike poll, WaitForMultipleObjects has a limit on the number of objects it can wait on. So they start an additional thread for every 64 objects, and at the end of the wait, they destroy the threads. Every time. Man, I sure I hope I never run into this case on Windows with my libusb CLI program. Code: https://github.com/libusb/libusb/blob/master/libusb/os/poll_windows.c#L239

Windows has no direct equivalent to stat()! Why? The mechanism in Windows to enumerate the contents of a folder is to call FindFirstFileEx and then repeatedly call FindNextFile() until it returns zero, and then call FindClose(). Similar to POSIX, right? Yes, except in Windows, there's no need to then call stat() on each file in a folder to get it's attributes because the file's attributes were already returned by Find[First|Next]File()!

That doesn't seem to be ideal. This probably affects native windows programs as well. Listing directory contents isn't the only purpose of stat(). Often, you may want to run it on a single file, or on a separate list of files (maybe "git status"? I don't know), so I wonder what native win32 programs do in these cases.

from windows-dev-performance.

nmoinvaz avatar nmoinvaz commented on May 23, 2024 2

What we lack in Win32 are some of the fundamental APIs that behave and perform as they do on POSIX systems.

There are some functions that are just missing, to name a few:
opendir,readdir,closedir,fsync,gettimeofday,getopt,getopt_long,getopt_long_only,strcasecmp

It seems like it would be an easy thing for Microsoft to add these and other missing functions compared to the amount of work it causes for developers all over the world.

from windows-dev-performance.

sskras avatar sskras commented on May 23, 2024 2

Although the project is still pre-alpha, an interested person could just try running the Midpix environment:
https://github.com/lalbornoz/midipix_build#1-what-is-midipix-and-how-is-it-different

Currently building it requires Linux.

It also requires a secret reference to a temporarily and small code repo, which can be obtained by chatting on #midipix IRC channel on Libera.chat.

Also I could try to share my own build from 2022.11.18 (if a person happens to trust that) via some web means:

image

In general, it uses NTAPI instead of WinAPI and is like 3-6 times faster than Cygwin.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024 1

@nmoinvaz Great point - we'll definitely discuss this with the VC libs team. It'd be great if we can close the gap between our current POSIX API support and modern-day POSIX API reality, esp. if there's a pretty close mapping between, for example opendir() / FindFirstFile() or fsync() / FlushFileBuffers().

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 23, 2024 1

@bitcrazed Haha, okay, no worries. Sorry; hard to read tone through the internet sometimes 🙂

from windows-dev-performance.

driver1998 avatar driver1998 commented on May 23, 2024

It's too early to discuss in detail yet, but we are working on a set of improvements to address some of the key, fundamental differences between POSIX and Win32, which we expect will provide substantial performance benefits for many POSIX apps on Windows, as well as many Windows-native apps!

Would it be possible to provide some POSIX-like APIs in the Win32 subsystem? I knew Windows once had a POSIX subsystem, but like WSL, being a subsystem means POSIX/Linux apps are separated from Win32 apps. Therefore people can't do things like call POSIX APIs from Win32 app, or call Win32 APIs from POSIX app, both of which are enabled by cygwin.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024

One issue at a time

We'd prefer if specific issues are filed, e.g.

  • "Enumerating files in a folder takes longer on Windows than on Linux"
  • "Forking sub/worker processes is faster on Linux than on Windows"
  • "Windows should better support POSIX style ____"

Assuming underlying reasons for an issue should be avoided at the outset - the more specific and reproducible the issue, the better.

On POSIX issues

We welcome the discussion about POSIX compat in this repo - we intend to broaden our scope out to include such issues anyhow. The perf caveat just indicates that we'd prefer perf issues at this time as a way of gating input to a level we can handle as we build our team and skills here.

Literally on a call discussing this as I type ;)

On WSL

The point re WSL was that WSL1 was not a VM, WSL 2 uses our current VM infrastructure, but the underlying infrastructure should be considered an internal implementation detail.

But yes, WSL provides a parallel POSIX / Linux runtime environment - it doesn't add POSIX capability to Windows per se. We're actively working to figure out how we can better support POSIX apps & runtimes on Windows itself in the future. Stay tuned for more info.

On FDs vs. Handles

Except in specific cases, handles are to be considered per-process, unique, and opaque. They should (generally) not be shared across processes, and one should avoid assuming underlying layout and structure of the handle's internal implementation. There are also several different underlying types of HANDLE on Windows (e.g. file handles, GDI handles, Registry handles, Console handles), but again, they should simply be considered as unique and opaque.

FDs describe files and are unique to a machine, so may be shared across processes. FDs index into file table entries which index into inodes - a fact that is often assumed and utilized for better, or for worse.

On stat()

Of course, Win32 provides GetFileAttributesEx to query the attributes for a specific file, but it isn't as cheap as stat() is in POSIX based systems. On Windows this isn't a major perf issue because code doesn't HAVE to call stat() on each file in a list to obtain it's attributes since those attributes are already returned during enumeration.

In our testing, individual or small batches of calls to stat(), which generally translate to calls to GetFileAttributesEx() make little perf impact, but code which naively issues storms of stat() calls (often repeatedly and unnecessarily several times in a call stack), can show up as a major perf issue.

In closing

So, to summarize:

  1. We hear you and understand & appreciate you raising the issues above
  2. We are actively working on some of the root-causes of the many of the issues and scenarios discussed above. We will share details when appropriate to do so
  3. We encourage you to file individual specific issues with repro steps if possible in order to help identify and focus on issues that we can action into improvements

Many thanks.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024

@driver1998 Great question: VC++ already implements many POSIX APIs which are implemented to call Win32 APIs. What we lack in Win32 are some of the fundamental APIs that behave and perform as they do on POSIX systems. This is an area we're actively exploring as I type.

from windows-dev-performance.

 avatar commented on May 23, 2024

Thanks, I appreciate that MS is working on this.

Though I'm getting confused about the following:

Except in specific cases, handles are to be considered per-process, unique, and opaque. They should (generally) not be shared across processes, and one should avoid assuming underlying layout and structure of the handle's internal implementation. There are also several different underlying types of HANDLE on Windows (e.g. file handles, GDI handles, Registry handles, Console handles), but again, they should simply be considered as unique and opaque.

There must be some sort of misunderstanding. win32 HANDLEs are not necessarily unique or process-local:

https://docs.microsoft.com/en-us/windows/win32/api/handleapi/nf-handleapi-duplicatehandle
https://docs.microsoft.com/en-us/windows/win32/sysinfo/handle-inheritance

Of course the HANDLE value itself will be different, but it still refers to the same kernel object.

FDs describe files and are unique to a machine, so may be shared across processes. FDs index into file table entries which index into inodes - a fact that is often assumed and utilized for better, or for worse.

A file descriptor is just an integer that can be used in a single process only. If you open() a file in one process, you can't use the same integer value in another process to perform a read(). FDs can be shared by fork() (then the integer value stays actually the same), or by sendmsg() when using unix domain sockets (the integer value may change in the target process).

A FD doesn't describe files either. A FD returned by socket() refers to an object in the network stack (often a network connection), a FD returned by memfd_create() refers to a block of memory, and a FD returned by epoll_create() doesn't even reference any kind of resources, just a specific kernel management object. It's possible that the Linux kernel has some sort of inode object per FD internally, but that's just an opaque implementation detail.

from windows-dev-performance.

 avatar commented on May 23, 2024

Indeed, as I have pointed out, a lot of programs have such wrappers. Often they even replace wrappers that already exist in the CRT, for implementation quality reasons. MinGW-w64 also has a bunch of these. (I wonder whether we can get an issue about this topic somewhere, without the focus on performance considerations, which was just my way to make this not out of scope, to be honest.)

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024

@wm4 LOL 😁 Don't worry about the perf scoping right now - you're spot-on above in your observation that some of the POSIX API differences do impact perf, so you're in-scope. Plus we absolutely do plan on broadening scope of this repo to discuss developer productivity and other scenarios too - just wanted to gate the repo at launch so that we weren't deluged at the start 😜

from windows-dev-performance.

orlando2378 avatar orlando2378 commented on May 23, 2024

Thanks for the very interesting thread.

We are working on a project with very similar issues, trying to run some linux libraries (with lots of POSIX native calls) on windows.

At first we went through WSL1+Docker and while we were okay with the lower performance, as @wm4 described, the solution is not well integrated considering deployment at scale of the application.

In order to provide better integration, we went down the path of compiling and running the libraries on Windows using Cygwin, with quite big performance issues and not few headaches.

I know that the goals of Cygwin and WSL differ and the historical difference in between POSIX apps and Windows make the integration everything but easy, but at an higher level, what's Windows answer to easily and tightly integrate linux binaries in your Win application? @bitcrazed Will WSL2 answer this need somehow?

Again thanks for all the interesting points addressed here.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024

Hi @orlando2378 - thanks for sharing. Could I ask what the major perf issues were that you found when porting your Linux libraries to Windows?

The goal of WSL (regardless of version) is primarily to provide an environment in which you can run unmodified Linux binaries alongside all your favorite Windows apps and tools.

It is NOT a goal of WSL to enable one to build apps that contain Linux libs hosted and running in WSL within a Windows app process ... in fact, that'd be prohibitive in so many ways as to be impossible.

If you have code in a Linux lib project and want to reuse that code on Windows, then building it with MSYS/Cygwin is a great first step. If that code has perf etc. issues on Windows, you may need to adjust its implementation to better adapt to Windows' architecture/behaviors.

We are keen to figure out where we may be able to expose additional features in Windows that better support POSIX apps, but note that this will take some time to happen.

from windows-dev-performance.

orlando2378 avatar orlando2378 commented on May 23, 2024

@bitcrazed Thanks for the prompt reply. The main issues we identified is very poor performance using multithreading. By disabling it, we actually run faster than when enabled. It seems like a common issue using Cygwin unfortunately.

Indeed we would need to adjust the implementation to adapt Windows needs but that could require quite some work, especially on big projects, defeating a bit the whole purpose of having a compatibility POSIX layer in the first place. (I know, too idealistic :))

Is in near future Windows roadmap to better support POSIX apps or something more long term?

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024

@orlando2378

Without knowing anything about the nature of the perf issues you're seeing when "using multithreading" it's difficult to know if the root cause is simply in MSYS' implementation of threading, inherent perf issues mapping POSIX threads to Windows threads, perf issues in Windows threading, or something else.

We'd love it if you could file an issue detailing specifically what you're seeing with an easy to recreate repro case, etc. to help us narrow-down the root cause of the issue.

from windows-dev-performance.

 avatar commented on May 23, 2024

Uh what? POSIX threading is quite straight-forward and simple. The only problem I see is that win32 adds weird requirements, possibly has worse scheduling and worse startup performance than Linux.

from windows-dev-performance.

insinfo avatar insinfo commented on May 23, 2024

something new with better compatibility of the kernel and c++ rumtime of windows with POSIX

from windows-dev-performance.

jcrben avatar jcrben commented on May 23, 2024

@bitcrazed one thing that's interesting is that as I've shifted over from MacOS to Windows - drawn by WSL as well as the broader Windows ecosystem - I've found myself using msys2 / git-for-windows / cygwin a lot. I still want my underlying host for the VM to be rock-solid and useful for scripts and services and I want those comfortable linux tools available in Windows.

My main ask is that Windows just consider Cygwin as it makes updates so as not to break existing functionality. It's not replaced or deprecated by WSL for me.

The message I'm getting here from the replies is that this is something that you all are thinking about which is encouraging.

In perusing the commits to cygwin, I noticed this commit for example: Cygwin: Adjust CWD magic to accommodate for the latest Windows previews. It's nice that the git-for-windows maintainer @dscho works for Microsoft and submitted that patch but hopefully this is on more than just them. People outside of Microsoft have limited ability and motivation to make patches for "magic" updates to Windows.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 23, 2024

Hey Ben. I left Microsoft last March, and returned back to the UK to try out this thing folks refer to as "retirement", so am not able to drive this issue internally any longer. However, the awesome @marcpems @snickler and others are working on a bunch of stuff that will help improve MSYS2 on Windows.

Also, the new Windows Developer Drive was conceived in large part to address the POSIX file IO perf issues I discuss above and should deliver very sizeable perf improvements when running POSIX workloads & scripts on Windows itself.

Rest assured that the team are working on improving the performance of many POSIX-first apps, tools, libs, etc. when running on Windows. Do file additional new issues, esp. if you can provide repro cases to demonstrate the biggest offenders - this will be super-useful to the team when trying to diagnose and remedy.

Thanks for your continued patience and support.

from windows-dev-performance.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.