Code Monkey home page Code Monkey logo

hcsshim's Introduction

hcsshim

Build status

This package contains the Golang interface for using the Windows Host Compute Service (HCS) to launch and manage Windows Containers. It also contains other helpers and functions for managing Windows Containers such as the Golang interface for the Host Network Service (HNS), as well as code for the guest agent (commonly referred to as the GCS or Guest Compute Service in the codebase) used to support running Linux Hyper-V containers.

It is primarily used in the Moby and Containerd projects, but it can be freely used by other projects as well.

Building

While this repository can be used as a library of sorts to call the HCS apis, there are a couple binaries built out of the repository as well. The main ones being the Linux guest agent, and an implementation of the runtime v2 containerd shim api.

Linux Hyper-V Container Guest Agent

To build the Linux guest agent itself all that's needed is to set your GOOS to "Linux" and build out of ./cmd/gcs.

C:\> $env:GOOS="linux"
C:\> go build .\cmd\gcs\

or on a Linux machine

> go build ./cmd/gcs

If you want it to be packaged inside of a rootfs to boot with alongside all of the other tools then you'll need to provide a rootfs that it can be packaged inside of. An easy way is to export the rootfs of a container.

docker pull busybox
docker run --name base_image_container busybox
docker export base_image_container | gzip > base.tar.gz
BASE=./base.tar.gz
make all

If the build is successful, in the ./out folder you should see:

> ls ./out/
delta.tar.gz  initrd.img  rootfs.tar.gz

Containerd Shim

For info on the Runtime V2 API.

Contrary to the typical Linux architecture of shim -> runc, the runhcs shim is used both to launch and manage the lifetime of containers.

C:\> $env:GOOS="windows"
C:\> go build .\cmd\containerd-shim-runhcs-v1

Then place the binary in the same directory that Containerd is located at in your environment. A default Containerd configuration file can be generated by running:

.\containerd.exe config default | Out-File "C:\Program Files\containerd\config.toml" -Encoding ascii

This config file will already have the shim set as the default runtime for cri interactions.

To trial using the shim out with ctr.exe:

C:\> ctr.exe run --runtime io.containerd.runhcs.v1 --rm mcr.microsoft.com/windows/nanoserver:2004 windows-test cmd /c "echo Hello World!"

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit Microsoft CLA.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

We require that contributors sign their commits to certify they either authored the work themselves or otherwise have permission to use it in this project.

We also require that contributors sign their commits using using git commit --signoff to certify they either authored the work themselves or otherwise have permission to use it in this project. A range of commits can be signed off using git rebase --signoff.

Please see the developer certificate for more info, as well as to make sure that you can attest to the rules listed. Our CI uses the DCO Github app to ensure that all commits in a given PR are signed-off.

Linting

Code must pass a linting stage, which uses golangci-lint. Since ./test is a separate Go module, the linter is run from both the root and the test directories. Additionally, the linter is run with GOOS set to both windows and linux.

The linting settings are stored in .golangci.yaml, and can be run automatically with VSCode by adding the following to your workspace or folder settings:

    "go.lintTool": "golangci-lint",
    "go.lintOnSave": "package",

Additional editor integrations options are also available.

Alternatively, golangci-lint can be installed and run locally:

# use . or specify a path to only lint a package
# to show all lint errors, use flags "--max-issues-per-linter=0 --max-same-issues=0"
> golangci-lint run

To run across the entire repo for both GOOS=windows and linux:

> foreach ( $goos in ('windows', 'linux') ) {
    foreach ( $repo in ('.', 'test') ) {
        pwsh -Command "cd $repo && go env -w GOOS=$goos && golangci-lint.exe run --verbose"
    }
}

Go Generate

The pipeline checks that auto-generated code, via go generate, are up to date. Similar to the linting stage, go generate is run in both the root and test Go modules.

This can be done via:

> go generate ./...
> cd test && go generate ./...

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Dependencies

This project requires Golang 1.18 or newer to build.

For system requirements to run this project, see the Microsoft docs on Windows Container requirements.

Reporting Security Issues

Security issues and bugs should be reported privately, via email, to the Microsoft Security Response Center (MSRC) at [email protected]. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Further information, including the MSRC PGP key, can be found in the Security TechCenter.

For additional details, see Report a Computer Security Vulnerability on Technet


Copyright (c) 2018 Microsoft Corp. All rights reserved.

hcsshim's People

Contributors

ambarve avatar anmaxvl avatar beweedon avatar darstahl avatar dcantah avatar dependabot[bot] avatar gabriel-samfira avatar gupta-ak avatar helsaawy avatar jstarks avatar jsturtevant avatar jterry75 avatar katiewasnothere avatar kengordon avatar kevpar avatar kiashok avatar matajoh avatar msabansal avatar nagiesek avatar nwoodmsft avatar pradipd avatar rn avatar scooley avatar seantallen avatar soccergb avatar swernli avatar tbble avatar thajeztah avatar veerun14 avatar yyatmsft avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hcsshim's Issues

hcsshim: timeout waiting for notification extra info

When trying to run my Rabbit image from my compose file I get this error below.

ERROR: for bin_Rabbit_1 Cannot start service Rabbit: container 4c49c5ce1c9be3f3deca474403b7a9df44ac09151bae5126c60768cf01767428 encountered an error during CreateContainer: hcsshim: timeout waiting for notification extra info: {"SystemType":"Container","Name":"4c49c5ce1c9be3f3deca474403b7a9df44ac09151bae5126c60768cf01767428","Owner":"docker","IgnoreFlushesDuringBoot":true,"LayerFolderPath":"C:\ProgramData\Docker\windowsfilter\4c49c5ce1c9be3f3deca474403b7a9df44ac09151bae5126c60768cf01767428","Layers":[{"ID":"dee86350-459f-580c-ae1e-fcc1bee0baa2","Path":"C:\ProgramData\Docker\windowsfilter\c056770fbd90992091b4d16fe9c2b608d739689b9b8d6f8b24edc6ecd36cfb3c"},{"ID":"cf555c82-e82c-5103-9328-02e79a453583","Path":"C:\ProgramData\Docker\windowsfilter\a1a2460d2ca841aac7e92a84fba5af4fa61274d960eff41ac1d5384bfff30efd"},{"ID":"d21d7c62-9717-5f8c-b40b-2e20f2c99b04","Path":"C:\ProgramData\Docker\windowsfilter\80fbafed497f150bbbdec469621cdb01494ac302443ee66be797ec167a88607c"},{"ID":"34db3c60-39de-51dc-bfbf-e9c907c6e86b","Path":"C:\ProgramData\Docker\windowsfilter\a10b48318221d634f2df7d4f6bbd8c4c24170cf92c8b54eaf15d21c6d12efe45"},{"ID":"5fb20f07-e3dc-5034-86a3-f1103c3377c3","Path":"C:\ProgramData\Docker\windowsfilter\4cfafd1cab11aa92b130ef6c3ed9a0f41d89ae500255424901c50408edbeb45b"},{"ID":"dbe867e1-5401-5747-9edc-02780de37593","Path":"C:\ProgramData\Docker\windowsfilter\802c32841d34c97eb63462666065b9fec57a4ea2aa9373bffe1425790078f4d1"},{"ID":"12b89f02-8d16-598e-86b0-f7c17f82612e","Path":"C:\ProgramData\Docker\windowsfilter\53801ceea5ae8088b78b6af799956087740807560aece1f504f1bb3c40efdee6"},{"ID":"28fb85e9-113c-5072-9ca5-d7de54103a5c","Path":"C:\ProgramData\Docker\windowsfilter\f6b75b2ad9713292ba588c5cd81a1efa8aadbebd6f23811cdd13327f0504d1fe"},{"ID":"2dea95ad-c46c-5ab3-a9eb-cac8df2c1451","Path":"C:\ProgramData\Docker\windowsfilter\33dcdab745abced6c32b832722b708284da1cc5ab049fe418af8c5ae42659670"},{"ID":"0df81b3f-652b-596c-8223-acaab6087dad","Path":"C:\ProgramData\Docker\windowsfilter\2c2d427c6268e0520729be4107b6c3839fd63ffdd05236a5cc9cbdc6b3ce7190"},{"ID":"de2d5623-089a-5d2c-944e-9246b670b4e6","Path":"C:\ProgramData\Docker\windowsfilter\16f6cc2dd45bfbd1be8b3255612f7740731744ee6f6dbb3f049eefd535df962f"},{"ID":"fd01999c-0b74-515c-ad9b-71b4236015eb","Path":"C:\ProgramData\Docker\windowsfilter\8b8f0948e6aa5ad08b3042a03c0bbd5ffd971d9f7e26d052d5afde4abb1837ad"},{"ID":"fc9ade98-724f-5fbb-8363-1ba433028c3d","Path":"C:\ProgramData\Docker\windowsfilter\abe09c74ed9ef55cde9a138b9b1cba2a3d987a43d2a3d492018a9e8b2d2bd94e"},{"ID":"60162656-a118-5f4a-a081-7114cce85437","Path":"C:\ProgramData\Docker\windowsfilter\1d6314999ada0560529b2fbbb14d4f35341cd2911959c0fe9be85d736ff3ca29"},{"ID":"fa0bbe42-6d85-531a-bfcb-1822906ff2c3","Path":"C:\ProgramData\Docker\windowsfilter\beb26da51fdda5d9d72ba60069d9b65fe35052013fac5f765775fc6e9224bf6b"},{"ID":"1d9b3c2c-68e4-5e56-82a9-3073faa6b72a","Path":"C:\ProgramData\Docker\windowsfilter\f420d7b6053c27051b688473386b8b621cf2a6f3ecca9f2600dfda0f2de20a92"}],"MemoryMaximumInMB":3072,"HostName":"4c49c5ce1c9b","HvPartition":true,"EndpointList":["221d7a6a-4b70-4f8f-bccd-afa1b6deb906"],"HvRuntime":{"ImagePath":"C:\ProgramData\Docker\windowsfilter\beb26da51fdda5d9d72ba60069d9b65fe35052013fac5f765775fc6e9224bf6b\UtilityVM"},"AllowUnqualifiedDNSQuery":true}

If I run the rabbit image using the docker run command the image will work fine. Only through compose it gives this error! I understand that its timing out waiting for a notification. I just dont know how exactly to go about fixing this or what to look at.

legacyLayerReader should not ignore wcidirs file for Files directory

legacyLayerReader.Next has code that specifically ignores wcidirs files that aren't under "Files" hierarchy. Now that a wcidirs file for the Files folder itself is produced, we can eventually use that to restore permissions on the Files folder that correspond to permissions that were set by the user on the container root volume. For this to work, the reader needs to preserve the file content from Files.$wcidirs$ instead of ignoring it.

This change should be verified on the downlevel hosts (RS1) to ensure that it doesn't interfere with their functionality.

MaximumOutgoingBandwidthInBytes is not respected

To reproduce:

  1. Create container with endpoint on a NAT network
  2. Update endpoint to add a hcsshim.QosPolicy with MaximumOutgoingBandwidthInBytes: 1024
  3. Inside the container, upload a 100 KB file to a remote host

The upload will take < 1 second, as opposed to the expected 100 seconds

Windows Kernel version: 10.0 16299 (16299.15.amd64fre.rs3_release.170928-1534)

Expose typed errors as a stable interface to prevent unnecessary hcsshim dependencies

This is just a thought that I'm posting as an issue to keep track of it.

Currently, in order to determine if an error is caused by a missing resource in hcsshim, clients need to take a dependency on hcsshim and use hcsshim.IsNotFound(). This is better than raw errors and error strings, but I think it could be done in a more scale-able way.

I wonder if we could do this in a more generic way in the future, such that clients who want to determine if an error is caused by a resource not existing do not need to take a dependency on hcsshim (assuming the hcsshim error filters a few layers, or even projects, above where it is read).

I think this would be best done with something like https://github.com/pkg/errors for Cause wrapping instead of my current implementation of getInnerError and an unexported (but guaranteed stable) isNotFound interface with some method such as .NotFound() and errors which implement that method. Clients could then just define the same interface in their code and at runtime determine if the error is caused by a missing resource without needing to know which underlying package caused the error.

This would be best defined outside hcsshim so that other platform layers could implement the same interface, and it is not just Microsoft/* that uses it. I don't have the time to look into if something like this already exists in Go, but storing this to come back to it at a later date. If not, pkg/errors might want to host something like this.

Unnecessary worker processers are created on Windows 10 Hyper-V isolation.

Starting a docker container with simple command: docker run --rm microsoft/nanoserver starts two vmwp.exe instances. One of them is killed when the container is stopped. The other one can only be stopped by killing it from the task manager.

One more thing, the unnecessary vmwp.exe instance creates two vmmem instances, while the correct one creates only one instance.

This causes some weird issues as some file handles are still used by the zombie instance and docker cannot get a handle to these files.

docker info: (I have debugged and found that the hcsshim.CreateContainer call is the one that creates two instances)

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 7
Server Version: 18.03.0-ce
Storage Driver: windowsfilter (windows) lcow (linux)
 Windows:
 LCOW:
Logging Driver: json-file
Plugins:
 Volume: local
 Network: ics l2bridge l2tunnel nat null overlay transparent
 Log: awslogs etwlogs fluentd gelf json-file logentries splunk syslog
Swarm: inactive
Default Isolation: hyperv
Kernel Version: 10.0 17134 (17134.1.amd64fre.rs4_release.180410-1804)
Operating System: Windows 10 Enterprise
OSType: windows
Architecture: x86_64
CPUs: 8
Total Memory: 15.89GiB
Name: YUSUFG-PC
ID: 7BCW:GTLJ:2FQB:KV3J:NRRE:IHTW:6C3B:YOCE:65GK:NKHJ:IS3D:3LNJ
Docker Root Dir: C:\ProgramData\Docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: -1
 Goroutines: 21
 System Time: 2018-07-02T17:17:15.1751126+03:00
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
 172.17.5.200:5000
 192.168.10.61:5000
 127.0.0.0/8
Live Restore Enabled: false

`wclayer create` fails on multiple parent layers

I was trying to create a writable scratch layer based on two read-only layers from microsoft/nanoserver:latest. After I imported the two layers, when I tried to run .\wclayer.exe create -l l_1 -l l_2 l_3 (l_1 and l_2 are the read-only layers, l_3 is the scratch layer I want to create), it failed with:

ERRO[0000] hcsshim::CreateScratchLayer failed in Win32: The system cannot find the path specified. (0x3) path=C:\Users\t-liazha\Desktop\test\l_3
hcsshim::CreateScratchLayer failed in Win32: The system cannot find the path specified. (0x3) path=C:\Users\t-liazha\Desktop\test\l_3

Any one has any idea on this?

Confusing parameter

In layer.go, the CreateSandboxLayer and CreateScratchLayer functions both have a parameter named parentId. It is a natural way to think that this is the parent layer of the layer we want to create, but from line 34 of create.go, we can deduce that the parentId is actually the base layer ID, because the order of paths to read-only parent layers given to wclayer create as argument should ends with base layer. Interestingly, the parentId parameter is ignored inside the CreateSandboxLayer and CreateScratchLayer functions, so it does not make any difference in the execution.

Here, I suggest probably we should delete that parameter. In addition, for argument specifying paths to parent layers like in create, import, export, and mount, we probably need a better documentation on the order and how the list should be given. The order should be ending with base layer instead of starting with it, and the list should have -l before every element like -l layer_3 -l layer_2 -l layer_1.

DNS lookup occasionally fails immediately after container create

We have found that DNS lookups occasionally fail when a process is executed in a newly created container. Here is a short go program which can be used to reproduce the issue:

$env:ROOTFS_PATH=(docker inspect microsoft/windowsservercore:1709 | ConvertFrom-Json).GraphDriver.Data.Dir
$env:NETWORK_NAME="nat"

for ($i=0; $i -lt 20; $i++) {./main.exe }

The DNS lookup from inside the container will fail 0-6 times out of 20.

If we use a container image that has the DNSCache service turned off, the DNS lookup always succeeds. The Dockerfile we use for this is:

FROM microsoft/windowsservercore:1709

RUN powershell.exe -command "Set-ItemProperty -Path 'HKLM:\SYSTEM\CurrentControlSet\Services\dnscache' -Name Start -Value 4"

A reason of the 'The requested compute system operation is not valid in the current state' error

Hello,

I already create an issue in the moby repository moby/moby#37395 but did not receive any feedback yet.

We face with this error on a few of our Windows Server 2016 servers. When we see this error then usually
docker ps hangs (if Docker is not upgraed to 18.03-ee) or docker run stuck.

What can be a reason of this error? What recommendations can be here to debug it and prevent in future?

Thanks!

Slow network performance on WinNAT due to receive segment coalescing (RSC)

As mentioned in docker/for-win#698 Docker for Windows (including Windows Server / Docker EE) can experience slow network performance with WinNAT due to receive segment coalescing (RSC).

I was able to reproduce the issue by having a long-running container that regularly does a git clone of the curl repository. After disabling RSC on all available network adapters, I no longer experienced slow network performance. Even just a regular curl download of large files (>1 GB) caused this issue.

Therefore I would like to suggest to disable RSC via hcsshim on the host adapter everytime a virtual network switch is created for use with Docker.

Adding ACLPolicies to an endpoint after it has been created fails on 1803 technical preview

Tested on Kernel Version: 10.0 17093 (17093.1000.amd64fre.rs_prerelease.180202-1400)

If you attempt to modify a created HNS Endpoint to add ACL Policies, the update fails with HNS failed with error : The parameter is incorrect. Code to reproduce can be found here. To run:

$env:ROOTFS_PATH=(docker inspect microsoft/windowsservercore-insider:10.0.17093.1000 | ConvertFrom-Json).GraphDriver.Data.Dir 
$env:NETWORK_NAME="nat"

.\acl-repro.exe

Container ID: 1521223469702047900
2018/03/16 11:04:32 HNS failed with error : The parameter is incorrect.

This is a regression from 1709:

$env:ROOTFS_PATH=(docker inspect microsoft/windowsservercore:1709 | ConvertFrom-Json).GraphDriver.Data.Dir
$env:NETWORK_NAME="nat"

.\acl-repro.exe

Container ID: 1521223272627319900
added acl to the endpoint

syscallWatcher should support early cancellation

The syscallWatcher today will effectively stack goroutines overtime. At the time of this post the defaultTimeout is 4 minutes which means that all syscalls (even completed ones) will have an open goroutine (although sleeping) for 4 minutes. This pattern should support a context.Context cancellation after a return from a syscall as we no longer need to monitor for a hung state.

It would likely look something like:

ctx, cancel := context.WithTimeout(context.Background(), defautTimeout)
defer cancel()
go syscallWatcher(ctx, ...)
// make syscall
return result

So if the syscall returns we cancel the syscallWatcher and if it times out before returning we get the appropriate syscall hung state as expected.

Container network collides with link-local 169.254.169.253 DNS resolver

Description.

When an HNS NAT network exists on a Windows Server 1709 machine, the link-local 169.254.169.253 address cannot be used to resolve DNS names. In AWS there is a configuration option to add this address to the list of DNS resolvers for the public ethernet interface. When the HNS NAT network is removed (via Remove-HNSNetwork powershell command), that link-local IP address can be used to successfully resolve DNS names.

Steps to reproduce.

  1. Launch an AWS VM using the following AMI: Windows_Server-2016-English-Core-Containers-2018.07.11 (ami-d0a296af).
  2. Run nslookup google.com 169.254.169.253. This fails to resolve the google.com DNS name. See below for what this failure looks like.
PS C:\Users\Administrator> nslookup google.com 169.254.169.253
Server:  UnKnown
Address:  169.254.169.253

*** UnKnown can't find google.com: No response from server

If you remove the HNS NAT network with the following go program (or Remove-HNSNetwork powershell command), the nslookup google.com 169.254.169.253 command will succeed.

package main

import (
	"fmt"
	"os"

	"github.com/Microsoft/hcsshim"
)

func main() {
	nets, err := hcsshim.HNSListNetworkRequest("GET", "", "")
	if err != nil {
		fmt.Println(err.Error())
		os.Exit(1)
	}

	for _, n := range nets {
		fmt.Printf("deleting: %s\n", n.Name)
		n.Delete()
	}
}

Windows 10

Just wondering if this works on standard Windows 10 Or only Windows servers.

I need a way for docker containers to just work on Windows, without virtualbox.

Unable to remove container network without vswitch

I somehow got into the following situation

> Get-ContainerNetwork

Name           Id                                   Subnets          Mode        SourceMac DNSServers DNSSuffix
----           --                                   -------          ----        --------- ---------- ---------
testNet        70c6f3b7-f5fe-462d-94f8-ecc73f83c2c3 {}               Transparent

> Get-ContainerNetwork | Remove-ContainerNetwork

Confirm
Remove-ContainerNetwork will remove the container network "".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
Remove-ContainerNetwork : The parameter is incorrect.
At line:1 char:24
+ Get-ContainerNetwork | Remove-ContainerNetwork
+                        ~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [Remove-ContainerNetwork], VirtualizationException
    + FullyQualifiedErrorId : OperationFailed,Microsoft.Containers.PowerShell.Cmdlets.RemoveContainerNetwork

> Get-ContainerNetwork

Name           Id                                   Subnets          Mode        SourceMac DNSServers DNSSuffix
----           --                                   -------          ----        --------- ---------- ---------
testNet        70c6f3b7-f5fe-462d-94f8-ecc73f83c2c3 {}               Transparent

> Get-VMSwitch
> Get-NetAdapter

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
Ethernet0                 vmxnet3 Ethernet Adapter                      3 Up           00-0C-29-1A-52-31        10 Gbps

Notice how there is no HNS vSwitch

I managed to reproduce the above scenario by:

> New-ContainerNetwork -Name reproNet -Mode transparent -NetworkAdapterName Ethernet0

Name     Id                                   Subnets Mode        SourceMac DNSServers DNSSuffix
----     --                                   ------- ----        --------- ---------- ---------
reproNet 731b3b26-3d42-44af-b5ca-fe4253127900 {}      Transparent

> Get-VMSwitch

Name           SwitchType NetAdapterInterfaceDescription
----           ---------- ------------------------------
New HNS Switch External   vmxnet3 Ethernet Adapter

> Get-VMSwitch | Remove-VMSwitch

Confirm
Are you sure you want to remove the virtual switch "New HNS Switch"?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
Remove-VMSwitch : Failed while removing virtual Ethernet switch.
Switch delete failed, switch = 'bc7bce24-6256-4691-9c71-5aab5534531e': General access denied error (0x80070005).
At line:1 char:16
+ Get-VMSwitch | Remove-VMSwitch
+                ~~~~~~~~~~~~~~~
    + CategoryInfo          : PermissionDenied: (:) [Remove-VMSwitch], VirtualizationException
    + FullyQualifiedErrorId : AccessDenied,Microsoft.HyperV.PowerShell.Commands.RemoveVMSwitch

> Get-VMSwitch

Name           SwitchType NetAdapterInterfaceDescription
----           ---------- ------------------------------
New HNS Switch Private

Notice how vSwitch changed type to Private


> net stop hns
The Host Network Service service is stopping.
The Host Network Service service was stopped successfully.

> Get-VMSwitch

Name           SwitchType NetAdapterInterfaceDescription
----           ---------- ------------------------------
New HNS Switch Private

> Get-VMSwitch | Remove-VMSwitch

Confirm
Are you sure you want to remove the virtual switch "New HNS Switch"?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
Remove-VMSwitch : Failed while removing virtual Ethernet switch.
Switch delete failed, switch = 'bc7bce24-6256-4691-9c71-5aab5534531e': General access denied error (0x80070005).
At line:1 char:16
+ Get-VMSwitch | Remove-VMSwitch
+                ~~~~~~~~~~~~~~~
    + CategoryInfo          : PermissionDenied: (:) [Remove-VMSwitch], VirtualizationException
    + FullyQualifiedErrorId : AccessDenied,Microsoft.HyperV.PowerShell.Commands.RemoveVMSwitch

> Get-ContainerNetwork

Name           Id                                   Subnets          Mode        SourceMac DNSServers DNSSuffix
----           --                                   -------          ----        --------- ---------- ---------
reproNet       731b3b26-3d42-44af-b5ca-fe4253127900 {}               Transparent

> Get-ContainerNetwork | Remove-ContainerNetwork

Confirm
Remove-ContainerNetwork will remove the container network "".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):

No error encountered, but you can still see the container network:

> Get-ContainerNetwork

Name           Id                                   Subnets          Mode        SourceMac DNSServers DNSSuffix
----           --                                   -------          ----        --------- ---------- ---------
reproNet       731b3b26-3d42-44af-b5ca-fe4253127900 {}               Transparent

And now it can't be removed:

> Get-ContainerNetwork | Remove-ContainerNetwork

Confirm
Remove-ContainerNetwork will remove the container network "".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
Remove-ContainerNetwork : The parameter is incorrect.
At line:1 char:24
+ Get-ContainerNetwork | Remove-ContainerNetwork
+                        ~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [Remove-ContainerNetwork], VirtualizationException
    + FullyQualifiedErrorId : OperationFailed,Microsoft.Containers.PowerShell.Cmdlets.RemoveContainerNetwork

Also, AFAIK calling *-ContainerNetwork automatically starts hns service:

> net start hns
The requested service has already been started.

More help is available by typing NET HELPMSG 2182.

> NET HELPMSG 2182

The requested service has already been started.  // haha thx

Now, after computer restart, I CAN remove "repro":

> Get-ContainerNetwork | Remove-ContainerNetwork

Confirm
Remove-ContainerNetwork will remove the container network "".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):

> Get-ContainerNetwork
>

But, restarting DOES NOT HELP with removing the "original" testNet network, so reproduction isn't perfect - it works only until restart.

Question is, how can I remove testNet network? I suspect it has something to do with vswitch not existing, as in my reproduction, but also something else. Is there a way to "factory-reset" hns?

Extracting layer fails when container is running

When:

  1. A container is running with a given image (e.g made up of layers A,B,C )
  2. A new image is downloaded + extracted that shares a base layer with the given image (e.g. made up of layers A,D,E)

Then extracting the first new layer (layer D in this example) will fail.

To reproduce, run in one terminal:

PS C:\Users\vagrant> docker pull microsoft/windowsservercore:1709_KB4054517
1709_KB4054517: Pulling from microsoft/windowsservercore
5847a47b8593: Already exists
e50cc21fbc56: Already exists
Digest: sha256:65a11ae1d7096b850c02184cfe30b7ef5665357472a34098afdcda94546b91b8
Status: Downloaded newer image for microsoft/windowsservercore:1709_KB4054517
PS C:\Users\vagrant> docker run -it microsoft/windowsservercore:1709_KB4054517 cmd.exe

and then in a second terminal, pull a new version of the container image:

PS C:\Users\vagrant> docker pull microsoft/windowsservercore:1709_KB4056892
1709_KB4056892: Pulling from microsoft/windowsservercore
5847a47b8593: Already exists
9f887ccb8077: Extracting [==================================================>]  689.7MB/689.7MB
failed to register layer: re-exec error: exit status 1: output: remove \\?\C:\ProgramData\docker\windowsfilter\572e75dfac18f1e5a7cf134a584e3376fa75115d17e8508240d2f71a0cb9fa14\UtilityVM\Files\Windows\WinSxS\amd64_microsoft-windows-workstationservice_31bf3856ad364e35_10.0.16299.15_none_ef2643e047f6349e\wkssvc.dll: Access is denied.

From docker info:

Default Isolation: process
Kernel Version: 10.0 16299 (16299.15.amd64fre.rs3_release.170928-1534)

LayerFolderPath field value does not affect creating containers

We noticed that the value of LayerFolderPath seems to not matter at all. When we set it to any non-empty string, all our integration tests pass (note that we are only creating shared-kernel containers, not Hyper-V).

Does this value actually need to be set to a specific value? Is there some side effect we are not seeing that setting this property affects?

Also please let us know if there is a better medium to ask this question in as it is a very general question

Windows kernel version: 10.0 16299 (16299.15.amd64fre.rs3_release.170928-1534)

Unable to use Linux containers on Windows 10- Docker hv-sock proxy (vsudd) is not reachable

I am unable to run windows containers or Linux containers on Windows 10 Pro latest version. The windows is already logged, The error i get when switch docker to use Linux is:
Docker hv-sock proxy (vsudd) is not reachable
at Docker.Backend.ContainerEngine.Linux.ConnectToVsud(TaskCompletionSource`1 vmId) in C:\gopath\src\github.com\docker\pinata\win\src\Docker.Backend\ContainerEngine\Linux.cs:line 293
at Docker.Backend.ContainerEngine.Linux.DoStart(Settings settings, String daemonOptions) in C:\gopath\src\github.com\docker\pinata\win\src\Docker.Backend\ContainerEngine\Linux.cs:line 260
at Docker.Backend.ContainerEngine.Linux.Start(Settings settings, String daemonOptions) in C:\gopath\src\github.com\docker\pinata\win\src\Docker.Backend\ContainerEngine\Linux.cs:line 130
at Docker.Core.Pipe.NamedPipeServer.<>c__DisplayClass9_0.b__0(Object[] parameters) in C:\gopath\src\github.com\docker\pinata\win\src\Docker.Core\pipe\NamedPipeServer.cs:line 47
at Docker.Core.Pipe.NamedPipeServer.RunAction(String action, Object[] parameters) in C:\gopath\src\github.com\docker\pinata\win\src\Docker.Core\pipe\NamedPipeServer.cs:line 145

More details in HNS error

Hi,
When HNS cannot create network it gives not very detailed error:

Error response from daemon: HNS failed with error : Failed to create network

There can be many reasons (only one NAT network can exists, problem with net adapter, etc.)
Is it possible to provide more details in it? It would be helpful, especially for users beginning with Docker.

Build failure with Go 1.7.4

I am new to go, but I think I am doing this right and it should work. The following uses docker to reproduce the problem to ensure that the issues aren't related to my environment.

$ docker run -it golang:1.7.4
root@1880646cda2e:/go# export GOOS=windows
root@1880646cda2e:/go# export GOARCH=386
root@1880646cda2e:/go# go get github.com/microsoft/hcsshim
# github.com/microsoft/hcsshim
src/github.com/microsoft/hcsshim/hcsshim.go:151: type [1073741824]uint16 too large

Friendly name and Hyper-V Switch

Hello,

I'm not sure if this is the right place to ask, but:

we're trying to do some docker-related networking using Hyper-V Switch. We need a way to associate specific container with a Port when it's connected to the switch. We noticed that Port/NIC "Friendly name" seen in Hyper-V Switch is the same as id in "EndpointList" returned by HNS. Is that the way to go?

Also, is there a way to specify existing vSwitch in optional parameter of a network driver? This is what happens when we try it ("vEthernet (DS)" is our switch):

`docker network create -d transparent -o com.docker.network.windowsshim.interface="vEthernet (DS)" mynetwork`
`Error response from daemon: HNS failed with error : The parameter is incorrect.`
time="2016-11-29T06:13:18.184073000-08:00" level=debug msg="Calling GET /_ping"
time="2016-11-29T06:13:18.190170400-08:00" level=debug msg="Calling POST /v1.26/networks/create"
time="2016-11-29T06:13:18.192964600-08:00" level=debug msg="form data: {\"Attachable\":false,\"CheckDuplicate\":true,\"D
river\":\"transparent\",\"EnableIPv6\":false,\"IPAM\":{\"Config\":[],\"Driver\":\"default\",\"Options\":{}},\"Internal\"
:false,\"Labels\":{},\"Name\":\"mynetwork\",\"Options\":{\"com.docker.network.windowsshim.interface\":\"vEthernet (DS)\"
}}"
time="2016-11-29T06:13:18.194740200-08:00" level=debug msg="Allocating IPv4 pools for network mynetwork (035650ac76f2b63
2265f368911e4eb3349bb98b2fb132e83af2ba6ab71946361)"
time="2016-11-29T06:13:18.196157400-08:00" level=debug msg="RequestPool(LocalDefault, , , map[], false)"
time="2016-11-29T06:13:18.197505500-08:00" level=debug msg="RequestAddress(0.0.0.0/0, <nil>, map[RequestAddressType:com.
docker.network.gateway])"
time="2016-11-29T06:13:18.197505500-08:00" level=debug msg="HNSNetwork Request ={\"Name\":\"035650ac76f2b632265f368911e4
eb3349bb98b2fb132e83af2ba6ab71946361\",\"Type\":\"transparent\",\"NetworkAdapterName\":\"vEthernet (DS)\",\"Subnets\":[{
\"AddressPrefix\":\"0.0.0.0/0\",\"GatewayAddress\":\"0.0.0.0\"}]} Address Space=[{0.0.0.0/0 0.0.0.0 []}]"
time="2016-11-29T06:13:19.550560300-08:00" level=debug msg="releasing IPv4 pools from network mynetwork (035650ac76f2b63
2265f368911e4eb3349bb98b2fb132e83af2ba6ab71946361)"
time="2016-11-29T06:13:19.550560300-08:00" level=debug msg="ReleaseAddress(0.0.0.0/0, 0.0.0.0)"
time="2016-11-29T06:13:19.552067700-08:00" level=debug msg="ReleasePool(0.0.0.0/0)"
time="2016-11-29T06:13:19.554412900-08:00" level=error msg="Handler for POST /v1.26/networks/create returned error: HNS
failed with error : The parameter is incorrect. "

Traffic to containers via NAT stops working when using IPSec to encrypt network connections

Using Windows Server 1709 or 1803 we are attempting to use IPSec encryption along with Windows Containers using NAT. For example:

Working:
Client --(unencrypted TCP)--> Container Host --> NAT --> Container

working

Not working:
Client --(encrypted with IPSec)--> Container Host --> NAT --> Container

notworking

IPSec is being enabled via standard WFP configuration with:

New-NetIPsecRule -LocalAddress [local] -RemoteAddress [remote]-InboundSecurity Require -OutboundSecurity Require 

We can reproduce this issue with Cloud Foundry which uses hcsshim as part of the https://github.com/cloudfoundry/winc component and we also see the same behavior using Docker, such as:

docker run -d -p 8080:80 --name aspnet microsoft/aspnet

It appears that this is a fundamental limitation with WinNAT / HNS / WFP but we aren't sure if some combination of settings can make this work.

hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3)

Unfortunately I am again encountering "hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3)" on Windows Server 1803 with Docker 18.03.1-ee-3:

re-exec error: exit status 1: output: time="2018-11-07T21:14:46+01:00" level=error msg="hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3) layerId=\\\\?\\C:\\ProgramData\\docker\\windowsfilter\\dbb8fe7c9ff32c1c36b7efea1786dc52d1bd420849248300536adc27dd5189e8 flavour=1 folder=C:\\ProgramData\\docker\\tmp\\hcs132418239"
hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3) layerId=\\?\C:\ProgramData\docker\windowsfilter\dbb8fe7c9ff32c1c36b7efea1786dc52d1bd420849248300536adc27dd5189e8 flavour=1 folder=C:\ProgramData\docker\tmp\hcs132418239
Client:
 Version:           18.06.1-ce
 API version:       1.37 (downgraded from 1.38)
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:18 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.03.1-ee-3
  API version:      1.37 (minimum version 1.24)
  Go version:       go1.10.2
  Git commit:       b9a5c95
  Built:            Thu Aug 30 18:56:49 2018
  OS/Arch:          windows/amd64
  Experimental:     false

It happens during the process-level isolation build of this Dockerfile:

# escape=`

ARG BASE_TAG=1803

FROM microsoft/windowsservercore:${BASE_TAG}

SHELL ["powershell", "-command"]

RUN Invoke-WebRequest "https://go.microsoft.com/fwlink/p/?linkid=870807" -OutFile "C:\Windows\Temp\winsdksetup.exe"; `
    Start-Process -FilePath "C:\Windows\Temp\winsdksetup.exe" -ArgumentList /Quiet, /NoRestart -NoNewWindow -PassThru -Wait; `
    Remove-Item @('C:\Windows\Temp\*', 'C:\Users\*\Appdata\Local\Temp\*') -Force -Recurse; `
    Write-Host 'Checking INCLUDE ...'; `
    Get-Item -Path 'C:\Program Files (x86)\Windows Kits\10\Include\10.0.17134.0\shared'; `
    Get-Item -Path 'C:\Program Files (x86)\Windows Kits\10\Include\10.0.17134.0\um'; `
    Get-Item -Path 'C:\Program Files (x86)\Windows Kits\10\Include\10.0.17134.0\ucrt';

RUN Write-Host 'Updating INCLUDE ...'; `
    $env:INCLUDE = 'C:\Program Files (x86)\Windows Kits\10\Include\10.0.17134.0\shared;' + $env:INCLUDE; `
    $env:INCLUDE = 'C:\Program Files (x86)\Windows Kits\10\Include\10.0.17134.0\um;' + $env:INCLUDE; `
    $env:INCLUDE = 'C:\Program Files (x86)\Windows Kits\10\Include\10.0.17134.0\ucrt;' + $env:INCLUDE; `
    [Environment]::SetEnvironmentVariable('INCLUDE', $env:INCLUDE, [EnvironmentVariableTarget]::Machine);

CMD ["powershell"]

The base image is the following:

microsoft/windowsservercore   1803                1a4a9d0fd8af        4 weeks ago         4.93GB

@thaJeztah @jhowardmsft @jterry75 Would you mind taking another look at this? The previous issue was moby/moby#32838.

Hns: POST after DELETE doesn't work without a delay

Deleting a HNS network and then creating another one immediately after doesn't work. Here's the error message:

Expected error:
    <*errors.errorString | 0xc0422f18b0>: {
        s: "HNS failed with error : Element not found. ",
    }
    HNS failed with error : Element not found.
not to have occurred

Here's a test case:

subnets1 := []hcsshim.Subnet{
    {
        AddressPrefix:  "172.100.0.0/20",
        GatewayAddress: "172.100.0.1",
    },
}
configuration1 := &hcsshim.HNSNetwork{
    Name:    "TestNetworkName1",
    Type:    "transparent",
    Subnets: subnets1,
}

subnets2 := []hcsshim.Subnet{
    {
        AddressPrefix:  "172.200.0.0/20",
        GatewayAddress: "172.200.0.1",
    },
}
configuration2 := &hcsshim.HNSNetwork{
    Name:    "TestNetworkName2",
    Type:    "transparent",
    Subnets: subnets2,
}

It("doesn't work if there's no delay after DELETE", func() {
    configBytes1, err := json.Marshal(configuration1)
    Expect(err).ToNot(HaveOccurred())
    response, err := hcsshim.HNSNetworkRequest("POST", "", string(configBytes1))
    Expect(err).ToNot(HaveOccurred())
    hnsID := response.Id

    _, err = hcsshim.HNSNetworkRequest("DELETE", hnsID, "")
    Expect(err).ToNot(HaveOccurred())

    //time.Sleep(time.Second * 20) // 20 second timeout "fixes" the issue

    configBytes2, err := json.Marshal(configuration2)
    Expect(err).ToNot(HaveOccurred())
    response, err = hcsshim.HNSNetworkRequest("POST", "", string(configBytes2))
    Expect(err).To(HaveOccurred()) // !!! ERROR

    _, err = hcsshim.HNSNetworkRequest("GET", hnsID, "")
    Expect(err).To(HaveOccurred()) // but can't GET the deleted network either
})

Note that sleeping for 20 seconds after deleting a network seems to "fix" the issue. 10 second timeout is not enough.

update: I repeated this in Powershell, so this may be problem with HNS. I posted an issue here: MicrosoftDocs/Virtualization-Documentation#516

`wclayer import` does not work properly with long path

I was using wclayer to import layers. It seems that if the target directory has a long path, wclayer does not work properly.

I was importing the base layer from microsoft/nanoserver. The layer can be downloaded from https://az896309.vo.msecnd.net/containers/microsoft/nanoserver:10.0.14393.447_en-us_full_spx2q06478MkUJl5hOPmCKDgYiXoSLt3. When I imported it with

.\wclayer.exe import -i .\nanoserver_10.0.14393.447_en-us_full_spx2q06478MkUJl5hOPmCKDgYiXoSLt3 C:\temp_short

the extracted folder has 1.13 GB. However, if I imported it with

.\wclayer.exe import -i .\nanoserver_10.0.14393.447_en-us_full_spx2q06478MkUJl5hOPmCKDgYiXoSLt3 C:\temp_long\abcdef\ghijklmn\opqrst\uvwxyz\abcd\efghijk\lmnopq\rst\uvwxyz\abcdefghijklmn\opqrstuvwxyz

the extracted folder has only 81.3 MB.

I have also tried to add long path prefix:

.\wclayer.exe import -i .\nanoserver_10.0.14393.447_en-us_full_spx2q06478MkUJl5hOPmCKDgYiXoSLt3 \\?\C:\temp_long\abcdef\ghijklmn\opqrst\uvwxyz\abcd\efghijk\lmnopq\rst\uvwxyz\abcdefghijklmn\opqrstuvwxyz

wclayer failed with mkdir \\?: The filename, directory name, or volume label syntax is incorrect., so I created the directory first and ran the command again. The extracted folder also has only 81.3 MB. Therefore, I suspect that if the target path is too long, some files get ignored during the import.

Implement proper signal support

For WCOW and LCOW we should actually make the SignalProcess call when a signal comes in. On Windows we can do this if the capability supports: SignalProcessSupported = true

Stop using “Sandbox”

New code is being added that uses the term “sandbox” to refer to the top layer of a layer-based storage system. This term is not used outside of Microsoft and may be confused with a Kubernetes sandbox. We should replace this term with upper or writable layer.

symlinked directories don't work as bind mount source on 1803 technical preview

Tested on Kernel Version: 10.0 17093 (17093.1000.amd64fre.rs_prerelease.180202-1400)

If the source of a bind mount is a symlink, container creation will succeed, but the directory will not be accessible inside the container. This is a regression from 1709. To reproduce:

$dockerImage = "microsoft/windowsservercore-insider:10.0.17093.1000"

$mountDir = "$env:TEMP\mountdir"
mkdir $mountDir

echo hello > "$mountDir\hello.txt"

$symlink = "$env:TEMP\symlink"
cmd.exe /c "mklink /D $symlink $mountDir"

docker run -v"$symlink":c:\containerDir $dockerImage cmd.exe /C "type c:\containerDir\hello.txt"

cmd.exe /c "rmdir $symlink"
rm -r -force $mountDir

The output of this script:

    Directory: C:\Windows\TEMP                                                                                                                 
                                                                                                                                               
                                                                                                                                               
Mode                LastWriteTime         Length Name                                                                                          
----                -------------         ------ ----                                                                                          
d-----         3/9/2018   9:09 AM                mountdir                                                                                      
symbolic link created for C:\Windows\TEMP\symlink <<===>> C:\Windows\TEMP\mountdir                                                             
The create operation failed because the name contained at least one mount point which resolves to a volume to which the specified device object is not attached.

On 1709, (with the container image changed appropriately) running the script show that the container mounts the symlinked directory correctly:

    Directory: C:\Windows\TEMP                                                    
                                                                                  
                                                                                  
Mode                LastWriteTime         Length Name                             
----                -------------         ------ ----                             
d-----         3/9/2018   9:07 AM                mountdir                         
symbolic link created for C:\Windows\TEMP\symlink <<===>> C:\Windows\TEMP\mountdir
hello                                                                             

connectex: A socket operation was attempted to an unreachable error AND HNS Unspecified error

Issue seems similar to #95

There are two test cases and two different errors, but I think the underlying root cause may be the same, that's why I report both errors in the same issue.

Here's the first test case:

for i := 0; i < numTries; i++ {
	subnets := []hcsshim.Subnet{
		{
			AddressPrefix:  "10.0.0.0/24",
			GatewayAddress: "10.0.0.1",
		},
	}
	configuration := &hcsshim.HNSNetwork{
		Type:               "transparent",
		NetworkAdapterName: "Ethernet0",
		Subnets:            subnets,
	}
	configBytes, _ := json.Marshal(configuration)
	resp, err := hcsshim.HNSNetworkRequest("POST", "", string(configBytes))
	Expect(err).ToNot(HaveOccurred())

	_, err = net.Dial("tcp", "10.7.0.54:8082")
	Expect(err).ToNot(HaveOccurred())
	// sometimes errors:
	// `dial tcp localhost:80: connectex: A socket operation was attempted to an
	// unreachable network.`

	hcsshim.HNSNetworkRequest("DELETE", resp.Id, "")
}

net.Dial sometimes fails with:

dial tcp 10.7.0.54:8082: connectex: A socket operation was attempted to an unreachable network.

Powershell script that kinda replicates it (but prints a different error):

1..50 | % { New-ContainerNetwork -Mode transparent -Name net1 -SubnetPrefix 10.0.0.0/24 -NetworkAdaptername Ethernet0; curl 10.7.0.54:8082; Remove-ContainerNetwork -Name net1 -Force; }

I sometimes get error:

curl : Unable to connect to remote server

which I believe to be a coarse error message encompassing aforementioned connectex... error.

Here's the second test case, which is very similar to the first one but we don't specify a subnet when creating the HNS network:

for i := 0; i < numTries; i++ {
	configuration := &hcsshim.HNSNetwork{
		Type:               "transparent",
		NetworkAdapterName: "Ethernet0",
	}
	configBytes, _ := json.Marshal(configuration)
	resp, err := hcsshim.HNSNetworkRequest("POST", "", string(configBytes))
	Expect(err).ToNot(HaveOccurred())
	// sometimes errors:
	// `HNS failed with error : Unspecified error`

	hcsshim.HNSNetworkRequest("DELETE", resp.Id, "")
}

This time, we get a HNS error when invoking POST request on HNS:

HNS failed with error : Unspecified error

To replicate via powershell:

1..50 | % { New-ContainerNetwork -Mode transparent -Name net1 -NetworkAdaptername Ethernet0; Remove-ContainerNetwork -Name net1 -Force; }

Which sometimes returns:

New-ContainerNetwork : Unspecified Error

Release & revendor go-winio/hcsshim + opengcs

This is re this thread: https://github.com/Microsoft/hcsshim/pull/276#issuecomment-411589251
.

We have a two part change between go-winio and hcsshim.

go-winio change is here: microsoft/go-winio#88 (comment)
hcsshim change is here: #276 (comment)

Note that there are also separate changes here:
hcsshim: #277 (comment)
opengcs: microsoft/opengcs#243 (comment)

These should be also submitted before we release and revendor, so we do not have to do this twice.

"Element not found" error when hot attaching endpoint to container

We've noticed that when we are creating many containers and attaching endpoints to them in parallel, we get an error from HotAttachEndpoint. The message is just "Element not found." It's not clear whether that is the container, the endpoint, or something else.

We haven't been able to find a way to recover from this error without destroying the container and endpoint and trying again. The state of an endpoint before attach appears identical whether the attach fails or not. So is the state after the attach (even if it fails). Retrying the attach does not work, nor does attempting to use the endpoint.

Here is a reproduction of the issue. For us, running this with 50 container creates in parallel usually fails on the first or second try. It's worth nothing that we very rarely see this issue if we are only doing a few creates in parallel.

Thanks,
@mdelillo @aminjam

runhcs outputs runc in a few places in help text

Should probably use the binary name instead.

binaryName = filepath.Base(os.Args[0])                                // With extension
binaryName = strings.TrimSuffix(binaryName, filepath.Ext(binaryName)) // Without extension

HNSEndpoint DNSServerList is ignored in 10.0.17733

After running the latest Windows Insider build for 2019 (10.0.17733), we noticed that the DNSServerList for HNSEndpoint is ignored.

Steps to reproduce this error:

  • Assuming we have a container and a nat network
  • When we run the following program with ContainerId and NetworkName
package main

import (
	"fmt"
	"os"

	"github.com/Microsoft/hcsshim"
)

func main() {
	containerName := os.Args[1]
	networkName := os.Args[2]
	endpoint := &hcsshim.HNSEndpoint{
		VirtualNetworkName: networkName,
		Create a ComputeProcess containerName:               containerName,
	}

	endpoint.DNSServerList = "222.111.111.222,123.123.123.123"

	newEndpoint, err := endpoint.Create()
	if err != nil {
		fmt.Printf("Endpoint creation failed- %s\n", err.Error())
	}

	err = hcsshim.HotAttachEndpoint(containerName, newEndpoint.Id)
	if err != nil {
		fmt.Printf("Attaching endpoint failed\n")
	}

}
  • We expect to see the (Get-DNSClientServerAddress).ServerAddresses to contain 222.111.111.222 and 123.123.123.123 in the container, but we observe that it is blank. This behavior is working as expected in 1709 and 1803 builds.

docker info:

Containers: 2
 Running: 0
 Paused: 0
 Stopped: 2
Images: 12
Server Version: master-dockerproject-2018-08-15
Storage Driver: windowsfilter
 Windows:
Logging Driver: json-file
Plugins:
 Volume: local
 Network: ics l2bridge l2tunnel nat null overlay transparent
 Log: awslogs etwlogs fluentd gelf json-file logentries splunk syslog
Swarm: inactive
Default Isolation: process
Kernel Version: 10.0 17733 (17733.1000.amd64fre.rs5_release.180803-1525)
Operating System: Windows Server Datacenter Version 1803 (OS Build 17733.1000)
OSType: windows
Architecture: x86_64
CPUs: 4
Total Memory: 32GiB
Name: WIN-8SUSKTQISJR
ID: BZ5Q:F572:PJZR:BLWE:TRIN:JWCP:FPSK:Q7ZH:UZCY:53NM:VCTY:HT5X
Docker Root Dir: C:\ProgramData\docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Typed Errors

There was recently a commit, c898547, that added another return value to the create process func. It is used for type checking the type of error.

This pkg is pretty small right now so I think we can make this even better. There are many different ways for handling typed errors in Go. We can do something like the syscall package does with it's syscall.Errno which is a uintptr for these. Or we can define an error type in the package with codes so that we don't have to have multiple return results to find out a type safe way to get the error.

I suggest doing something like this but I wanted to get your input on what you think before I open a PR.

Option 1:

var (
    WaitErrExecFailed = errors.New("hcsshim: wait exec failed")

    // Known Win32 RC values which should be trapped
    Win32PipeHasBeenEnded                 = errors.New("hcsshim: The pipe has been ended")
    Win32SystemShutdownIsInProgress       = errors.New("hcsshim: A system shutdown is in progress")
    Win32SpecifiedPathInvalid             = errors.New("hcsshim: The specified path is invalid")
    Win32SystemCannotFindThePathSpecified = errors.New("hcsshim: The system cannot find the path specified")
    Win32InvalidArgument                  = errors.New("hcsshim: An invalid argument was supplied")
)

By doing this we have typed errors that are easily comparable by the consumer.

_, _, _, _, err := CreateProcessInComputeSystem(id, true, ...)
if err != nil {
    if err == hcsshim.Win32InvalidArgument {
         // do specific stuff
    }
    return err
}

Option two if you want to keep the codes from the types above then you can do something like the syscall pkg.

type Errno uint32                                                                                              

func (e Errno) Error() string {                                                                                
       switch e {
             case Win32InvalidArgument:
                    return "hcsshim: An invalid argument was supplied"
       }                                                                                                       
}

What do you all think?

Is this what lets programs create files on my windows system that I cant delete?

This issue is probably not so much about docker, so please bear with for a moment instead of insta-closing.

I uninstalled docker for windows but cant seem to delete files it leaves in the C:\ProgramData\Docker\windowsfilter folder. The advice is to use a "dangerous" utility that makes use of this hcsshim to effectively handle the deletion.

This readme states that this hcsshim is used by docker (mostly*), so it seems reasonable that its what docker used to create the files I cant delete. Even after taking ownership and trying to replace child permissions.

icacls "C:\ProgramData\Docker\" /T /C /grant Administrators:F
...
Successfully processed 212988 files; Failed processing 111741 files

Is this module creating files in such a way that they cannot be normally removed by a machine administrator?

If so, these files are holding 20 GB of my disk hostage that I would like to free up using conventional (and safe) windows deletion means.

  • The visual studio feedback collector tool (the one that captures recordings and ETW stuffs also has created files I cant remove. Is this used in that program also?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.