Comments (5)
Note that this issue is blocking .NET integration. @vincentkam to add a code example that reproduces the issue we are seeing with signals on Windows + .NET.
from drivers-atlas-testing.
@prashantmital note that using a tombstone file will not work if the workload executor is implemented as a docker container. In such scenarios, the container would not have access to that file unless it were explicitly linked into the container, and may yet still cause trouble. Another approach might be to use a lightweight RPC over domain sockets/named pipes, or to use something like a value in a mongodb document. I still think the path of least complexity lays with using system signals, so I'm eager to see @vincentkam 's code exemplifying the difficulty with trapping these signals on windows.
from drivers-atlas-testing.
I haven't delved too deeply yet, but I know that on the JVM there is no straightforward way to install a signal handler. There is also no JVM support for domain sockets.
from drivers-atlas-testing.
The following sample exemplifies the issues we've been running into in getting signal handling to work with a combination of cygwin bash + windows python + dotnet.
https://github.com/vincentkam/drivers-atlas-testing/tree/dotnet-signaling-issues
The workload-executor is a cygwin bash script adapted from the python driver's bash script. It in turns executes the "native" workload executor, which in this case, is basically Program.cs
.
I've commented Program.cs to illustrate the flow a bit better as I know not everyone may not have a Windows box handy, although a spawnhost with the dotnet
toolchain installed should work if anyone wants to play with this example.
The TLDR is that it appears that something is terminating the native workload executor before the it can finish executing. I suspect it's a Cygwin bash problem because I see similar behavior when using kill -INT
on the workload-executor bash script.
Here is a sample test run using Cygwin bash to invoke astrolabe
which was installed via pip
via Python on Windows:
Vincent@Astorma:~/projects/drivers-atlas-testing/integrations/dotnet$ /cygdrive/c/Python38/Scripts/astrolabe.exe spec-tests validate-workload-executor -e workload-executor --connection-string mongodb://localhost
test_num_errors (astrolabe.validator.ValidateWorkloadExecutor) ... INFO:astrolabe.utils:Starting workload executor subprocess
INFO:astrolabe.utils:Started workload executor [PID: 19028]
INFO:astrolabe.utils:Waiting 1.0 seconds for the workload executor subprocess to start
+ set -o errexit
+ FRAMEWORK=netcoreapp2.1
+ MAGIC_FILE_NAME=nox
+ CONNECTION_STRING=mongodb://localhost
+ WORKLOAD_SPEC='{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}, {"object": "collection", "name": "doesNotExist", "arguments": {"foo": "bar"}}]}'
+ echo I am 1307...
I am 1307...
+ rm -f nox
+ export MAGIC_FILE_NAME
+ trap 'echo You have activated my trap card; touch $MAGIC_FILE_NAME; wait $NATIVE_WORKLOAD_EXECUTOR_PID; exit $?' INT
+ export NATIVE_WORKLOAD_EXECUTOR_PID=1309
+ dotnet run --framework netcoreapp2.1 -p workload-executor.csproj mongodb://localhost '{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}, {"object": "collection", "name": "doesNotExist", "arguments": {"foo": "bar"}}]}'
+ NATIVE_WORKLOAD_EXECUTOR_PID=1309
+ wait 1309
dotnet main> Magic: nox
dotnet main> Arg: mongodb://localhost
dotnet main> Arg: {"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}, {"object": "collection", "name": "doesNotExist", "arguments": {"foo": "bar"}}]}
INFO:astrolabe.utils:Stopping workload executor [PID: 19028]
dotnet int handler> The main program has been interrupted.
dotnet int handler> Key pressed: ControlBreak
dotnet int handler> Cancel property: False
dotnet int handler> Setting the Cancel property to true...
dotnet int handler> Spinning until 4s have elapsed. Time (ms) elapsed: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 ++ echo You have activated my trap card
7 7 7 You have activated my trap card
7 8 8 ++ touch nox
8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 23 23 23 23 23 23 23 23 23 23 23 23 24 24 24 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 28 28 28 28 28 28 28 28 28 28 28 28 29 29 29 29 29 29 29 29 30 30 30 30 30 30 30 30 ++ wait 1309
INFO:astrolabe.utils:Stopped workload executor [PID: 19028]
INFO:astrolabe.utils:Reading sentinel file 'C:\\users\\vincent\\Projects\\drivers-atlas-testing\\integrations\\dotnet\\results.json'
ERROR:astrolabe.utils:Sentinel file not found
FAIL
test_simple (astrolabe.validator.ValidateWorkloadExecutor) ... INFO:astrolabe.utils:Starting workload executor subprocess
INFO:astrolabe.utils:Started workload executor [PID: 4416]
INFO:astrolabe.utils:Waiting 1.0 seconds for the workload executor subprocess to start
+ set -o errexit
+ FRAMEWORK=netcoreapp2.1
+ MAGIC_FILE_NAME=nox
+ CONNECTION_STRING=mongodb://localhost
+ WORKLOAD_SPEC='{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}]}'
+ echo I am 1311...
I am 1311...
+ rm -f nox
+ export MAGIC_FILE_NAME
+ trap 'echo You have activated my trap card; touch $MAGIC_FILE_NAME; wait $NATIVE_WORKLOAD_EXECUTOR_PID; exit $?' INT
+ export NATIVE_WORKLOAD_EXECUTOR_PID=1313
+ dotnet run --framework netcoreapp2.1 -p workload-executor.csproj mongodb://localhost '{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}]}'
+ NATIVE_WORKLOAD_EXECUTOR_PID=1313
+ wait 1313
dotnet main> Magic: nox
dotnet main> Arg: mongodb://localhost
dotnet main> Arg: {"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}]}
INFO:astrolabe.utils:Stopping workload executor [PID: 4416]
dotnet int handler> The main program has been interrupted.
dotnet int handler> Key pressed: ControlBreak
dotnet int handler> Cancel property: False
dotnet int handler> Setting the Cancel property to true...
dotnet int handler> Spinning until 4s have elapsed. Time (ms) elapsed: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 ++ echo You have activated my trap card
8 9 You have activated my trap card
9 9 9 ++ touch nox
9 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 13 13 13 14 14 14 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 21 21 21 21 23 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26 27 27 27 29 29 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 31 31 32 32 32 32 32 32 32 33 33 33 33 33 33 33 33 33 34 34 34 34 34 35 35 35 35 35 35 36 36 36 36 36 36 36 36 36 37 37 37 37 37 37 38 38 38 38 38 38 39 39 39 40 40 40 40 41 41 41 41 41 41 42 42 42 42 42 42 42 43 43 43 43 43 44 44 44 44 44 44 44 45 45 45 45 46 46 46 46 46 46 46 46 46 47 47 47 47 47 48 48 48 48 48 48 48 48 48 48 48 49 49 49 49 49 49 49 49 50 50 50 50 50 50 50 50 50 50 50 50 50 51 51 51 51 51 51 51 51 51 51 51 52 52 52 52 52 53 53 53 53 53 54 54 54 54 54 54 54 54 54 55 55 55 55 55 56 56 56 56 56 56 56 56 57 57
58 58 58 58 59 59 59 59 59 59 59 59 60 60 60 60 60 60 60 60 61 61 61 61 61 61 61 61 61 62 62 62 62 63 63 63 63 63 63 63 63 63 63 63 63 64 ++ wait 1313
INFO:astrolabe.utils:Stopped workload executor [PID: 4416]
INFO:astrolabe.utils:Reading sentinel file 'C:\\users\\vincent\\Projects\\drivers-atlas-testing\\integrations\\dotnet\\results.json'
ERROR:astrolabe.utils:Sentinel file not found
FAIL
======================================================================
FAIL: test_num_errors (astrolabe.validator.ValidateWorkloadExecutor)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\python38\lib\site-packages\astrolabe\validator.py", line 122, in test_num_errors
stats = self.run_test(driver_workload)
File "C:\python38\lib\site-packages\astrolabe\validator.py", line 71, in run_test
self.fail("The workload executor did not write a results.json "
AssertionError: The workload executor did not write a results.json file in the expected location, or the file that was written contained malformed JSON.
======================================================================
FAIL: test_simple (astrolabe.validator.ValidateWorkloadExecutor)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\python38\lib\site-packages\astrolabe\validator.py", line 92, in test_simple
stats = self.run_test(driver_workload)
File "C:\python38\lib\site-packages\astrolabe\validator.py", line 71, in run_test
self.fail("The workload executor did not write a results.json "
AssertionError: The workload executor did not write a results.json file in the expected location, or the file that was written contained malformed JSON.
----------------------------------------------------------------------
Ran 2 tests in 16.298s
FAILED (failures=2)
from drivers-atlas-testing.
See #79 for a proposed alternate strategy for communicating state between astrolabe and workload-executors.
from drivers-atlas-testing.
Related Issues (20)
- Consider defining the driver workload in its own file
- Implement Poller that retries endpoints when encountering some pre-specified Atlas API errors
- Fail the whole test run if the preparation steps failed
- Investigate why Atlas Group retrieval fails for some groups HOT 2
- Migrate to ubuntu1804-drivers-atlas-testing
- Workload executor validation evergreen task
- Add documentation for workload executor validation
- Support for workload executors that take a long time to start up
- Prepend environment variables used by astrolabe with ASTROLABE_*
- Only inject certifi certificates on Windows if using TLS HOT 2
- Astrolabe attempts reading the sentinel file before it is written on Windows HOT 1
- Use Atlas test cluster to synchronize operations between astrolabe and workload executors
- Workaround hitting rate limits while polling Atlas API endpoints HOT 1
- Make validation a bit more forgiving HOT 1
- Bash scripts wrapping native workload executors need not run them as background processes
- Document that workload executors MUST ignore the testData key in the workload executor spec
- Astrolabe does not stop when cluster creation fails HOT 2
- Use cloud-dev Atlas for running Evergreen against drivers-atlas-testing HOT 1
- Migrate to using IP Access List endpoints
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from drivers-atlas-testing.