Code Monkey home page Code Monkey logo

Comments (5)

prashantmital avatar prashantmital commented on June 21, 2024

Note that this issue is blocking .NET integration. @vincentkam to add a code example that reproduces the issue we are seeing with signals on Windows + .NET.

from drivers-atlas-testing.

mbroadst avatar mbroadst commented on June 21, 2024

@prashantmital note that using a tombstone file will not work if the workload executor is implemented as a docker container. In such scenarios, the container would not have access to that file unless it were explicitly linked into the container, and may yet still cause trouble. Another approach might be to use a lightweight RPC over domain sockets/named pipes, or to use something like a value in a mongodb document. I still think the path of least complexity lays with using system signals, so I'm eager to see @vincentkam 's code exemplifying the difficulty with trapping these signals on windows.

from drivers-atlas-testing.

jyemin avatar jyemin commented on June 21, 2024

I haven't delved too deeply yet, but I know that on the JVM there is no straightforward way to install a signal handler. There is also no JVM support for domain sockets.

from drivers-atlas-testing.

vincentkam avatar vincentkam commented on June 21, 2024

The following sample exemplifies the issues we've been running into in getting signal handling to work with a combination of cygwin bash + windows python + dotnet.
https://github.com/vincentkam/drivers-atlas-testing/tree/dotnet-signaling-issues

The workload-executor is a cygwin bash script adapted from the python driver's bash script. It in turns executes the "native" workload executor, which in this case, is basically Program.cs.

I've commented Program.cs to illustrate the flow a bit better as I know not everyone may not have a Windows box handy, although a spawnhost with the dotnet toolchain installed should work if anyone wants to play with this example.

The TLDR is that it appears that something is terminating the native workload executor before the it can finish executing. I suspect it's a Cygwin bash problem because I see similar behavior when using kill -INT on the workload-executor bash script.

Here is a sample test run using Cygwin bash to invoke astrolabe which was installed via pip via Python on Windows:

Vincent@Astorma:~/projects/drivers-atlas-testing/integrations/dotnet$ /cygdrive/c/Python38/Scripts/astrolabe.exe spec-tests validate-workload-executor -e workload-executor --connection-string mongodb://localhost
test_num_errors (astrolabe.validator.ValidateWorkloadExecutor) ... INFO:astrolabe.utils:Starting workload executor subprocess
INFO:astrolabe.utils:Started workload executor [PID: 19028]
INFO:astrolabe.utils:Waiting 1.0 seconds for the workload executor subprocess to start
+ set -o errexit
+ FRAMEWORK=netcoreapp2.1
+ MAGIC_FILE_NAME=nox
+ CONNECTION_STRING=mongodb://localhost
+ WORKLOAD_SPEC='{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}, {"object": "collection", "name": "doesNotExist", "arguments": {"foo": "bar"}}]}'
+ echo I am 1307...
I am 1307...
+ rm -f nox
+ export MAGIC_FILE_NAME
+ trap 'echo You have activated my trap card; touch $MAGIC_FILE_NAME; wait $NATIVE_WORKLOAD_EXECUTOR_PID; exit $?' INT
+ export NATIVE_WORKLOAD_EXECUTOR_PID=1309
+ dotnet run --framework netcoreapp2.1 -p workload-executor.csproj mongodb://localhost '{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}, {"object": "collection", "name": "doesNotExist", "arguments": {"foo": "bar"}}]}'
+ NATIVE_WORKLOAD_EXECUTOR_PID=1309
+ wait 1309
dotnet main> Magic: nox
dotnet main> Arg: mongodb://localhost
dotnet main> Arg: {"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}, {"object": "collection", "name": "doesNotExist", "arguments": {"foo": "bar"}}]}
INFO:astrolabe.utils:Stopping workload executor [PID: 19028]

dotnet int handler> The main program has been interrupted.
dotnet int handler>  Key pressed: ControlBreak
dotnet int handler>  Cancel property: False
dotnet int handler> Setting the Cancel property to true...
dotnet int handler> Spinning until 4s have elapsed. Time (ms) elapsed: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 ++ echo You have activated my trap card
7 7 7 You have activated my trap card
7 8 8 ++ touch nox
8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 23 23 23 23 23 23 23 23 23 23 23 23 24 24 24 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 28 28 28 28 28 28 28 28 28 28 28 28 29 29 29 29 29 29 29 29 30 30 30 30 30 30 30 30 ++ wait 1309
INFO:astrolabe.utils:Stopped workload executor [PID: 19028]
INFO:astrolabe.utils:Reading sentinel file 'C:\\users\\vincent\\Projects\\drivers-atlas-testing\\integrations\\dotnet\\results.json'
ERROR:astrolabe.utils:Sentinel file not found
FAIL
test_simple (astrolabe.validator.ValidateWorkloadExecutor) ... INFO:astrolabe.utils:Starting workload executor subprocess
INFO:astrolabe.utils:Started workload executor [PID: 4416]
INFO:astrolabe.utils:Waiting 1.0 seconds for the workload executor subprocess to start
+ set -o errexit
+ FRAMEWORK=netcoreapp2.1
+ MAGIC_FILE_NAME=nox
+ CONNECTION_STRING=mongodb://localhost
+ WORKLOAD_SPEC='{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}]}'
+ echo I am 1311...
I am 1311...
+ rm -f nox
+ export MAGIC_FILE_NAME
+ trap 'echo You have activated my trap card; touch $MAGIC_FILE_NAME; wait $NATIVE_WORKLOAD_EXECUTOR_PID; exit $?' INT
+ export NATIVE_WORKLOAD_EXECUTOR_PID=1313
+ dotnet run --framework netcoreapp2.1 -p workload-executor.csproj mongodb://localhost '{"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}]}'
+ NATIVE_WORKLOAD_EXECUTOR_PID=1313
+ wait 1313
dotnet main> Magic: nox
dotnet main> Arg: mongodb://localhost
dotnet main> Arg: {"database": "validation_db", "collection": "validation_coll", "testData": [{"_id": "validation_sentinel", "count": 0}], "operations": [{"object": "collection", "name": "updateOne", "arguments": {"filter": {"_id": "validation_sentinel"}, "update": {"$inc": {"count": 1}}}}]}
INFO:astrolabe.utils:Stopping workload executor [PID: 4416]

dotnet int handler> The main program has been interrupted.
dotnet int handler>  Key pressed: ControlBreak
dotnet int handler>  Cancel property: False
dotnet int handler> Setting the Cancel property to true...
dotnet int handler> Spinning until 4s have elapsed. Time (ms) elapsed: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 ++ echo You have activated my trap card
8 9 You have activated my trap card
9 9 9 ++ touch nox
9 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 13 13 13 14 14 14 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 21 21 21 21 23 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26 27 27 27 29 29 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 31 31 32 32 32 32 32 32 32 33 33 33 33 33 33 33 33 33 34 34 34 34 34 35 35 35 35 35 35 36 36 36 36 36 36 36 36 36 37 37 37 37 37 37 38 38 38 38 38 38 39 39 39 40 40 40 40 41 41 41 41 41 41 42 42 42 42 42 42 42 43 43 43 43 43 44 44 44 44 44 44 44 45 45 45 45 46 46 46 46 46 46 46 46 46 47 47 47 47 47 48 48 48 48 48 48 48 48 48 48 48 49 49 49 49 49 49 49 49 50 50 50 50 50 50 50 50 50 50 50 50 50 51 51 51 51 51 51 51 51 51 51 51 52 52 52 52 52 53 53 53 53 53 54 54 54 54 54 54 54 54 54 55 55 55 55 55 56 56 56 56 56 56 56 56 57 57
 58 58 58 58 59 59 59 59 59 59 59 59 60 60 60 60 60 60 60 60 61 61 61 61 61 61 61 61 61 62 62 62 62 63 63 63 63 63 63 63 63 63 63 63 63 64 ++ wait 1313
INFO:astrolabe.utils:Stopped workload executor [PID: 4416]
INFO:astrolabe.utils:Reading sentinel file 'C:\\users\\vincent\\Projects\\drivers-atlas-testing\\integrations\\dotnet\\results.json'
ERROR:astrolabe.utils:Sentinel file not found
FAIL

======================================================================
FAIL: test_num_errors (astrolabe.validator.ValidateWorkloadExecutor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\python38\lib\site-packages\astrolabe\validator.py", line 122, in test_num_errors
    stats = self.run_test(driver_workload)
  File "C:\python38\lib\site-packages\astrolabe\validator.py", line 71, in run_test
    self.fail("The workload executor did not write a results.json "
AssertionError: The workload executor did not write a results.json file in the expected location, or the file that was written contained malformed JSON.

======================================================================
FAIL: test_simple (astrolabe.validator.ValidateWorkloadExecutor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\python38\lib\site-packages\astrolabe\validator.py", line 92, in test_simple
    stats = self.run_test(driver_workload)
  File "C:\python38\lib\site-packages\astrolabe\validator.py", line 71, in run_test
    self.fail("The workload executor did not write a results.json "
AssertionError: The workload executor did not write a results.json file in the expected location, or the file that was written contained malformed JSON.

----------------------------------------------------------------------
Ran 2 tests in 16.298s

FAILED (failures=2)

from drivers-atlas-testing.

prashantmital avatar prashantmital commented on June 21, 2024

See #79 for a proposed alternate strategy for communicating state between astrolabe and workload-executors.

from drivers-atlas-testing.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.