The following exception occasionally takes down our computation. We don't really know how to reproduce it in isolation as it happens only when running on large input datasets and after a few minutes of execution. I am not yet sure that our computation is fully deterministic when run multiple times on the same input (we're verifying this now) so the fact that this happens occasionally may be due to non-deterministic behaviour in our code.
"ParsedInputPono" (renamed) is a plain-old-NET-object with a few public fields (of type long
, int
, string
) and a constructor that only copies the arguments to the respective fields.
We're running a debug build on linux with mono (I'll try to rerun with --debug
to get line numbers) on about 16 machines with 4 workers each (-t 4).
Can you help us out?
Thank you.
00:07:58.2570129, Graph 0 failed on scheduler 3 with exception:
System.NullReferenceException: Object reference not set to an instance of an object
at Microsoft.Research.Naiad.Serialization.AutoGenerated.ParsedInputPono.TrySerializeMany (Microsoft.Research.Naiad.Serialization.SubArray`1& destination, ArraySegment`1 values) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Serialization.SendBufferPage.WriteElements[ParsedInputPono] (NaiadSerialization`1 serializer, ArraySegment`1 elements) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Serialization.AutoSerializedMessageEncoder`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].WriteElements (ArraySegment`1 records, Int32 srcVertexId) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Serialization.AutoSerializedMessageEncoder`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].Write (ArraySegment`1 records, Int32 srcVertexId) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.Channels.RemoteMailbox`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].Send (Message`2 message, ReturnAddress from) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.Channels.BufferingPostbox`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].FlushBuffer (Int32 index) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.Channels.BufferingPostbox`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].InternalSend (Message`2 records) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.Channels.BufferingPostbox`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].Send (Message`2 records) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.VertexOutputBuffer`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].Send (Message`2 message) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.VertexOutputBufferPerTime`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].SendBuffer () [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.VertexOutputBufferPerTime`2[ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].Send (ParsedInputPono record) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Frameworks.Lindi.ExtensionMethods+SelectManyVertex`3[System.String,ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].OnReceive (Message`2 message) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.StandardVertices.UnaryVertex`3[System.String,ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch].<MakeStage>m__0 (Message`2 message, Microsoft.Research.Naiad.Dataflow.StandardVertices.UnaryVertex`3 vertex) [0x00000] in <filename unknown>:0
at (wrapper delegate-invoke) System.Action`2<Microsoft.Research.Naiad.Dataflow.Message`2<string, Microsoft.Research.Naiad.Dataflow.Epoch>, Microsoft.Research.Naiad.Dataflow.StandardVertices.UnaryVertex`3<string, ParsedInputPono, Microsoft.Research.Naiad.Dataflow.Epoch>>:invoke_void_T1_T2 (Microsoft.Research.Naiad.Dataflow.Message`2<string, Microsoft.Research.Naiad.Dataflow.Epoch>,Microsoft.Research.Naiad.Dataflow.StandardVertices.UnaryVertex`3<string, ParsedInputPono, Microsoft.Research.Naiad.Dataflow.Epoch>)
at Microsoft.Research.Naiad.Dataflow.Stage`2+<NewInput>c__AnonStorey2`1+<NewInput>c__AnonStorey3[Microsoft.Research.Naiad.Dataflow.StandardVertices.UnaryVertex`3[System.String,ParsedInputPono,Microsoft.Research.Naiad.Dataflow.Epoch],Microsoft.Research.Naiad.Dataflow.Epoch,System.String].<>m__0 (Message`2 m) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.ActionReceiver`2+<ActionReceiver>c__AnonStorey0[System.String,Microsoft.Research.Naiad.Dataflow.Epoch].<>m__0 (Message`2 m, ReturnAddress u) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.ActionReceiver`2[System.String,Microsoft.Research.Naiad.Dataflow.Epoch].OnReceive (Message`2 message, ReturnAddress from) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.Channels.PipelineChannel`2+Fiber[System.String,Microsoft.Research.Naiad.Dataflow.Epoch].Send (Message`2 records) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.VertexOutputBuffer`2[System.String,Microsoft.Research.Naiad.Dataflow.Epoch].Send (Message`2 message) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.VertexOutputBufferPerTime`2[System.String,Microsoft.Research.Naiad.Dataflow.Epoch].SendBuffer () [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.VertexOutputBufferPerTime`2[System.String,Microsoft.Research.Naiad.Dataflow.Epoch].Send (System.String record) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Dataflow.StreamingInputVertex`1[System.String].PerformAction (WorkItem workItem) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Scheduling.Scheduler+WorkItem.Run () [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Scheduling.Scheduler.Schedule (WorkItem workItem) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Scheduling.Scheduler.RunWorkItem (Int32 graphId) [0x00000] in <filename unknown>:0
at Microsoft.Research.Naiad.Scheduling.Scheduler.RunNotification (Int32 computationIndex) [0x00000] in <filename unknown>:0