Code Monkey home page Code Monkey logo

Comments (9)

tushu1232 avatar tushu1232 commented on August 22, 2024

#22

from sparkbwa.

xubo245 avatar xubo245 commented on August 22, 2024

You can debug or add log info to trace, focus on the file, including temp file

from sparkbwa.

tushu1232 avatar tushu1232 commented on August 22, 2024

@xubo245 We are running the program on a non-hdfs environment using IBM LSF as scheduler and not YARN.I am attaching the entire run on this comment.
BWASPARKRUN.txt

from sparkbwa.

xubo245 avatar xubo245 commented on August 22, 2024

Can you see stderr log? in app-** of work dir ,

and list tmps dir file ?(maybe in workspace/tmps)

I can not see err in you BWASPARKRUN.txt

from sparkbwa.

salimbakker avatar salimbakker commented on August 22, 2024

SparkBWA generates empty sam files

/04/21 08:11:45 INFO ContainerManagementProtocolProxy: Opening proxy : slave2.hdp:45454
17/04/21 08:11:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (slave1.hdp:41948) with ID 1
17/04/21 08:11:48 INFO BlockManagerMasterEndpoint: Registering block manager slave1.hdp:38864 with 7.0 GB RAM, BlockManagerId(1, slave1.hdp, 38864)
17/04/21 08:11:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (slave.hdp:50548) with ID 2
17/04/21 08:11:48 INFO BlockManagerMasterEndpoint: Registering block manager slave.hdp:43602 with 7.0 GB RAM, BlockManagerId(2, slave.hdp, 43602)
17/04/21 08:11:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (slave2.hdp:49614) with ID 3
17/04/21 08:11:48 INFO BlockManagerMasterEndpoint: Registering block manager slave2.hdp:46273 with 7.0 GB RAM, BlockManagerId(3, slave2.hdp, 46273)
17/04/21 08:12:14 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
17/04/21 08:12:14 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done
17/04/21 08:12:14 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: Starting BWA
17/04/21 08:12:14 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] ::Not sorting in HDFS. Timing: 47447818973648
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 341.8 KB, free 341.8 KB)
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.3 KB, free 370.2 KB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.2.86:32844 (size: 28.3 KB, free: 1140.3 MB)
17/04/21 08:12:14 INFO SparkContext: Created broadcast 0 from textFile at BwaInterpreter.java:149
17/04/21 08:12:14 INFO FileInputFormat: Total input paths to process : 1
17/04/21 08:12:14 INFO SparkContext: Starting job: zipWithIndex at BwaInterpreter.java:152
17/04/21 08:12:14 INFO DAGScheduler: Got job 0 (zipWithIndex at BwaInterpreter.java:152) with 13 output partitions
17/04/21 08:12:14 INFO DAGScheduler: Final stage: ResultStage 0 (zipWithIndex at BwaInterpreter.java:152)
17/04/21 08:12:14 INFO DAGScheduler: Parents of final stage: List()
17/04/21 08:12:14 INFO DAGScheduler: Missing parents: List()
17/04/21 08:12:14 INFO DAGScheduler: Submitting ResultStage 0 (hdfs://master.hdp:8020/SparkBWA/ERR000589_1.filt.fastq MapPartitionsRDD[1] at textFile at BwaInterpreter.java:149), which has no missing parents
17/04/21 08:12:14 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:173)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:34)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.0 KB, free 373.2 KB)
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1857.0 B, free 375.0 KB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.2.86:32844 (size: 1857.0 B, free: 1140.3 MB)
17/04/21 08:12:14 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:14 INFO DAGScheduler: Submitting 13 missing tasks from ResultStage 0 (hdfs://master.hdp:8020/SparkBWA/ERR000589_1.filt.fastq MapPartitionsRDD[1] at textFile at BwaInterpreter.java:149)
17/04/21 08:12:14 INFO YarnClusterScheduler: Adding task set 0.0 with 13 tasks
17/04/21 08:12:14 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, slave.hdp, partition 0,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:14 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, slave2.hdp, partition 1,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:14 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, slave1.hdp, partition 2,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on slave1.hdp:38864 (size: 1857.0 B, free: 7.0 GB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on slave1.hdp:38864 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on slave.hdp:43602 (size: 1857.0 B, free: 7.0 GB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on slave2.hdp:46273 (size: 1857.0 B, free: 7.0 GB)
17/04/21 08:12:15 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on slave.hdp:43602 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:15 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on slave2.hdp:46273 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:17 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, slave.hdp, partition 3,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:17 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2556 ms on slave.hdp (1/13)
17/04/21 08:12:17 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, slave1.hdp, partition 4,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:17 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 2662 ms on slave1.hdp (2/13)
17/04/21 08:12:17 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, slave2.hdp, partition 5,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:17 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2904 ms on slave2.hdp (3/13)
17/04/21 08:12:18 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, slave.hdp, partition 6,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:18 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 1708 ms on slave.hdp (4/13)
17/04/21 08:12:18 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, slave2.hdp, partition 7,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:18 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 1406 ms on slave2.hdp (5/13)
17/04/21 08:12:19 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 2160 ms on slave1.hdp (6/13)
17/04/21 08:12:19 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, slave1.hdp, partition 8,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:20 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, slave.hdp, partition 9,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:20 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 1375 ms on slave.hdp (7/13)
17/04/21 08:12:20 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, slave2.hdp, partition 10,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:20 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 1522 ms on slave2.hdp (8/13)
17/04/21 08:12:21 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, slave1.hdp, partition 11,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:21 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 1652 ms on slave1.hdp (9/13)
17/04/21 08:12:21 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, slave.hdp, partition 12,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:21 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 1603 ms on slave.hdp (10/13)
17/04/21 08:12:21 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 877 ms on slave1.hdp (11/13)
17/04/21 08:12:22 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 1659 ms on slave2.hdp (12/13)
17/04/21 08:12:23 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 1581 ms on slave.hdp (13/13)
17/04/21 08:12:23 INFO DAGScheduler: ResultStage 0 (zipWithIndex at BwaInterpreter.java:152) finished in 8.808 s
17/04/21 08:12:23 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/04/21 08:12:23 INFO DAGScheduler: Job 0 finished: zipWithIndex at BwaInterpreter.java:152, took 8.884092 s
17/04/21 08:12:23 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:23 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:175)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 341.9 KB, free 716.9 KB)
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 28.3 KB, free 745.2 KB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.2.86:32844 (size: 28.3 KB, free: 1140.3 MB)
17/04/21 08:12:23 INFO SparkContext: Created broadcast 2 from textFile at BwaInterpreter.java:149
17/04/21 08:12:23 INFO FileInputFormat: Total input paths to process : 1
17/04/21 08:12:23 INFO SparkContext: Starting job: zipWithIndex at BwaInterpreter.java:152
17/04/21 08:12:23 INFO DAGScheduler: Got job 1 (zipWithIndex at BwaInterpreter.java:152) with 13 output partitions
17/04/21 08:12:23 INFO DAGScheduler: Final stage: ResultStage 1 (zipWithIndex at BwaInterpreter.java:152)
17/04/21 08:12:23 INFO DAGScheduler: Parents of final stage: List()
17/04/21 08:12:23 INFO DAGScheduler: Missing parents: List()
17/04/21 08:12:23 INFO DAGScheduler: Submitting ResultStage 1 (hdfs://master.hdp:8020/SparkBWA/ERR000589_2.filt.fastq MapPartitionsRDD[8] at textFile at BwaInterpreter.java:149), which has no missing parents
17/04/21 08:12:23 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:173)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:34)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 3.0 KB, free 748.2 KB)
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 1863.0 B, free 750.1 KB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.2.86:32844 (size: 1863.0 B, free: 1140.3 MB)
17/04/21 08:12:23 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:23 INFO DAGScheduler: Submitting 13 missing tasks from ResultStage 1 (hdfs://master.hdp:8020/SparkBWA/ERR000589_2.filt.fastq MapPartitionsRDD[8] at textFile at BwaInterpreter.java:149)
17/04/21 08:12:23 INFO YarnClusterScheduler: Adding task set 1.0 with 13 tasks
17/04/21 08:12:23 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 13, slave1.hdp, partition 0,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:23 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 14, slave2.hdp, partition 1,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:23 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 15, slave.hdp, partition 2,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on slave1.hdp:38864 (size: 1863.0 B, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on slave.hdp:43602 (size: 1863.0 B, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on slave2.hdp:46273 (size: 1863.0 B, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on slave1.hdp:38864 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on slave2.hdp:46273 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on slave.hdp:43602 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:24 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 16, slave.hdp, partition 3,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:24 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 15) in 1111 ms on slave.hdp (1/13)
17/04/21 08:12:24 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 17, slave1.hdp, partition 4,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:24 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 13) in 1233 ms on slave1.hdp (2/13)
17/04/21 08:12:24 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 18, slave2.hdp, partition 5,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:24 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 14) in 1415 ms on slave2.hdp (3/13)
17/04/21 08:12:25 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 19, slave1.hdp, partition 6,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:25 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 17) in 861 ms on slave1.hdp (4/13)
17/04/21 08:12:25 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 20, slave.hdp, partition 7,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:25 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 16) in 1128 ms on slave.hdp (5/13)
17/04/21 08:12:25 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 21, slave2.hdp, partition 8,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:25 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 18) in 1075 ms on slave2.hdp (6/13)
17/04/21 08:12:26 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 22, slave1.hdp, partition 9,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:26 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 19) in 852 ms on slave1.hdp (7/13)
17/04/21 08:12:26 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 23, slave.hdp, partition 10,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:26 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 20) in 969 ms on slave.hdp (8/13)
17/04/21 08:12:27 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 24, slave2.hdp, partition 11,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:27 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 21) in 1091 ms on slave2.hdp (9/13)
17/04/21 08:12:27 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 25, slave1.hdp, partition 12,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:27 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 22) in 891 ms on slave1.hdp (10/13)
17/04/21 08:12:27 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 23) in 976 ms on slave.hdp (11/13)
17/04/21 08:12:28 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 25) in 861 ms on slave1.hdp (12/13)
17/04/21 08:12:28 INFO TaskSetManager: Finished task 11.0 in stage 1.0 (TID 24) in 1159 ms on slave2.hdp (13/13)
17/04/21 08:12:28 INFO DAGScheduler: ResultStage 1 (zipWithIndex at BwaInterpreter.java:152) finished in 4.737 s
17/04/21 08:12:28 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/04/21 08:12:28 INFO DAGScheduler: Job 1 finished: zipWithIndex at BwaInterpreter.java:152, took 4.746554 s
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:175)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 INFO MapPartitionsRDD: Removing RDD 6 from persistence list
17/04/21 08:12:28 INFO BlockManager: Removing RDD 6
17/04/21 08:12:28 INFO MapPartitionsRDD: Removing RDD 13 from persistence list
17/04/21 08:12:28 INFO BlockManager: Removing RDD 13
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onUnpersistRDD(EventLoggingListener.scala:186)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: No sort with partitioning
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onUnpersistRDD(EventLoggingListener.scala:186)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: Repartition with no sort
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: End of sorting. Timing: 47461927545800
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: Total time: 0.23514286920000002 minutes
17/04/21 08:12:28 INFO BwaAlignmentBase: [com.github.sparkbwa.BwaPairedAlignment] :: application_1492697141087_0027 - SparkBWA_ERR000589_1.filt.fastq-32-NoSort
17/04/21 08:12:28 INFO SparkContext: Starting job: collect at BwaInterpreter.java:305
17/04/21 08:12:28 INFO DAGScheduler: Registering RDD 3 (mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO DAGScheduler: Registering RDD 10 (mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO DAGScheduler: Registering RDD 17 (repartition at BwaInterpreter.java:281)
17/04/21 08:12:28 INFO DAGScheduler: Got job 2 (collect at BwaInterpreter.java:305) with 32 output partitions
17/04/21 08:12:28 INFO DAGScheduler: Final stage: ResultStage 5 (collect at BwaInterpreter.java:305)
17/04/21 08:12:28 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 4)
17/04/21 08:12:28 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 4)
17/04/21 08:12:28 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[3] at mapToPair at BwaInterpreter.java:152), which has no missing parents
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:173)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:34)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 5.3 KB, free 755.3 KB)
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 2.8 KB, free 758.2 KB)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.2.86:32844 (size: 2.8 KB, free: 1140.3 MB)
17/04/21 08:12:28 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:28 INFO DAGScheduler: Submitting 14 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[3] at mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO YarnClusterScheduler: Adding task set 2.0 with 14 tasks
17/04/21 08:12:28 INFO DAGScheduler: Submitting ShuffleMapStage 3 (MapPartitionsRDD[10] at mapToPair at BwaInterpreter.java:152), which has no missing parents
17/04/21 08:12:28 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 26, slave1.hdp, partition 0,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:28 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 27, slave2.hdp, partition 1,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:28 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 28, slave.hdp, partition 2,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 5.3 KB, free 763.4 KB)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on slave1.hdp:38864 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.8 KB, free 766.3 KB)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 192.168.2.86:32844 (size: 2.8 KB, free: 1140.3 MB)
17/04/21 08:12:28 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:28 INFO DAGScheduler: Submitting 14 missing tasks from ShuffleMapStage 3 (MapPartitionsRDD[10] at mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on slave.hdp:43602 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:12:28 INFO YarnClusterScheduler: Adding task set 3.0 with 14 tasks
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on slave2.hdp:46273 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:12:34 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 29, slave2.hdp, partition 3,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:34 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 27) in 6478 ms on slave2.hdp (1/14)
17/04/21 08:12:36 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 30, slave.hdp, partition 4,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:36 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 28) in 8047 ms on slave.hdp (2/14)
17/04/21 08:12:36 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 31, slave1.hdp, partition 5,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:36 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 26) in 8334 ms on slave1.hdp (3/14)
17/04/21 08:12:42 INFO TaskSetManager: Starting task 6.0 in stage 2.0 (TID 32, slave2.hdp, partition 6,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:42 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 29) in 7902 ms on slave2.hdp (4/14)
17/04/21 08:12:44 INFO TaskSetManager: Starting task 7.0 in stage 2.0 (TID 33, slave.hdp, partition 7,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:45 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 30) in 8723 ms on slave.hdp (5/14)
17/04/21 08:12:45 INFO TaskSetManager: Starting task 8.0 in stage 2.0 (TID 34, slave1.hdp, partition 8,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:45 INFO TaskSetManager: Finished task 5.0 in stage 2.0 (TID 31) in 8952 ms on slave1.hdp (6/14)
17/04/21 08:12:50 INFO TaskSetManager: Starting task 9.0 in stage 2.0 (TID 35, slave2.hdp, partition 9,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:50 INFO TaskSetManager: Finished task 6.0 in stage 2.0 (TID 32) in 8277 ms on slave2.hdp (7/14)
17/04/21 08:12:51 INFO TaskSetManager: Starting task 10.0 in stage 2.0 (TID 36, slave1.hdp, partition 10,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:51 INFO TaskSetManager: Finished task 8.0 in stage 2.0 (TID 34) in 6232 ms on slave1.hdp (8/14)
17/04/21 08:12:53 INFO TaskSetManager: Starting task 11.0 in stage 2.0 (TID 37, slave.hdp, partition 11,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:53 INFO TaskSetManager: Finished task 7.0 in stage 2.0 (TID 33) in 8497 ms on slave.hdp (9/14)
17/04/21 08:12:59 INFO TaskSetManager: Starting task 12.0 in stage 2.0 (TID 38, slave2.hdp, partition 12,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:59 INFO TaskSetManager: Finished task 9.0 in stage 2.0 (TID 35) in 8242 ms on slave2.hdp (10/14)
17/04/21 08:12:59 INFO TaskSetManager: Starting task 13.0 in stage 2.0 (TID 39, slave1.hdp, partition 13,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:59 INFO TaskSetManager: Finished task 10.0 in stage 2.0 (TID 36) in 7685 ms on slave1.hdp (11/14)
17/04/21 08:13:01 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 40, slave.hdp, partition 0,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:01 INFO TaskSetManager: Finished task 11.0 in stage 2.0 (TID 37) in 8151 ms on slave.hdp (12/14)
17/04/21 08:13:01 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on slave.hdp:43602 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:13:03 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 41, slave1.hdp, partition 1,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:03 INFO TaskSetManager: Finished task 13.0 in stage 2.0 (TID 39) in 3730 ms on slave1.hdp (13/14)
17/04/21 08:13:03 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on slave1.hdp:38864 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:13:08 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 42, slave2.hdp, partition 2,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:08 INFO TaskSetManager: Finished task 12.0 in stage 2.0 (TID 38) in 9660 ms on slave2.hdp (14/14)
17/04/21 08:13:08 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool
17/04/21 08:13:08 INFO DAGScheduler: ShuffleMapStage 2 (mapToPair at BwaInterpreter.java:152) finished in 40.556 s
17/04/21 08:13:08 INFO DAGScheduler: looking for newly runnable stages
17/04/21 08:13:08 INFO DAGScheduler: running: Set(ShuffleMapStage 3)
17/04/21 08:13:08 INFO DAGScheduler: waiting: Set(ResultStage 5, ShuffleMapStage 4)
17/04/21 08:13:08 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more
17/04/21 08:13:08 INFO DAGScheduler: failed: Set()
17/04/21 08:13:08 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on slave2.hdp:46273 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:13:09 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 43, slave.hdp, partition 3,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:09 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 40) in 8088 ms on slave.hdp (1/14)
17/04/21 08:13:10 INFO TaskSetManager: Starting task 4.0 in stage 3.0 (TID 44, slave1.hdp, partition 4,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:10 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 41) in 7719 ms on slave1.hdp (2/14)
17/04/21 08:13:16 INFO TaskSetManager: Starting task 5.0 in stage 3.0 (TID 45, slave.hdp, partition 5,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:16 INFO TaskSetManager: Finished task 3.0 in stage 3.0 (TID 43) in 6821 ms on slave.hdp (3/14)
17/04/21 08:13:16 INFO TaskSetManager: Starting task 6.0 in stage 3.0 (TID 46, slave2.hdp, partition 6,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:16 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 42) in 7732 ms on slave2.hdp (4/14)
17/04/21 08:13:17 INFO TaskSetManager: Starting task 7.0 in stage 3.0 (TID 47, slave1.hdp, partition 7,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:17 INFO TaskSetManager: Finished task 4.0 in stage 3.0 (TID 44) in 6100 ms on slave1.hdp (5/14)
17/04/21 08:13:24 INFO TaskSetManager: Starting task 8.0 in stage 3.0 (TID 48, slave.hdp, partition 8,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:24 INFO TaskSetManager: Finished task 5.0 in stage 3.0 (TID 45) in 7780 ms on slave.hdp (6/14)
17/04/21 08:13:24 INFO TaskSetManager: Starting task 9.0 in stage 3.0 (TID 49, slave2.hdp, partition 9,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:24 INFO TaskSetManager: Finished task 6.0 in stage 3.0 (TID 46) in 7916 ms on slave2.hdp (7/14)
17/04/21 08:13:25 INFO TaskSetManager: Starting task 10.0 in stage 3.0 (TID 50, slave1.hdp, partition 10,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:25 INFO TaskSetManager: Finished task 7.0 in stage 3.0 (TID 47) in 8519 ms on slave1.hdp (8/14)
17/04/21 08:13:31 INFO TaskSetManager: Starting task 11.0 in stage 3.0 (TID 51, slave.hdp, partition 11,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:31 INFO TaskSetManager: Finished task 8.0 in stage 3.0 (TID 48) in 7668 ms on slave.hdp (9/14)
17/04/21 08:13:32 INFO TaskSetManager: Starting task 12.0 in stage 3.0 (TID 52, slave2.hdp, partition 12,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:32 INFO TaskSetManager: Finished task 9.0 in stage 3.0 (TID 49) in 7587 ms on slave2.hdp (10/14)
17/04/21 08:13:33 INFO TaskSetManager: Starting task 13.0 in stage 3.0 (TID 53, slave1.hdp, partition 13,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:33 INFO TaskSetManager: Finished task 10.0 in stage 3.0 (TID 50) in 8168 ms on slave1.hdp (11/14)
17/04/21 08:13:37 INFO TaskSetManager: Finished task 13.0 in stage 3.0 (TID 53) in 4209 ms on slave1.hdp (12/14)
17/04/21 08:13:39 INFO TaskSetManager: Finished task 11.0 in stage 3.0 (TID 51) in 7536 ms on slave.hdp (13/14)
17/04/21 08:13:40 INFO TaskSetManager: Finished task 12.0 in stage 3.0 (TID 52) in 7997 ms on slave2.hdp (14/14)
17/04/21 08:13:40 INFO DAGScheduler: ShuffleMapStage 3 (mapToPair at BwaInterpreter.java:152) finished in 71.731 s
17/04/21 08:13:40 INFO DAGScheduler: looking for newly runnable stages
17/04/21 08:13:40 INFO YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool
17/04/21 08:13:40 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more
17/04/21 08:13:40 INFO DAGScheduler: running: Set()
17/04/21 08:13:40 INFO DAGScheduler: waiting: Set(ResultStage 5, ShuffleMapStage 4)
17/04/21 08:13:40 INFO DAGScheduler: failed: Set()
17/04/21 08:13:40 INFO DAGScheduler: Submitting ShuffleMapStage 4 (MapPartitionsRDD[17] at repartition at BwaInterpreter.java:281), which has no missing parents
17/04/21 08:13:40 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 8.3 KB, free 774.6 KB)
17/04/21 08:13:40 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.9 KB, free 778.5 KB)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on 192.168.2.86:32844 (size: 3.9 KB, free: 1140.3 MB)
17/04/21 08:13:40 INFO SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1008
17/04/21 08:13:40 INFO DAGScheduler: Submitting 14 missing tasks from ShuffleMapStage 4 (MapPartitionsRDD[17] at repartition at BwaInterpreter.java:281)
17/04/21 08:13:40 INFO YarnClusterScheduler: Adding task set 4.0 with 14 tasks
17/04/21 08:13:40 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 54, slave1.hdp, partition 0,NODE_LOCAL, 2132 bytes)
17/04/21 08:13:40 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 55, slave2.hdp, partition 1,NODE_LOCAL, 2132 bytes)
17/04/21 08:13:40 INFO TaskSetManager: Starting task 2.0 in stage 4.0 (TID 56, slave.hdp, partition 2,NODE_LOCAL, 2132 bytes)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on slave1.hdp:38864 (size: 3.9 KB, free: 7.0 GB)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on slave2.hdp:46273 (size: 3.9 KB, free: 7.0 GB)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on slave.hdp:43602 (size: 3.9 KB, free: 7.0 GB)
17/04/21 08:13:40 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to slave2.hdp:49614
17/04/21 08:13:40 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 2 is 207 bytes
17/04/21 08:13:40 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to slave.hdp:50548
17/04/21 08:13:40 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to slave1.hdp:41948
17/04/21 08:13:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to slave.hdp:50548
17/04/21 08:13:47 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 188 bytes
17/04/21 08:13:48 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to slave1.hdp:41948
17/04/21 08:13:48 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to slave2.hdp:49614
17/04/21 08:14:04 INFO TaskSetManager: Starting task 3.0 in stage 4.0 (TID 57, slave.hdp, partition 3,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:04 INFO TaskSetManager: Finished task 2.0 in stage 4.0 (TID 56) in 24645 ms on slave.hdp (1/14)
17/04/21 08:14:06 INFO TaskSetManager: Starting task 4.0 in stage 4.0 (TID 58, slave2.hdp, partition 4,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:06 INFO TaskSetManager: Finished task 1.0 in stage 4.0 (TID 55) in 26426 ms on slave2.hdp (2/14)
17/04/21 08:14:07 INFO TaskSetManager: Starting task 5.0 in stage 4.0 (TID 59, slave1.hdp, partition 5,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:07 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 54) in 27388 ms on slave1.hdp (3/14)
17/04/21 08:14:32 INFO TaskSetManager: Starting task 6.0 in stage 4.0 (TID 60, slave.hdp, partition 6,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:32 INFO TaskSetManager: Finished task 3.0 in stage 4.0 (TID 57) in 27494 ms on slave.hdp (4/14)
17/04/21 08:14:38 INFO TaskSetManager: Starting task 7.0 in stage 4.0 (TID 61, slave2.hdp, partition 7,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:38 INFO TaskSetManager: Finished task 4.0 in stage 4.0 (TID 58) in 31786 ms on slave2.hdp (5/14)
17/04/21 08:14:39 INFO TaskSetManager: Starting task 8.0 in stage 4.0 (TID 62, slave1.hdp, partition 8,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:39 INFO TaskSetManager: Finished task 5.0 in stage 4.0 (TID 59) in 31780 ms on slave1.hdp (6/14)
17/04/21 08:14:57 INFO TaskSetManager: Starting task 9.0 in stage 4.0 (TID 63, slave.hdp, partition 9,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:57 INFO TaskSetManager: Finished task 6.0 in stage 4.0 (TID 60) in 25588 ms on slave.hdp (7/14)
17/04/21 08:15:07 INFO TaskSetManager: Starting task 10.0 in stage 4.0 (TID 64, slave2.hdp, partition 10,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:08 INFO TaskSetManager: Finished task 7.0 in stage 4.0 (TID 61) in 29562 ms on slave2.hdp (8/14)
17/04/21 08:15:28 INFO TaskSetManager: Starting task 11.0 in stage 4.0 (TID 65, slave1.hdp, partition 11,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:28 INFO TaskSetManager: Finished task 8.0 in stage 4.0 (TID 62) in 49647 ms on slave1.hdp (9/14)
17/04/21 08:15:35 INFO TaskSetManager: Starting task 12.0 in stage 4.0 (TID 66, slave.hdp, partition 12,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:35 INFO TaskSetManager: Finished task 9.0 in stage 4.0 (TID 63) in 38007 ms on slave.hdp (10/14)
17/04/21 08:15:40 INFO TaskSetManager: Starting task 13.0 in stage 4.0 (TID 67, slave2.hdp, partition 13,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:40 INFO TaskSetManager: Finished task 10.0 in stage 4.0 (TID 64) in 32524 ms on slave2.hdp (11/14)
17/04/21 08:17:17 INFO TaskSetManager: Finished task 13.0 in stage 4.0 (TID 67) in 97403 ms on slave2.hdp (12/14)
17/04/21 08:17:20 INFO TaskSetManager: Finished task 12.0 in stage 4.0 (TID 66) in 104233 ms on slave.hdp (13/14)
17/04/21 08:18:24 INFO TaskSetManager: Finished task 11.0 in stage 4.0 (TID 65) in 175387 ms on slave1.hdp (14/14)
17/04/21 08:18:24 INFO DAGScheduler: ShuffleMapStage 4 (repartition at BwaInterpreter.java:281) finished in 284.201 s
17/04/21 08:18:24 INFO YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool
17/04/21 08:18:24 INFO DAGScheduler: looking for newly runnable stages
17/04/21 08:18:24 INFO DAGScheduler: running: Set()
17/04/21 08:18:24 INFO DAGScheduler: waiting: Set(ResultStage 5)
17/04/21 08:18:24 INFO DAGScheduler: failed: Set()
17/04/21 08:18:24 INFO DAGScheduler: Submitting ResultStage 5 (MapPartitionsRDD[22] at mapPartitionsWithIndex at BwaInterpreter.java:304), which has no missing parents
17/04/21 08:18:24 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more

This is my running command
/usr/bin/spark-submit --class com.github.sparkbwa.SparkBWA --master yarn-cluster --driver-memory 2g --executor-memory 10g --executor-cores 1 --verbose --num-executors 32 /home/SparkBWA/target/SparkBWA-0.2.jar -m -r -p --index hdfs://master.hdp:8020/Data/HumanBase/hg19.fa -n 32 -w "-R @rg\tID:foo\tLB:bar\tPL:illumina\tPU:illumina\tSM:ERR000589" hdfs://master.hdp:8020/SparkBWA/ERR000589_1.filt.fastq hdfs://master.hdp:8020/SparkBWA/ERR000589_2.filt.fastq hdfs://master.hdp:8020/sample/output

from sparkbwa.

xubo245 avatar xubo245 commented on August 22, 2024

You can run in locla and executor number should be less than worker number.

from sparkbwa.

jmabuin avatar jmabuin commented on August 22, 2024

Just to be clear, what do you mean when you say "non-hdfs environment"?

from sparkbwa.

jmabuin avatar jmabuin commented on August 22, 2024

I was taking a look at your execution command. The index must be in local disk, not in HDFS, and it must be available in all computing nodes in your cluster.

from sparkbwa.

jmabuin avatar jmabuin commented on August 22, 2024

We have been finally able to reproduce the empty sam file error. It is because bwa is not finding the index. The job in the YARN web interface appears to finish correctly, but internally, the spark executors are failing.

You can check if this is happening to you by ckecking the executors logs
yarn logs -applicationId your-app-id

You should find some kind of bwa error in the executors

from sparkbwa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.