Code Monkey home page Code Monkey logo

Comments (11)

wonderfulping avatar wonderfulping commented on May 28, 2024
    at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
    at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
    at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
    at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
    at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
    at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
    at scala.Option.map(Option.scala:146)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 ERROR executor.Executor: Exception in task 3.0 in stage 2.0 (TID 1461)
java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00003, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
19/07/11 13:57:35 ERROR executor.Executor: Exception in task 14.0 in stage 2.0 (TID 1472)
java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00014, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
19/07/11 13:57:35 ERROR executor.Executor: Exception in task 9.0 in stage 2.0 (TID 1467)
java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00009, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 4.0 in stage 2.0 (TID 1462, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00004, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 ERROR scheduler.TaskSetManager: Task 4 in stage 2.0 failed 1 times; aborting job
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 2.0 (TID 1459, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00001, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 12.0 in stage 2.0 (TID 1470, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00012, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 9.0 in stage 2.0 (TID 1467, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00009, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 3.0 in stage 2.0 (TID 1461, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00003, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 2.0 (TID 1460, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00002, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 7.0 in stage 2.0 (TID 1465, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00007, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 11.0 in stage 2.0 (TID 1469, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00011, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 14.0 in stage 2.0 (TID 1472, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00014, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 15.0 in stage 2.0 (TID 1473, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00015, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 13.0 in stage 2.0 (TID 1471, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00013, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 10.0 in stage 2.0 (TID 1468, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00010, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 1458, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00000, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 6.0 in stage 2.0 (TID 1464, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00006, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 5.0 in stage 2.0 (TID 1463, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00005, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 8.0 in stage 2.0 (TID 1466, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00008, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 24.0 in stage 2.0 (TID 1482, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 27.0 in stage 2.0 (TID 1485, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 23.0 in stage 2.0 (TID 1481, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 29.0 in stage 2.0 (TID 1487, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 16.0 in stage 2.0 (TID 1474, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 25.0 in stage 2.0 (TID 1483, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 31.0 in stage 2.0 (TID 1489, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 28.0 in stage 2.0 (TID 1486, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 20.0 in stage 2.0 (TID 1478, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 19.0 in stage 2.0 (TID 1477, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 26.0 in stage 2.0 (TID 1484, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 17.0 in stage 2.0 (TID 1475, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 18.0 in stage 2.0 (TID 1476, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 21.0 in stage 2.0 (TID 1479, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 22.0 in stage 2.0 (TID 1480, localhost, executor driver): TaskKilled (stage cancelled)
19/07/11 13:57:35 WARN scheduler.TaskSetManager: Lost task 30.0 in stage 2.0 (TID 1488, localhost, executor driver): TaskKilled (stage cancelled)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 2.0 failed 1 times, most recent failure: Lost task 4.0 in stage 2.0 (TID 1462, localhost, executor driver): java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00004, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.collect(RDD.scala:935)
at org.apache.spark.sql.execution.datasources.parquet.ParquetMetastoreSupport.createIndex(ParquetMetastoreSupport.scala:140)
at org.apache.spark.sql.execution.datasources.IndexedDataSource$$anonfun$createIndex$2.apply(IndexedDataSource.scala:108)
at org.apache.spark.sql.execution.datasources.IndexedDataSource$$anonfun$createIndex$2.apply(IndexedDataSource.scala:107)
at org.apache.spark.sql.execution.datasources.Metastore.create(Metastore.scala:162)
at org.apache.spark.sql.execution.datasources.IndexedDataSource.createIndex(IndexedDataSource.scala:107)
at org.apache.spark.sql.CreateIndexCommand.table(DataFrameIndexManager.scala:234)
... 50 elided
Caused by: java.lang.IllegalArgumentException: Wrong FS: file:/usr/local/spark-2.2.0/bin/index_metastore/catalog/parquet/hdfs/user/gsbdc/dbdatas/test/fact_ord_order_test/part-f-00004, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1819)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:141)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD$$anonfun$1.apply(ParquetStatisticsRDD.scala:138)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.execution.datasources.parquet.ParquetStatisticsRDD.compute(ParquetStatisticsRDD.scala:138)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

spark-shell --packages lightcopy:parquet-index:0.4.0-s_2.11

import com.github.lightcopy.implicits._

spark.index.create.indexBy("order_id", "user_id", "shop_id").table("default.fact_ord_order_test")

from parquet-index.

sadikovi avatar sadikovi commented on May 28, 2024

how did you create "fact_ord_order_test" table? Can you include output of describe command?

from parquet-index.

sadikovi avatar sadikovi commented on May 28, 2024

Why did you close the issue? Did you manage to resolve it?

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

yeah,already settle by add metastore to hdfs path

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

but don't get enough improvement with a index

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

SELECT *
FROM dmcrm.fact_user a
LEFT JOIN ensights.l_user_tag_par b
ON a.h_user_key = b.h_user_key
where a.p_data_day = '2019-07-04'
limit 10;

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

I add index on table ensights.l_user_tag_par and use it in the JOIN condition,with or without a index is almost the same

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

image
image
like this.

from parquet-index.

wonderfulping avatar wonderfulping commented on May 28, 2024

ddl:

CREATE TABLE ensights.l_user_tag_par (
l_user_tag_key STRING COMMENT 'MD5(upper(tag_id + ‘^’ + user_id))',
h_user_key STRING COMMENT 'MD5(upper(user_id))',
h_tag_key STRING COMMENT 'MD5(upper(id))',
process_id BIGINT,
rec_source STRING COMMENT 'JNDI NAME',
load_time STRING COMMENT 'now()'
)
STORED AS PARQUET
LOCATION 'hdfs://nameservice1/user/hive/warehouse/ensights.db/l_user_tag_par'
TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='false', 'STATS_GENERATED_VIA_STATS_TASK'='true', 'numFiles'='0', 'numRows'='389349169', 'rawDataSize'='-1', 'totalSize'='24210861490')

from parquet-index.

sadikovi avatar sadikovi commented on May 28, 2024

It looks like dynamic filter pruning is not supported right now, we can't propagate that date filter from one table to another indexed table. If you add date filter on indexed column, it should be faster.

But it would be great to fix it, of course.

from parquet-index.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.