Comments (5)
For some reason, on my first try, the TransactionDate column was not shown when I went to "Mappings" on my copy task and tried to map it from source to destination. I had to import the schema again from the parquet file so that it was shown. Maybe a bug, maybe a mistake on my part... just a heads up in case you can repro.
from azure-synapse-analytics-workshop-400.
after having corrected the mapping, I am getting the following error message when I trigger the pipeline execution:
{
"errorCode": "2200",
"message": "ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.OutOfMemoryError:Direct buffer memory\ntotal entry:32\r\njava.nio.Bits.reserveMemory(Bits.java:658)\r\njava.nio.DirectByteBuffer.(DirectByteBuffer.java:123)\r\njava.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)\r\norg.apache.parquet.hadoop.codec.SnappyDecompressor.setInput(SnappyDecompressor.java:102)\r\norg.apache.parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:46)\r\njava.io.DataInputStream.readFully(DataInputStream.java:195)\r\njava.io.DataInputStream.readFully(DataInputStream.java:169)\r\norg.apache.parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:251)\r\norg.apache.parquet.bytes.BytesInput.toByteBuffer(BytesInput.java:202)\r\norg.apache.parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:592)\r\norg.apache.parquet.column.impl.ColumnReaderImpl.access$300(ColumnReaderImpl.java:61)\r\norg.apache.parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:541)\r\norg.apache.parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:538)\r\norg.apache.parquet.column.page.DataPageV1.accept(DataPageV1.java:96)\r\norg.apache.parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:538)\r\norg.apache.parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderImpl.java:530)\r\norg.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:642)\r\norg.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:358)\r\norg.apache.parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:82)\r\norg.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:77)\r\norg.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:270)\r\norg.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:140)\r\norg.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:106)\r\norg.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:154)\r\norg.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:106)\r\norg.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:136)\r\norg.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:194)\r\norg.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:122)\r\norg.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:126)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetBatchReaderBridge.(ParquetBatchReaderBridge.java:68)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetBatchReaderBridge.open(ParquetBatchReaderBridge.java:63)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetFileBridge.createReader(ParquetFileBridge.java:22)\r\n.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'",
"failureType": "UserError",
"target": "Copy sales",
"details": []
}
from azure-synapse-analytics-workshop-400.
On my third attempt, I changed the "Data integration unit" of the copy data activity to "AUTO" and it completed the job successfully
from azure-synapse-analytics-workshop-400.
We have been seeing the out of memory error intermittently quite a few times. The best guess currently is that it has something to do with Parquet files that are larger in size.
from azure-synapse-analytics-workshop-400.
I have run through this a few times now and have not seen the issue popup again. Closing.
from azure-synapse-analytics-workshop-400.
Related Issues (20)
- DW Optimizer/Task 2/Step 4 HOT 2
- DW Optimizer/Exercise2/Task2/Step 4 HOT 1
- DW Optimizer/Lab 1/Exercise 2/Step 6
- Lab 2 / Exercise 3 / Task 3 / Step 7 HOT 2
- Lab 2 / Exercise 4 / Step 8 HOT 3
- L400 – Lab 6 – Broken links
- L400 – Lab 6 – Documentation
- L400 – Lab 6 Exercise 1 – Documentation HOT 1
- L400 - Lab 6 – Exercise 1 HOT 1
- Lab3/Exercise2/task3 HOT 2
- Lab 3/Exercise 2/Step 4
- Lab 3/Exercise 2/Step 5/Step 2 HOT 2
- Day 2/Lab2/Exercise 2/Part1 HOT 1
- Day 2/Lab2/Exercise 2/Part2 HOT 1
- L400 - Day 3 - Lab 6 - Exercise 1 HOT 1
- L400-Day 3-Lab 6-Exercise 1 Cell 7 line magic UsageError HOT 2
- Missing .\artifacts\environment-setup\labfiles\AzureCred.ps1 HOT 1
- Datasets HOT 1
- Empty Storage Containers with Cosmos DB Data
- CLI 2.0 Commands
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from azure-synapse-analytics-workshop-400.