The usql sample script to build the mag database in Visual Studio 2017 (Data Lake approach) is failing on the Papers table. Error messages provided below. This appears from the examples below to involve some kind of text spillage across columns (?).
I have now tried to submit several times and so this is a clear problem. It may result from the 2019-01-25 build rather than the script itself. Would you mind taking a look at this? I am keen to use mag long term if I can get past the build problem. Full details below. Many thanks for your help. Paul.
### Minimal steps to reproduce
> Build and submit CreateDatabase.usql in Visual Studio 2017 following instructions for Visual Studio using microsoft academic 2019 blob fails on Papers.txt.
### Any log messages given by the failure
> Example 1
[
{
"errorSource": "User",
"errorId": "VertexFailedFast",
"errorFields": [
{
"fieldName": "SEVERITY",
"fieldValue": "Error"
},
{
"fieldName": "DESCRIPTION",
"fieldValue": "Vertex failure triggered quick job abort. Vertex failed: SV49_Extract_Partition[4] with error: Vertex user code error."
},
{
"fieldName": "MESSAGE",
"fieldValue": "Vertex failed with a fail-fast error"
}
],
"innerError": {
"errorSource": "User",
"errorId": "E_RUNTIME_USER_EXTRACT_ROW_ERROR",
"errorFields": [
{
"fieldName": "SEVERITY",
"fieldValue": "Error"
},
{
"fieldName": "COMPONENT",
"fieldValue": "RUNTIME"
},
{
"fieldName": "MESSAGE",
"fieldValue": "Error occurred while extracting row after processing 4 record(s) in the vertex' input split. Column index: 7, column name: 'Year'."
},
{
"fieldName": "DIAGNOSTICCODE",
"fieldValue": "195887152"
}
],
"innerError": {
"errorSource": "User",
"errorId": "E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_EMPTY_ERROR",
"errorFields": [
{
"fieldName": "SEVERITY",
"fieldValue": "Error"
},
{
"fieldName": "COMPONENT",
"fieldValue": "RUNTIME"
},
{
"fieldName": "DESCRIPTION",
"fieldValue": "Can not convert EMPTY string to proper type."
},
{
"fieldName": "MESSAGE",
"fieldValue": "Failure when attempting to convert empty column data."
},
{
"fieldName": "DETAILS",
"fieldValue": "Row Delimiter: 0x0\nColumn Delimiter: 0x9\nHEX: 6C 64 20 61 6E 64 20 75 6E 73 6F 6C 64 20 3A 20 61 6C 73 6F 20 61 6C 6C 20 62 75 69 6C 64 69 6E 67 73 20 61 6E 64 20 6F 74 68 65 72 20 69 6D 70 72 6F 76 65 6D 65 6E 74 73 09 09 09 ### 09 09 09 09 09 09 09 09 09 30\nTEXT: ld and unsold : also all buildings and other improvements\\t\\t\\t ### \\t\\t\\t\\t\\t\\t\\t\\t\\t0\n"
},
{
"fieldName": "DIAGNOSTICCODE",
"fieldValue": "195887144"
},
{
"fieldName": "RESOLUTION",
"fieldValue": "Check the input for errors and make sure that the target type in your EXTRACT schema is a nullable type such as int?, or use \"silent\" switch to ignore conversion errors for nullable types.\nConsider that ignoring \"invalid\" rows may influence job results and that types have to be nullable for conversion errors to be ignored."
}
],
"innerError": null
}
}
}
]
Example 2
{
"errorSource": "User",
"errorId": "E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_EMPTY_ERROR",
"errorFields": [
{
"fieldName": "SEVERITY",
"fieldValue": "Error"
},
{
"fieldName": "COMPONENT",
"fieldValue": "RUNTIME"
},
{
"fieldName": "DESCRIPTION",
"fieldValue": "Can not convert EMPTY string to proper type."
},
{
"fieldName": "MESSAGE",
"fieldValue": "Failure when attempting to convert empty column data."
},
{
"fieldName": "DETAILS",
"fieldValue": "Row Delimiter: 0x0\nColumn Delimiter: 0x9\nHEX: 20 2D 20 53 69 65 72 72 61 20 4E 65 76 61 64 61 20 45 63 6F 72 65 67 69 6F 6E 20 2D 20 53 4D 50 2F 45 42 4D 20 5B 64 73 31 30 36 33 5D 20 47 49 53 20 44 61 74 61 73 65 74 09 09 09 ### 09 09 09 09 09 09 09 09 09 30\nTEXT: - Sierra Nevada Ecoregion - SMP/EBM [ds1063] GIS Dataset\\t\\t\\t ### \\t\\t\\t\\t\\t\\t\\t\\t\\t0\n"
},
{
"fieldName": "DIAGNOSTICCODE",
"fieldValue": "195887144"
},
{
"fieldName": "RESOLUTION",
"fieldValue": "Check the input for errors and make sure that the target type in your EXTRACT schema is a nullable type such as int?, or use \"silent\" switch to ignore conversion errors for nullable types.\nConsider that ignoring \"invalid\" rows may influence job results and that types have to be nullable for conversion errors to be ignored."
}
],
"innerError": null
}
### Expected/desired behavior
> Script Builds database
### OS and Version?
> Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Windows Data Science VM running Visual Studio 2017
### Versions
>
### Mention any other details that might be useful
Run multiple times. Fails on the same table but with variations in the error.