Code Monkey home page Code Monkey logo

microsoft-academic-graph-usql-samples's Introduction

page_type languages products description urlFragment
sample
usql
azure
U-SQL samples for analyzing data in Microsoft Academic Graph.
microsoft-academic-graph-usql-samples

Microsoft Academic Graph U-SQL Samples

U-SQL samples for analyzing data in Microsoft Academic Graph

Getting Started

Prerequisites

  1. Get started with Microsoft Academic Graph on Azure storage
  2. Get started using Microsoft Academic Graph on Azure Data Lake Analytics
  3. Get started with PowerBI

Quickstart

  1. Download or clone the repository.
  2. Run the scripts on Azure portal.

Resources

  1. Microsoft Academic Graph documentation
  2. Microsoft Academic Graph PySpark Samples

microsoft-academic-graph-usql-samples's People

Contributors

darrineide avatar microsoftopensource avatar msftgits avatar supernova-eng avatar v-rajagt-zz avatar wuchiehhan avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

microsoft-academic-graph-usql-samples's Issues

Create Database usql fails on partition SV49 (Papers.txt)

The usql sample script to build the mag database in Visual Studio 2017 (Data Lake approach) is failing on the Papers table. Error messages provided below. This appears from the examples below to involve some kind of text spillage across columns (?).

I have now tried to submit several times and so this is a clear problem. It may result from the 2019-01-25 build rather than the script itself. Would you mind taking a look at this? I am keen to use mag long term if I can get past the build problem. Full details below. Many thanks for your help. Paul.

  • [x ] bug report -> please search issues before submitting
  • feature request
  • documentation issue or request
  • [ x?] regression (a behavior that used to work and stopped in a new release)
### Minimal steps to reproduce
> Build and submit CreateDatabase.usql in Visual Studio 2017 following instructions for Visual Studio using microsoft academic 2019 blob fails on Papers.txt. 

### Any log messages given by the failure
>  Example 1
[
  {
    "errorSource": "User",
    "errorId": "VertexFailedFast",
    "errorFields": [
      {
        "fieldName": "SEVERITY",
        "fieldValue": "Error"
      },
      {
        "fieldName": "DESCRIPTION",
        "fieldValue": "Vertex failure triggered quick job abort. Vertex failed: SV49_Extract_Partition[4] with error: Vertex user code error."
      },
      {
        "fieldName": "MESSAGE",
        "fieldValue": "Vertex failed with a fail-fast error"
      }
    ],
    "innerError": {
      "errorSource": "User",
      "errorId": "E_RUNTIME_USER_EXTRACT_ROW_ERROR",
      "errorFields": [
        {
          "fieldName": "SEVERITY",
          "fieldValue": "Error"
        },
        {
          "fieldName": "COMPONENT",
          "fieldValue": "RUNTIME"
        },
        {
          "fieldName": "MESSAGE",
          "fieldValue": "Error occurred while extracting row after processing 4 record(s) in the vertex' input split. Column index: 7, column name: 'Year'."
        },
        {
          "fieldName": "DIAGNOSTICCODE",
          "fieldValue": "195887152"
        }
      ],
      "innerError": {
        "errorSource": "User",
        "errorId": "E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_EMPTY_ERROR",
        "errorFields": [
          {
            "fieldName": "SEVERITY",
            "fieldValue": "Error"
          },
          {
            "fieldName": "COMPONENT",
            "fieldValue": "RUNTIME"
          },
          {
            "fieldName": "DESCRIPTION",
            "fieldValue": "Can not convert EMPTY string to proper type."
          },
          {
            "fieldName": "MESSAGE",
            "fieldValue": "Failure when attempting to convert empty column data."
          },
          {
            "fieldName": "DETAILS",
            "fieldValue": "Row Delimiter: 0x0\nColumn Delimiter: 0x9\nHEX: 6C 64 20 61 6E 64 20 75 6E 73 6F 6C 64 20 3A 20 61 6C 73 6F 20 61 6C 6C 20 62 75 69 6C 64 69 6E 67 73 20 61 6E 64 20 6F 74 68 65 72 20 69 6D 70 72 6F 76 65 6D 65 6E 74 73 09 09 09 ### 09 09 09 09 09 09 09 09 09 30\nTEXT: ld and unsold : also all buildings and other improvements\\t\\t\\t ### \\t\\t\\t\\t\\t\\t\\t\\t\\t0\n"
          },
          {
            "fieldName": "DIAGNOSTICCODE",
            "fieldValue": "195887144"
          },
          {
            "fieldName": "RESOLUTION",
            "fieldValue": "Check the input for errors and make sure that the target type in your EXTRACT schema is a nullable type such as int?, or use \"silent\" switch to ignore conversion errors for nullable types.\nConsider that ignoring \"invalid\" rows may influence job results and that types have to be nullable for conversion errors to be ignored."
          }
        ],
        "innerError": null
      }
    }
  }
]

Example 2

{
  "errorSource": "User",
  "errorId": "E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_EMPTY_ERROR",
  "errorFields": [
    {
      "fieldName": "SEVERITY",
      "fieldValue": "Error"
    },
    {
      "fieldName": "COMPONENT",
      "fieldValue": "RUNTIME"
    },
    {
      "fieldName": "DESCRIPTION",
      "fieldValue": "Can not convert EMPTY string to proper type."
    },
    {
      "fieldName": "MESSAGE",
      "fieldValue": "Failure when attempting to convert empty column data."
    },
    {
      "fieldName": "DETAILS",
      "fieldValue": "Row Delimiter: 0x0\nColumn Delimiter: 0x9\nHEX: 20 2D 20 53 69 65 72 72 61 20 4E 65 76 61 64 61 20 45 63 6F 72 65 67 69 6F 6E 20 2D 20 53 4D 50 2F 45 42 4D 20 5B 64 73 31 30 36 33 5D 20 47 49 53 20 44 61 74 61 73 65 74 09 09 09 ### 09 09 09 09 09 09 09 09 09 30\nTEXT:  - Sierra Nevada Ecoregion - SMP/EBM [ds1063] GIS Dataset\\t\\t\\t ### \\t\\t\\t\\t\\t\\t\\t\\t\\t0\n"
    },
    {
      "fieldName": "DIAGNOSTICCODE",
      "fieldValue": "195887144"
    },
    {
      "fieldName": "RESOLUTION",
      "fieldValue": "Check the input for errors and make sure that the target type in your EXTRACT schema is a nullable type such as int?, or use \"silent\" switch to ignore conversion errors for nullable types.\nConsider that ignoring \"invalid\" rows may influence job results and that types have to be nullable for conversion errors to be ignored."
    }
  ],
  "innerError": null
}


### Expected/desired behavior
> Script Builds database

### OS and Version?
> Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Windows Data Science VM running Visual Studio 2017

### Versions
>

### Mention any other details that might be useful

Run multiple times. Fails on the same table but with variations in the error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.