Code Monkey home page Code Monkey logo

relational_data_augmentation's Introduction

โ– Must-Read Papers on Tabular Data Augmentation (TDA)

Papers listed here may be not from top publications, some of them even are not for purely relational data, but are all interesting papers related to relational data augmentation that deserve reading.

Year 2023

[SIGMOD] SANTOS: Relationship-based Semantic Table Union Searchtaset Discovery from Data Lakes with Contextualized Column-based Representation Learning [paper] [official code]

[VLDB] Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning [paper] [official code]

[ACL findings] Automatic Table Union Search with Tabular Representation Learning [paper]

[VLDB] DeepJoin: Joinable Table Discovery with Pre-Trained Language Models [paper]

[SIGMOD Conference Companion] Table Discovery in Data Lakes: State-of-the-art and Future Directions [paper]

[SIGMOD] Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation [paper]

Year 2022

[VLDB] Integrating Data Lake Tables [paper] [official code]

[SIGMOD] Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation [paper]

[VLDB] MATE: multi-attribute table extraction [paper] [official code]

[VLDB] Selective data acquisition in the wild for model charging [paper]

[ICDE] Feature Augmentation with Reinforcement Learning [paper]

[VLDB] Coresets over multiple tables for feature-rich and data-efficient machine learning [paper] [official code]

[ICDE] A Sketch-based Index for Correlated Dataset Search [paper]

[TKDE] Data Lake Organization [paper]

[SIGMOD] Annotating Columns with Pre-trained Language Models [paper] [official code]

[WWW] StruBERT: Structure-aware BERT for Table Search and Matching [paper] [official code]

Year 2021

[SIGMOD] Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond [paper] [official code]

[ICDE] Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach [paper]

[VLDB] Automatic data acquisition for deep learning [paper]

[ICDE] Valentine: Evaluating Matching Techniques for Dataset Discovery [paper] [official code]

[VLDB] RONIN: data lake exploration [paper]

Year 2020

[VLDB] ARDA: automatic relational data augmentation for machine learning [paper]

[ICDE] Dataset Discovery in Data Lakes [paper]

[VLDB] Relational data synthesis using generative adversarial networks: a design space exploration [paper] [official code]

[VLDB] Sato: contextual semantic type detection in tables [paper] [official code]

[VLDB] TURL: table understanding through representation learning [paper] [official code]

[WWW] Novel Entity Discovery from Web Tables [paper]

[SIGMOD] Organizing Data Lakes for Navigation [paper]

Year 2019

[SIGMOD] JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes [paper]

[IJCAI] FakeTables: Using GANs to Generate Functional Dependency Preserving Tables with Bounded Real Data [paper]

[SIGIR] Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval [paper] [official code]

[SIGKDD] Sherlock: A Deep Learning Approach to Semantic Data Type Detection [paper]

[CIKM] Auto-completion for Data Cells in Relational Tables [paper]

Year 2018

[VLDB] Table union search on open data [paper]

[VLDB] Open data integration [paper]

[ICDE] Aurum: A Data Discovery System [paper] [official code]

[VLDB] Data synthesis based on generative adversarial networks [paper]

Year 2017

[SIGIR] EntiTables: Smart Assistance for Entity-Focused Tables [paper]

Year 2016

[VLDB] LSH ensemble: internet-scale domain search [paper]

Year 2015

[BDC] Towards a Hybrid Imputation Approach Using Web Tables [paper]

Year 2013

[SIGMOD] InfoGather+: semantic matching and annotation of numeric and time-varying attributes in web tables [paper]

Year 2012

[SIGMOD] InfoGather: entity augmentation and attribute discovery by holistic matching with web tables [paper]

[SIGMOD] Finding related tables [paper]

relational_data_augmentation's People

Contributors

ciciliaclx avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.