Code Monkey home page Code Monkey logo

mdb-reader's Introduction

npm tests downloads license semantic-release

MDB Reader

JavaScript library to read data from Access databases.

Installation

npm install mdb-reader

or

yarn add mdb-reader

Compatibility

Node / JavaScript

Access Database versions

  • Access 97 (Jet 3)
  • Access 2000, XP and 2003 (Jet 4)
  • Access 2010 (ACE14)
  • Access 2013 (ACE15)
  • Access 2016 (ACE16)
  • Access 2019 (ACE17)

Encryption

  • Jet
  • Office Agile
  • Office RC4 Crypto API

Dependencies

To decrypt databases, this library requires a few dependencies:

Usage

import { readFileSync } from "fs";
import MDBReader from "mdb-reader";

const buffer = readFileSync("database.mdb");
const reader = new MDBReader(buffer);

reader.getTableNames(); // ['Cats', 'Dogs', 'Cars']

const table = reader.getTable("Cats");
table.getColumnNames(); // ['id', 'name', 'color']
table.getData(); // [{id: 5, name: 'Ashley', color: 'black'}, ...]

Examples

  • browser - Running in the browser with parcel
  • to-json - CLI script that accepts a database file name and outputs the data as JSON
  • sveltekit - Running with SvelteKit

API

MDBReader

class MDBReader {
    /**
     * @param buffer Buffer of the database.
     */
    constructor(
        buffer: Buffer,
        options?: {
            password?: string;
        }
    );

    /**
     * Date when the database was created
     */
    getCreationDate(): Date | null;

    /**
     * Database password
     */
    getPassword(): string | null;

    /**
     * Default sort order
     */
    getDefaultSortOrder(): Readonly<SortOrder>;

    /**
     * Returns an array of table names.
     *
     * @param normalTables Includes user tables.
     * @param systemTables Includes system tables.
     * @param linkedTables Includes linked tables.
     */
    getTableNames({
        normalTables,
        systemTables,
        linkedTables,
    }?: {
        normalTables: boolean;
        systemTables: boolean;
        linkedTables: boolean;
    }): string[];

    /**
     * Returns a table by its name.
     *
     * @param name Name of the table. Case sensitive.
     */
    getTable(name: string): Table;
}

Table

class Table {

    /**
     * Name of the table
     */
    readonly name: string,

    /**
     * Number of rows.
     */
    readonly rowCount: number;

    /**
     * Number of columns.
     */
    readonly columnCount: number;

    /**
     * Returns an ordered array of all column definitions.
     */
    getColumns(): Column[];

    /**
     * Returns a column definition by its name.
     *
     * @param name Name of the column. Case sensitive.
     */
    getColumn(name: string): Column;

    /**
     * Returns an ordered array of all column names.
     */
    getColumnNames(): string[];

    /**
     * Returns data from the table.
     *
     * @param columns Columns to be returned. Defaults to all columns.
     * @param rowOffset Index of the first row to be returned. 0-based. Defaults to 0.
     * @param rowLimit Maximum number of rows to be returned. Defaults to Infinity.
     */
    getData<TRow extends {
        [column in TColumn]: Value;
        TColumn extends string = string;
    }>(options?: {
        columns?: ReadonlyArray<TColumn>;
        rowOffset?: number;
        rowLimit?: number;
    }): TRow[];
}

Column

interface Column {
    /**
     * Name of the table
     */
    name: string;

    /**
     * Type of the table
     */
    type: ColumnType;
    size: number;

    fixedLength: boolean;
    nullable: boolean;
    autoLong: boolean;
    autoUUID: boolean;

    /**
     * Only exists if type = 'numeric'
     */
    precision?: number;

    /**
     * Only exists if type = 'numeric'
     */
    scale?: number;
}

Data Types

The data types returned by Table.getData() depends on the column type. Null values are always returned as null.

Column Type JavaScript Type
bigint bigint
binary Buffer
boolean boolean
byte number
complex number
currency string
datetime Date
datetimeextended string
double number
float number
integer number
long number
memo string
numeric string
ole Buffer
repid string
text string

Development

Build

To build the library, first install the dependencies, then run npm run build for a single build or npm run watch for automatic rebuilds.

npm install
npm run build

Tests

To run the tests, first install the dependencies, then run npm test. Watch mode can be started with npm test -- --watch.

npm install
npm test

Resources

MDB Tool

GitHub

Set of applications to read and write Access files, written in C. Main source of knowledge about the file structure and algorithms to extract data from it.

Jackcess

Jackcess

Java library to read and write Access files. It inspired the interface of this library. The databases used for testing are copied from the repository.

The unofficial MDB Guide

Tech Specs for the JET format used by Access 1997-2010

License

MIT

mdb-reader's People

Contributors

andipaetzold avatar andreasrueedlinger avatar dependabot[bot] avatar oliverdowling avatar paulnorstrom avatar redmer avatar renovate-bot avatar renovate[bot] avatar semantic-release-bot avatar sgvictorino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mdb-reader's Issues

Floating point issue in datetime parsing

Hi @andipaetzold
Thanks for this very useful libary to migrate old legacy data cross-platform!

Comparing the output of your library to the output of mdb-tools, I detected some discrepancy in the datetime parsing.
Example:

const td = 32234.339699074073;
const daysDiff = 25569;
return new Date((td - daysDiff) * 86400 * 1000);

This results in 1988-04-01T08:09:09.999Z instead of 1988-04-01T08:09:10.000Z due to the poor floating point arithmetic in JavaScript.

I suggest rounding the milliseconds passed to the Date constructor.

Linux: Unexpected token '?'

Hi, i am trying to run a script i wrote on a linux server but get an error (Windows works perfectly).

file:///root/mdb-reader/node_modules/mdb-reader/lib/node/MDBReader.js:14
        this.db = new Database(this.buffer, password ?? "");
                                                      ^

SyntaxError: Unexpected token '?'

Any idea how to fix this? Appeciate the help.

[browser] Use of operator ?? and ?.

Hello,

Currently, I'm trying to integrate mdb-reader in a vue application (vue2 with vuetify) and I encounter some issues because you use the ?? and ?. operators. These operators are pretty recent and not supported in many browser. Is it really needed in this library.

Best regards,

Data loss for string fields over 1893 characters

Fields that contain over ~1893 characters are getting cut very short, returning a string that somewhat resembles what should have been returned. For example, if the long string started with "Version 1 ... " followed by large amounts of text, its returning "Versgr " and nothing else

Decrypt all Codecs

  • Jet
  • MSISAM
  • Office Agile
  • Office Binary Doc RC4
  • Office ECMA Standard Encryption
  • Office RC4 Crypto API

Unable to import the package in Nestjs

Hi, I am trying to use this awesome package in Nestjs, but I receive this error, please help to solve this issue, thank you in advance.

const mdb_reader_1 = require("mdb-reader").default; ^ Error [ERR_REQUIRE_ESM]: require() of ES Module E:\api\node_modules\mdb-reader\lib\node\index.js from E:\api\dist\workshop\workshop.service.js not supported. Instead change the require of index.js in E:\api\dist\workshop\workshop.service.js to a dynamic import() which is available in all CommonJS modules. at Object.<anonymous> (E:\api\dist\workshop\workshop.service.js:12:22) at Object.<anonymous> (E:\api\dist\workshop\workshop.module.js:11:28) at Object.<anonymous> (E:\api\dist\app.module.js:17:27) at Object.<anonymous> (E:\api\dist\main.js:4:22)

Cannot read a table with a byte field

I'm trying to read the table tRelais in this mdb:
https://u.pcloud.link/publink/show?code=XZu5EHVZVqot9BBBU7FMwQHIVSJ4zHIwB9i7

If I read this table with "sysinfotool MDB file viewer 20.0", I read a full table with no empty data. But, if I read with this library, I found many empty object:

{NoEngageRelais: 1487, NoEngagement: 44242, NoSerieTour: 8096, NoAthlete: 13310, NoOrdreRelais: null}
{NoEngageRelais: 1488, NoEngagement: 44242, NoSerieTour: 8096, NoAthlete: null, NoOrdreRelais: null} ...

The field NoOrdreRelais has the type octet in french language... It is a small int with 8bits.

My script:

function loadRelais (reader) {
  let cmtrelays = new Map()
  const table = reader.getTable('tRelais')
  let rows = table.getData()
  for (let row of rows) {
    console.log(row)
    let uid = row.NoEngageRelais
    let noEngagement = row.NoEngagement
    let serieTour = row.NoSerieTour
    let athUID = row.NoAthlete
    let ord = row.NoOrdreRelais
    cmtrelays.set(uid, { uid, noEngagement, serieTour, athUID, ord })
  }
  return cmtrelays
}

GMT Problem

Hello,

If you look into the table tSerieTour and look the column HoraireSerieTour, you have dates ending by 0. When I read the table with mdb-reader, I got:
Sat Dec 30 1899 15:14:21 GMT+0009 (Central European Standard Time) object
There is a delta of 9 minutes and 21seconds. My computer is GMT+1 (Paris, Bruxelles, ...)

If you set the computer clock with GTM+0, there's no bug.

mdb with problem

Encoding Problem

I have a french MDB, using probably not an UTF-8 encoding. If I read the table 'tTour', i see:

{ NoTour: 1, NomTour: 'Engagement' }
{ NoTour: 2, NomTour: 'S�rie' }
{ NoTour: 3, NomTour: '16�me f.' }
{ NoTour: 4, NomTour: '8�me f.' }
{ NoTour: 5, NomTour: 'Quart f.' }
{ NoTour: 6, NomTour: 'Semi f.' }
{ NoTour: 7, NomTour: 'Finale'

I suspect an iso8859-15.

the mdb source:
https://u.pcloud.link/publink/show?code=XZnIlkVZLG3ffaEEef4XDRskx2F5qyBGbHYV

Is there an option to set the encoding source ?

unable to build browser example

Here are the commands I ran:

git clone https://github.com/andipaetzold/mdb-reader.git
cd mdb-reader/examples/browser
npm install
npm run dev

Here's the output:

> [email protected] dev
> rimraf dist && parcel

Server running at http://localhost:1234
× Build failed.

@parcel/core: Failed to resolve 'mdb-reader' from './examples/browser/src/index.mjs'

  /path/to/mdb-reader/examples/browser/src/index.mjs:2:23
    1 | import { Buffer } from "buffer/";
  > 2 | import MDBReader from "mdb-reader";
  >   |                       ^^^^^^^^^^^^
    3 |
    4 | const button = document.getElementById("button");

@parcel/resolver-default: Could not load './lib/browser/index.js' from module 'mdb-reader' found
in package.json#browser

  /path/to/mdb-reader/examples/browser/node_modules/mdb-reader/package.json:6:16
    5 |     "main": "lib/node/index.js",
  > 6 |     "browser": "lib/browser/index.js",
  >   |                ^^^^^^^^^^^^^^^^^^^^^^ './lib/browser/index.js' does not exist, did you mean './src/crypto/index.ts'?'
    7 |     "types": "lib/types/index.d.ts",
    8 |     "sideEffects": false,

Parcel is shutting down...

Do not read full data page when using rowOffset

The library currently always reads all records of a data page and then filters it using Array.slice. This means a performance loss, especially for large table records.

This should be improved such that only data is read that is actually returned.

Uncaught RangeError: offset is not uint

When I try to load this using the Browser example, I get:

Uncaught RangeError: offset is not uint
at checkOffset (index.js:1097:1)
at Uint8Array.readInt32LE (index.js:1220:1)
at readComplexOrLong (complexOrLong.js:2:1)
at readFieldValue (index.js:43:1)
at Table.getDataFromPage (Table.js:242:1)
at Table.getData (Table.js:132:1)
at new MDBReader (MDBReader.js:15:1)

Can't Import it

I am unable to use const MDBParser = require("mdb-reader"). I don't wanna change all other imports of other libraries being used.

Enhancement: Native ESM modules using CDN for browser example

First of all, thank you for this great library. It is tremendously helpful for client-side exploration of MS Access databases.

I noticed that the browser-based example is using parcel to bundle the assets. You are probably aware of this, but it is possible to eliminate the build/bundling step by referencing a CDN. For example:

import { Buffer } from 'https://cdn.jsdelivr.net/npm/buffer@6/+esm'
import MDBReader from 'https://cdn.jsdelivr.net/npm/[email protected]/+esm' // @latest or @2 can be used instead

This could prove helpful for importing this library directly in the browser or other JavaScript runtimes that don't use a package manager.

rowOffset and rowLimit

I am reading a table with 9855 records. I want to read 1000 at a time.
If I pass rowOffset:0, rowLimit:1000 I get 1000 records, if I pass rowOffset 1000, rowLimit: 1000 I get 0(zero) records

table.name:Products, table.rowCount:9855
table.getData({rowOffset:0,rowLimit:1000});
returned: 1000 records
table.getData({rowOffset:1000,rowLimit:1000});
returned: 0 records

and if I pass rowOffset 1000, rowLimit: 2000 it trows an error. So, I am not sure how it is supposed to work.

table.name:Products, table.rowCount:9855
table.getData({rowOffset:0,rowLimit:1000});
returned: 1000 records
table.getData({rowOffset:1000,rowLimit:2000});
(node:20761) UnhandledPromiseRejectionWarning: RangeError [ERR_OUT_OF_RANGE]: The value of "offset" is out of range. It must be >= 0 and <= 4094. Received 6400
    at boundsError (internal/buffer.js:81:9)
    at Buffer.readUInt16LE (internal/buffer.js:238:5)
    at Buffer.readUIntLE (internal/buffer.js:175:17)
    at D.getDataFromPage (/Users/fpw/Projects/migrandama/mystore/mdbImport2/node_modules/mdb-reader/lib/index.js:1:10704)
    at D.getData (/Users/fpw/Projects/migrandama/mystore/mdbImport2/node_modules/mdb-reader/lib/index.js:1:10030)

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Ignored or Blocked

These are blocked by an existing closed PR and will not be recreated unless you click a checkbox below.

Detected dependencies

github-actions
.github/workflows/compressed-size.yml
  • actions/checkout v4
  • preactjs/compressed-size-action v2
  • ubuntu 22.04
.github/workflows/push.yml
  • actions/checkout v4
  • actions/setup-node v4
  • actions/checkout v4
  • actions/setup-node v4.0.1
  • actions/checkout v4
  • actions/setup-node v4
  • ubuntu 22.04
  • ubuntu 22.04
  • ubuntu 22.04
npm
examples/browser/package.json
  • buffer 6.0.3
  • mdb-reader 3.0.0
  • rimraf 5.0.5
examples/sveltekit/package.json
  • mdb-reader 3.0.0
  • @sveltejs/adapter-auto 3.0.1
  • @sveltejs/kit 2.0.6
  • @sveltejs/vite-plugin-svelte 3.0.1
  • svelte 4.2.8
  • svelte-check 3.6.2
  • tslib 2.6.2
  • typescript 5.3.3
  • vite 5.0.12
  • vite-plugin-node-polyfills 0.18.0
examples/to-json/package.json
  • mdb-reader 3.0.0
package.json
  • browserify-aes ^1.0.0
  • create-hash ^1.0.0
  • fast-xml-parser ^4.0.0
  • @semantic-release/changelog 6.0.3
  • @semantic-release/git 10.0.1
  • @tsconfig/node18 18.2.2
  • @tsconfig/strictest 2.0.2
  • @types/chai 4.3.11
  • @types/mocha 10.0.6
  • @types/mocha-each 2.0.4
  • @types/node 18.19.3
  • @typescript-eslint/eslint-plugin 7.0.1
  • @typescript-eslint/parser 7.0.1
  • chai 5.0.0
  • eslint 8.56.0
  • mocha 10.2.0
  • mocha-each 2.0.1
  • rimraf 5.0.5
  • semantic-release 23.0.1
  • ts-node 10.9.2
  • typescript 5.3.3
nvm
.nvmrc
  • node 20

  • Check this box to trigger a request for Renovate to run again on this repository

Error with nodejs example

I have this error with node 14.5 version:


file:///C:/Users/laure/Documents/Perso/Sources/live/node_modules/mdb-reader/lib/codec-handler/handlers/office/agile/EncryptionDescriptor.js:1
import { XMLParser } from "fast-xml-parser";
         ^^^^^^^^^
SyntaxError: The requested module 'fast-xml-parser' is expected to be of type CommonJS, which does not support named exports. CommonJS modules can be imported by importing the default export.
For example:
import pkg from 'fast-xml-parser';
const { XMLParser } = pkg;
    at ModuleJob._instantiate (internal/modules/esm/module_job.js:98:21)
    at async ModuleJob.run (internal/modules/esm/module_job.js:137:5)
    at async Loader.import (internal/modules/esm/loader.js:162:24)
    at async formattedImport (C:\Users\laure\Documents\Perso\Sources\live\node_modules\mocha\lib\esm-utils.js:7:14)
    at async Object.exports.loadFilesAsync (C:\Users\laure\Documents\Perso\Sources\live\node_modules\mocha\lib\esm-utils.js:55:20)
    at async singleRun (C:\Users\laure\Documents\Perso\Sources\live\node_modules\mocha\lib\cli\run-helpers.js:125:3)
    at async Object.exports.handler (C:\Users\laure\Documents\Perso\Sources\live\node_modules\mocha\lib\cli\run.js:362:5)

It seems the problem comes with 'src/codec-handler/handlers/office/agile/EncryptionDescriptor.js' file.

Best regards,

Unable to open MDB files -> RangeError: offset is not uint

Mode: Browser (Chrome)
mdb-reader version: 2.2.2
Node version: 14.19.1
Buffer: 6.0.3

When using this library, and passing in an MDB file from the test examples, e.g. test.mdb (test/data/v2000) when constructing a reader, I receive the error: Range Error: offset is not uint.

After debugging in Chrome, this is caused by: complexOrLong.js.readComplexOrLong(). This is calling buffer.readInt32LE() without passing an offset in. Other articles suggests that offset has a default version of 0 https://www.geeksforgeeks.org/node-js-buffer-readint32le-method/.

The buffer package however requires a value to be passed in https://github.com/feross/buffer/blob/795bbb5bda1b39f1370ebd784bea6107b087e3a7/index.js#L1127.

In this case should this package's code just be passing 0 in as the parameter value for offset?

If I manually edit those functions (integer.js, complexOrLong.js, float.js) in the node_modules folder for this package to pass 0 in as the index, the MDBReader is constructed and I can then query the data within it. Is there a mismatch between node versions?

Browser import issue

I'm trying to use MDBReader in the browser for a project that I'm serving with Web Dev Server, but keep hitting the following error:

Error while handling server request.
PluginError: Could not resolve import "crypto".

I've also tried directly referencing the built browser version

import { MDBReader } from 'mdb-reader/lib/browser/index.js';

but this yields a different error in lib/browser/environment/index.js where browserifyAES fails to import because it doesn't provide a default export.

Am I importing something incorrectly in either scenario?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.