Code Monkey home page Code Monkey logo

npyjs's People

Contributors

adityasarwade avatar d4l3k avatar dependabot[bot] avatar fil avatar ha-limlee avatar j6k4m8 avatar jhughes982 avatar koreanwglasses avatar rsxdalv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

npyjs's Issues

Functionality to allow ArrayBuffer inputs along with file path

Hey, @j6k4m8 I've used your package in a few packages and I think it is a life-saver.

I faced an issue when I wanted to parse and open a .npy file from and ArrayBuffer in TypeScript.
I identified that the load function internally creates an ArrayBuffer which stores the result of the fetch request.

It would be really helpful if we could bypass this part of the code when the passed argument is already an ArrayBuffer.

Although I am not as experienced as my peers, I'm submitting a PL as it might be a useful addition.

Bring npm package up to date with Github repo

The current version of the npyjs npm package seems to be behind the Github repo version, updating the npm package to match the repo version would be helpful.

You could consider automating the process of publishing to npm through Github actions (e.g. when pushing code or when accepting pull requests).

File failing to load.

When trying to open a file I get the following. Tried with both the local and absolute paths as well as with file:/// and without.

TypeError: Only absolute URLs are supported
    at getNodeRequestOption

Various issues in code

Hello -- some things I noticed in the code which I think may be worth mentioning.

  1. Errors for headers larger than 255 bytes: In parse(), when reading the header length, a uint8 is being read from the DataView. It should actually be a uint16 in little-endian ordering. Existing code does not work for files with headers >255 bytes. Not a common situation but one that I've run into, especially when logging arrays of structures with named fields.

Fix would be replacing getUint8( offset ) with getUint16( offset, true ), where true indicates little-endian ordering.

  1. Errors when dtype description has more than one "(": When translating the header content from 'Python dict' format to 'JSON' format, there are three calls to String.replace() to substitute single quotes for double quotes, and, parentheses to square brackets. I'm guessing this is for py-tuple to js-array conversion. Two of the calls use regular expressions as the match argument, and one uses a string. Unfortunately, String.replace() does not behave consistently for these inputs. When the argument is a regular expression, String.replace() replaces all instances of the match argument. When the argument is a string, it replaces only the first. In this case, what happens is that ALL ")" characters are replaced with "]", but, only the FIRST "(" is converted to a "[". I suspect this is not intended.

  2. Errors for data segments not aligned to word boundaries: Various arrayConstructor function values expect data samples to be aligned to 4-byte word boundaries ("<i4", for example). When this is not the case, an error is thrown. If you want to read the data into arrays using these constructor functions, to accommodate situations where the first data address is not at a word boundary, the data should first be sliced into a new ArrayBuffer before being passed into the constructor.

Browser Version

This works great in the browser without the fetch import. I had to a do a little fiddling to figure that out and it looks @Fil did too.

Have you considered publishing the decoding function alone so this library can be used directly in the browser?

Int16 dtype

Thanks for your code! I found, however, that Int16 files were not supported. This can be fixed easily enough by including the following dtype

"<i2": {
name: "int16",
size: 16,
arrayConstructor: Int16Array,
},

Thanks

How to load a local npy file using npyjs in React?

Loading a 2D array does not seem to work despite the code compiling. In the console I get the error -

Uncaught (in promise) SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data

The code that I have is -

import ndarray from "ndarray";
import npyjs from "npyjs";

const NP = function(){

let n = new npyjs();

n.load("./embeddings.npy").then(res => {
    // res has { data, shape, dtype } members.
    const npyArray = ndarray(res.data, res.shape);
    console.log(npyArray);
});
}
export default NP;

If I replace the file path with https://rawcdn.githack.com/aplbrain/npyjs/ba60a3a529f3210dd07d2ed05ab628939e18b6a7/test/data/4x4x4x4x4-float32.npy then it seems to work...is there a way to load from local file paths?

Does not work with structured arrays

.npy files can contain structured arrays, as described here, fail to open with error:

Uncaught SyntaxError: Unexpected token ( in JSON at position 28

Example, in python create the structured array:

np.save('test/out.npy', np.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)],
             dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')]))

Serve the out.npy file:

cd test
npx serve

Try to load that out.npy file:

> np.load("http://localhost:3000/out.npy")
Promise {
  <pending>,
  [Symbol(async_id_symbol)]: 4454,
  [Symbol(trigger_async_id_symbol)]: 5
}
> Uncaught SyntaxError: Unexpected token ( in JSON at position 28

Support for all numpy dtypes

Numpy supports the following dtypes as per the docs. However npyjs only supports the following:

character description supported
'?' boolean NO
'b' (signed) byte NO
'B' unsigned byte NO
'i' (signed) integer YES
'u' unsigned integer YES
'f' floating-point YES
'c' complex-floating point NO
'm' timedelta NO
'M' datetime NO
'O' (Python) objects N/A
'S',ย 'a' zero-terminated bytes (not recommended) NO
'U' Unicode string NO
'V' raw data (void) NO

Would be great to add these in.

Throw Errors from Load Function

Looking to use this library as I'll be loading in npy files. I was looking at the source code and noticed that you're catching potential errors in the load function and just printing them to console.error. I would like to suggest that those errors are allowed to propagate out of the function, so the developer can handle them instead of having them go to the console.

The callback might be an issue. My ideas on that would be to either add a second errorCallback, or a second error argument to the first callback. I'd be happy to put together a PR as well if that helps!

Edit: I should clarify, the second argument method I think should follow the error-first callback schema, so the result would then be in the second argument. This would be a change in the way the module works, and would possibly warrant a minor version bump if accepted. The errorCallback method would not change existing functionality as it could simply be omitted, which may be preferable.

dtype '|u1'?

Someone sent me a npy with '|u1', which results in an error. I had to add it in dtypes so that it would parse.

hexdump -C

00000000  93 4e 55 4d 50 59 01 00  76 00 7b 27 64 65 73 63  |.NUMPY..v.{'desc|
00000010  72 27 3a 20 27 7c 75 31  27 2c 20 27 66 6f 72 74  |r': '|u1', 'fort|
00000020  72 61 6e 5f 6f 72 64 65  72 27 3a 20 46 61 6c 73  |ran_order': Fals|
00000030  65 2c 20 27 73 68 61 70  65 27 3a 20 28 34 39 39  |e, 'shape': (499|
00000040  35 30 30 2c 29 2c 20 7d  20 20 20 20 20 20 20 20  |500,), }        |

Raw access for float16

Hi,

I am using your library for fast prototyping.

It was very convenient to use but i had to patch it to read float16.

float16 is not supported by js, but is supported by webgl. So this allow to fetch the data as an opaque manner, the shape remains the same, and the result is anyway typed with dtype==='float16'.

I also use some wrapper to read float16 values in js, but i think it is not really needed at this stage.

Anyway, here is setup.

import _npyjs from 'npyjs';

const npyjs = new _npyjs();

// Supports float16 as uint16
npyjs.dtypes['<f2'] = {
  name: 'float16',
  size: 16,
  arrayConstructor: Uint16Array,
};

Tell me if you prefer to have a PR.
Regards,

Stating changes in accordance with Apache License

/*
 * Copyright 2023 aplbrain/npyjs
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/*
 * This file includes code adapted from the npyjs project,
 * which is licensed under the Apache License, Version 2.0.
 * The original code can be found at: https://github.com/aplbrain/npyjs/blob/master/index.js
 *
 * Modifications:
 * Added U1 support and fixed U1 -> u1 bug, added fortran comment and made a functional interface
 */

const dTypeMapping: Record<
  string,
  // "<u1" | "|u1" | "<u2" | "|i1" | "<i2" | "<u4" | "<i4" | "<u8" | "<i8" | "<f4" | "<f8" | "<U1",
  {
    name: string;
    size: number;
    arrayConstructor:
      | Uint8ArrayConstructor
      | Uint16ArrayConstructor
      | Int8ArrayConstructor
      | Int16ArrayConstructor
      | Int32ArrayConstructor
      | BigUint64ArrayConstructor
      | BigInt64ArrayConstructor
      | Float32ArrayConstructor
      | Float64ArrayConstructor
      | Uint32ArrayConstructor;
  }
> = {
  "<u1": {
    name: "uint8",
    size: 8,
    arrayConstructor: Uint8Array,
  },
  "|u1": {
    name: "uint8",
    size: 8,
    arrayConstructor: Uint8Array,
  },
  "<u2": {
    name: "uint16",
    size: 16,
    arrayConstructor: Uint16Array,
  },
  "|i1": {
    name: "int8",
    size: 8,
    arrayConstructor: Int8Array,
  },
  "<i2": {
    name: "int16",
    size: 16,
    arrayConstructor: Int16Array,
  },
  "<u4": {
    name: "uint32",
    size: 32,
    arrayConstructor: Int32Array,
  },
  "<i4": {
    name: "int32",
    size: 32,
    arrayConstructor: Int32Array,
  },
  "<u8": {
    name: "uint64",
    size: 64,
    arrayConstructor: BigUint64Array,
  },
  "<i8": {
    name: "int64",
    size: 64,
    arrayConstructor: BigInt64Array,
  },
  "<f4": {
    name: "float32",
    size: 32,
    arrayConstructor: Float32Array,
  },
  "<f8": {
    name: "float64",
    size: 64,
    arrayConstructor: Float64Array,
  },
  "<U1": {
    name: "<U1", // no way to know when to use ucs2 vs ucs4
    size: 32,
    arrayConstructor: Uint32Array,
  },
};

export const parseNpy = (arrayBufferContents: ArrayBuffer) => {
  // const version = arrayBufferContents.slice(6, 8); // Uint8-encoded
  const headerLength = new DataView(arrayBufferContents.slice(8, 10)).getUint8(
    0
  );
  const offsetBytes = 10 + headerLength;

  const hcontents = new TextDecoder("utf-8").decode(
    new Uint8Array(arrayBufferContents.slice(10, 10 + headerLength))
  );
  const header = JSON.parse(
    hcontents
      // .toLowerCase() // True -> true
      .replace(/True/g, "true")
      .replace(/False/g, "false")
      .replace(/'/g, '"')
      .replace("(", "[")
      .replace(/,*\),*/g, "]")
  );
  const shape = header.shape;
  const dtype = dTypeMapping[header.descr];
  const nums = new dtype["arrayConstructor"](arrayBufferContents, offsetBytes);

  // if fortran_order:
  //     array.shape = shape[::-1]
  //     array = array.transpose()

  return {
    dtype: dtype.name,
    data: nums,
    shape,
    fortranOrder: header.fortran_order,
  };
};

Release including types

Thanks for this awesome package!

The current release doesn't include any of the types at found here. A quick release that includes these types would be great. Currently we just copy-pasted that file into our project.

TypeScript Support

I'm using the code from this library in a TypeScript/ Deno environment. It'd be nice if there was a possible TS implementation. If not, I can take a shot at it! I have a rough, typed implementation. Let me know what you think and how I can test/ help implement this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.