Code Monkey home page Code Monkey logo

scodec's Introduction

scodec

Discord Maven Central javadoc

Scala combinator library for working with binary data.

Design Constraints

This library focuses on contract-first and pure functional encoding and decoding of binary data. The following design constraints are considered:

  • Binary structure should mirror protocol definitions and be self-evident under casual reading
  • Mapping binary structures to types should be statically verified
  • Encoding and decoding should be purely functional
  • Failures in encoding and decoding should provide descriptive errors
  • Compiler plugin should not be used

As a result, the library is implemented as a combinator based DSL. Performance is considered but yields to the above design constraints.

Acknowledgements

Scodec 1.x used Shapeless and is heavily influenced by scala.util.parsing.combinator. As of Scodec 2.x, the library only depends on the standard library.

Administrative

This project is licensed under a 3-clause BSD license.

The scodec channel on Typelevel Discord is a good place to go for help.

Introduction

The primary abstraction is a Codec[A], which supports encoding a value of type A to a BitVector and decoding a BitVector to a value of type A.

The codecs objects provides a number of predefined codecs and combinators.

    import scodec.*
    import scodec.bits.*
    import scodec.codecs.*

    // Create a codec for an 8-bit unsigned int followed by an 8-bit unsigned int followed by a 16-bit unsigned int
    val firstCodec = uint8 :: uint8 :: uint16

    // Decode a bit vector using that codec
    val result: Attempt[DecodeResult[(Int, Int, Int)]] = firstCodec.decode(hex"102a03ff".bits)
    // Successful(DecodeResult(((16, 42), 1023), BitVector(empty)))

    // Sum the result
    val add3 = (_: Int) + (_: Int) + (_: Int)
    val sum: Attempt[DecodeResult[Int]] = result.map(_.map(add3))
    // Successful(DecodeResult(1081, BitVector(empty)))

Automatic case class binding is supported via tuples:

    case class Point(x: Int, y: Int, z: Int)

    val pointCodec: Codec[Point] = (int8 :: int8 :: int8).as[Point]

    val encoded: Attempt[BitVector] = pointCodec.encode(Point(-5, 10, 1))
    // Successful(BitVector(24 bits, 0xfb0a01))

    val decoded: Attempt[DecodeResult[Point]] = pointCodec.decode(0xfb0a01)
    // Successful(DecodeResult(Point(-5, 10, 1), BitVector(empty)))

Codecs can also be derived, resulting in usage like:

    case class Point(x: Int, y: Int, z: Int) derives Codec

    val encoded: Attempt[BitVector] = Codec.encode(Point(-5, 10, 1))
    // Successful(BitVector(96 bits, 0x000000fb0000000a00000001))

    val decoded: Attempt[DecodeResult[Point]] = Codec.decode[Point](0x000000fb0000000a00000001)
    // Successful(DecodeResult(Point(-5, 10, 1), BitVector(empty)))

New codecs can be created by either implementing the Codec trait though typically new codecs are created by applying one or more combinators to existing codecs.

See the guide for detailed documentation. Also, see ScalaDoc. Especially:

Ecosystem

Many libraries have support for scodec:

Examples

There are various examples in the test directory, including codecs for:

The protocols module of fs2 has production quality codecs for the above examples.

The skunk database library uses scodec to communicate with Postgres.

The hippo project uses scodec to parse .hprof files.

The bitcoin-scodec library has a codec for the Bitcoin network protocol.

The scodec-msgpack library provides codecs for MessagePack.

The fs2-http project uses FS2, scodec, and shapeless to implement a minimal HTTP client and server.

The scodec-bson library implements BSON codecs and combinators.

Testing Your Own Codecs

If you're creating your own Codec instances scodec publishes some of its own test tooling in the scodec-testkit module.

Getting Binaries

libraryDependencies += "org.scodec" %%% "scodec-core" % 
  (if (scalaVersion.value.startsWith("2.")) "1.11.9" else "2.1.0")

Building

This project uses sbt and requires node.js to be installed in order to run Scala.js tests. To build, run sbt publishLocal.

Code of Conduct

See the Code of Conduct.

scodec's People

Contributors

aartigao avatar ajaychandran avatar alvaroc1 avatar aoiroaoino avatar cranst0n avatar danicheg avatar danielwegener avatar fristi avatar hairyfotr avatar itglvb avatar jarlakxen avatar jrudnick avatar knutwalker avatar ljoublanc avatar marcsaegesser avatar marq avatar mpilquist avatar nadavwr avatar nicolasstucki avatar nikiforo avatar ogirardot avatar pchiusano avatar pm47 avatar pocketberserker avatar scala-steward avatar searler avatar smarter avatar stew avatar xuwei-k avatar yzernik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scodec's Issues

Add support for lifting Encoder/Decoder to Codec

Often, users need encoding or decoding but not both. Codec has some "forgetful" combinators that convert to an encoder/decoder and forget the dual behavior. For example, calling map on a Codec results in a Decoder. However, the vast majority of combinators require a codec instead of an encoder or decoder.

See this blog post for an example of how this impacts users: http://searler.github.io/scala/2014/09/06/crippled-codec.html

Replicating the combinators for Encoder and Decoder results in a lot of code duplication. Instead, we can lift an Encoder to a Codec by implementing decode with an error (left) result. Similarly, we can lift a Decoder to a Codec by implementing encode with an error result.

Codec.upcast throws "java.lang.InternalError: Malformed class name" for nested classes

The underlying issue is the call to getSimpleName.
See https://issues.scala-lang.org/browse/SI-2034

java.lang.InternalError: Malformed class name
at java.lang.Class.getSimpleName(Class.java:1195)
at scodec.Codec$$anon$4.encode(Codec.scala:341)

  final def upcast[B >: A](implicit m: Manifest[A]): Codec[B] = new Codec[B] {
    def sizeBound: SizeBound = self.sizeBound
    def encode(b: B) = b match {
      case a: A => self encode a
      case _ => Attempt.failure(Err(s"${b.getClass.getSimpleName} is not a ${m.runtimeClass.getSimpleName}"))
    }
    def decode(bv: BitVector) = self decode bv
    override def toString = self.toString
  }

listDelimited or vectorDelimited does not handle "twice" the marker

Hi,
If I take back the example :

scala> codec.decode(ascii.encode("i am delimited").require).require.value // List("i", "am", "delimited")
res106: List[String] = List(i, am, delimited)

scala> codec.decode(ascii.encode("i am  delimited").require).require.value // List("i", "am", "delimited")
res107: List[String] = List(i, am)

I don't think this is supposed to be the expected behaviour, no ?

Regards,

Olivier.

bytes(8).encode() doesn't return an error when supplied a 7B buffer

> println(scodec.codecs.bytes(8).encode(ByteVector.fill(7)(1)))
Successful(BitVector(64 bits, 0x0101010101010100))

It should return an error (codecs/package.scala:126):

`/**
   * Encodes by returning the supplied byte vector if its length is `size` bytes, otherwise returning error;
   * [...]
   */
  def bytes(size: Int): Codec[ByteVector] = new Codec[ByteVector] {

MpegPacketExample mishandles adaptationField

There are a couple related problems wrt mpeg transport stream packets which contain an adaptation field:

  1. The adaptationField doesn't include the adaptation field length byte (which is the first byte in the adaptation field). This means that the adaptationFieldFlags bits are all read from the wrong byte location, so are not valid.

  2. If the packet contains both an adaptation field and payload, and since the sample code doesn't extract the field length, it doesn't make any attempt to shorten the payload by that size, instead specifying a fixed 184 (which is valid when there's no adaptation field). It should instead subtract the adaptation field length from 184. Since there aren't 184 bytes left in such a packet it ends up throwing like so: java.lang.IllegalArgumentException: payload: cannot acquire 1472 bits from a vector that contains 1464 bits

  3. The adaptationField also doesn't account for:

  • transport private data
  • adaptation extension
  • stuffing bytes

Add additional padding combinators for encoding

There are a number of combinators that can be used to pad the encoded representation:

  • constantLenient
  • ignore
  • fixedSizeBytes
  • fixedSizeBits

These have various limitations illustrated by some examples

In particular:

  • zero fill only
  • trailing fill only
  • coupling to representation of padded codec
  • coupling to data being encoded

Additional combinators would primarily provide convenience since the current code suffices to create an implementation (see above examples).

I would be willing to provide a contribution if such extension is deemed appropriate.

Thanks

Support for RLE sequence codecs

Right now, both list and vector happily chew through the ENTIRE INPUT VECTOR without remorse. I'd like a version that feels remorse.

Correct me if I'm wrong, but the primary advantage to such an encoding (aside from saving four bytes) is that you can code an infinite sequence with the same codec that you would use for a finite sequence. However, you can achieve this benefit with RLE by simply using a negative run length (since Java ints are always signed!). Additionally, this benefit cannot be reaped in any case due to the API (List and Vector).

RLE is very hard to achieve from the outside, since I would need to proactively trim the BitVector, but I don't necessarily know the width of each of my values. In fact, for variable-width values like strings, it would be outright impossible without writing a completely green-field Codec.

Oh, in related news, the string codec is undelimited, so there is no difference between the coding for List() and List("") in the codec defined by list(utf8). This could either be viewed as an issue with utf8 or an issue with list. The sort-of-functional workaround is to do something like list(0x00 ~> utf8 <~ 0x01), but that's ugly and prone to errors.

Decode until ByteVector delimiter

Hi! I'm trying to write an implementation of the firmata's protocol using scodec, but I didn't found a way to implement some of the messages. Here, in this message, the firmware name is after position 3 until I read 0xF7. I don't have a way to know in advance how long is the string

0  START_SYSEX       (0xF0)
1  queryFirmware     (0x79)
2  major version     (0-127)
3  minor version     (0-127)
4  first char of firmware name (LSB)
5  first char of firmware name (MSB)
6  second char of firmware name (LSB)
7  second char of firmware name (MSB)
... for as many bytes as it needs
N  END_SYSEX         (0xF7)

Query Firmware Name and Version

This case also happen with another messages like Capability Query.
Is there any way to implement this?

Thanks!!

case class auto derivation

Is scodec able to auto derive codev for something like case class Yolo(req: String, a: String, b: Seq[String], c: Seq[String], d: Seq[Seq[String]])?

variable length list codec

We have blobs generated from existing systems, whose protocol contains of nested blocks. The last block can contain less elements than usual. The file ends there.

def list[A](codec: Codec[A]): Codec[List[A]] = new ListCodec(codec)

scodec.codecs.ListCodec#ListCodec is capable to handle this case by specifying a limit, but scodec.codecs#list doesnt' provide this argument.

Is there a reason why this particular API is not published? We would appreciate a method like listOfAtLeastN or something like this.

Infinite loop while decoding using listDelimiter

Hi,
I seem to have touched some kind of inner issue which for me is critical,

listDelimited(hex"3a".bits, bytes).decode(hex"22993a01353a90223a093af20129260750062759082049964318053a2016283a203a003af230493af9783a003a22".bits)

goes into an infinite loop and I'm not exactly sure why...

Regards,

Olivier.

zlib codec eats remaining bits

scala> val codec = zlib(int32) ~ int32
codec: scodec.Codec[(Int, Int)] = (scodec.codecs.ZlibCodec@22aea051, 32-bit signed integer)

scala> val bits = codec.encode((1, 2)).require
bits: scodec.bits.BitVector = BitVector(128 bits, 0x789c6360606004000005000200000002)

scala> codec.decode(bits)
res1: scodec.Attempt[scodec.DecodeResult[(Int, Int)]] = Failure(cannot acquire 32 bits from a vector that contains 0 bits)

scala> zlib(int32).decode(bits)
res2: scodec.Attempt[scodec.DecodeResult[Int]] = Successful(DecodeResult(1,BitVector(empty)))

Codec Generator

In order to make things faster in the long run is it possible to consider implementing Scodec as a compile time codec generator (similar to a parser generator). Perhaps with the option to hook into the generation phase soe you can generate the codes in other languages also using the same code base.

Codec for recursive data structure

I posted question on Stackoverflow, but figured here might be a better place to ask,

http://stackoverflow.com/questions/32790957/define-codec-for-recursive-data-structure

I have a class looking like this,

case class Foo ( bar: Int, foos: Vector[Foo] )

to define a Codec[Foo], I tried this,

def fc = shapeless.Lazy((int32 ~~ vector(fc)).widenOpt( (Foo.apply _).tupled, Foo.unapply _ ))

But this did not work, and scodec threw StackOverflowError. Wondering what is the right way of defining such a Codec ?

codec.dropUnits is very slow to compile for large case classes

See http://stackoverflow.com/questions/28109303/dropunit-on-hlisted-codecs-doesnt-seem-to-work

The following case class didn't finish compiling in ~10m using 1.7.0-RC1:

import scodec._
import bits._
import codecs._
import implicits._

case class Big(
  a: Int,
  b: Int,
  c: Int,
  d: Int,
  e: Int,
  f: Int,
  g: Int,
  h: Int,
  j: Int,
  k: Int,
  l: Int,
  m: Int,
  n: Int,
  o: Int,
  p: Int
)

object Big {
  implicit val codec: Codec[Big] = (
    constant(bin"0000") :: int32 :: int32 :: int32 :: int32 :: int32 ::
    constant(bin"1111") :: int32 :: int32 :: int32 :: int32 :: int32 ::
    constant(bin"0000") :: int32 :: int32 :: int32 :: int32 :: int32
  ).dropUnits.as[Big]
}

Incorrect documentation example

If you go here and then scroll down to "HList Codecs," you'll see two examples for making a Codec using ::. The second example contains a :: HNil in the type of threeInts, but codec made similarly in the first example above does not. This causes the first example to not compile - that line should be: val codec: Codec[Int :: Int :: String :: HNil] = uint8 :: uint8 :: string.

Also, after talking with @mpilquist, we came to the conclusion that it should be made explicitly clear that one needs to have import shapeless.{::, HNil} in order to have those examples work.

v1.1.0 packaged with stale classfile

        18: invokeinterface #48,  2           // InterfaceMethod scalaz/$bslash$div.flatMap:(Lscala/Function1;)Lscalaz/$bslash$div;

Wellโ€ฆthat seems wrong. Here's how you can reproduce this issue:

package com.rr.experiment

import org.specs2.ScalaCheck
import org.specs2.mutable._

import org.scalacheck._

import scalaz._

import scodec._
import scodec.bits._
import scodec.codecs._

import shapeless._

object BasicSpecs extends Specification with ScalaCheck {

  "simple encoding and decoding" should {
    "code a point2" in check { (x: Int, y: Int) =>
      val codec = (int32 :: int32).as[Point2]

      val p = Point2(x, y)

      val \/-((BitVector.empty, p2)) = for {
        bits <- codec.encode(p)
        result <- Codec.decode(bits)(codec)
      } yield result

      p2 mustEqual p
    }

  }

  case class Point2(x: Int, y: Int)
}

There are slightly more minimal versions, but this is sufficient to see the problem. Run it with the following SBT project:

net.virtualvoid.sbt.graph.Plugin.graphSettings

organization := "com.richrelevance"

name := "scodec-experiments"

version := "0.1-SNAPSHOT"

scalaVersion := "2.11.2"

libraryDependencies := Seq(
  "org.scalaz"    %% "scalaz-core" % "7.0.6",
  "org.typelevel" %% "scodec-core" % "1.1.0",
  "com.chuusai"   %% "shapeless"   % "2.0.0",
  //
  "org.specs2"     %% "specs2"     % "2.4"    % "test",
  "org.scalacheck" %% "scalacheck" % "1.11.5" % "test")

The results?

[error]    IncompatibleClassChangeError: : Found class scalaz.$bslash$div, but interface was expected  (Encoder.scala:79)
[error] scodec.EncoderFunctions$class.encodeBoth(Encoder.scala:79)
[error] scodec.Codec$.encodeBoth(Codec.scala:164)
[error] scodec.codecs.HListCodec$$anon$2.encode(HListCodec.scala:21)
[error] scodec.codecs.HListCodec$$anon$2.encode(HListCodec.scala:20)
[error] scodec.Codec$$anon$1.encode(Codec.scala:33)
[error] com.rr.experiment.BasicSpecs$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(BasicSpecs.scala:22)
[error] com.rr.experiment.BasicSpecs$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(BasicSpecs.scala:16)
[error] org.scalacheck.Prop$$anonfun$forAllShrink$1.org$scalacheck$Prop$$anonfun$$result$1(Prop.scala:623)
[error] org.scalacheck.Prop$$anonfun$forAllShrink$1.apply(Prop.scala:659)
[error] org.scalacheck.Prop$$anonfun$forAllShrink$1.apply(Prop.scala:616)
[error] org.scalacheck.Prop$$anon$1.apply(Prop.scala:309)
[error] org.scalacheck.Prop$$anonfun$forAllShrink$1.org$scalacheck$Prop$$anonfun$$result$1(Prop.scala:623)
[error] org.scalacheck.Prop$$anonfun$forAllShrink$1.apply(Prop.scala:659)
[error] org.scalacheck.Prop$$anonfun$forAllShrink$1.apply(Prop.scala:616)
[error] org.scalacheck.Prop$$anon$1.apply(Prop.scala:309)
[error] org.scalacheck.Test$.org$scalacheck$Test$$workerFun$1(Test.scala:325)
[error] org.scalacheck.Test$.check(Test.scala:376)
[error] com.rr.experiment.BasicSpecs$.checkScalaCheckProperty(BasicSpecs.scala:13)
[error] com.rr.experiment.BasicSpecs$.checkProperty(BasicSpecs.scala:13)
[error] com.rr.experiment.BasicSpecs$.checkProp(BasicSpecs.scala:13)
[error] com.rr.experiment.BasicSpecs$.check(BasicSpecs.scala:13)
[error] com.rr.experiment.BasicSpecs$$anonfun$1$$anonfun$apply$1.apply(BasicSpecs.scala:16)
[error] com.rr.experiment.BasicSpecs$$anonfun$1$$anonfun$apply$1.apply(BasicSpecs.scala:16)

Somehow, EncoderFunctions#encodeBoth was compiled believing that Scalaz's \/ class was in fact a trait and not a class. The only way I can think of this happening is if EncoderFunctions$class.class were compiled with some older version of Scalaz and SBT's dependency resolution didn't realize that recompilation was needed. (shakes fist at sbt-release plugin)

Anyway, 1.2.0 doesn't have this problem.

"failed to encode size of [$a]" may lead to data leaks

When encoding sensitive data, the last thing we want is to see "failed to encode size of [{ 'password': 'Supercalifragilisticexpialidocious' }]" end up in our log.

This message is generated at VariableSizeCodec:18, and is nigh impossible to avoid.

Perhaps a message that omits encoded payload would do? e.g.:
"failed to encode size for $codec using $sizeCodec"?

OOM error while compiling

Hello,

I'm currently using v1.7.1

I have a small/moderately sized scodec project, and it appears I've reached a tipping point with the compiler somewhere. It will take over 10 minutes to compile, and will then throw an out of memory error. I have 2GB allocated to sbt.

Here is a stack trace:

java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at sbt.ConcurrentRestrictions$$anon$4.take(ConcurrentRestrictions.scala:188)
    at sbt.Execute.next$1(Execute.scala:83)
    at sbt.Execute.processAll(Execute.scala:86)
    at sbt.Execute.runKeep(Execute.scala:66)
    at sbt.EvaluateTask$.liftedTree1$1(EvaluateTask.scala:342)
    at sbt.EvaluateTask$.run$1(EvaluateTask.scala:341)
    at sbt.EvaluateTask$.runTask(EvaluateTask.scala:361)
    at sbt.Aggregation$$anonfun$3.apply(Aggregation.scala:64)
    at sbt.Aggregation$$anonfun$3.apply(Aggregation.scala:62)
    at sbt.EvaluateTask$.withStreams(EvaluateTask.scala:293)
    at sbt.Aggregation$.timedRun(Aggregation.scala:62)
    at sbt.Aggregation$.runTasks(Aggregation.scala:71)
    at sbt.Aggregation$$anonfun$applyTasks$1.apply(Aggregation.scala:32)
    at sbt.Aggregation$$anonfun$applyTasks$1.apply(Aggregation.scala:31)
    at sbt.Command$$anonfun$applyEffect$2$$anonfun$apply$3.apply(Command.scala:60)
    at sbt.Command$$anonfun$applyEffect$2$$anonfun$apply$3.apply(Command.scala:60)
    at sbt.Aggregation$$anonfun$evaluatingParser$4$$anonfun$apply$5.apply(Aggregation.scala:153)
    at sbt.Aggregation$$anonfun$evaluatingParser$4$$anonfun$apply$5.apply(Aggregation.scala:152)
    at sbt.Act$$anonfun$sbt$Act$$actParser0$1$$anonfun$sbt$Act$$anonfun$$evaluate$1$1$$anonfun$apply$10.apply(Act.scala:244)
    at sbt.Act$$anonfun$sbt$Act$$actParser0$1$$anonfun$sbt$Act$$anonfun$$evaluate$1$1$$anonfun$apply$10.apply(Act.scala:241)
    at sbt.Command$.process(Command.scala:92)
    at sbt.MainLoop$$anonfun$1$$anonfun$apply$1.apply(MainLoop.scala:98)
    at sbt.MainLoop$$anonfun$1$$anonfun$apply$1.apply(MainLoop.scala:98)
    at sbt.State$$anon$1.process(State.scala:184)
    at sbt.MainLoop$$anonfun$1.apply(MainLoop.scala:98)
    at sbt.MainLoop$$anonfun$1.apply(MainLoop.scala:98)
    at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
    at sbt.MainLoop$.next(MainLoop.scala:98)
    at sbt.MainLoop$.run(MainLoop.scala:91)
    at sbt.MainLoop$$anonfun$runWithNewLog$1.apply(MainLoop.scala:70)
    at sbt.MainLoop$$anonfun$runWithNewLog$1.apply(MainLoop.scala:65)
    at sbt.Using.apply(Using.scala:24)
    at sbt.MainLoop$.runWithNewLog(MainLoop.scala:65)
    at sbt.MainLoop$.runAndClearLast(MainLoop.scala:48)
    at sbt.MainLoop$.runLoggedLoop(MainLoop.scala:32)
    at sbt.MainLoop$.runLogged(MainLoop.scala:24)
    at sbt.StandardMain$.runManaged(Main.scala:53)
    at sbt.xMain.run(Main.scala:28)
    at xsbt.boot.Launch$$anonfun$run$1.apply(Launch.scala:57)
    at xsbt.boot.Launch$.withContextLoader(Launch.scala:77)
    at xsbt.boot.Launch$.run(Launch.scala:57)
    at xsbt.boot.Launch$$anonfun$explicit$1.apply(Launch.scala:45)
    at xsbt.boot.Launch$.launch(Launch.scala:65)
    at xsbt.boot.Launch$.apply(Launch.scala:16)
    at xsbt.boot.Boot$.runImpl(Boot.scala:32)
    at xsbt.boot.Boot$.main(Boot.scala:21)
    at xsbt.boot.Boot.main(Boot.scala)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at scala.collection.mutable.ListBuffer.$plus$eq(ListBuffer.scala:170)
    at scala.collection.immutable.List.loop$1(List.scala:184)
    at scala.collection.immutable.List.mapConserve(List.scala:189)
    at scala.reflect.internal.tpe.TypeMaps$TypeMap.mapOver(TypeMaps.scala:115)
    at scala.reflect.internal.tpe.TypeMaps$SubstMap.apply(TypeMaps.scala:716)
    at scala.reflect.internal.tpe.TypeMaps$SubstMap.apply(TypeMaps.scala:685)
    at scala.collection.immutable.List.loop$1(List.scala:173)
    at scala.collection.immutable.List.mapConserve(List.scala:189)
    at scala.reflect.internal.tpe.TypeMaps$TypeMap.mapOver(TypeMaps.scala:157)
    at scala.reflect.internal.tpe.TypeMaps$SubstMap.apply(TypeMaps.scala:716)
    at scala.reflect.internal.Types$Type.subst(Types.scala:705)
    at scala.reflect.internal.Types$Type.instantiateTypeParams(Types.scala:470)
    at scala.reflect.internal.Types$ArgsTypeRef.asSeenFromInstantiated$1(Types.scala:1855)
    at scala.reflect.internal.Types$ArgsTypeRef.transform(Types.scala:1882)
    at scala.reflect.internal.Types$AliasTypeRef$class.betaReduce(Types.scala:2043)
    at scala.reflect.internal.Types$AliasArgsTypeRef.betaReduce(Types.scala:2326)
    at scala.reflect.internal.Types$AliasTypeRef$class.dealias(Types.scala:2014)
    at scala.reflect.internal.Types$AliasArgsTypeRef.dealias(Types.scala:2326)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch.loop$1(Implicits.scala:519)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch.checkCompatibility(Implicits.scala:556)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch.isPlausiblyCompatible(Implicits.scala:369)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch$ImplicitComputation.survives(Implicits.scala:816)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch$ImplicitComputation$$anonfun$19$$anonfun$20.apply(Implicits.scala:872)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch$ImplicitComputation$$anonfun$19$$anonfun$20.apply(Implicits.scala:872)
    at scala.collection.TraversableLike$$anonfun$filterImpl$1.apply(TraversableLike.scala:259)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.TraversableLike$class.filterImpl(TraversableLike.scala:258)
    at scala.collection.TraversableLike$class.filter(TraversableLike.scala:270)
    at scala.collection.AbstractTraversable.filter(Traversable.scala:104)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch$ImplicitComputation$$anonfun$19.apply(Implicits.scala:872)
    at scala.tools.nsc.typechecker.Implicits$ImplicitSearch$ImplicitComputation$$anonfun$19.apply(Implicits.scala:871)
    at scala.collection.immutable.List.flatMap(List.scala:327)
[error] java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded

Some of my case classes are large (some with 38 fields), I use dropUnits and Discriminator frequently. I also use scala Enumerations heavily. Just from the stack trace it seems lost searching for implicits or something. I can try monkeying around with how I'm using implicits, as I'm actually not relying on that feature too heavily.

It's worth noting that my compile times were pretty quick until recently when I added a few case classes that had many fields in them. I see #41 was fixed in 1.7.0 so I doubt the same thing is going on.

Let me know if you'd like some more information or maybe a strip-downed gist of my coding style with scodec.

Thanks!

publish for 2.12.0-RC2?

scodec-bits for 2.12.0-RC1 is on Maven Central, but scodec-core wasn't. did you hit any difficulties with RC1?

lookahead codec encodes

lookahead codec should not be encoding the value, instead it shall only decode true/false. Currently lookahead codec encodes whatever codec is passed to it.

discriminated lazy codec failing roundtrip

Hi, here's a test that's not passing that I believe should be:

    "support discriminated recursive codec" in {
      sealed trait Message
      case class Batch(msgs: List[Message]) extends Message
      case class Single(msg: String) extends Message

      lazy val codec: Codec[Message] = lazily {
        discriminated[Message].by(uint8).
          typecase(0, utf8.as[Single]).
          typecase(1, list(codec).as[Batch])
      }

      val msg: Message = Batch(List(Single("first"), Single("second"), Single("third")))

      roundtrip(codec, msg)
    }

The output when run is:

[info] - should support discriminated recursive codec *** FAILED *** (122 milliseconds)
[info]   Batch(List(Single(firstsecondthird))) did not equal Batch(List(Single(first), Single(second), Single(third))) (CodecSuite.scala:26)

The strings inside each Single are concatted when they shouldn't be, and so the roundtrip fails.

I've pushed this failing test to my fork on branch bug/lazy. I can PR the branch with the broken test, if that would be useful. I put the test in HListCodecTest.scala.

Diverging implicit expansion when looking up implicit codec defined in companion object of sealed trait with more than one subtype

Consider:

  sealed trait Foo
  object Foo {
    case object Bar extends Foo
    case object Baz extends Foo
    implicit val codec: Codec[Foo] = provide(Bar)
  }

Looking up this implicit with Codec[Foo] fails to compile as of 1.4.0 with the error:

[error] ...: diverging implicit expansion for type scodec.Codec[DerivedCodecsExample.this.Foo]
[error] starting with method derive in object Codec

Removing the case objects (and adjusting the codec to provide(null)) fixes the error, presumably because there's no LabelledGeneric[Foo] instance anymore, which prevents Codec.derive from being called.

parsing variable record length

Hi,
I am supposed to parse an old mainframe IBM VRL file. It has 7 different types of recors. Each has a differnt lenght. I got this to work using Option and different case classes for each record type.

However what I would like to do is this:

case class Header(length: Int)
case class SubRecord1 (field1: Int, field2: Long) extends Header
case class SubRecord1A(field3: Long) extends SubRecord1

and in the end i want to read from an input stream and i want to get a stream of header back (wela ctually the correct sub classes)

so I would need to do something like this:

implicit val header: Codec[Header] = {
  int32 >>:~ {
    length => {
     if(length == 6)
       decodeSubRecord1A
    else
      decodeSubRecord2
// and s on
   }
}.as[Header]

is there any way of doing this?

Regards

variableSizeBytes implicitly requires byte quantization

When I write it like that, it seems obvious, but it's far less obvious when you're using it. Behold:

    "roundtrip a rle vector containing bool" in {
      val codec: Codec[Vector[Boolean]] = variableSizeBytes(int32, vector(bool))

      roundtripAll(codec, Seq(Vector(), Vector(true), Vector(false, false, true, false)))
    }

This doesn't work. We found this (or rather, @alissapajer found it) when using optional(bool, int32), which is a bit more obfuscated, but ultimately the problem is bool. It's a codec which codes into single bits, meaning that the bit length of the coded vector is not necessarily mod 8, meaning that it doesn't round trip back through the byte quantizing. The following does work:

    "roundtrip a rle vector containing bool" in {
      val codec: Codec[Vector[Boolean]] = variableSizeBits(int32, vector(bool))

      roundtripAll(codec, Seq(Vector(), Vector(true), Vector(false, false, true, false)))
    }

The culprit is here, inside of variableSizeBytesLong:

    private val codec = variableSizeBitsLong(size.xmap[Long](_ * 8, _ / 8), value, sizePadding * 8)

Obviously the integer division by 8 is lossy, and that's precisely what is showing up in the round trip failure.

@alissapajer and I talked about this quite a bit, and while yes, it is right in the name that you're dealing with bytes and not bits, it's still very surprising and in a real sense, composition-breaking. I'm not sure exactly what to do, but it tripped us up for a while.

Support coproduct reordering with `.as[X]`

Coproduct codecs can be converted to sealed trait codecs using codec.as. For example:

sealed trait Sprocket
case class Wocket(...) extends Sprocket
case class Woozle(...) extends Sprocket

(woozleCodec :+: wocketCodec).as[Sprocket]

However, the component codecs must be specified in the same order as Generic[Sprocket]#Out. Hence, (wocketCodec :+: woozleCodec).as[Sprocket] fails to compile.

Add .as[X] method to CoproductCodecBuilder

CoproductCodecBuilder supports xmap and exmap for deferred codec transformations. Those methods were originally added in support of Codec.coproduct, which needs to defer the transform until after the user provides discriminator information.

However, the .auto method on CoproductCodecBuilder#NeedDiscriminators relies on knowing the final codec type to look up implicit Discriminator instances. In order to use .auto with a manually created CoproductCodecBuilder, users currently must write something like:

val g = Generic[Foo]
(subtype1 :+: subtype2 :+: ... :+: subtypeN).xmap(g.from, g.to).discriminatedBy(uint8).auto

The following should be allowed instead.

(subtype1 :+: subtype2 :+: ... :+: subtypeN).as[Foo].discriminatedBy(uint8).auto

Not entirely obvious limitations on lengths of certain ops

I am reading some hprof files using scodec 2.11-1.2.0.jar
Some parts of these are quite large. I'm just playing around at the moment (with a 700meg heap dump) but have noticed a few oddities.

eg
ignore takes an Int rather than a Long.
I got around this using this function

def longSkipper(lim:Long) = {
    val maxInts = lim / maxIgnore
    val rem = lim - (maxInts * maxIgnore)
    val lengths = rem.toInt :: List.fill( maxInts.toInt )( maxIgnore )

    lengths.map( codecs.ignore ).reduce( _ <~ _ )
}

This works fine, then I wanted to use fixedSize to limit reading a segment. I tried making a FixedLongSizeCodec, just changing the type of the limit. This looks like it should work from the types, but still has issues, because in a few places there are calls like :

bytesNeededForBits(m2).toInt

ByteVector seems to be the limiting factor here and is quite heavily based on int indexes.
I gave up on the idea of working around this from the outside at this point.

So my questions are
a) Will you accept changes to make these kinds of situations work?
b) Do you already have a plan for dealing with this?

Anyway, I am happy to have a go at this, but thought I'd ask first.

Missing scodec.Codec[Command] implicit because of class with non-value fields

I'm sorry for posting it here but this issue is blocking me.

SO question http://stackoverflow.com/questions/41997534/missing-scodec-codeccommand-implicit-because-of-class-with-non-value-fields

I'm trying to use discriminators in existing project and something is wrong with my classes I guess.

Consider this scodec example. If I change TurnLeft and its codec to

sealed class TurnLeft(degrees: Int) extends Command {
  def getDegrees: Int = degrees
}
implicit val leftCodec: Codec[TurnLeft] = uint8or16.xmap[TurnLeft](v => new TurnLeft(v), _.getDegrees)

I get

Error:(x, x) could not find Lazy implicit value of type scodec.Codec[Command]
    val codec: Codec[Either[UnrecognizedCommand, Command]] = discriminatorFallback(unrecognizedCodec, Codec[Command])

It all works if I make degrees field value field. I suspect it's something tricky with shapeless. What should I do to make it work ?

Sample project that demonstrates the issue is here.

README.md seems to be outdated

I'm using Kryo at the moment. I've heard about scodec and decided to try it.

Sadly none of the examples from README.md work for me.

Since it doesn't say what dependencies to use I took the latest from Maven

          "org.scodec" %% "scodec-core" % "1.10.3",
          "org.scodec" %% "scodec-bits" % "1.1.2",
          "org.scodec" %% "scodec-scalaz" % "1.4.1",

Following code can't find codec for String

    val test = "sdfsfd"
    import scodec._
    import scodec.bits._
    import codecs._
    val encoded = Codec.encode(test)

error is

Error:(296, 31) could not find Lazy implicit value of type scodec.Codec[String]
    val encoded = Codec.encode(test)

I was able to compile it by changing it to

val encoded = Codec.encode(test)(codecs.string(Charset.defaultCharset()))

but I doubt it's designed way of using it.

Then following code

    val firstCodec = (uint8 ~ uint8 ~ uint16)
    val result: DecodeResult[(Int ~ Int ~ Int)] = Codec.decode(firstCodec, BitVector(0x10, 0x2a, 0x03, 0xff))
    val add3 = (_: Int) + (_: Int) + (_: Int)
    val sum: DecodeResult[Int] = result map add3

doesn't compile because there is no decode with such prototype. I guess I can change it to

val result: Attempt[DecodeResult[((Int, Int), Int)]] = Codec.decode(BitVector(0x10, 0x2a, 0x03, 0xff))(firstCodec)

but returning type is different so result map add3 doesn't compile because there is no map method.

Could you please update readme to latest version of the library ?

Thank you.

Mis-use of ByteBuffer#array could lead to issues

Just noticed this line 1:

BitVector(ByteVector.view(ByteBuffer.allocate(8).order(byteOrder).putDouble(value).array)).right

ByteBuffer#array is not guaranteed to return a properly sized array and it may be padded. You'll need to trim the array (or use ByteBuffer's get method with an array). I've been bitten by this before - it is not a fun one to debug.

variableSizeWithEndOfBlockMarker*

Sometimes when dealing with variable size encodings we might not know the size but there will be a end of block marker. Hence is it possible to add the above where we can get the end of the decoding when a delimiter is encountered. In this scheme you have to consider how to escape if encoded stream can have the delimiter as part of the legal encoding. You should be able supply some of the common conventions.

Should ~> really require a Monoid?

The reason I ask is for the relatively common use-case of a static delimiter preceeding a field. For example:

constant(BitVector(0x01B)) ~> int32 :: constant(BitVector(0x02B)) ~> int32

Unfortunately, the above doesn't work unless I provide a Monoid[Unit]. Since you're really just using the monoid for the zero, wouldn't it make more sense to straight-up require Codec[Unit] on the left-hand side? It's obviously quite easy to xmap any codec to Codec[Unit] (and in fact, I would argue this should itself be a combinator).

I just really, really don't like providing Monoid[Unit]. :-)

Standard Codecs for Literals encoded as Char

Some protocols use plain char to represent numbers and other types. So is it possible to have some standards codecs for such decoding.

E.g.

  • +1_000.3
  • -1,000
  • 1k
  • 0b101010101011
  • -0xHA
  • 0o123
  • etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.