Code Monkey home page Code Monkey logo

audio's Introduction

audio

GoDoc

audio is a generic Go package designed to define a common interface to analyze and/or process audio data.

At the heart of the package is the Buffer interface and its implementations:

  • FloatBuffer
  • Float32Buffer
  • IntBuffer

Decoders, encoders, processors, analyzers and transformers can be written to accept or return these types and share a common interface.

The idea is that audio libraries can define this interface or its implementations as input and return an audio.Buffer interface allowing all audio libraries to be chainable.

Performance

The buffer implementations are designed so a buffer can be reused and mutated avoiding allocation penalties.

It is recommended to avoid using Float32Buffer unless performance is critical. The major drawback of using float32s is that the Go stdlib was designed to work with float64 and therefore the access to standard packages is limited.

Usage

Examples of how to use this interface is available under the go-audio organization.

audio's People

Contributors

bovarysme avatar glaslos avatar kybin avatar lsegal avatar mattetti avatar shabbyrobe avatar velovix avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

audio's Issues

Play sample loud and choppy

Trying to use PortAudio to play... It's super loud and choppy.

package main

import (
	"flag"
	"fmt"
	"io"
	"os"

	"strings"

	"github.com/go-audio/audio"
	"github.com/go-audio/wav"
	"github.com/gordonklaus/portaudio"
)

var (
	flagInput  = flag.String("input", "/Users/brydavis/sounds/The Malinchak Show Episode 108.wav", "The file to convert")
	flagFormat = flag.String("format", "wav", "The format to convert to (wav or aiff)")
	flagOutput = flag.String("output", "out", "The output filename")
)

func main() {
	flag.Parse()
	if *flagInput == "" {
		fmt.Println("Provide an input file using the -input flag")
		os.Exit(1)
	}
	switch strings.ToLower(*flagFormat) {
	case "aiff", "aif":
		*flagFormat = "aiff"
	case "wave", "wav":
		*flagFormat = "wav"
	default:
		fmt.Println("Provide a valid -format flag")
		os.Exit(1)
	}
	f, err := os.Open(*flagInput)
	if err != nil {
		panic(err)
	}
	defer f.Close()

	var buf *audio.IntBuffer

	wd := wav.NewDecoder(f)

	if !wd.WasPCMAccessed() {
		err := wd.FwdToPCM()
		if err != nil {
			panic(err)
		}
	}

	framePerBuffer := 2048

	buf = &audio.IntBuffer{Format: wd.Format(), Data: make([]int, framePerBuffer)}
	var n int
	var doneReading bool

	portaudio.Initialize()
	defer portaudio.Terminate()

	out := make([]float32, framePerBuffer)

	stream, err := portaudio.OpenDefaultStream(0, 2, 44100, framePerBuffer, &out)

	if err != nil {
		fmt.Printf("unable to open port audio stream for playback: %s\n", err.Error())
		os.Exit(1)
		return
	}

	defer stream.Close()

	if err := stream.Start(); err != nil {
		fmt.Printf("unable to open port audio stream for playback: %s\n", err.Error())
		os.Exit(1)
		return
	}

	defer stream.Stop()

	for err == nil {
		n, err = wd.PCMBuffer(buf)
		if err != nil && err != io.EOF && err != io.ErrUnexpectedEOF {
			fmt.Println(err)
		}

		if n != len(buf.Data) {
			buf.Data = buf.Data[:n]
			doneReading = true
		}

		intToF32Copy(out, buf.Data)

		// write to the stream
		if err := stream.Write(); err != nil {
			fmt.Println(err)
		}

		if doneReading {
			break
		}
	}

}

// portaudio doesn't support float64 so we need to copy our data over to the
// destination buffer.
func intToF32Copy(dst []float32, src []int) {
	for i := range src {
		dst[i] = float32(src[i])
	}
}

float32 buffer to int buffer conversation should be corrected

  I wnated to convert an ogg file to aiff format, so I used oggvorbis to decode an ogg audio file and the package give me a decoded float32 array, and then I used go-audio/aiff to encode float32 pcm data to aiff.Later I found that the encoder only accepts an IntBuffer, so I made a FloatBuffer and use its AsIntBuffer method to make a conversation, and this method didn't give me the proper result. I look into the code and found that inside the AsIntBuffer method the float value is simply converted to int using int(float_value). This is absolutly incorrect. After some researches, I wrote this to resolve the problem and lucikly it works fine.

//ignore the magical unsafe statement...
package main

import (
	"io/ioutil"
	"log"
	"reflect"
	"unsafe"
)

func main() {
	//test.raw is a pcm file,format float32
	data, err := ioutil.ReadFile("test.raw")
	if err != nil {
		log.Fatalln(err)
	}
	var f32Buf []float32
	for len(data) > 0 {
		buf := data[:4]
		data = data[4:]
		f := *(*float32)(unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&buf)).Data))
		f32Buf = append(f32Buf, f)
	}
	iBuf := convert(f32Buf)
	var oBuf []byte
	for _, v := range iBuf {
		header := &reflect.SliceHeader{
			Data: uintptr(unsafe.Pointer(&v)),
			Cap:  2,
			Len:  2,
		}
		oBuf = append(oBuf, *(*[]byte)(unsafe.Pointer(header))...)
	}

	//output is a pcm file,format int
	ioutil.WriteFile("output.raw", oBuf, 0600)
}

func convert(f32buf []float32) (i16Buf []int16) {
	for _, v := range f32buf {
		sample := v * 32767
		i16Buf = append(i16Buf, int16(sample))
	}
	return
}

  I wonder if we can tell that if the current implemation is a bug or not ,but at least the document should be more clear so that people won't be misleading.

is this API appropriate, especially for real time use

This discussion is a follow up from this initial proposal. The 2 main arguments that were raised are:

  • Is this API appropriate for real time usage (especially in regards to allocation and memory size)
  • is the interface too big/not adequate

@egonelbre @kisielk @nigeltao @taruti all brought up good points and Egon is working on a counter proposal focusing on smaller interfaces with compatibility with types commonly found in the wild (int16, float32).

As mentioned in the original proposal, I'd like to this organization of a special interest group of people interested in doing more/better audio in Go. I have to admit my focus hasn't been real time audio and I very much appreciate the provided feedback. We all know this is a challenging issue which usually results in a lot of libraries doing things in very different ways. However, I do want to believe that we, as a community and with the support of the core team, can come up with a solid API for all Go audio projects.

Acronyms should have a consistent case

According to the Code Review Comments document, which provides style guidelines for Go code:

Words in names that are initialisms or acronyms (e.g. "URL" or "NATO") have a consistent case. For example, "URL" should appear as "URL" or "url" (as in "urlPony", or "URLPony"), never as "Url". Here's an example: ServeHTTP not ServeHttp.

For that reason, I think functions IeeeFloatToInt and IntToIeeeFloat should be renamed to IEEEFloatToInt and IntToIEEEFloat, respectively.

I know this is bikeshedding, but if this library is to become an official interface for Go audio, I think this kind of stuff is somewhat important.

dsp package?

Trying to figure out where a good place to put commonly-used DSP functions would. It seems like they would be commonly used across a lot of the packages in the project, so maybe a dsp package? Should it be github.com/go-audio/dsp or github.com/go-audio/audio/dsp?

Default Stereo formats have Single channel

The formats in formats.go are identical whether they are stereo or mono. We'd probably expect Channels to be 2 for stereo formats. If this is intentional, maybe there should be a note about it?

format converter example

From @kisielk:

I think the best thing to do would be to write a simple application like a format converter and suss out some of the API. It would:

Accept a file as an input
Have [a] format flag(s) to specify the desired output format
An argument for output filename
Then the application could detect the format of the input file from a known set, and perform a conversion to the desired output type. This is pretty much the bare minimum "round trip" that the audio API would need to be able to handle (apart from the slightly more simple case of synthesizing audio to an output file).

code styling

I'd like to discuss code styling for a minute. It would be good to agree on a general approach of the code we'd like to write. Here are some examples, I'd like to hear more about what you think

##Code styling

Embedded type and optimized struct sizes

type Format struct {
	SampleRate uint32
	Channels   int16
}

type Buffer32 struct {
	Format
	Data []float32
}

vs pointers and convenient field types

type Format struct {
	SampleRate int
	Channels   int
}

type Buffer32 struct {
	Format *Format
	Data []float32
}

Convenient but straight forward constructor

func NewBuffer32(format Format, sampleCount int) *Buffer32 {
	return &Buffer32{
		Format: format,
		Data:   make([]float32, sampleCount, sampleCount),
	}
}

vs DIY approach. (note that in this code, the sample count should probably
multiple by the number of channels which is an easy thing to forget.)

&Buffer32{ 
    Format: format,
	Data:   make([]float32, sampleCount),
}

Explicit, manually, duplicated functions

func (buffer *Buffer32) Process32(output *Buffer32) error {
	// stuff
	return nil
}

func (buffer *Buffer64) Process64(output *Buffer64) error {
    // stuff
    return nil
}

vs interface and type switching

    func Process(buf1, buf2 Buffer) error {
        switch t := t.(type) {
        *Buffer32:
            // some stuff
        *Buffer64:
            // some other stuff
        }
        return nil
    }

vs using go generate for any functions implemented the same way in float32 and 64

I personally like:

2: I don't think the convenience of a having a field in a type we can easily manipulate trumps the memory size gain. Plus using a pointer would allow us to provide predefined formats.

3: Constructors are often convenient, especially as entry points and when the setup isn't trivial. This example is a good example since it was was written by hand and technically has a bug. This bug would be mitigated by a constructor. It does come at a price of a bigger API and potentially surprising behaviors.

Finally 5: It might be a bit of a pain at first but it works better with code completion (easier for people not knowing the API), it can be optimized (custom logic and/or SIMD) but can be the source of more bugs (fix something on the float32 side but not float64 for instance). 7 (code generation) is nice but it always feels a bit clunky to have to move things outside of the code generation path because of some edge cases.

@egonelbre @kisielk @nigeltao what do you think?

Incorrect conversion of Float to Int

Hi,

I tried to convert a Float32Buffer to an IntBuffer, resulting in silence. Problem is, that the current implementation has a flaw:

newB.Data[i] = int(buf.Data[i])

newB.Data[i] = int(buf.Data[i])

Float audio samples use the value range of -1....0....1 casting them just to int won't do the job. For a proper conversion, the bit depth of the integer (8/16/32/64) must be known. Here is a good stack overflow article how to do it right.

Problem is, that within a Float32Buffer, the bit depth is not available :-(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.