Code Monkey home page Code Monkey logo

itertools's Introduction

Itertools

Extra iterator adaptors, functions and macros.

Please read the API documentation here.

How to use with Cargo:

[dependencies]
itertools = "0.13.0"

How to use in your crate:

use itertools::Itertools;

How to contribute

If you're not sure what to work on, try checking the help wanted label.

See our CONTRIBUTING.md for a detailed guide.

License

Dual-licensed to be compatible with the Rust project.

Licensed under the Apache License, Version 2.0 https://www.apache.org/licenses/LICENSE-2.0 or the MIT license https://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.

itertools's People

Contributors

bluss avatar bors[bot] avatar bsteinb avatar danieleades avatar easyoakland avatar flo-l avatar gin-ahirsch avatar hellow554 avatar jojojet avatar jswrenn avatar justusfluegel avatar kinto-b avatar malbarbo avatar mankinskin avatar milibopp avatar mitchmindtree avatar mmirate avatar nbraud avatar orium avatar owen-ch-leung avatar philippe-cholet avatar phimuemue avatar rkarp avatar sagebati avatar shepmaster avatar skifire13 avatar ten0 avatar tranzystorekk avatar vadixidav avatar xaeroxe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

itertools's Issues

Breaking changes in 0.5

  • #86
  • Possibly depend on use crate either #129
  • Remove .map_fn(), MapFn
  • Remove .mend_slices
  • Remove reexports itertools::{enumerate, rev} (exist in the free module)
  • Remove .into_rc()
  • Remove .sort_by (deprecated)
  • Remove ZipSlices (doesn't live up to its perf promise anymore due to optimizer changes)
  • Remove ZipTrusted (will have new life with specialization)
  • Rename Group by lazy and chunks lazy?
  • Remove Stride/StrideMut? Are separate crate material
  • Remove Linspace (doesn't work properly without using crate num-traits)
  • Remove itertools::equal (Iterator.eq is the same).
  • Remove dropn (can't use .nth() like dropping does).

Group by that merges same key elements

After a quick glance I couldn't find a function like that. I believe it would be useful to have for unsorted Vecs, instead of creating multiple groups only for consecutive elements.

Thoughts?

cannot use .rev() on (i..j).combinations()

Would be nice to have that

fn main() {
    let v: Vec<_> = (0..5).combinations().rev().collect();
}
cargo build --release
   Compiling palindrome v0.1.0 (file:///Users/SumProxy/Projects/rust/palindrome)
src/main.rs:43:43: 43:48 error: the trait `core::iter::DoubleEndedIterator` is not implemented for the type `itertools::adaptors::Combinations<core::ops::Range<_>>` [E0277]
src/main.rs:43     let v: Vec<_> = (0..5).combinations().rev().collect();
                                                         ^~~~~
src/main.rs:43:43: 43:48 help: run `rustc --explain E0277` to see a detailed explanation
src/main.rs:43:49: 43:58 error: no method named `collect` found for type `core::iter::Rev<itertools::adaptors::Combinations<core::ops::Range<_>>>` in the current scope
src/main.rs:43     let v: Vec<_> = (0..5).combinations().rev().collect();
                                                               ^~~~~~~~~
src/main.rs:43:49: 43:58 note: the method `collect` exists but the following trait bounds were not satisfied: `itertools::adaptors::Combinations<core::ops::Range<_>> : core::iter::DoubleEndedIterator`, `core::iter::Rev<itertools::adaptors::Combinations<core::ops::Range<_>>> : core::iter::Iterator`
error: aborting due to 2 previous errors
Could not compile `palindrome`.

To learn more, run the command again with --verbose.

Index trait error in stride.rs compliation.

When compiling src/lib.rs, I get the following error:

src/stride.rs:141:5: 148:6 error: method `index` has an incompatible type for trait: expected type parameter but found &-ptr
src/stride.rs:141     fn index<'b>(&'b self, i: &uint) -> &'b A
src/stride.rs:142     {
src/stride.rs:143         assert!(*i < self.size_hint().val0());
src/stride.rs:144         unsafe {
src/stride.rs:145             let ptr = self.begin.offset(self.stride * (*i as int));
src/stride.rs:146             mem::transmute(ptr)

This is compiling with rustc version 0.11.0, on Arch Linux x86_64.

Add minmax/minmax_by_key

Sometimes I'd like to keep track of minimum and maximum without iterating twice (if that's even possible). Itertools might be a good place for such adapters?

Improve the fast case for .group_by(), .chunks()

We could keep some kind of flag for non-buffering, so that the simplest use case of .group_by_lazy() is as fast as possible.

The simplest case is: For each group, the group is iterated until its end, before the next iteration of the group's iterator.

Add function to convert Option<IntoIterator> into Iterator

I've run into a situation where I could use a function that takes an Option<T> where T: IntoIterator and return a struct that implements the Iterator trait. The behavior would be the same as the contained iterator, or std::iter::Empty if there is no contained iterator.

First of all, is there any way to do this without defining a new struct? If not, would this function be a useful addition to itertools? I'm willing to do the implementation, I just wanted to get the go-ahead first.

prepend/append convenience methods

I have quite some instances of combining iter::once and Iterator::chain in my code, do you think it would make sense to add two convenience methods for this? Looking at Vec, push may be a better name for append, but I'm not sure what to call prepend then. The argument order of prepend may be confusing, either way.

Use #[must_use]

Mimic libstd and put #[must_use] on adaptors where it makes sense.

Add a `skip_every` iterator

The name and semantics can be changed if necessary, but I'd like an iterator that skips every nth element, and I wasn't able to find an existing one in reading through the docs. For example:

let items = vec![1, 2, 3, 4, 5, 6, 7, 8];
assert_eq!(vec![1, 2, 4, 5, 7, 8], items.iter().skip_every(3).collect());

Stop using function pointers

Function pointers are useful in place of closures, because we can actually return them without boxing; however, the virtual calls are hard for the compiler to inline, leading to degradation, see #47.

For itertools 0.4.0, remove function pointers. Possible workarounds: Wrapper types. Unit structs as statically compiled function values.

Add reductions()

Unless I missed it, there doesn't seem to currently exist a function to get an iterator of the intermediate values of a reduction function. Here is an example of how this can be used to easily create a cumulative sum iterator:

#[test]
fn test_cumsum() {
    let expected = vec![1, 3, 6, 10, 15];
    let result: Vec<_> = (1..6).reductions(|a,b| a+b).collect();
    assert_eq!(expected, result);
}

For me this use-case up when I needed a cumulative sum over an addition operator without inverse, to optimize sums of certain structure.

Does this seem like something that could be part of this library? (The name "reductions" is not great, as rust uses "fold" instead of "reduce", there's probably some other term for this.)


I came up with the following implementation for my own use, though me being new to Rust it likely does everything wrong:

pub struct Reductions<I, R> where I : Iterator {
    iter: I,
    reductor: R,
    prev: Option<I::Item>
}

impl<I, R> Iterator for Reductions<I, R>
        where I: Iterator, I::Item: Clone,
              R: FnMut(I::Item, I::Item) -> I::Item {
    type Item = I::Item;

    fn next(&mut self) -> Option<Self::Item> {
        self.iter.next().map(|item| {
            let value = match self.prev.take() {
                Some(prev) => (self.reductor)(prev, item),
                None => item
            };
            self.prev = Some(value.clone());
            value
        })
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        self.iter.size_hint()
    }
}

impl<I, R> ExactSizeIterator for Reductions<I, R>
    where I: Iterator, I::Item: Clone,
          R: FnMut(I::Item, I::Item) -> I::Item {}

pub trait IteratorExt : Iterator {
    fn reductions<R>(self, reductor: R) -> Reductions<Self, R>
            where R: FnMut(Self::Item, Self::Item) -> Self::Item,
                  Self: Sized {
        Reductions {
            iter: self,
            reductor: reductor,
            prev: None
        }
    }
}

impl<T> IteratorExt for T where T: Iterator { }

Add a .windows(n) iterator adapter

Similar to .windows() in std, just not only for slices but for arbitrary iterators. I saw that .chunks() is already implemented as chunks_lazy(). Would you find eg. windows_lazy() implemented in a similar fashion useful?

I've read the code of chunks_lazy(), it's quite complicated. But with a few hints from somebody who knows the code I could probably implement windows_lazy() myself.

step() slowness

I encountered a case where the step() method leads to incredibly slow code:

#![feature(step_by)]
extern crate time;
use time::PreciseTime;
extern crate itertools;
use itertools::Itertools;

fn calc2(n: usize) -> Vec<usize> {
    let mut t = vec![0; n] ;
    for i in 2..n {
        if t[i] == 0 {
            t[i] = i - 1;
            for j in (2*i..n).step_by(i) {
                if t[j] == 0 {
                    t[j] = j;
                }
                t[j] = t[j] * (i - 1) / i;
            }
        }
    }
    t
}

fn calc1(n: usize) -> Vec<usize> {
    let mut t = vec![0; n] ;
    for i in 2..n {
        if t[i] == 0 {
            t[i] = i - 1;
            for j in (2*i..n).step(i) {
                if t[j] == 0 {
                    t[j] = j;
                }
                t[j] = t[j] * (i - 1) / i;
            }
        }
    }
    t
}

fn main() {
    let nb = 100_001;
    let start = PreciseTime::now();
    let s = calc2(nb).into_iter().fold(0, |a, c| a + c);
    println!("sum (step_by): {} {:?}", s, start.to(PreciseTime::now()));

    let start = PreciseTime::now();
    let s = calc1(nb).into_iter().fold(0, |a, c| a + c);
    println!("sum (step): {} {:?}", s, start.to(PreciseTime::now()));
}

Replacing step() with the unstable step_by() gives a 100 fold improvement in execution time, as can be observed on the playground here.

0.4.0 checklist / upcoming breaking changes

ticks are for my WIP

  • Remove deprecated items (icompr, fn_map, etc.)
  • Mark Slice trait unsafe
  • Rename MergeAscend to MergeFn (?)
  • Generalize .format to debug / display(?)
  • Merge / MergeBy without fn() #49
  • Dedup without fn() #49
  • Unique without fn() #49
  • MendSlices without fn() #49
  • Rename sort_by โ†’ sorted_by
  • Remove times()
  • interleave_shortest not fused

Crate Issue for Itertools 0.4.11 for Rust 1.0 & 1.1

Running travis builds here and seem to see an issue for itertools. Is there some breaking change that prevents 1.0 & 1.1 from working with the new version? It seems to work fine for rustc > 1.1

 Downloading itertools v0.4.11
Unable to get packages from source
Caused by:
  failed to parse manifest at `/home/travis/.cargo/registry/src/github.com-1ecc6299db9ec823/itertools-0.4.11/Cargo.toml`
Caused by:
  expected a value of type `string` for the key `lib.name`

Document fusing and other iterator protocol details

  • Adaptor is fused: guaranteed to return None indefinitely after end of iterator.
  • Adaptor is well behaved: Does not call .next() on non-fused iterator after it has returned None once.

All of our adaptors must be well behaved, some guarantee being fused.

with_value and into_parts for PushBackN

Would be nice to have these implemented for PushBackN:

    /// Create a `PutBackN` along with the `value` to put back.
    #[inline]
    pub fn with_value(value: Vec<I::Item>, it: I) -> Self {
        PutBackN {
            top: value,
            iter: it,
        }
    }

    /// Split the `PutBackN` into its parts.
    #[inline]
    pub fn into_parts(self) -> (Vec<I::Item>, I) {
        let PutBackN{top, iter} = self;
        (top, iter)
    }

Why the extra impl for ZipSlices?

I was looking at the ZipSlices impl and noticed that you have the signatures impl<'a, 'b, A, B> ZipSlices<&'a [A], &'b [B]> and impl<T, U> ZipSlices<T, U> where T: Slice, U: Slice. The latter seems a lot more general, and I was wondering why the former is still in the library? Are there some non-trivial reasons for this?

'sorted()' method

It would be useful to have sorted() method for regular sorting that calls sorted_by(Ordering::cmp) internally. What do you think?

Enhancement: Coalesce-style functions

I've used the following functions in two completely different unrelated projects already. They basically take an Iterator<Result> (or Iterator<Option>) and return the first Ok Result or the last Err.

trait RetryIterator: Iterator {
    fn retry_results<T, E>(&mut self) -> std::result::Result<T, E> where Self: Iterator<Item=std::result::Result<T, E>> {
        let mut last=None;
        loop {
            let cur=self.next();
            match cur {
                Some(v @ Ok(_)) => { return v; },
                Some(e @ Err(_)) => { last=Some(e); },
                None => if let Some(e) = last { return e; },
            }
        }
    }

    fn retry_options<T>(&mut self) -> Option<T> where Self: Iterator<Item=Option<T>> {
        let mut last=None;
        loop {
            let cur=self.next();
            match cur {
                Some(v @ Some(_)) => { return v; },
                Some(e @ None) => { last=Some(e); },
                None => if let Some(e) = last { return e; },
            }
        }
    }
}

impl<T: Iterator> RetryIterator for T {}

I'm happy to turn this into PR but first wanted to discuss what would be the best way to implement this.

Kmerge fails

This simple test crashes itertools:

#[test]
fn kmerge_reverse() {
    let its = vec![3, 2, 1, 0].into_iter().map(|s| (s..10).step(4));
    it::assert_equal(its.kmerge(), (0..10));
}

The error is:

        thread 'kmerge_reverse' panicked at 'Failed assertion Some(1) == Some(0) for iteration 0', src/lib.rs:1585

If I print the sequences I'm getting the following:

---- kmerge_reverse stdout ----
        [[3, 7], [2, 6], [1, 5, 9], [0, 4, 8]]

So nothing wrong with the sequences.

I also have an issue with getting the wrong (unsorted) data. But I have not found a simple example to reproduce.

Breakage with recent nightly

   Compiling itertools v0.4.13
/Users/alexbool/.cargo/registry/src/github.com-1ecc6299db9ec823/itertools-0.4.13/src/free.rs:218:1: 224:2 error: type mismatch resolving `<<J as std::iter::IntoIterator>::IntoIter as std::iter::Iterator>::Item == <I as std::iter::IntoIterator>::Item`:
 expected type parameter,
    found associated type [E0271]
/Users/alexbool/.cargo/registry/src/github.com-1ecc6299db9ec823/itertools-0.4.13/src/free.rs:218 pub fn merge<I, J>(i: I, j: J) -> Merge<I::IntoIter, J::IntoIter>
                                                                                                 ^
/Users/alexbool/.cargo/registry/src/github.com-1ecc6299db9ec823/itertools-0.4.13/src/free.rs:218:1: 224:2 help: run `rustc --explain E0271` to see a detailed explanation
/Users/alexbool/.cargo/registry/src/github.com-1ecc6299db9ec823/itertools-0.4.13/src/free.rs:218:1: 224:2 note: required by `adaptors::Merge`
error: aborting due to previous error
Build failed, waiting for other jobs to finish...
error: Could not compile `itertools`.

alexbool@alexbool-osx ~/D/I/i/r/portfolio> rustc -vV
rustc 1.10.0-nightly (22ac88f1a 2016-05-11)
binary: rustc
commit-hash: 22ac88f1a47a82195a49fbff3cf24a2c395d7a81
commit-date: 2016-05-11
host: x86_64-apple-darwin
release: 1.10.0-nightly

starmap

It'd be nice to have a starmap method. The main aim is to skip having to write closures after things like zip. If you're not familiar with it from Python, a starmap function unpacks the tuples in the input iterable into the function being mapped.

A simple but unstable implementation is

fn starmap<F>(self, f: F) -> iter::Map<Self, FnZipper<F>>
    where Self: Sized, F: Fn<Self::Item>
{
    self.map(FnZipper(f))
}

where

pub struct FnZipper<F>(F);

impl<F, Args> FnOnce<(Args,)> for FnZipper<F>
    where F: FnOnce<Args>
{
    type Output = F::Output;
    extern "rust-call" fn call_once(self, (args,): (Args,)) -> Self::Output {
        self.0.call_once(args)
    }
}

impl<F, Args> FnMut<(Args,)> for FnZipper<F>
    where F: FnMut<Args>
{
    extern "rust-call" fn call_mut(&mut self, (args,): (Args,)) -> Self::Output {
        self.0.call_mut(args)
    }
}

// For completeness
impl<F, Args> Fn<(Args,)> for FnZipper<F>
    where F: Fn<Args>
{
    extern "rust-call" fn call(&self, (args,): (Args,)) -> Self::Output {
        self.0.call(args)
    }
}

A stable implementation would probably requires macros to handle all the wanted cases, and macros would also let us also support more sophisticated flattening like for

xs.zip(ys).zip(zs).starmap(|x, y, z| ...)

Thoughts?

Iterator adapter like std::iter::Inspect, but which passes a mutable reference to the closure

I don't know how common of a use-case this is, but I can see some utility in an iterator adapter which can mutate items via a closure before yielding them. I was going to open this as an issue on rust-lang/rfcs but I figured it would be better to incubate the idea here first.

Basically a more succinct version of this:

my_iter.map(|mut val| {
    val.mutate();
    val
})

As for what to call it, I'm not sure. Perhaps as a parallel to .inspect() but implying the possibility of change. Here's some of my ideas so far:

  • .mutate()
  • .edit() (as a reference to sed)
  • .update()
  • .probe()
  • .modify()
  • .tweak()

fold_options?

Would it make sense to add an Option<T> equivalent to fold_results?

Integrate with either

Related to https://internals.rust-lang.org/t/where-would-a-join-that-writes-to-some-impl-write-go/3521/8 and rust-lang/rust#34143.

It would be nice if itertools integrated with either to provide:

  • left, right -- Return iterators mapping over Either::Left/Either::Right.
  • interleave_either, interleave_either_shortest, intersperse_either, and chain_either -- Same as the existing adapters but these would work with distinct types.

If the above rust PR were merged, it would be possible to write: some_string.extend(items.intersperse_either(", ")).

Make Stride use RandomAccessIterators

As it is now, the Stride struct only operates on slices. We can instead make the Stride struct use any iterator that implements the RandomAccessIterator trait. The benefits of doing this are that we can stride over more than just slices, the Stride struct should be simpler, and we won't need to use unsafe blocks.

I thought there might be a performance decrease in switching to RandomAccessIterator-based strides, especially when composing strides. I implemented the RandomAccessIterator-based strides, and there is indeed a performance drop when compiling without optimization. However, when compiling with optimization (level 2), the RandomAccessIterator-based stride actually performs better than the slice-based stride, often by at least a factor of 2. I have the tests at awelkie/rust-itertools on the "stride_performance" branch.

Would you mind checking my performance comparison? If the RandomAccessIterator-based strides are indeed faster with optimization, and you think it's a good idea, then I would be happy to re-implement the Stride struct.

Add Windows-like Iterator that returns static-sized tuple elements instead of slices.

The Windows iterator in the std lib returns its values as slices of elements.
This makes it unpleasant to work with since pattern matching and borrowing of slices is more restricted than (for example) for tuples.

Besides that a tuple-based Windows iterator should not involve runtime overhead and should make it possible to make compile time assumptions about the correctness of the implementation. For example, it can be checked at compiletime if accesses to iterator elements are "out-of-bounds".

E.g.:

    let primes = &[2, 3, 5, 7, 11, 13, 17, ...];
    for (fst, snd) in primes.iter().windows(2) {
        println!("({}, {})", fst, snd);
    }

Investigate .merge()

@frankmcsherry reported a big runtime improvement by switching from .merge(j) to .merge_by(j, Ord::cmp), with merge::wrapper showing up in the profile.

  • .partial_ord() call being harmful?
  • function pointer vs inlined closure issue?

(Profile says: It's both)

Add a generic Chain iterator

It would be nice to have a Chain iterator in this library that could chain several iterators together instead of just too as it is supported in the std lib at the moment.
This should behave similar to the currently implemented Zip::new((..)).

Simple-silly example code:

let v = vec![4,5,6];
let words: Vec<String> = vec!["Hello", "World", "!"];
for x in Chain::new((&[1,2,3], v.iter(), words.iter().map(|word| word.len()))) {
    println!("{}", x);
}

Important would be a possibility to also create a way to chain a bunch of iterators over mutable elements to obtain yet another iterator over mutable data.

A (possibly buggy) example code:

let (ax, bx, cx) = (&[1,2,3], &[4,5,6], &[7,8,9]);
for mut x in Chain::new((ax.iter_mut(), bx.iter_mut(), cx.iter_mut())) {
    x = x*x + x;
}

Make a Product adapter for the Itertools trait

In addition to iproduct!(iter_a, iter_b), Itertools should have something like iter_a.times(iter_b) or iter_a.cartesian_product(iter_b). This would let you take the product of iterators inline, functional style.

Maybe a method like this:

pub trait Itertools: Iterator {
    ...
    fn cartesian_product<U: Iterator>(self, other: U) -> Product<Item, Self, U> 

This would allow you to do much better inline operations, like:

oranges.iter().map(apple_for_orange).filter(is_granny_smith).cartesian_product(
    guests.iter().map(favorite_desert)
).map(more_nonsense).etc()

This shouldn't require higher-kinded types either.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.