Code Monkey home page Code Monkey logo

Comments (25)

tibbe avatar tibbe commented on July 16, 2024

I wont have time to look into this any time soon, but patches are welcome. :)

from containers.

foxik avatar foxik commented on July 16, 2024

I second that. Patches are welcome :-)

from containers.

hvr avatar hvr commented on July 16, 2024

What's the story with all those #if __GLASGOW_HASKELL__ >= ... ? Is that preferred over #if MIN_VERSION_BASE(...) (although I saw that as well)?

from containers.

tibbe avatar tibbe commented on July 16, 2024

I think MIN_VERSION_base is more correct.

from containers.

foxik avatar foxik commented on July 16, 2024

The members of Foldable class are probably dependent on base version, not on GHC version (even in little people use different base versions) -- would it be possible to change the __GLASGOW_HASKELL__ to __MIN_VERSION_base__? We use __MIN_VERSION_base__ for example in IntSet/Base.hs, in the following way:

-- We want to be able to compile without cabal. Nevertheless
-- #if defined(MIN_VERSION_base) && MIN_VERSION_base(4,5,0)
-- does not work, because if MIN_VERSION_base is undefined,
-- the last condition is syntactically wrong.
#define MIN_VERSION_base_4_5_0 0
#ifdef MIN_VERSION_base
#if MIN_VERSION_base(4,5,0)
#undef MIN_VERSION_base_4_5_0
#define MIN_VERSION_base_4_5_0 1
#endif
#endif

Also, could you add {-# INLINE #-} pragma for the new class methods? We have it for the other Foldable methods so that when the Foldable dictionary is unknown, the method is replaced at compile time.

from containers.

hvr avatar hvr commented on July 16, 2024

I can do the __MIN_VERSION_base__ change, but that would effectively mean adding significant amount of CPP boilerplate in additional modules for the base-4.6 and base-4.8 boundaries. Can't we just use a commonly included .h file, and then simply just define a MIN_VERSION_base(x,y,z) macro manually there?

To be more specific:

#ifndef MIN_VERSION_base
#include "../../macros.h"
#endif

whereas macros.h would construct a proper MIN_VERSION_base(a,b,c) macro based on the __GLASGOW_HASKELL__ value.


As for the INLINE, I'm happy to do that if it's useful, but I'm totally confused about

We have it for the other Foldable methods so that when the Foldable dictionary is unknown, the method is replaced at compile time.

Did you really mean "unknown"? Why doesn't GHC automatically inline such a trivial function? can you give a simple code example where I can see the effect you're describing?

from containers.

foxik avatar foxik commented on July 16, 2024

Can't we just use a commonly included .h file, and then simply just define a MIN_VERSION_base(x,y,z) macro manually there?

Yes, that would probably be best. The base_version.h would look something like

#ifndef MIN_VERSION_base
#ifdef __GLASGOW_HASKELL__
... try defining MIN_VERSION_base according to __GLASGOW_HASKELL__
#else
... define MIN_VERSION_base to be 0 for every version (or maybe return 1 for minimal base version specified in cabal)
#endif

And then do #include base_version.h every needed module (no need to include it conditionally). BTW, we have already include-dir: include in the cabal file.

(When you are at it, could you please also use this file in IntSet/Base and remove the MIN_VERSION_base definitions from that file? Thanks.)

Did you really mean "unknown"? Why doesn't GHC automatically inline such a trivial function? can you give a simple code example where I can see the effect you're describing?

Sorry, I was rewriting the sentence and left the negation there. What I really ment was known. I understand that GHC may inline it automatically but I would still prefer to explicitly spell out the {-# INLINE #-} pragma -- one of the issues I am not sure about is what will happen if the right side is itself {-# INLINE #-} (for example foldl or foldl') -- then it may look like non-trivial function, but we still want to inline it.

from containers.

tibbe avatar tibbe commented on July 16, 2024

Is it really worth being able to compile without cabal? If you really want to you can use

ghc -O2 Data/Map -optP-include -optPdist/build/autogen/cabal_macros.h

and only run cabal once to generate that file.

(Although, if this makes the containers package not work with the Emacs haskell-mode and/or ghc-mod, that would be a shame.)

from containers.

hvr avatar hvr commented on July 16, 2024

@tibbe, Fwiw, ghc-mod mentions explicit support for cabal_macros.h in its changelog, and haskell-mode does support Cabal via cabal repl

from containers.

tibbe avatar tibbe commented on July 16, 2024

I'll leave it up to @foxik.

from containers.

foxik avatar foxik commented on July 16, 2024

I frequently load the containers files directly to ghci, so I would be happy if I could continue doing that.

But maybe it would be enough just to add

#ifndef MIN_VERSION_base
#define MIN_VERSION_base(x,y,z) 0
#endif

which I can do myself.

So just use MIN_VERSION_base(x,y,z) please and I will add and comment these no-min-version-base-what-should-I-do-oh-my-oh-my defaults.

from containers.

hvr avatar hvr commented on July 16, 2024

ignore that updated commit, it's still missing the INLINEs

from containers.

foxik avatar foxik commented on July 16, 2024

It is a shame we cannot implement logarithmic elem for Set. Nevertheless, is there any reason why we could not implement minimum and maximum for Map and IntMap?

Also, could we implement sum and product using foldl' (instead of foldMap which is used in the default definition)?

Moreover, even if we cannot implement logarithmic elem, maybe we could implement optimized linear version for Set.

Thoughts?

from containers.

hvr avatar hvr commented on July 16, 2024
  • As for elem on Set: we'd need to capture the Ord dict. It's possible but I'm not sure if it's worth it.
  • minimum/maximum on Map/IntMap does not refer to the keys but rather the values
  • sum/product: I think you're right; we should use foldl'/foldr' instead (I wonder why the default impl uses foldMap

from containers.

ekmett avatar ekmett commented on July 16, 2024

@hvr: The default definition is set up to maximally avoid bottoms. You have to fear infinite-snoc-list-like structures when infinitely left associating trees. You don't have that problem here, so going foldl or foldr works fine -- all maps are finite.

Strictifying sum is one of those design decisions that putting it in the class leaves open to implementors.

Changing sum to be strict by default everywhere would be a big jump and ran into resistance last time it was suggested as I recall, but folks can chip away at it a project at a time. ;)

@foxik

As a thought exercise, if not a practical approach, there /is/ an option to get a log time elem for Set, but you'd probably have to make every other operation pay a tax.

Basically what you'd need to do is differentiate the root of the Set, so we can know it is either

data Set a = Empty | NonEmpty (Set' a)

And in the NonEmpty case you'd store

data Set a = Empty | NonEmpty (a -> a -> Ordering) (Set' a)

or

data Set a where
Empty :: Set a
NonEmpty :: Ordering a => Set' a -> Set a

and have the existing Set (probably with the Empty case stripped out) moved to Set'. The problem is removing the Empty case as i recall might require you to change the way some of the tree algorithms fold back up, and probably incurs a non-trivial cost.

Once you have that it is actually possible to do the elem search in log time, because you have access to the dictionary.

The cost is constant factors everywhere and that a couple of mapMonotone like functions need to pick up a previously missing Ord constraint.

from containers.

treeowl avatar treeowl commented on July 16, 2024

This does not look like a clean approach. I missed the missing-Ord complication when I suggested adding elem to the Foldable class. I think that was an error on my part. elem should probably just be a function (the cases where it can be optimized cleanly seem ... unlikely). What the Foldable class should really have are methods elemOrd :: Ord a => a -> f a -> Bool and (perhaps, with difficulty) elemHash :: Hashable a => a -> f a -> Bool.

from containers.

hvr avatar hvr commented on July 16, 2024

@treeowl

I don't think that having elem<Class> :: Class a => a -> f a -> Bool for each class(combination) would scale. Also it would leak implementation details making it more difficult to write generic code, as you can't know whether elemHash or elemOrd will be more efficient.

from containers.

hvr avatar hvr commented on July 16, 2024

@ekmett I assume you pull the Empty out in

data Set a where
  Empty :: Set a
  NonEmpty :: Ordering a => Set' a -> Set a

in order to accommodate the

empty :: Set a

constructor w/o having to add an Ord a => constraint there. But what about the existing

singleton :: a -> Set a

constructor?

from containers.

ekmett avatar ekmett commented on July 16, 2024

Ah. In the version of this I built I had that one separated as well. That is what I get for doing this from memory. =)

from containers.

foxik avatar foxik commented on July 16, 2024

This is a general issue -- for all containers, we pass the dictionary to every function, instead of storing it in the structure. If we stored it in the structure, it would also allow us to use user-specific instance (i.e., user specified hash or reversed ordering for toList return decreasing list).

We might face this issue again in the future when we have Modifiable or Insertible or similar.

Nevertheless, if we changed it for Set, we should maybe change it for all containers, which could result in quite different API :-)

BTW, I am thinking for some time that we should separate the singleton case for Set and Map. I have some benchmarks showing both memory reduction (20-30%) and speedup (5% for most operations, 10% for folds). But the code gets more complicated, especially in union-like methods and in rebalancing, where we decompose two trees, so we have up to 9 instead of 4 cases.

BTW2, we could create specialized versions of Foldable.elem which would kick in if the Foldable dictionary is known. I am not that familiar with rules and specialization, so I am not sure if it is possible to say for everything with known Ord instance do this or if we had to explicitly spell out basic types like Int etc.

@hvr Oh, thanks for the minimum/maximum, I did not realize it. I believe we should create specialized versions using foldl' for all these methods (including elem), but I can do it myself if you want.

from containers.

tibbe avatar tibbe commented on July 16, 2024

I don't think we should worry so much about elem. It's a very uninteresting method in Foldable as it cannot be guaranteed to be better than O(n). We ought to have a FiniteMap or similar class which contains an elem that can be constrained by some class constraint (using constraint kinds) and have map-like types implement that type class.

Nevertheless, if we changed it for Set, we should maybe change it for all containers, which could result in quite different API :-)

Also in quite different performance as GHC will no longer be able to do call-site specialization (i.e. INLINABLE specialization.)

from containers.

hvr avatar hvr commented on July 16, 2024

@foxik

Oh, thanks for the minimum/maximum, I did not realize it. I believe we should create specialized versions using foldl' for all these methods (including elem), but I can do it myself if you want.

That'd be great; ...so are you taking it from here?

from containers.

foxik avatar foxik commented on July 16, 2024

@hvr Yes, I am. But please do not remove your commit until I merge it and close this issue :-)

from containers.

treeowl avatar treeowl commented on July 16, 2024

I got myself confused last night. Now I realize that the problem implementing elem for Set is entirely the fault of Set (well, the fault of Haskell 98, really, but in a GHC context, Set), and Set should bear the blame for that problem. In the GHC world, it would be much better to give Set a more constrained type with a GADT.

from containers.

foxik avatar foxik commented on July 16, 2024

Done. I merged the original proposal and added specializations to other base-4.8 foldable methods. The elem could be probably implemented using Foldable.fold, but I spelled out the specialization.

from containers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.