Comments (17)
Thanks, I'm glad to hear that BiFunction
is at least an option. You have found a clever way to show how much more complex that makes the signature, though ;) (I do seriously think that the complexity may explain the difference between Collectors.toMap
, which has complex signature that we don't want to become yet more complex (so it uses BinaryOperator
), and Map.merge
, which is relatively straightforward (so it uses BiFunction
). But I'm speculating.
I wonder if the sweet spot might be BiFunction<V, V, @Nullable V>
: We don't do the PECS wildcards that explode the length of the type (the way that BiFunction<? super V, ? super V, ? extends @Nullable V>
would), but we have enough distinct types that we can put @Nullable
where we want it.
Still, this matters relatively rarely, so it's not necessarily worth going more complex than BinaryOperator
.
And I can give some data about how rarely! I ran some tests overnight, and I have a couple different things to report:
- My proposal with
@NonNull
from my earlier post is complex enough that it causes problems...- ...for our Checker Framework users. The problem appears to be only in type inference, which is an area that the Checker Framework owners have been actively working on, so it might go away in time.
- ...for some Kotlin experiments that we're doing. The problem there might also be only type inference, but I'm not going to look into it any more deeply because the Checker Framework issue is already a blocker for doing it now.
- We do have several different users who rely on being able to return
null
from themergeFunction
oftoImmutableMap
. (I don't think we have any tests of our own, though... :)) Most (maybe all?) of them fit into the category of "If we see any duplicates, then drop the key altogether," which I will note does not actually work if there might be 3, 5, 7, etc. copies of a key, IIUC :) So the functionality gets used, which is all the more reason to document and test it.
So I think my plan now is:
- Document the null handling.
- Test the null handling.
- Maybe change the nullness annotations someday if the Checker Framework issue gets fixed.
- Maybe or maybe not agonize over
BinaryOperator
vs.BiFunction
if we follow up on #6822
from guava.
Thanks also for catching toImmutableTable. I dug up the review, which I had forgotten about. It turns out that the author and I both complained about the JDK's null behavior, with the author even citing the "if you had three duplicates you'd be back to discarding two and putting in one" behavior that I "discovered" above... :)
If nulls are used to eliminate duplicates then yes, it is going to be error prone. However, I can imagine some exotic use cases, involving toImmutableTable
but also toImmutableRangeMap
, which we all need desperately ;), where you may want to use nulls as a way to avoid storing neutral elements in some collection / table / map (as being uninteresting or for whatever other reason). Merging -x
and x
could then result in null instead of 0
, symmetric difference of two identical sets could result in null instead of empty set and so on. I know that this can be accomplished with post-processing (e.g. streaming, filtering out "zeroes", recollecting), and that in general it would be less error-prone method than the one involving nulls, but we are where we are.
That still gives us the option to change toImmutableTable, presumably to match toTable. That said, consistency might be less important there: I'd expect users to less commonly migrate from toTable to toImmutableTable than from toMap to toImmutableMap, since toImmutableTable has been available just as long as toTable.
I would love to see consistency between toImmutableTable
and toTable
. Firstly, it may be useful to some. Furthermore, consistency may be "less important", but it is still important, I hope :) After all, it may be the only quality that that led me - and consequently you - to this issue.
from guava.
Nice catch, thanks! As you say:
in case of Guava, having "just one"
V
forces both parameters and return type to be annotated together
That does suggest that your proposed ImmutableRangeMap.toImmutableRangeMap
overload would benefit from using BiFunction
, following the Map.merge
precedent you cited instead of the Collectors.toMap
precedent. (Maybe there was an argument for having toImmutableMap
follow Collectors.toMap
so as to ease migration between the two, but that argument wouldn't apply for ImmutableRangeMap
, which has no JDK equivalent.)
For ImmutableMap.toImmutableMap
, we can't change the actual type used at this point. We could maybe change to something like:
public static <T extends @Nullable Object, K, V extends @Nullable Object>
Collector<T, ?, ImmutableMap<K, @NonNull V>> toImmutableMap(
Function<? super T, ? extends K> keyFunction,
Function<? super T, ? extends @NonNull V> valueFunction,
BinaryOperator<V> mergeFunction) {
That would let existing users stay unchanged (I hope! I'd want to test to see what happens in practice) while allowing anyone who wants to return null
from mergeFunction
to do so—at the cost of having the inputs to mergeFunction
also change to have nullable types, which can be inconvenient (but is often fine, like if your merge function is (first, second) -> second
).
I think that would at least be preferable to the alternative of...
public static <T extends @Nullable Object, K, V>
Collector<T, ?, ImmutableMap<K, V>> toImmutableMap(
Function<? super T, ? extends K> keyFunction,
Function<? super T, ? extends V> valueFunction,
BinaryOperator<@Nullable V> mergeFunction) {
...which would foist the nullable mergeFunction
inputs onto everyone.
Even if we don't change the types, we should document the behavior.
(I've been assuming in all this that the current behavior is what we want. I think it probably is, if only to ease migration from Collectors.toMap
. Maybe someone else on the team remembers specific discussions of this.)
Thought on any of that?
from guava.
That does suggest that your proposed
ImmutableRangeMap.toImmutableRangeMap
overload would benefit from usingBiFunction
, following theMap.merge
precedent you cited instead of theCollectors.toMap
precedent. (Maybe there was an argument for havingtoImmutableMap
followCollectors.toMap
so as to ease migration between the two, but that argument wouldn't apply forImmutableRangeMap
, which has no JDK equivalent.)
It needs to be BinaryOperator
. Not only there's this pleasant analogy between Map.merge
<-> Collectors.toMap
vs RangeMap.merge
<-> ImmutableRangeMap.toImmutableRangeMap
, BiFunction
<-> BinaryOperator
in both cases. What's more important, there are some some serious wildcard-related issues with signatures like...
Function<? super T, Range<K>> keyFunction,
Function<? super T, ? extends V> valueFunction,
BiFunction<? super V, ? super V, ? super V> mergeFunction
It seems that BinaryOperator<V>
is a neccessity.
(EDIT: See next comment.)
Even if we don't change the types, we should document the behavior.
Please do!
Thought on any of that?
I guess that's all :-)
from guava.
I've made stupid mistake. mergeFunction
can and possibly should be declared like...
Function<? super T, Range<K>> keyFunction,
Function<? super T, ? extends V> valueFunction,
BiFunction<? super V, ? super V, ? extends V> mergeFunction
... which is fine and causes no compilation errors (which previously were caused by using super
instead of extends
in the last type param). Even so, I think I would still prefer to use BinaryOperator<V>
because of similarity to Collectors.toMap
. Thanks for the hint anyway!
from guava.
I have a change out for review for those first two bullets.
I'm going to generalize this issue to also cover potentially changing the nullness annotations at some future point.
Thanks again for the report.
from guava.
Unfortunately there are still some problems I missed before (sorry for that!). You will likely have to generalize at lest once more.
ImmutableSortedMap.toImmutableSortedMap
and Maps.toImmutableEnumMap
need merge/null-related documentation and tests too. This seems to be unquestionable. But there's more.
Tables.toTable
and ImmutableTable.toImmutableTable
are already documented and tested (that's good news), but ImmutableTable.toImmutableTable
behave differently than, say, ImmutableMap.toImmutableMap
(Tables.toTable
accepts nulls from mergeFunction
, ImmutableTable.toImmutableTable
does not). This may be an argument for changing ImmutableMap.toImmutableMap
and other toImmutableXYZ
collectors to align with ImmutableTable.toImmutableTable
(current behavior was unspecified so there may be still room for change, even if breaking for some; EDIT: this would make part of the annotation-related problems go away too!). WDYT?
from guava.
Here's current, undocumented behavior.
@Test
void toImmutableSortedMap() {
ImmutableSortedMap<Integer, Integer> map = Stream.of(0, 0).collect(ImmutableSortedMap
.toImmutableSortedMap(Comparator.<Integer>naturalOrder(), e -> e, e -> e, (a, b) -> null));
assertThat(map).isEmpty();
}
@Test
void toImmutableEnumMap() {
ImmutableMap<RoundingMode, RoundingMode> map = Stream
.of(RoundingMode.UP, RoundingMode.UP, RoundingMode.DOWN, RoundingMode.DOWN)
.collect(Maps.toImmutableEnumMap(e -> e, e -> e, (a, b) -> null));
assertThat(map).isEmpty();
}
from guava.
That's a tad embarrassing, given that I've been actively working on the various immutable-collection Collector
APIs... :) I think toImmutableBiMap
even popped into my head at one point, but I reassured myself that it doesn't have a mergeFunction
overload. Clearly my brain was determined not to see the remaining work, so we needed someone else to do so. Thanks.
Thanks also for catching toImmutableTable
. I dug up the review, which I had forgotten about. It turns out that the author and I both complained about the JDK's null
behavior, with the author even citing the "if you had three duplicates you'd be back to discarding two and putting in one" behavior that I "discovered" above... :)
I'm inclined to think that a change to toImmutableMap
and friends is more dangerous than we'd like now. And the current behavior might even be the least bad choice if we were designing the method today, given the JDK precedent (and given that users might like to be able to s/toMap/toImmutableMap/g
without fear, assuming that they don't use null keys).
That still gives us the option to change toImmutableTable
, presumably to match toTable
. That said, consistency might be less important there: I'd expect users to less commonly migrate from toTable
to toImmutableTable
than from toMap
to toImmutableMap
, since toImmutableTable
has been available just as long as toTable
.
And that brings us to the fact that we were reviewing toImmutableTable
and toTable
on the same day, discussed (and documented and tested, as you saw) their null-handling behavior, and still chose different policies. Hmm.
I think it would be a step forward to test and document the current ImmutableSortedMap.toImmutableSortedMap
and Maps.toImmutableEnumMap
behavior. Beyond that... hmm.
from guava.
That still gives us the option to change
toImmutableTable
, presumably to matchtoTable
.
It's also about consistency between toImmutableTable
and toImmutableMap
...
from guava.
True, I have focused entirely on the "remove duplicates" use case, neglecting legitimate use cases like the one in the new test that we'll be releasing of these days:
toImmutableMap(
Map.Entry::getKey,
Map.Entry::getValue,
(a, b) -> {
int result = a + b;
return result == 0 ? null : result;
});
In fairness, I have done that because "remove duplicates" was the only use case I encountered in practice with toImmutableMap
in our codebase. And while I don't want to say we'd be "doing people a favor" by breaking their code (which does work if there are never more than two copies), especially if the alternative is to make their code throw but only in case of duplicates, I also don't feel confident that the null->remove behavior is necessarily the one that's going to lead to more resilient programs.
But consistency is still a good argument. And while there's always danger in changing behavior, the danger is minimal when the old behavior was "throw an exception."
from guava.
I also don't feel confident that the null->remove behavior is necessarily the one that's going to lead to more resilient programs.
Sadly, it probably will lead to less resilient programs. It's almost too tricky to use. Note that even in the example you gave in your last post, things may go wrong, depending on what happens before. It's easy to assume that the resulting map won't contain 0 values, but that's guaranteed only if there are no mappings to 0 in the collected entries. This fact is easy to miss when writing code and even easier to miss when reading.
Maybe it would be best to have @NonNull V
since the beginning (it's too late now), but I still think the next best option is to stick to consistency (although it's getting harder and harder to believe ;)).
from guava.
It's easy to assume that the resulting map won't contain 0 values, but that's guaranteed only if there are no mappings to 0 in the collected entries.
Yikes. That does sour me a little more on the idea.
On the other side: One other lesson that I could stand to remind myself of is that, as much as we hate for our libraries to promote fragile coding practices (like "removing duplicates" that works only with an even number of copies), we've gone too far the other direction at least once, leading to (e.g.) crashes in Android apps because we threw an exception.
I'm thinking here of ImmutableMap
, whose build()
method rejects duplicate keys. (That's surprisingly similar to what we're discussing here.) It's a very principled thing to do: There's a whole class of security bugs that arises when one part of a system keeps the first entry for a key and another part keeps the last. And there are cases in which a duplicate indicates some kind of problem. Maybe rejecting duplicates was a good choice for stateless server apps. But we kept hearing of cases in which it crashed apps or long-running pipelines or other stateful processes. That led us to introduce buildKeepingLast()
.
(Not that I'm advocating for having two methods here. Without a lot more users who might benefit, I don't see enough justification for a toImmutableMapDroppingDuplicates
.)
If only we had universal nullness checking, in which case we could have made compilers reject any attempt to add null
from the beginning!
[edit: It might be interesting to see whether the handful of "duplicate-removing" users I found inserted the duplicate-removal logic after seeing an exception in prod. It might also be interesting to see whether other people have seen the exception in prod, filed bugs, and resolved them in some other way. Maybe we're breaking people's apps when they really just want deduplication, or maybe we're catching real issues.]
from guava.
(It turns out that we have a toImmutableMapIgnoringDuplicateEntries
inside Google. (It's not in ImmutableMap
itself but in a sort of "contrib
" package.) It's a bit different than the use case above: It permits duplicate key-value entries (and includes them (well, one of them, but it shouldn't matter which :)) in the map), but if it sees a key with two different values, it throws an exception. It has more usages than I would have guessed—definitely more than the handful I saw who were attempting removal of all duplicate keys.)
from guava.
@cpovirk You seem to be hoping to publish a release soon. I'd like to ask if you are absolutely sure that toImmutabeMap
null handling have to be different than toImmutableTable
? If not, then maybe it would be wise to undo #6826 and wait until there are no doubts? BTW what about toImmutableSortedMap
/ toImmutableEnumMap
? It seems that they are still untested and undocumented.
You probably have more important things to do and I understand that. I just wanted to remind you that publishing a release including #6826 will make it's changes permanent (behavior will change from unspecified to guaranteed).
from guava.
Thanks. I still find everything about this kind of gross... :\
Mainly, though, I think that we have clear evidence that some users depend on the until-recently undocumented behavior. A change to that behavior is probably more likely to ruin someone's day than to improve it. If anything, I could see reconsidering toImmutableTable
, since we could hope that no one depends on the NullPointerException
there.... Oh, and maybe we'll see fit to change up the toImmutableMap
nullness annotations someday, too.
from guava.
Looking at this again after a teammate flagged it in a followup to #7077, I think the problem with the Kotlin experiment is that I'm changing ImmutableMap.toImmutableMap
but not changing the signature that we give Kotlin for Collectors.toMap
(which toImmutableMap
calls). (Or at least that's part of the problem.)
Currently, the blocker to doing anything here remains that we're on an older version of the Checker Framework with type-inference problems. If that problem goes away, then I should try again with a corresponding change to the toMap
signature used by Kotlin.
from guava.
Related Issues (20)
- Is `com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-guava` really necessary? HOT 2
- SmoothRateLimiter.SmoothBursty obtaining too many permits without failure HOT 1
- Certain versions of Guava on Maven central no longer working on Java 11 HOT 2
- guava-gwt should publish Gradle module metadata or somehow fix POM HOT 2
- java.lang.NoSuchMethodError: 'void com.google.common.base.Preconditions.checkState(boolean, java.lang.String, long) HOT 1
- `Suppliers.memoize()` thread pinning HOT 6
- Copying a filtered collection with ImmutableSet.copyOf() should require visiting each element only once HOT 1
- com.google.common.collect.Range#hasLowerBound determination error after deserialization HOT 7
- InternetDomainName.topPrivateDomain() throws exception for Amazonaws-Domain HOT 4
- Use Gradle module available-at tag instead of files for redirecting jre vs android consumers HOT 5
- Failing while building guava version v31.0.1 using mvn clean install HOT 3
- Make `Striped.custom` public HOT 3
- Improve `guava-android` Animal Sniffer compatibility testing for Java 8+ APIs HOT 2
- 2024年, 一元机场是不是跑路了? HOT 1
- Add Duration support for CacheBuilder.expireAfterWrite like in other guava projects HOT 4
- class file for com.google.errorprone.annotations.CompatibleWith not found HOT 3
- Use GitHub Actions for most release automation HOT 1
- RateLimiter.tryAcquire blocks when it should not HOT 1
- Implement of RateLimiter.setRate can be simplified
- The PSL has been not been updated for over 3 months HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from guava.