Code Monkey home page Code Monkey logo

Comments (17)

natemcmaster avatar natemcmaster commented on June 16, 2024 1

Resolved in 41553a4

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

Labeling as help-wanted. Let me know if you want to tackle it. Let's create a design first.

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

I would like to tackle this, but I am not sure how.
Maybe add a Property like app.CommandSuggestion?
And edit the behaviour of HandleUnexpectedArg and ShowHint?

e.g. instead of:

if (_currentCommand.ThrowOnUnexpectedArgument)
{
    _currentCommand.ShowHint();
    throw new CommandParsingException(_currentCommand, $"Unrecognized {argTypeName} '{_enumerator.Current.Raw}'");
}
if (_currentCommand.ThrowOnUnexpectedArgument)
{
    if(_currentCommand.CommandSuggestion){
        //Calculate Levenshtein Distance and if close enough display hint?
        _currentCommand.ShowHint(suggestion: true);
    } 
    else
    {
        _currentCommand.ShowHint();
    }
    throw new CommandParsingException(_currentCommand, $"Unrecognized {argTypeName} '{_enumerator.Current.Raw}'");
}

I guess I will think this over and most importantly get some shuteye/sleep.

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

Yes, I think you are headed in the right direction. I would make the change in HandleUnexpectedArg, too. What do you think of this?

  • adding an overload to ShowHint. internal void ShowHint(ParameterType paramType, string value)
  • inside ShowHint, compute distance to known commands or optoins. If distance is less than a given threshold, include the nearest command name or option flag in the hint output

As far as which algorithm and which threshold to select, i'm not sure. Requires more investigation. There are half a dozen good algorithms for string comparison, including Levenshtein distance. I'd have to play with them first to see what kind of suggestions they would make

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

What do you think of this?

Seems good to me.
Should we show only the best match or the best matches (If there are for some reason really similar commands/options)?

As far as which algorithm and which threshold to select, i'm not sure. Requires more investigation. There are half a dozen good algorithms for string comparison, including Levenshtein distance. I'd have to play with them first to see what kind of suggestions they would make

I will see If I can't whip up a short Demo for different String Comparsion Algorithms and Tresholds.

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

This is not representative.
Seeing as
A: The mistypes are not natural at all.
B: I used only the standard settings (aka standard constructor) and I did not try to tweak the algorithms.
C: I did not read up all the Algorithms and as such it's pretty much guaranteed that I just misused some of them.

This is only meant as a Showcase and uses the Algorithms from SimMetrics.

The mangled input is generated the following way:
Mangle at least 1 Char up to (input.Length/2-Math.log(input.Lenght)).
Then flip a coin and if heads is shown add a random 2-6 chars.
The classes from SimMetrics return a double from 0 to 1.
I just selected the string the given algorithm thought matched most (e.g. nearest to 1).

Mangled Strings (Example):

Input Output
config conbig
init iGit
clone cWone\b
add adl
commit cSmmit
push Qush
status stitus
remote vemote
checkout OheckOutlq
branch brancM
pull pkllKejto
merge mQrgepdD
diff difNDr
tag BagxjkL
log yog
fetch fetcYCDfBc
reset reseNvSHmh
grep gfepBVGe

The DistanceStringMetrics (for the case above):

Metric Correct Wrong
SmithWatermanGotohWindowedAffine 18 0
SmithWatermanGotoh 18 0
MongeElkan 18 0
JaroWinkler 18 0
Levenstein 18 0
Jaro 18 0
SmithWaterman 17 1
QGramsDistance 16 2
NeedlemanWunch 11 7
ChapmanLengthDeviation 4 14
ChapmanMeanLength 1 17
CosineSimilarity 1 17
OverlapCoefficient 1 17
EuclideanDistance 1 17
JaccardSimilarity 1 17
DiceSimilarity 1 17
MatchingCoefficient 1 17
BlockDistance 1 17

The same run 1000 times gives the following:

Metric Correct Wrong
JaroWinkler 17791 209
Jaro 17747 253
Levenstein 17668 332
SmithWatermanGotohWindowedAffine 17159 841
SmithWatermanGotoh 17159 841
MongeElkan 17159 841
QGramsDistance 17028 972
SmithWaterman 16735 1265
NeedlemanWunch 11810 6190
ChapmanLengthDeviation 3010 14990
CosineSimilarity 1132 16868
OverlapCoefficient 1132 16868
EuclideanDistance 1132 16868
JaccardSimilarity 1132 16868
DiceSimilarity 1132 16868
MatchingCoefficient 1132 16868
BlockDistance 1132 16868
ChapmanMeanLength 1000 17000

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

Really interesting data. Thanks for doing some research here. It looks to me like Jaro(Winkler) or Levenstein are good candidates. I'd be okay with a PR that uses either of these.

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

From wikipedia:

Informally, the Jaro distance between two words is the minimum number of single-character transpositions required to change one word into the other.

The JaroWinkler implementation favors words that start correctly and make mistakes down the line.
e.g. for commit: comtat (favored) vs aftmit. The amount of changes is the same, but the Similarity value is different.

the Levenshtein distance allows deletion, insertion and substitution;
the Damerau–Levenshtein distance allows insertion, deletion, substitution, and the transposition of two adjacent characters;

Damerau–Levenshtein in this case seems to be the best Algorithm for this (In my opinion).
I will start writing up a test-implementation for CommandLineUtils today/tommorow.

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

Hey,
I created a basic implementation over at https://github.com/MadbHatter/CommandLineUtils.
See src/Internal/StringDistance, ShowHint and ThrowOnUnexpectedArgument.

It's not the most beautiful thing, but it's something to base things on.
Things to do/that are missing:

  • A Treshold (Added)
  • Modify NormalizeDistance to account for 0 distance/length inputs (at the moment it'a DivideByZeroException in waiting)
  • Added subcommands via flattening of the tree structure (Maybe change this to only catch direct subcommands, not sure)
  • Modify the behaviour so it only conisders one ParameterType? Not sure if we want this?
  • A option to disable the suggestion?
  • Unit tests for some things (StringDistance.GetBestMatch)

I will be thankful for any input :)

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

Some thoughts:

A Treshold (Added)

Yes, needed. Some inputs will be just completely off. How did you pick a threshold?

Added subcommands via flattening of the tree structure (Maybe change this to only catch direct subcommands, not sure)
Modify the behaviour so it only conisders one ParameterType? Not sure if we want this?

I think it would be best to only make suggestions would be correct if not for the typo. For example, I would consider this a bad suggestion:

$ git fethc
command 'fethc' not found
Did you mean `--force`?

A option to disable the suggestion?

Maybe, but not essential IMO.

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

Yes, needed. Some inputs will be just completely off. How did you pick a threshold?

I normalized the distance between two strings, e.g. 1.0d-distance/length.
Meaning that 1 is a perfect match and 0 a complete nonmatch.
I then did add treshold to GetBestMatchIndex, at the moment I put it at 0.33, e.g 1 in 3 has to be right.

I think it would be best to only make suggestions would be correct if not for the typo. For example, I would consider this a bad suggestion:

Hmm, fine I will change it.
I had more the idea making this possible:

myprogram copy a.exe b.exe
command copy not found
Did you mean the option --copy?

But I see your reasoning.

Maybe, but not essential IMO.

Add least an option to hide the suggestion for hidden commands (Does this Library support this, I'm not sure at the moment)?

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

Did you mean the option --copy?

I like this idea. Sorry, maybe wasn't clear. What I meant was that we shouldn't make suggestions that would be incorrect. In this case, changing copy to --copy would be correct because --copy is an option on myprogram. Maybe I didn't understand what you meant by "Added subcommands via flattening of the tree structure". I thought you were implying that you might do this:

Usage:
myprogram subcommand [--copy]
$ myprogram --cpy
Did you mean `--copy`?

In this case, it would be a bad suggestion because you can't execute myprogram --copy, but must do myprogram subcommand --copy.

Add least an option to hide the suggestion for hidden commands

Already exists. A ShowInHelpText property exists on commands, options, and args. If this is set to false, I would expect it to be excluded from hints.

from commandlineutils.

MadbHatter avatar MadbHatter commented on June 16, 2024

Sorry. I am on Mobille.

In this case, it would be a bad suggestion because you can't execute myprogram --copy, but must do myprogram subcommand --copy.

Ah. That was not my intention but I wasn't being clear enough. At the moment it does this because I didn't have the time to fix it. My final intention is for it to be aware of valid subcommands&others only. It would also output the whole necesssary command structure as to prevent confusion. E.g
mycommand --magic vs only --magic

Already exists. A ShowInHelpText property exists on commands,

Ah good to know. I must include this.

Thanks for your thoughts.

from commandlineutils.

jerriep avatar jerriep commented on June 16, 2024

@natemcmaster I know this is not the same thing, but are there plans to support tab completion like the .NET CLI does?

I see their commands has this thing called WithSuggestionsFrom which allow them to do that:
https://github.com/dotnet/cli/blob/8c937a0db08e56660aca456ac088f2d0e70735ab/src/dotnet/commands/dotnet-add/dotnet-add-package/AddPackageParser.cs#L25

Sorry for always trying to add more to your plate ;) I am building a CLI to manage GitHub Issues and this may be a pretty nifty feature to help users quickly type the correct repo name or issue number.

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

I originally opened #9 for this, but didn't get much feedback on it. No plans to implement it at the moment, but I'm open to taking a PR if someone wants to design and build it.

from commandlineutils.

jerriep avatar jerriep commented on June 16, 2024

Thanks Nate, I would be interested to look into that, but for now I want to get this GitHub CLI out of the door. I am learning a lot along the way and their are a bunch of other helper methods and utilities I think we can add.

from commandlineutils.

natemcmaster avatar natemcmaster commented on June 16, 2024

I've put this in the 2.3.0 milestone. @MadbHatter let me know if you are interested in contributing your prototype. It seems like a really good starting point.

from commandlineutils.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.