Code Monkey home page Code Monkey logo

Comments (10)

UncleGene avatar UncleGene commented on August 23, 2024

Flay compares "normalized" code, so it cannot tell how may lines are duplicated (potentially - different for each file). You can see duplicated code by running flay in a diff mode (requires ruby2ruby),

Potentially code can be modified to approximate number of lines, but there is not enough information preserved during parsing. It is pretty common request; Ryan, would it be possible to change parser to preserve this information in sexp?

from flay.

ggallen avatar ggallen commented on August 23, 2024

Well, the normalized code has the start line information. I would expect it has line information for each line in the duplicated block. Assuming that is the case the number of lines could be calculated.

And as for the number of duplicated lines being different per file, that is true. However I would expect (especially with cut and paste) that this would be the exception, not the norm. And if they are different I would like to know that as well. Couldn't each instance contain a start/end line number to solve this problem?

IMO this is a big issue for using flay on my project. It actually creates work for me in that I have to look at each instance and compare them visually to see where the duplication ends, which is of course error-prone. And I'd rather not have to install yet another tool to work around this - this seems like information flay has to have and it just needs to output it.

To me this is similar to a compiler issuing a message that says "error on line 4". That only tells me there is a problem on line 4, but not what the problem is. Is it a syntax error? A type error? Undefined variable? A good compiler (actually, probably any compiler) gives you enough detail and context to (hopefully) easily identify and fix the problem.

from flay.

zenspider avatar zenspider commented on August 23, 2024

THIS is the big issue? Not the fact that your example is the 1970th thing in the output? :P

This "creates work for me in that I have to look at each instance" is exactly the point of flay. It tells you there is work you need to do. It is NEVER the intention that it is some automatic "remove all the dupes for me" type of tool. You're programming, thinking is what you do.

You know it starts on line 64. You know it is a defn node (a method)... ie a whole method. You know it has 32 nodes within. And if you run with -d, you'll see the whole thing. How is that not enough info?

from flay.

ggallen avatar ggallen commented on August 23, 2024

You are missing my point and inferring things. I never said I expected flay to automatically "remove all the dupes". I simply asked for it to give me more information.

This "creates work for me in that I have to look at each instance" is exactly the point of flay.

And this cuts what I actually said at a convenient point for you to bolster your argument. If you include the rest of the statement - "and compare them visually to see where the duplication ends" - it takes on an entirely different meaning.

Of course there is work to be done there. As you say, that is the point. But how much work? 1 line? 2 lines? 40 lines? I have no idea based on the output. As it is, I have to go into the code and look. Knowing how big the duplication is could certainly help me prioritize the issues.

All I knew initially was it started on line 64 (actually 44 or 33, take your pick). I did not know that a defn == method. How was I to know that? I certainly didn't know it had 32 nodes within. I think you may be assuming a familiarity with the internals of flay and the algorithms it uses that novice users may not have and may never acquire. At least I certainly don't have it - but I have already learned a great deal.

I am only requesting information that it seems should be provided and that I would expect the parser to have handy already. To me this is information that would enhance flay.

It is pretty common request;

And judging by this comment, I would have to conclude that I am not alone.

from flay.

zenspider avatar zenspider commented on August 23, 2024

"convenient point for you to bolster your argument" ? Maybe I'm reading more hostility into this response than there actually is, but it is rubbing me the wrong way and I'm simply not in the mood right now to deal with this.

From the readme http://docs.seattlerb.org/flay/:

  % flay -v --diff ~/Work/svn/ruby/ruby_1_8/lib/cgi.rb
  Processing /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb...

  Matches found in :defn (mass = 184)
    A: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:1470
    B: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:1925

  A: def checkbox_group(name = "", *values)
  B: def radio_group(name = "", *values)
       if name.kind_of?(Hash) then
         values = name["VALUES"]
         name = name["NAME"]
       end
       values.collect do |value|
         if value.kind_of?(String) then
  A:       (checkbox(name, value) + value)
  B:       (radio_button(name, value) + value)
         else
           if (value[(value.size - 1)] == true) then
  A:         (checkbox(name, value[0], true) + value[(value.size - 2)])
  B:         (radio_button(name, value[0], true) + value[(value.size - 2)])
           else
  A:         (checkbox(name, value[0]) + value[(value.size - 1)])
  B:         (radio_button(name, value[0]) + value[(value.size - 1)])
           end
         end
       end.to_s
     end

  IDENTICAL Matches found in :for (mass*2 = 144)
    A: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:2160
    B: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:2217

     for element in ["HTML", "BODY", "P", "DT", "DD", "LI", "OPTION", "THEAD", "TFOOT", "TBODY", "COLGROUP", "TR", "TH", "TD", "HEAD"] do
       methods = (methods + (("          def #{element.downcase}(attributes = {})\n" + nO_element_def(element)) + "          end\n"))
     end
  ...

from flay.

UncleGene avatar UncleGene commented on August 23, 2024

The tone of this conversation is not too constructive, but I'll risk to continue here.

Flay is perfectly suited for command-line use. But if you look at clients that are trying to use it as a library (MetricFu, and, looking at results, CodeClimate too) with their own presentation layer, they all are making futile attempts to mitigate absence of this information (it is common problem for flog and flay). It would be really nice if not only start, but also the end (at least line #) position can be preserved for the node.

from flay.

ggallen avatar ggallen commented on August 23, 2024

I apologize if my post was received in a negative light. That was not my intent.

I was simply trying to get across two points:

  1. A novice flay user (for example, me) may have no idea where the duplication ends without comparing two of the duplicated areas manually.
  2. I would expect both a start and an end line to the duplicated block, and I would think the parser readily has that information. I could very well be wrong.

Apologies again if I offended anyone.

from flay.

zenspider avatar zenspider commented on August 23, 2024

@UncleGene I can totally accept your feedback on that, but since this (writing tools to interface Flay directly) isn't the case with the OP, I consider this a separate issue. I welcome you to file an issue on sexp_processor (as this is really about the API on Sexp itself more than anything).

@ggallen I don't know how many times I need to show you that the information is already there (and documented!) and even better than what you're asking for. Asking a novice flay user to read the readme is not too much to ask.

from flay.

ggallen avatar ggallen commented on August 23, 2024

@zenspider Yes, I agree I can figure it out from the diffs. I will try that approach. And FYI, I am trying to use flay within metric_fu and send the results to SonarQube via a plugin. Hence the need for either the end line or the number of lines in the duplicated block. Thanks.

from flay.

ggallen avatar ggallen commented on August 23, 2024

@zenspider I have to retract my previous statement. I can sort of figure it out from the diffs.

The end result here for me is to use the flay output to display duplicated code in SonarQube. SonarQube expects the start line and number of lines in the duplication in the input, and in the UI displays the actual duplicated source code block for inspection by the user.

Using the output of flay with the --diff I can try to simulate this. By counting the lines on the diff output and associating those counts with the correct file I can calculate the number of lines for each duplicated block in each file. However, the numbers are not always correct.

After poking around the flay code I realized that the diff output code is generated by using ruby2ruby on the sexp block of the duplicated code. While this makes perfect sense and presents the user with a well-formatted block of code showing the duplcation, it may not match up with the actual source. In particular, comment lines are not included and multiple statements on a single line are changed to two lines.

The end result is that the number of lines I calculate and pass to SonarQube is not always correct. Close, but not correct.

from flay.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.