Browsers swallow the first newline after a textarea tag, as per the HTML spec. The tex

Hmm, you are right the documentation is not clear here <a href="http://www.w3.org/TR/h

From the <a href="http://www.w3.org/TR/html401/appendix/notes.html#notes-line-breaks"

I think the opinion of a core team member would be better on this. <a class="user-ment

Good call, <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

textarea helper swallows leading newline,about rails/rails

Comments (46)

vandrijevik commented on May 13, 2024

Can you please point where the HTML spec states that a newline after <textarea> should be silently ignored? Neither http://dev.w3.org/html5/markup/textarea.html nor http://dev.w3.org/html5/markup/syntax.html#replaceable-character-data mention anything on this topic.

from rails.

Fjan commented on May 13, 2024

Hmm, you are right the documentation is not clear here http://www.w3.org/TR/html4/interact/forms.html#h-17.7 but the example given shows that the newline gets swallowed. I tested this behaviour in all browsers I have here and they do consistently eat the newline after the text area (a quick Google will confirm this as well).

As an aside, the textarea helper in ASP.NET does emit an additional new line, perhaps this is why all browser vendors choose to swallow it. This blog pots has a nice illustration: http://haacked.com/archive/2008/11/18/new-line-quirk-with-html-textarea.aspx

from rails.

matiaskorhonen commented on May 13, 2024

From the HTML 4.01 specification appendices:

SGML (see [ISO8879], section 7.6.1) specifies that a line break immediately
following a start tag must be ignored, as must a line break immediately
before an end tag. This applies to all HTML elements without exception.

Unfortunately I can't find the same text in the HTML5 spec, but it would seem that this is expected browser behaviour for the foreseeable future. Especially as the HTML5 spec mostly formalizes existing browser behaviour.

from rails.

dmathieu commented on May 13, 2024

This applies to all HTML elements without exception.

Based on that, it shouldn't only apply to textareas, but all tags.

So instead of <textarea></textarea>, we should have <textarea>\n</textarea> right ?

from rails.

Fjan commented on May 13, 2024

The textarea tag is the only one that has the problem where a leading \n can be swallowed, the other HTML tags are not sensitive to white space. So it would seem superfluous to add a \n inside every tag.

from rails.

Fjan commented on May 13, 2024

Thanks for fixing that. I looked at the fix, and in this case it might actually be a bit more elegant to simply prefix the content with a newline. That way we don't have to put that extra parameter in for the 99% of the cases where it isn't needed. Good for performance too:

content_tag("textarea", "\n#{html_escape(options.delete('value') || value_before_type_cast(object))}", options)

from rails.

Fjan commented on May 13, 2024

oh, wait, the html_escape should go on the outside, I tested it and it works fine, so:

content_tag("textarea", html_escape("\n#{options.delete('value') || value_before_type_cast(object)}"), options)

from rails.

dmathieu commented on May 13, 2024

I think the opinion of a core team member would be better on this. @josevalim what do you think ?

from rails.

benatkin commented on May 13, 2024

Good call, @dmathieu. Perhaps we could also see if and how another popular form plugin handles this. I'm taking a look at https://github.com/justinfrench/formtastic

from rails.

parndt commented on May 13, 2024

This never actually got closed?

from rails.

Fjan commented on May 13, 2024

Yep, this bug has been in there since rails 1.0.
The suggested patch is a bit unwieldy though, since it can also be fixed by simply adding a single newline in the right place instead of adding an extra parameter to every content tag, perhaps that's why it didn't get picked up.

from rails.

jeremy commented on May 13, 2024

Crazy. @Fjan, @mathieu, got a simpler patch without the extra param? If this is broken everywhere, it should be default.

from rails.

codykrieger commented on May 13, 2024

@jeremy There's a simpler patch - better?

from rails.

Fjan commented on May 13, 2024

@codykrieger Thanks, but this adds a hash lookup to every generated tag. Why not use the 1 line patch I suggested above? (excuse me for not doing it myself, I don't have git here)

from rails.

codykrieger commented on May 13, 2024

I actually get a bunch of failures in the tests when using your patch:

Finished tests in 80.100596s, 42.4341 tests/s, 192.9074 assertions/s.

  1) Failure:
test_text_area_with_escapes(FormHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/form_helper_test.rb:445]:
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nBack to &lt;i&gt;the&lt;/i&gt; hill and over it again!</textarea>"> expected to be == to
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nBack to &amp;lt;i&amp;gt;the&amp;lt;/i&amp;gt; hill and over it again!</textarea>">.

  2) Failure:
test_text_area_with_html_entities(FormHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/form_helper_test.rb:460]:
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nThe HTML Entity for &amp; is &amp;amp;</textarea>"> expected to be == to
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nThe HTML Entity for &amp;amp; is &amp;amp;amp;</textarea>">.

  3) Failure:
test_content_tag_with_newline(TagHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/tag_helper_test.rb:92]:
<"<textarea>\nlimelight</textarea>"> expected to be == to
<"<textarea>limelight</textarea>">.

3399 tests, 15452 assertions, 3 failures, 0 errors, 0 skips
rake aborted!

Looks like things are being escaped oddly.

UPDATE: Turning your patch into content_tag("textarea", "\n#{html_escape(options.delete('value') || value_before_type_cast(object))}".html_safe, options) fixed the first two failures.

from rails.

Fjan commented on May 13, 2024

Yeah, the corrected one was actually what I suggested on July 1 (you probably picked the earlier one from an (untested) comment in response to the initial test).

from rails.

rafaelfranca commented on May 13, 2024

I opened a new pull request with a simpler solution. Could you check if it is satisfactory

from rails.

Fjan commented on May 13, 2024

I haven't tried it but I think the html_safe on the first line is not correct (the ERB:Util term is already html_safe and the value_before_type_cast may not be html safe). Just removing it should fix it.

from rails.

rafaelfranca commented on May 13, 2024

@Fjan no. It is need. The ERB::Util expression is

ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

This will output a html_safe string.

So if I call

"\n".html_safe + ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

Is the same thing that

"\n#{ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))}".html_safe

But the result of this expression need to be a html safe string, or the content_tag will escape the html again.

from rails.

Fjan commented on May 13, 2024

My point is that value_before_type_cast(object) may contain unsafe characters so you create a potential HTML injection exploit by doing that.

from rails.

rafaelfranca commented on May 13, 2024

But it is escaped by ERB::Util.html_escape

from rails.

Fjan commented on May 13, 2024

Oh, ok, sorry. That seems kind of pointless though: escaping it and then marking it html safe since the content_tag does the same thing.

from rails.

rafaelfranca commented on May 13, 2024

I can change this line to:

"\n".html_safe + ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

But this is the same thing that the current implementation.

If the result of this concatenation is not HTML safe the content_tag will escape the HTML again. So some tests will be broken.

Like these one:

  1) Failure:
test_text_area_with_escapes(FormHelperTest) [/Users/rafaelmfranca/Projects/github/rails/actionpack/test/template/form_helper_test.rb:485]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 <textarea cols="40" id="post_body" name="post[body]" rows="20">
-Back to &lt;i&gt;the&lt;/i&gt; hill and over it again!</textarea>
+Back to &amp;lt;i&amp;gt;the&amp;lt;/i&amp;gt; hill and over it again!</textarea>


  2) Failure:
test_text_area_with_html_entities(FormHelperTest) [/Users/rafaelmfranca/Projects/github/rails/actionpack/test/template/form_helper_test.rb:500]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 <textarea cols="40" id="post_body" name="post[body]" rows="20">
-The HTML Entity for &amp; is &amp;amp;</textarea>
+The HTML Entity for &amp;amp; is &amp;amp;amp;</textarea>

from rails.

rafaelfranca commented on May 13, 2024

Hey @Fjan. I refactored the code to leave the escape logic only with the content_tag. Now the code is cleaner.

Thanks to review it.

from rails.

Fjan commented on May 13, 2024

Ok, great. Thanks very much for taking the time to do this. I think this bug is probably the longest standing one in Rails, it has been there since Rails 1. Cheers.

from rails.

mhfs commented on May 13, 2024

@jeremy @josevalim I guess this one can be closed since @rafaelfranca's #5190/#5191 have been merged.

from rails.

rafaelfranca commented on May 13, 2024

What?! Github didn't close this?

from rails.

jcoleman commented on May 13, 2024

@Fjan I'd like to go back to @codykrieger's original patch. I know it adds a hash lookup, but the "simpler" solution smacks of a hacky premature optimization.

Also, the currently implemented solution breaks textarea tags in all apps using HAML (which obviously has a huge user base in the Rails community.) The real problem here isn't just that HAML does something odd, it could easily break other extensions because the newline isn't part of the tag content (though the current patch treats it as such.) The newline is actually part of the tag not the content.

Anyone have more input on this? @josevalim: if I opened a pull request to go back to @codykrieger's patch would that be acceptable?

from rails.

Fjan commented on May 13, 2024

Adding a hash look up for EVERY single HTML tag generated by Rails instead of adding a few bytes to the code base... "premature" optimization would be the last qualification I would think off. Hacky I can live with, although you could argue that your preferred solution of "magically" adding a newline to the content is more of a hack than magically adding it to the tag where it belongs.

Would you care to elaborate on the HAML problem? Why would an extension break because of an additional new line? Does HAML output an incorrect text area tag as well?

from rails.

jcoleman commented on May 13, 2024

The reason I think it's hacky and premature optimization is that technically based on the HTML specs quoted, every tag should have a newline. Now we all know this isn't necessary, but it is true nonetheless. So the newline only being added to the textarea tag is to address browser implementations not being compatible with Rails not generating newlines in each tag.

When you magically add it to the content (rather than to the tag) it breaks extensions like HAML because the content has changed (rather than the tag.) In the specific HAML case, HAML automatically preserves whitespace in textarea tags (so it turns newlines into their respective HTML codes) since indentation obviously shouldn't happen inside textarea content (or rather, the browser eats indentation, but you want the content whitespace to be preserved.) Since you've added a newline to the content, HAML preserves that whitespace and browsers add an additional newline. So every time a form containing one is submitted and an object updated an saved, it gets an extra newline tacked onto the beginning. So the patch is actually causing the reverse problem of the problem it intends to prevent.

Extensions need to be able to access the actual content rather than modified content. And I doubt that a single hash lookup here (even though it's in every tag generated) really causes any measurable performance difference even on high usage apps. Hash lookups are performed in constant time and are in general extremely cheap, and there are lots of other parts of the view generation process (like view contexts for each partial) that cause exponentially higher overhead than a single hash lookup.

If you want to avoid the hash lookup, we could follow the spec and add newlines to every tag, but I don't think that would be appreciated...

from rails.

jcoleman commented on May 13, 2024

To be fair, just going with the original hash-lookup based patch won't automatically fix HAML, but it will making patching HAML far less dirty (and make it resilient to different versions of Rails with or without this fix.)

from rails.

jcoleman commented on May 13, 2024

By "far less dirty" I mean that to patch HAML with Rails the way it currently works, then HAML will have to either check Rails versions or magically divine whether the newline beginning the textarea tag content is actually part of the content or is just part of the tag. There's no way of knowing definitively the way it's currently implement.

from rails.

Fjan commented on May 13, 2024

Well, opinions can differ on what's a worthwhile optimization, but I would venture that an additional lookup on every HTML tag on every page generated by every rails application in the world causes a larger CO2 footprint than you as a person. (That's true of any optimization in a hot part of a frequently used code base, but this is a very hot part of a very large installed base.)

If HAML's behaviour is that closely tied to the behavior of the Rails content_tag it should just be tied to the correct Rails version by its gem spec. After the fix HAML can assume a newline is already added, older Rails users get an older HAML gem, no hacking or version detecting needed.

from rails.

jcoleman commented on May 13, 2024

I realize that this is a hot piece of code, but I'd suggest that the other parts of this method still dwarf the cost of an additional hash lookup. For example, html escaping and the tag_options calls are both rather expensive.

Unless there are benchmarks to back up the hash lookup being that consequential, I'd still say this is a premature optimization that shouldn't be the sole determinant of how this gets patched.

We shouldn't break correctness (the \n isn't actually part of the content--though the current patch acts like it is) just to micro-optimize.

from rails.

Fjan commented on May 13, 2024

I agree the rest of the code base is wasting CPU cycles left and right (and that makes me sad). To give you one example of the numbers we are talking about: Twitter 20B page views, 890 tags on the page (I did a quick grep/google). You now already have something with 15 zeroes, let's say a server can do a million a second, then you are still left with 10^9 s = 31 additional servers per year just for Twitter. I am by no means a tree hugger, but you shouldn't think too lightly about your impact on global energy consumption and your responsibilities as a programmer.

Anyway, I agree with you that the current implementation is a hack, we just differ on whether the added cost weighs up against making it slightly less hacky, especially since it's not really broken.

from rails.

jcoleman commented on May 13, 2024

Assuming 20 billion pages views a year, and 1000 tags on the page, then if a server can process a million per second you're looking at 232 additional server days, so under one extra server a year.

Interestingly enough, in a quick benchmark (in an entirely unoptimized Ruby on my MacBook, I find that Ruby can do more than 20 million hash lookups in a second.

So I'd say again, I don't think this is really an issue from a performance perspective.

Plus, the whole point of Ruby is to optimize for the user rather than for the machine. This case is very much optimizing for the machine rather than for the user.

I realize that in this case the only problem is for plugins, but this creates a maintenance trap in the Rails code too. If you're working in the content_tag method, you expect content to be just that, the content. Not some modified variable with additional non-content information tacked on. The method that logically should know about this (separation of concerns) is content_tag--not the method calling it to create a textarea tag.

from rails.

Fjan commented on May 13, 2024

I don't feel like doing the math again, but even taking your number and adding that there additional logic and parameters being passed and that there's a lot more apps out there than twitter (and that the 20B number was from 2009) I'm sure I can come up with a large number.

Perhaps this is not the best place to discuss this, but the principle of Ruby is to make it an optimal experience for the end-user. Matz himself writes mostly in Ansi C to give you that. Rails should follow that philosophy and do the same for its end users, not do something that's optimal for the HAML maintainers.

Edit: I just noticed your comment in another thread that points out the original fix may have a bug. That changes matters, of course

from rails.

jcoleman commented on May 13, 2024

I definitely understand your concerns re: speed. Rails 3 is already significantly slower than Rails 2 in some cases. I'd just much rather squeeze that performance out of things that can be optimized without making things harder to maintain in Rails or in other projects.

Thanks for all the push back though. In the end, all the discussion hopefully pushes us all into the best of both worlds.

from rails.

angelomichel commented on May 13, 2024

Who came up with this idea that some element should have a newline character in a textarea? It makes NO sense at all. Basic HTML always says < pre >< / pre > isn't < pre >[ new line here]< / pre > Why would a textarea be different?

Google Chrome users now have an issue thanks to this (if I google this problem, I see allot of frustrated users by enabling this!). They all get free bonus unwanted characters before text_area's.

PLEASE roll this back to normal state

Or is there any reason why anyone should/would want this character injected?

from rails.

jcoleman commented on May 13, 2024

@angelomichel Actually, the HTML spec came up with the idea. The character needs to be injected because browser (including most versions of Chrome) actually swallow a leading newline, so if you don't have one, you actually lose real information from the textarea (assuming the real content started with a newline.)

If you're using Haml, then the extra leading spaces/newlines actually being in the content is a known bug, and there is a fixed version out as well. Check the Haml Github repo issues tracker for more information.

from rails.

codykrieger commented on May 13, 2024

@angelomichel, see this comment: #393 (comment)

If Chrome isn't ignoring the first newline inside of textarea tags, it's broken.

from rails.

jcoleman commented on May 13, 2024

@codykrieger @angelomichel Chrome doesn't ignore the leading newline in some specific versions when the newline is encoded as an HTML entity. This is caused by Haml, but there is a fixed version of Haml to correct this behavior (the leading newline shouldn't be encoded.)

from rails.

Fjan commented on May 13, 2024

@codykrieger Actually, the SGML spec that preceded the HTML spec already had it.

@angelomichel The reasoning is probably that you can nicely format the HTML / SGML source by adding an extra newline after a tag. All other tags also allow an extra new line there but all other tags are insensitive to white space so that's why textarea needs a different treatment.

from rails.

es128 commented on May 13, 2024

Perhaps this could be changed to only insert the newline if there is content going into the textarea.

Currently, when this ends up being inserted as the html entity 
 in a textarea that would otherwise be empty, it breaks the placeholder attribute on Mobile Safari.

I haven't tested, but I suspect that Mobile Safari handles an actual newline, as opposed to the entity code, correctly.

from rails.

jcoleman commented on May 13, 2024

@es128 Are you using Haml? The html entity is not inserted by Rails. Rails inserts an actual newline. Haml decides to encode. If you look at the relevant bugs in the Haml repository on Github, you'll find that there are multiple workarounds and fixes available. I believe that latest version of Haml actually completely resolves this (though it's not the nicest of solutions.)

from rails.

es128 commented on May 13, 2024

@jcoleman Yes, I'm using Haml. Thanks for the tip.

from rails.

textarea helper swallows leading newline about rails HOT 46 CLOSED

Comments (46)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent