Code Monkey home page Code Monkey logo

Comments (13)

jazeee avatar jazeee commented on June 7, 2024

For now, revert to jazeee:spiderable-longer-timeout 1.2.2 which works, but does not handle 404s.
We will have to look at it.
@dr-dimitru have you seen this issue?

from jazeee-meteor-spiderable.

dr-dimitru avatar dr-dimitru commented on June 7, 2024

@Buom01 At first try to access node app directly without proxy, could you provide a link (should be with port, like 3000, actually I've tried 300 port but got no response)?
BTW (see image below) it is something wrong with your setup, your server not supposed to response with 204
screen shot 2015-08-03 at 7 46 15 pm

from jazeee-meteor-spiderable.

dr-dimitru avatar dr-dimitru commented on June 7, 2024

@jazeee I've never meet such issue before

from jazeee-meteor-spiderable.

Buom01 avatar Buom01 commented on June 7, 2024

Sadly, when I talked about proxy, I wanted to say a reverse proxy created by my hoster to get the 80 port (shared hosting...).

BTW (see image below) it is something wrong with your setup, your server not supposed to response with 204

... Now I remeber that my hoster was recently under maintenance, and I think that they had changed their proxy config, (...).

Solution found !
I changed ROOT_URL to nodejs directly like it;

process.env.ROOT_URL = "http://" + process.env.OPENSHIFT_NODEJS_IP +":"+process.env.OPENSHIFT_NODEJS_PORT';

And not like (not to http://test-buom01.rhcloud.com/)

process.env.ROOT_URL = "http://" + (process.env.OPENSHIFT_APP_DNS || "localhost:8000");

An idea is to get it from $PORT and $BIND_IP env variables and not from $ROOT_URL. (I dont know the default value of $ROOT_URL)

I keep this app in this status to notify my hoster that there are bugs with they apache proxy's config

Thank you very much 👍

from jazeee-meteor-spiderable.

jazeee avatar jazeee commented on June 7, 2024

Ok, sounds like it was a configuration/maintenance issue.
Thanks.

from jazeee-meteor-spiderable.

jazeee avatar jazeee commented on June 7, 2024

I found that this issue does exist, and is due to a bug in the phantomjs script.

It appears to depend on whether one uses a proxy, or similar. For example, if one uses nginx in front of meteor, for SSL services, for example.

One way to test is, create a meteor server, with nginx wrapper for SSL. Then, test the phantomjs script:

phantomjs --load-images=no --ssl-protocol=TLSv1 --ignore-ssl-errors=true --web-security=false jazeee-meteor-spiderable/lib/phantom_script.js https://your-server

This returns a 204. The reason is that https://github.com/jazeee/jazeee-meteor-spiderable/blob/master/lib/phantom_script.js#L60 is not correct.

page.onResourceReceived(...) triggers on any resource request, including image assets or Meteor Websocket connection. In the first part of that function, we correctly check for the URL, however, for the remaining part, we accept any status. Since this seems unnecessary, I am removing that part to fix this bug.

EDIT:
When I test the current script, I see the URL: https://.../sockjs/383/l2de9dnl/xhr_send, with status 204, which is the wrong status for the page. If I remove the problem code, I get the right result.

from jazeee-meteor-spiderable.

dr-dimitru avatar dr-dimitru commented on June 7, 2024

Without this logic statement you will not receive status code on redirects

On 10 Aug 2015, at 18:40, Jaz [email protected] wrote:

I found that this issue does exist, and is due to a bug in the phantomjs script.

It appears to depend on whether one uses a proxy, or similar. For example, if one uses nginx in front of meteor, for SSL services, for example.

One way to test is, create a meteor server, with nginx wrapper for SSL. Then, test the phantomjs script:

phantomjs --load-images=no --ssl-protocol=TLSv1 --ignore-ssl-errors=true --web-security=false jazeee-meteor-spiderable/lib/phantom_script.js https://your-server
This returns a 204. The reason is that https://github.com/jazeee/jazeee-meteor-spiderable/blob/master/lib/phantom_script.js#L60 is not correct.

page.onResourceReceived(...) triggers on any resource request, including image assets or Meteor Websocket connection. In the first part of that function, we correctly check for the URL, however, for the remaining part, we accept any status. Since this seems unnecessary, I am removing that part to fix this bug.


Reply to this email directly or view it on GitHub.

from jazeee-meteor-spiderable.

dr-dimitru avatar dr-dimitru commented on June 7, 2024

I believe developer should workaround with proxy server, cause we are using nginx and SSL on our production stage with this package without any issues.

On 10 Aug 2015, at 18:47, Dmitriy A. Golev [email protected] wrote:

Without this logic statement you will not receive status code on redirects

On 10 Aug 2015, at 18:40, Jaz [email protected] wrote:

I found that this issue does exist, and is due to a bug in the phantomjs script.

It appears to depend on whether one uses a proxy, or similar. For example, if one uses nginx in front of meteor, for SSL services, for example.

One way to test is, create a meteor server, with nginx wrapper for SSL. Then, test the phantomjs script:

phantomjs --load-images=no --ssl-protocol=TLSv1 --ignore-ssl-errors=true --web-security=false jazeee-meteor-spiderable/lib/phantom_script.js https://your-server
This returns a 204. The reason is that https://github.com/jazeee/jazeee-meteor-spiderable/blob/master/lib/phantom_script.js#L60 is not correct.

page.onResourceReceived(...) triggers on any resource request, including image assets or Meteor Websocket connection. In the first part of that function, we correctly check for the URL, however, for the remaining part, we accept any status. Since this seems unnecessary, I am removing that part to fix this bug.


Reply to this email directly or view it on GitHub.

from jazeee-meteor-spiderable.

jazeee avatar jazeee commented on June 7, 2024

I don't think it makes sense to allow rely on any arbitrary resource response code. For example, as part of the processing, I see a response code of 204, for URL: https://.../sockjs/383/l2de9dnl/xhr_send

This is the last URL that is processed before phantomJS finishes. It is quite complex to debug these issues, and other users will have serious problems that will appear random. For example, it works locally, but doesn't work due to some arbitrary nginx configuration, version, or other arbitrary server setup/load etc.

Since this issue affects the majority of normal cases, we will have to look at other options for redirects.
For example, if it is specifically handling redirects, then we should capture that in a different way, and handle it specifically.

from jazeee-meteor-spiderable.

dr-dimitru avatar dr-dimitru commented on June 7, 2024

I believe it should work with any kind of redirects. BTW http-redirect is handled by response.redirectURL. But partly I agree with you, so we should pre-program support for "complete" responses, like 200, 302, 404, 400, 500, etc., and avoid "non-complete", like 206, 204, etc.
And add all supported codes into docs, what do you think?

from jazeee-meteor-spiderable.

jazeee avatar jazeee commented on June 7, 2024

Quite possible. My changes seem to also break 404, but I think we will have to be very careful about these issues. (In a separate ticket).
I wouldn't want phantom to respond with a 404 just because an image file or asset is missing.

Of course, the primary goal is to make the successful paths work, in most people's scenarios.

from jazeee-meteor-spiderable.

jazeee avatar jazeee commented on June 7, 2024

Just a note, if I log all response statuses during a redirect test, within onResourceReceived, I get about 15x200, then 204 then 4x200 then 204, then 200. These are most likely due to the various JavaScript pages or websocket connections occurring during the page load. I don't see any redirect status codes. If I test the same, but directly to meteor, bypassing nginx, I see about 12x200 codes.

Not sure if it is a PhantomJS issue, a polling issue, or something else. In any case, I doubt that we can count on the intermediate status codes being representative of the page's final code.

In reality, I think we have to specify something else, such as adding Spiderable.responseCode = 404, to the IronRouter/Meteor code.

from jazeee-meteor-spiderable.

dr-dimitru avatar dr-dimitru commented on June 7, 2024

It is easily solved at this statement
The line you have removed (maybe was incorrect) shouldn't be removed, it should be replaced with something more complex to handle redirects (when page's URL is not equal to original requested page), now you broke correct redirects and all kind of responses (except 200 (which is set by default)).

from jazeee-meteor-spiderable.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.