fastly / pushpin Goto Github PK

View Code? Open in Web Editor NEW

3.6K 99.0 154.0 5.41 MB

A proxy server for adding push to your API, used at the core of Fastly's Fanout service

Home Page: https://pushpin.org

License: Apache License 2.0

Python 1.29% Shell 0.03% QMake 0.27% C++ 44.43% PHP 0.04% Rust 53.49% C 0.39% Makefile 0.02% JavaScript 0.03%

http-streaming websockets long-polling realtime push proxy api server-sent-events zeromq streaming

pushpin's People

Stargazers

Watchers

Forkers

chloe899 sunwill jeremyjbowers ujwalendu shailcoolboy vireshas tradisj thebookworm101 prodigeni mansilladev neuroradiology llawlor mappingkat techslides jnan77 alex1818 duke-lv ksmaheshkumar supertanglang kbokarius freaking1 onode connlan pombredanne netfirms jbelke welll priestd09 sourcec0de mcspring rtvt123 flow180 maniacs-ops izogain ieleh launchdarkly coocoky herbicure gaohongsong jannic thinhnd8752 jayjay1 codelabspro rodnandes cuulee nguyenducnhaty neomatrixcode datazource githubch linecode patlolla-sai-charan-reddy weimingtom zhiephieforks jbrach niilante setcodemonkeys bootinge ivanmejiarocha xpilot terudeca mbeacom kapyshonchik minhnhut0602 muhammadmuhlas stlcours shalomz mbelal ibm-iloom elfekym datumlogic b-xiang garethtdavies sai-jiang polygram-tech awesomedotnetcore sumonst21 jamesfebin anniyanvr kevinswiber enginelabs joehuangcoding reloadbrain evertonteotonio goopm fashionmgr mattheworiordan ilovejs kp666 tuthodien bitbreakr askmetoo silvrwolfboy mkipcak 404-not-find huiwenhan yutiansut parasharrajat wall-eeeeeee liang0 devprasan

pushpin's Issues

Feature Request: Replace Path

I want to proxy routes from http://localhost:7999/something/* to http://backend:8080/something_else/*, replacing something with something_else.

websockets

(adapted from prior discussion)

Here's an idea for "Websocket GRIP":

Websocket connections are forwarded to an origin server (or balanced to a set of origin servers). Messages would be forwarded in either direction as-is, with the exception of instructions as discussed below.
Instead of supplying GRIP instructions in the body of an HTTP response, the origin server would supply them as the content of the first websocket message. Further messages would be relayed as-is back to the client. The origin could also choose to provide no instructions and let its first message be relayed back as normal. So there should be a way for the initial response message to be uniquely identified as instructions vs regular traffic. Example of instructions could be channel binding as per HTTP GRIP.
The origin would be allowed to close its websocket connection with the proxy without the proxy closing its websocket connection to the client. This could be the default behavior if the connection is bound to one or more channels.

With this basic framework you can at least get something similar to HTTP streaming, where each client has a connection open with the server receiving one-way data (server->client). For example, say your website broadcasts news items to everyone visiting. On page load, a websocket connection could be made to the proxy, forwarded to origin, origin responds with binding to a shared channel (say "news"), and then origin closes the connection. Whenever the origin has data to broadcast to all listeners, it would publish on the "news" channel by making some API call to the origin (possibly via REST). The proxy would then multicast this message to all connected clients. The origin server would not need to maintain any open connections.

There is interest in having two-way communication where the inbound messages are funneled through a limited set of connections between the proxy and the origin. To go about this, a sharing policy could be configured in advance. Only the first websocket connection request would truly be forwarded (i.e. headers and all). Future connections that match on the policy would not cause additional connections to be made between the proxy and the origin (or at least, not more than one per proxy node or such).

Additionally, the connection could be configured to use multiplexing wrapping. This could be selected on the fly in the instructions response from origin. Without wrapping, the origin would not be able to tell which clients sent which messages that it receives, and messages sent from the origin would be copied out to all of the shared clients by the proxy. With wrapping, some client identification would be tagged on every incoming message sent to the origin, and outbound messages sent from origin would also need to be tagged with destination client identification.

There is also interest in some kind of stickiness. If the goal is for every message from the same client to end up going down the same proxy->origin connection then I believe this is exactly what would happen in the above design. If a client was disconnected from the proxy and reconnected to a different proxy node though, then messages from that new client connection might go down a different proxy->origin connection.

Pushpin not able to start

When i want to start pushpin I am getting following output

starting...
starting mongrel2 (http:3000)
starting m2adapter
starting zurl
starting pushpin-proxy
starting pushpin-handler
started
Traceback (most recent call last):
  File "/usr/local/bin/pushpin", line 59, in <module>
    runner.run(exedir, config_file, verbose)
  File "/usr/local/lib/pushpin/runner/runner.py", line 111, in run
    p.wait()
  File "/usr/local/lib/pushpin/runner/processmanager.py", line 90, in wait
    p.check()
  File "/usr/local/lib/pushpin/runner/processmanager.py", line 41, in check
    raise RuntimeError("process exited unexpectedly: %s" % self.name)
RuntimeError: process exited unexpectedly: m2adapter

I checked mongrel2 and i am able to run it without any kind of issues. I checked m2adapter logs within var/www/pushpin log directory and nothing usefull there:

[INFO] 2015-04-07 14:08:25.554 starting...
[INFO] 2015-04-07 14:09:19.096 starting...
[INFO] 2015-04-07 14:12:22.728 starting...
[INFO] 2015-04-07 15:06:56.012 starting...
[INFO] 2015-04-07 19:05:10.023 starting...
[INFO] 2015-04-07 19:13:03.515 starting...
[INFO] 2015-04-07 19:15:24.216 starting...
[INFO] 2015-04-07 19:15:57.014 starting...
[INFO] 2015-04-07 19:23:19.709 starting...
[INFO] 2015-04-07 19:28:20.920 starting...
[INFO] 2015-04-07 19:29:00.137 starting...

Any kind of additional configuration for m2adapter required?

headers for instructions

Basically having headers like Grip-Hold as a substitute for the application/grip-instruct JSON body.

More info: http://lists.fanout.io/pipermail/fanout-users-fanout.io/2014-July/000024.html

Host header incorrect when port present

If you run pushpin on port 8000 and visit localhost:8000, it sends:
Host: localhost

It should send:
Host: localhost:8000

This doesn't follow the HTTP spec and can break redirects.

Apache Setup

Hi,

i want to test your package but i become a little error after trying the url with curl -i or on my browser

There error contains Error while proxying to origin.

Can you give an example ob configuration the server?

index path_beg

Matching path_beg in routes is currently a linear search. To better support configurations with many paths on the same domain, the lookup should be improved.

In practice this probably affects nobody other than some older features in the Fanout Cloud.

Examples of using stats via HTTP

Hi @jkarneges,

Having a lot of fun playing with pushpin! Love the approach, I never realised how much time I was spending solving "socket" problems vs application problems before, and it's so nice to have it handled so simply.

I saw somewhere that for scaling multiple instances one could use a stat service for querying subscriptions.

Is this an HTTP endpoint or ZMQ? If so what are it's specs? How can one get at this data?

Many Thanks!

Checking Pushpin with PVS-Studio static analyzer

Hello,
I have found some weaknesses and bugs using PVS-Studio tool. PVS-Studio is a static code analyzer for C, C++ and C#: https://www.viva64.com/en/pvs-studio/

Analyzer warnings:

V509 The 'new' operator should be located inside the try..catch block, as it could potentially generate an exception. Raising exception inside the destructor is illegal. src/proxy/websocketoverhttp.cpp 929
V595 (CWE-476) V595 The 'inRequest' pointer was utilized before it was verified against nullptr. Check lines: 452, 475. src/proxy/proxysession.cpp 452

In addition, I suggest having a look at the emails, sent from @pvs-studio.com.

Best regards,
Phillip Khandeliants

debug mode

When there is a failure communicating with an origin server, Pushpin responds with uninformative errors. This is on purpose, to prevent leaking internal details to the outside world. However it makes diagnosing problems difficult. There should be a way to opt-in to a debug mode in which case Pushpin should respond with more verbose error responses indicating exactly what is wrong.

Ignore messages from self

Here's how it will likely work:

The backend will assign a user ID to the connection via GRIP instructions.
Published messages can contain a sending user ID as well as an indication that the message should not be delivered to any connections that match the sender.

Request body size bug

When pushpin gets a request, it compares all request size (header and body) with MAX_HEADERS_SIZE. So, our requests size are limited with this value.

in src/handler/httpserver.cpp:
in void processIn():

Line 219: QByteArray buf = sock->readAll(); // Reads all request
Line 223: if(inBuf.size() + buf.size() > MAX_HEADERS_SIZE) // Comparses request size with headers max size value

Crashes after backend was unavailable for some time

I set up pushpin to work like this:

Nginx:38080 -> Pushpin:7999 -> rails:3000 for websockets enabled paths. Other requests are not proxied through Pushpin.

Routes:

*,sockjs=/api/v1/sjs,sockjs_as_path=/api/v1/pp rails:3000,over_http

If I stop my rails service for some time while clients were connected and restart the service again, Pushpin will crash with the following stack trace:

pushpin_1  | [DEBUG] 2017-04-05 16:10:53.973 [m2a] m2: OUT [pushpin-m2-7999 4:X 21, 26:3:ctl,16:6:cancel,4:true!}]]
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.974 [m2a] m2: IN pushpin-m2-7999 21 @* 17:{"METHOD":"JSON"},21:{"type":"disconnect"},
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.974 [m2a] m2: pushpin-m2-7999 id=21 disconnected
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.975 [m2a] m2: IN pushpin-m2-7999 21 @* 17:{"METHOD":"JSON"},21:{"type":"disconnect"},
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.975 [m2a] m2: pushpin-m2-7999 id=21 disconnected
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.975 [proxy] wsproxysession: 0x55ace41db350 connected
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.975 [proxy] wsproxysession: 0x55ace41db350 grip enabled, message-prefix=[m:]
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.976 [proxy] wscontrol: OUT { "items": [ { "ttl": 60, "type": "here", "cid": "d2377f6b-7b87-4317-aaa7-9f40186917af", "uri": "ws://pushpin:7999/api/v1/pp?t=1491408653941" } ] }
pushpin_1  | [INFO] 2017-04-05 16:10:53.976 [proxy] GET ws://pushpin:7999/api/v1/pp?t=1491408653941 -> rails:3000[http] ref=http://localhost:38080/app code=101 0
pushpin_1  | pushpin-proxy: zhttprequest.cpp:1186: virtual void ZhttpRequest::beginResponse(int, const QByteArray&, const HttpHeaders&): Assertion `d->state == Private::ServerReceiving || d->state == Private::ServerResponseWait' failed.
pushpin_1  | [ERR] 2017-04-05 16:10:53.979 proxy: Exited unexpectedly
pushpin_1  | [INFO] 2017-04-05 16:10:53.980 stopping m2 http:7999
pushpin_1  | [INFO] 2017-04-05 16:10:53.980 [zurl] stopping...
pushpin_1  | [INFO] 2017-04-05 16:10:53.980 [zurl] stopped
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.980 [zurl] curl: Closing connection 1
pushpin_1  | [DEBUG] 2017-04-05 16:10:53.980 [zurl] socketFunction: CURL_POLL_REMOVE 26
pushpin_1  | [INFO] 2017-04-05 16:10:53.980 [handler] stopping...
pushpin_1  | [INFO] 2017-04-05 16:10:53.981 [m2a] stopping...
pushpin_1  | [INFO] 2017-04-05 16:10:53.981 [m2a] stopped
nginx_1    | 172.19.0.1 - - [05/Apr/2017:16:10:53 +0000] "GET /api/v1/sjs/305/db15fqwl/websocket HTTP/1.1" 101 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36" "-"
nginx_1    | 2017/04/05 16:10:53 [error] 8#8: *137 upstream prematurely closed connection while reading response header from upstream, client: 172.19.0.1, server: domain2.com, request: "POST /api/v1/sjs/601/vig4ufk2/xhr?t=1491408653941 HTTP/1.1", upstream: "http://172.19.0.3:7999/api/v1/sjs/601/vig4ufk2/xhr?t=1491408653941", host: "localhost:38080", referrer: "http://localhost:38080/app"
nginx_1    | 172.19.0.1 - - [05/Apr/2017:16:10:53 +0000] "POST /api/v1/sjs/601/vig4ufk2/xhr?t=1491408653941 HTTP/1.1" 502 575 "http://localhost:38080/app" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36" "-"
pushpin_1  | [INFO] 2017-04-05 16:10:53.985 [handler] stopped
pushpin_1  | [ERR] 2017-04-05 16:10:53.987 m2 http:7999: Exited uncleanly
pushpin_1  | [INFO] 2017-04-05 16:10:53.988 stopped

Dockerfile

FYI: I made a Dockerfile to build pushpin containers: https://github.com/johnjelinek/pushpin-docker

Properly publish response status

I am trying to pass status in my publish request. I tried something like:

    {
      "items" => [{
        "channel" => 'mychannel'
        "http-response" => {
          "headers" => {
            "Content-Type" => "application/json"
          },
          "body" => body.to_json,
          "status" => 204
        }
      }]
    }

I also tried to pass this in headers section or as code attribute. Can you tell me what am I doing wrong? Also, how can I pass status code in hold instruction response?

Using Python sortedcontainers vs blist

First, thanks for a great package. I'm exploring real-time Django solutions and am currently evaluating this.

I noticed that the project is using the Python blist module, in particular the sorteddict data type is used for expiring subs by time. As the author of the sortedcontainers module, I wondered why. SortedContainers is a pure-Python implementation of SortedList, SortedDict, and SortedSet data types. For your use pattern of getting, setting, deleting, and iterating sortedcontainers.SortedDict looks faster than blist.sorteddict. Take a look at the benchmarks at http://www.grantjenks.com/docs/sortedcontainers/performance.html#id2

But then I noticed that your typical install instructions are a list of Debian packages. And sortedcontainers doesn't have a Debian package (though someone recently requested that (grantjenks/python-sortedcontainers#29) so I wanted to ask:

How did you choose blist?
Is the lack of a Debian package for sortedcontainers a deal breaker?
Would you accept a pull request that changed the dependency and made the code a bit more idiomatic? In particular, I'd be really curious to test any benchmark that might be affected by such a change.

Thanks for reading.

WebSocket reliable mode

It should be possible to enable a reliable mode for WebSocket connections, similar to what can be done with HTTP connections. The server should be able to supply a prev ID when subscriptions are assigned to connections, and Pushpin should be able to query for missed messages per-channel by last known ID. The feature should work for both WebSocket targets and HTTP targets (WebSocket-over-HTTP).

The big question is what protocol to use for the recovery requests. For WebSocket targets, it could be done using WebSocket messages, however there may be value in having it work over HTTP. For HTTP targets, it could be done using WebSocket messages encapsulated in WebSocket-over-HTTP protocol, but it would be nice if requests could use GET and reuse existing constructs like Grip-Link, Grip-Last, etc.

Note that if the protocol will use WebSocket messages, we'll likely need a way for Pushpin to send control messages to the backend which is not yet possible.

HTTP streaming reliability

Note: This approach falls within the scope of a patent application filed by Fanout. However, if implemented, the Pushpin license (AGPL) ensures that the necessary rights would be granted to users.

Publish-subscribe services are unreliable by design. Since Pushpin can be used to provide historical and streaming content over a single connection, there is an opportunity for the two data sources to be unified in a way that creates a reliable stream.

When creating a stream hold, any channel to be subscribed should include a prev-id value. A next link should also be provided:

HTTP/1.1 200 OK
Content-Type: text/plain
Grip-Hold: stream
Grip-Channel: fruit; prev-id=3
Grip-Link: </fruit/?after=3>; rel=next

{... initial response ...}

The prev-id here would have a similar effect as it does with response holds, in that if the ID doesn't match the last known ID published to the channel, then Pushpin will make a request again to the proxy. The difference is that with a stream hold, the subscription will be activated immediately and initial response data sent. Then the next link will be retrieved, and any content received will be dumped into the already-open response. If another next link is received then Pushpin will follow it until the end is hit (hold instruction, or no link/hold which would cause the response to close).

When Pushpin retrieves a next link as a result of a publish conflict, it will also include a Grip-Last header indicating the last known ID received on a given channel. This should take precedence over whatever checkpoint information may have been encoded in the link.

GET /fruit/?after=3 HTTP/1.1
Grip-Last: fruit; last-id=7

For example, if the origin server received the above request, then the last-id of 7 would be used as the basis for determining the response content rather than the after query param.

Note: this approach could be applied to WebSockets as well, to be expored in a separate issue.

runner unified log

By default, the runner program should unify the output of all subprocesses into a single log file rather than creating a bunch of log files.

It should also be possible to have the runner output the logs to stdout.

mongrel2 exits uncleanly

Sometimes Pushpin will log a message like this on exit:

[ERR] 2016-10-22 14:09:37.809 m2 http:7999: Exited uncleanly

This is a bug in Mongrel2 (a dependency of Pushpin), and not Pushpin itself, but since users may see this error I figured it would be good to have an issue filed until it is fixed. Note that the bug is harmless.

configure: --qtselect shouldn't be needed

If both Qt 5 and Qt 4 dev environments are installed, you need to explicitly specify --qtselect=5 when running configure. However, Pushpin requires Qt 5 so it should just do the right thing automatically. Fixing this will require fixing qconf.

SUB to stats socket not working

The demo Python program (pasted below) that connects to the stats socket and prints the decoded messages is stuck at the sock.recv() and does not print anything, although I can successfully have pushpin play along with clients and origin server. The permissions on the socket look good (srwxr-xr-x 1 pushpin zurl ). Am I missing anything? Thanks!

import sys
import tnetstring
import zmq

ctx = zmq.Context()
sock = ctx.socket(zmq.SUB)
sock.connect('ipc://var/run/pushpin/pushpin-stats')
sock.setsockopt(zmq.SUBSCRIBE, '')

while True:
    m_raw = sock.recv()
    mtype, mdata = m_raw.split(' ', 1)
    if mdata[0] != 'T':
        print 'unsupported format'
        continue
    m = tnetstring.loads(mdata[1:])
    print '%s %s' % (mtype, m)

Pushpin retries the requests indefinitely if channel name changes between requests

We have a unique use case in which each client gets a unique channel which is used to publish a message only once. This worked perfectly in 1.14. But in 1.15 for each new channel, the first request is always retried. So if the channel changes between requests, pushpin indefinitely retries the requests

Sample server code to reproduce the problem in ruby

require 'sinatra'
require 'json'
require 'pp'
require 'securerandom'
get '/hold' do
  #pp request.env
  headers \
    'Grip-Hold' => 'response',
    'Grip-Channel' => SecureRandom.uuid,
    'Content-Type' => 'application/json'
  '{"resp": "timeout"}'
end

curl http://localhost:7999/hold

Release notifications

I maintain Arch Linux's AUR packages for pushpin and zurl. It would be great if there was a way to know when new versions were released (a mailing list or some other similar method) so I could update said packages in a timely fashion.

Is it possible to query a pushpin instance to see if a given ID is still connected?

I see in the docs you mention connection checking with IDs like m2zhttp_45517:pushpin-m2-7999_1_0. Is it possible to query to see if a connection id returned in the Connection-Id header in a websocker-over-http call is still connected?

optimize disconnect handling rate

Currently, m2adapter processes disconnect events at a rate of 100/sec. Ideally we'd be able to handle a larger rate. Things to consider:

If there are too many pending disconnects, we should remove connection structures but send nothing to other components, or even ignore disconnect events completely. Connection structures will eventually be cleaned up when sessions time out. This is pretty much how things work with TCP, when the OS can't keep up with close packets.
Simply removing connection structures in m2adapter can cause internal overload. This is because m2adapter responds to batched keep-alives with individual cancels, so when pushpin-handler pings a thousand connections it thinks might exist, m2adapter will slap pushpin-handler around. m2adapter should respond with batched cancels, or it should ignore keep-alives for sessions it doesn't recognize.
Removing session structures in pushpin-handler is somewhat expensive due to having to muck with a bunch of tables and send stats. If there are a lot of disconnects occurring at once, we should consider a cheaper processing route and don't bother sending stats. Downstream stats listeners will have to timeout connections on their own in that case.

Setting hold instruction timeout

I set response on grip hold instruction, but timeout is much bigger than server timeout. Is there a way to set custom timeout (i.e. 5 seconds)?

more practical logging

Default log level should produce data interesting to users. It shouldn't contain debug-ish output.

m2adapter: request attempt and initial response info
proxy: withhold log line until after proxying, then output a single line
handler: per-connection subscription changes. global subscription here/gone. incoming publish. publish being relayed.

automatic keep-alive with websockets

Pushpin's HTTP streaming mode enables specifying keep alive instructions, e.g.:

Grip-Hold: stream
Grip-Keep-Alive: still here\n; format=cstring; timeout=30

It would be good to have something similar for WebSockets.

Proposal: a new control message type keep-alive:

c{"type": "keep-alive", "content": "still here", "timeout": 30}

If set more than once, the previous setting would be overwritten. Omit content to disable keep alives.

route review

routes:

*,path_beg=/test localhost:8080,path_beg=/test
*,ssl=no,path_beg=/v2 acceptor:8080,path_beg=/v2,path_rem=3

In my scenario, I would like traffic to localhost:7999/test to go to localhost:8080/test (where I am not hosting somewhere, so I expect an error) and I get the expected result until I hit a route that matches the second scenario. Then the second route responds to requests mentioned in the first route.

In the second route, I would like localhost:7999/v2/abc to go to acceptor:8080/abc but it is proxying the whole path acceptor:8080/v2/abc. Also, when I make changes to the routes, it's not always reflecting the changes unless I restart pushpin -- I don't think that's expected, right?

ImportError: No module named runner

I'm trying pushpin with following documentation.
When I start pushpin with example config I get following error:

$ git clone https://github.com/fanout/pushpin
$ cd pushpin/examples/config/
$ pushpin --verbose
Traceback (most recent call last):
  File "/usr/local/Cellar/pushpin/1.5.0/libexec/bin/pushpin", line 57, in <module>
    from runner import runner
ImportError: No module named runner

I installed via homebrew and puspin version is 1.5.0.

$ brew install pushpin
$ pushpin --version
pushpin 1.5.0

Is any kind of additional configuration required?

Connection reset when filesize is big

I am using hapi as api server. And testing via request.

test_request.js

fs.createReadStream('./workbooks/tmp/somefile.csv')
  .pipe(
    request({
      method: 'POST',
      url: 'https://some.domain.com/some_route',
      qs: {
        params: {
          param_a: 'some_param'
        },
        email: '[email protected]',
        prj: 'someprj'
      },
      agentOptions: {
        ca: fs.readFileSync('./ssl/cert.pem')
      }      
    },function(err,response,body){
      if (err) console.log(err);
      console.log(body);
    })
  )
  .on('error', function(err){
    console.log(err.stack);
  });

hapi_route.js

module.exports = {
  method  : "POST",
  path    : "/some_route",
  config: { 
    auth: false,
    payload: {
      maxBytes: 100*1024*1024,
      output: 'stream'
      parse: false
    }
  },
  handler : handleRoute
};

I get this error

Bad Request
Error: write ECONNRESET
    at errnoException (net.js:905:11)
    at Object.afterWrite (net.js:721:19)

if I just send the content as a chunk

test_request.js

request({
      method: 'POST',
      url: 'https://some.domain.com/some_route',
      qs: {
        params: {
          param_a: 'some_param'
        },
        email: '[email protected]',
        prj: 'someprj'
      },
      agentOptions: {
        ca: fs.readFileSync('./ssl/cert.pem')
      } ,
     body: fs.readFileSync('somefile.csv')
    },function(err,response,body){
      if (err) console.log(err);
      console.log(body);
    })

hapi_route.js

module.exports = {
  method  : "POST",
  path    : "/some_route",
  config: { 
    auth: false,
    payload: {
      maxBytes: 100*1024*1024,
      parse: false
    }
  },
  handler : handleRoute
};

I get this error

{ [Error: socket hang up] code: 'ECONNRESET' }

however, if I try with a smaller file, everything is working fine. I can't seem to find any configuration to control the max payload size in pushpin? I assume there should be some equivalent to nginx's upload_max_filesize??

ability to disable message reordering

Normally, if id and prev-id fields are used on published items, Pushpin will reorder messages before delivery. To do this, Pushpin will hold on to a message before delivering it, if it is received out of order. This might not be desired if you are already ensuring in-order publishing some other way.

For example, if you have exactly one worker that only publishes one message at a time, then all messages will be received by Pushpin in order. If this worker then fails to publish for some reason, causing a gap in the message sequence, the next message received will be held on to in expectation of preceding messages that will never come. So if you already know you are publishing in order, Pushpin's own reordering could actually cause unnecessary problems.

This could be implemented as an app-level configuration or a message-level flag.

I'm not 100% sure if this is something we should implement but I wanted to file an issue so it's not forgotten.

http-stream: support close action

The GRIP libraries support it but Pushpin doesn't.

Django WebSocket Browser: connection refused

Got a django endpoint which is working well with an http curl from cli, but connection gets refused when attempting to create a websocket object in browser.

JS Console in Chrome:
var ws = new WebSocket('ws://localhost:7999/dashboard/stream/');

WebSocket connection to 'ws://localhost:7999/dashboard/stream/' failed: Error in connection establishment: net::ERR_CONNECTION_REFUSED

enabled the over_http with no luck.

Django server registers no incoming requests when attempting websocket conn from browser.

WebSocket-over-HTTP retry requests

Currently, if Pushpin makes a request to the origin server and it fails, then the associated WebSocket client connection will be closed. Instead, Pushpin should retry the request a few times (with backoff delay) before failing the connection. This way, WS connections always survive origin server restarts, even if they happen to send messages while the origin is restarting.

Probably we should only retry requests if we were unable to submit the request data. If the request fails after Pushpin has submitted the POST request and application/websocket-events body, then it should not be retried, else the origin might receive duplicate events.

publish to close websockets

Pushpin's HTTP streaming mode enables the publisher to close connections. We should support something similar with WebSockets, e.g.:

"ws-message": {
  "action": "close"
}

command to trigger stream recovery

The HTTP streaming reliability feature makes a request to the origin server if received messages are incorrectly sequenced or after a period of inactivity. For quiet channels, this may mean that it takes awhile to recover. If the user is aware of publishing issues (e.g. the publisher crashed), it would be useful if the user could explicitly invoke the recovery request rather than having to wait for new messages to be published or for the timeout to expire.

Consider exposing a new command (via zmq command interface as well as HTTP) called recover, with potential parameters to fine tune:

channel: recover only subscribers of a certain channel.
channel-prefix: recover any subscribers that match.
prev-id: recover only subscribers who previously received an item with this ID.

Default would be to recover all subscribers.

minor typo/correction in websocket-over-http.md: incorrect size

Where it says

"
Server responds with two messages:

HTTP/1.1 200 OK
Content-Type: application/websocket-events

TEXT 5\r\n
world\r\n
TEXT 1B\r\n
here is another nice message\r\n

The length of the 2nd message should be 1C (27) instead of 1B.

Docker-compose with Pushpin

Was trying out the docker image from johnjelinek, but seems to have problem getting it to work with Docker compose due to the need of a tty by mongrel2.

I got the following error:

(errno: Inappropriate ioctl for device) getlogin failed and no LOGNAME env variable, how'd you do that?

I am not too familiar with mongrel2, any ideas what I need to tweak so that an automated script can start pushpin without tty?

proxy ./configure doesn't generate right lib path for QtCrypto

Installed qca with homebrew. The correct path for QtCrypto should be /usr/local/Cellar/qca/2.0.3/lib/qca.framework/Versions/2/Headers/QtCrypto, but is written to be /usr/local/Cellar/qca/2.0.3/include/QtCrypto

Configure failure with Qt >= 5.10

The failure is caused by https://github.com/fanout/pushpin/blob/2a386e5ec9219003edc8185e0b91fc595dc1f6a2/configure#L216

?.?.? fails to match a two-digit minor version. So any version > 5.9 will fail.

The error is

==> ./configure --prefix=/usr/local/Cellar/pushpin/1.17.0 --configdir=/usr/local/etc --rundir=/usr/local/var/run --logdir=/usr/local/var/log --extraconf=QMAKE_MACOSX_DEPLOYMENT_TARGET=10.11
Configuring Pushpin ...
Verifying Qt build environment ... fail

Reason: Unable to find the 'qmake' tool for Qt 4 or 5.

Be sure you have a proper Qt 4.0+ build environment set up.  This means not
just Qt, but also a C++ compiler, a make tool, and any other packages
necessary for compiling C++ programs.

If you are certain everything is installed, then it could be that Qt is not
being recognized or that a different version of Qt is being detected by
mistake (for example, this could happen if $QTDIR is pointing to a Qt 3
installation).  At least one of the following conditions must be satisfied:

 1) --qtdir is set to the location of Qt
 2) $QTDIR is set to the location of Qt
 3) QtCore is in the pkg-config database
 4) qmake is in the $PATH

This script will use the first one it finds to be true, checked in the above
order.  #3 and #4 are the recommended options.  #1 and #2 are mainly for
overriding the system configuration.

I can work around it by string replacing "?.?.?" with "?.??.?"

ability to provide initial streaming content via a series of requests

If the origin needs to return a lot of initial content for an HTTP streaming request, then it may be desirable to have the origin only return a portion of the content and wait for the proxy to fetch the next part.

Some reasons this would be useful:

Origin can return large initial content to the client without having to buffer the entire content at once.
For the origin to properly generate GRIP instructions it may need to know the last part of the initial content before doing so. However, since instructions are provided in the headers, this might require the origin to buffer the whole response in order to know what headers to set before beginning to send the response body. Breaking up the content into a series of requests would let the origin wait until the last request before generating GRIP instructions. Trailing HTTP headers could theoretically be used to supply instructions after sending a single streamed response, but this only works with chunked encoding and is too obscure of an HTTP feature to rely on.
Paging may be useful as part of a reliability mechanism (to be discussed separately).

We could introduce a new response header telling Pushpin how to do the paging:

HTTP/1.1 200 OK
Content-Type: text/plain
Grip-Link: </stream/?after=nextID>; rel=next

{... first part of content ...}

We use a Link-style header here in case we need to support additional link types in the future. We call it Grip-Link rather than reusing Link so that Pushpin can remove the header before relaying to the client (as Pushpin does with all such Grip-* headers).

Pushpin would then page through all of the response data before holding (subscribing) the request to channels. The expected usage is that the origin server would omit the Grip-Hold header from all pages until the last page. If a response contains neither Grip-Hold, nor a Grip-Link of type next, then Pushpin would close the response to the client after all data has been sent. If a response contains both such headers then the hold would win.

wildcard subscriptions

It would be useful to subscribe to channels by wildcard. For example, a publisher could publish a message to channel A.B.C, and a receiver with subscription to A.* could receive the message.

Wildcard subscriptions are not uncommon in publish-subscribe systems, however they tend to work differently in each system. For example, ZeroMQ supports filtering by prefix. Bayeux supports segmented channels and trailing wildcards of either single segment or multi-segment (e.g. /a/* matches /a/b, and /a/** matches /a/b/c/d). At least in these two cases, wildcarding is effectively suffix only, which is pretty conservative and seems like something we could reasonably do in Pushpin.

Proposal:

If a subscribed channel ends with a single asterisk, then it should count as a wildcard. For example: Grip-Channel: foo.*.
Pushpin should subscribe to the channel without the asterisk on its ZeroMQ SUB socket. A subscription to foo.* would use foo. as the SUB socket filter.

Note that ZeroMQ being limited to filtering by prefix doesn't necessarily mean that we can't support more complex wildcarding. For example, a mid-string wildcard like foo*bar could be supported by setting the SUB socket to use foo as the filter, but then only deliver to clients if the full expression matches. However we should only implement something like this if there is a real need. For most applications needing wildcards, suffix-only is usually enough.

Handle large initial content with streaming

Pushpin limits all GRIP responses to 100,000 bytes (as defined by MAX_ACCEPT_RESPONSE_BODY). However, it would be ideal if the server could send an initial response body with no size limit.

At first glance it might seem like we should just not have a limit on the response size. However this is easier said than done as Pushpin currently buffers the entire GRIP response before processing it. The code could be changed to stream out the response to the client as it is being received from the origin server, but doing that has a couple of problems:

Streaming out an unbuffered initial response would not work with JSON-style instructions. It might be achievable with a streaming parse on the JSON input but this seems impractical.
For the origin to properly generate GRIP instructions it may need to know the last part of the initial content before doing so. However, since instructions are provided in the headers, this might require the origin to buffer the whole response first in order to know what headers to set before beginning to send the response body. This means we'd still have the same buffer size problem, but in the origin server rather than Pushpin. Trailing HTTP headers could theoretically be used to supply instructions after sending the initial response, but this only works with chunked encoding and is probably too obscure of an HTTP feature to rely on.

Instead of allowing the origin to provide a huge response, the proposed solution is to have Pushpin "page" through a series of responses. Hold instructions would be included on the last page. This way the instructions are generated at the end, and would work fine with either headers-style or JSON-style instructions.

We could introduce a new response header telling Pushpin how to do the paging:

HTTP/1.1 200 OK
Content-Type: text/plain
Grip-Link: </stream/?after=nextID>; rel=next

{... first part of content ...}

The server would still need to keep responses within the Pushpin limit. We'll add a request header to help with that:

GET /stream/?after=firstID HTTP/1.1
Grip-Response-Max-Size: 100000

Pushpin would then page through all of the response data before holding (subscribing) the request to channels. The expected usage is that the origin server would omit the Grip-Hold header from all pages until the last page. If a response contains neither Grip-Hold or a Grip-Link of type next, then Pushpin would close the response to the client after all data has been sent. If a response contains both such headers then the hold would win. The next link may still be useful as part of a reliability mechanism (to be discussed separately).

UPDATE: With the new Grip-Status feature there should no longer be much need for JSON-style instructions. We could consider streaming initial responses of any length without needing paging, for origin servers that are able to do so (and that understand the implications of having to provide hold headers in advance). This is a more practical default approach than requiring servers to limit response sizes.

Create /var/run/pushpin directory

I installed pushpin using apt-get, and got this when I tried to run it:

Traceback (most recent call last):
  File "/usr/bin/pushpin", line 56, in <module>
    runner.run(exedir, config_file, verbose)
  File "/usr/lib/pushpin/runner/runner.py", line 58, in run
    m2sqlpath = services.write_mongrel2_config(configdir, os.path.join(configdir, "mongrel2.conf.template"), rundir, logdir, http_port, https_ports, m2sh_bin)
  File "/usr/lib/pushpin/runner/services.py", line 34, in write_mongrel2_config
    compile_template(configpath, genconfigpath, vars)
  File "/usr/lib/pushpin/runner/services.py", line 11, in compile_template
    f = open(outfilename, "w")
IOError: [Errno 2] No such file or directory: '/var/run/pushpin/mongrel2.conf'

Workaround: Create /var/run/pushpin manually.

How do I see information about clients that are listening?

Can I subscribe to get stats about pushpin? Is there a zmq channel or something?

client side possibility?

hello

is there a library or a way to use javascript websocket to listen on a channel?
i am getting an net::ERR_EMPTY_RESPONSE

thanks

SockJS support

In a project I need to build a chat server and I wanted to use Pushpin as a proxy between a Laravel (PHP) app and sockjs on the client.
At the moment I can connect the client successfully and receive messages broadcasted an a channel. But when I try to send a message it does not seem to call the backend (backend only get's called on initial connection).

I am using the "fanout/laravel-grip" package and the following code (for now it should be an echo server, if that works I can start building the actual functionality)...

        \Log::info('(' . time() . ') chat');

        \LaravelGrip\verify_is_websocket();

        /** @var WebSocketContext $context */
        $context = \LaravelGrip\get_wscontext();

        if ($context->is_opening()) {
            $context->accept();
            $context->subscribe('test');
        }

        while($context->can_recv()) {
            $message = $context->recv();
            \Log::info('Received message', ['message' => $message]);

            if (is_null($message)) {
                $context->close();
                break;
            }

            $context->send($message);

            \LaravelGrip\publish('test', new WebSocketMessageFormat($message));
        }

        return null;

If I look into the laravel logfile I can see the log message only being called on the initial client connection not when the client sends a message.

The route is

*,debug,sockjs=/sockjs,sockjs_as_path=/chat localhost:80,over_http

When connecting a regular websocket to /chat it works without a problem. Does sockjs require any special handling (compared to regular websockets), since I assumed that sockjs emulated websockets behave like regular ones (ie. when a message is received a post request is send to the backend, after which the backend handles the message and sends a response/publishes a message on a channel/etc)...

how do you upgrade

hi, how would you upgrade pushpin to a newer version?

fastly / pushpin Goto Github PK

pushpin's People

Stargazers

Watchers

Forkers

pushpin's Issues

Recommend Projects

Recommend Topics

Recommend Org