Code Monkey home page Code Monkey logo

lua-io-nginx-module's Introduction

Name

lua-io-nginx-module - Nginx C module to take over the Lua file operations. It's based on Nginx's thread pool.

Build Status License

Table of Contents

Status

This Nginx module is currently considered experimental.

Synopsis

# configure a thread pool, with 16 threads and a task queue which size is 64k.
thread_pool default threads=16 max_queue=65536;

http {
  
  ...
    
  server {
      listen *:8080;
      lua_io_thread_pool default;
      location /read_by_line {
          lua_io_read_buffer_size 8192;
          content_by_lua_block {
              local ngx_io = require "ngx.io"
              local filename = "/tmp/foo.txt"
              local file, err = ngx_io.open(filename, "r")
              assert(file and not err)

              -- iterates the file by reading one line every time.
              for line in file:lines() do
                  ngx.say(line)
              end

              local ok, err = file:close()
              assert(ok and not err)
          }
      }

      location /read_by_bytes {
          content_by_lua_block {
              local ngx_io = require "ngx.io"
              local filename = "/tmp/foo.txt"

              local file, err = ngx_io.open(filename, "r")
              assert(file and not err)

              while true do
                  -- iterates the file by reading 512 bytes every time.
                  local data, err = file:read(512)
                  if err ~= nil then
                      ngx.log(ngx.ERR, "file:read() error: ", err)
                      break
                  end

                  if data == nil then
                      break
                  end

                  ngx.print(data)
              end

              local ok, err = file:close()
              assert(ok and not err)
          }
      }

      location /write {
          lua_io_write_buffer_size 4k;
          content_by_lua_block {
              local ngx_io = require "ngx.io"

              local length = tonumber(ngx.var.http_content_length)
              if not length then
                  return ngx.exit(200)
              end

              local sock, err = ngx.req.socket()
              if not sock then
                  ngx.log(ngx.ERR, "ngx.req.socket() failed: ", err)
                  return ngx.exit(500)
              end

              local file, err = ngx_io.open("/tmp/foo.txt", "w")
              assert(file and not err)

              repeat
                  local size = length > 4096 and 4096 or length
                  length = length - size
                  local data, err = sock:receive(size)
                  if err then
                      ngx.log(ngx.ERR, "sock:receive() failed: ", err)
                      return
                  end

                  local bytes, err = file:write(data)
                  assert(bytes == size)
                  assert(not err)
              until length == 0

              local ok, err = file:close()
              assert(ok and not err)

              return ngx.exit(200)
       }
  }
}

Description

This Nginx C module provides the basic file operations APIs with a mechanism that never block Nginx's event loop. For now, it leverages Nginx's thread pool, I/O operations might be offloaded to one of the free threads, and current Lua coroutine (Light Thread) will be yield until the I/O operations is done, in the meantime, Nginx in turn processes other events.

It's worth to mention that the cost time of a single I/O operation won't be reduced, it was just transferred from the main thread (the one executes the event loop) to another exclusive thread. Indeed, the overhead might be a little higher, because of the extra tasks transferring, lock waiting, Lua coroutine resumption (and can only be resumed in the next event loop) and so forth. Nevertheless, after the offloading, the main thread doesn't block due to the I/O operation, and this is the fundamental advantage compared with the native Lua I/O library.

The APIs are similar with the Lua I/O library, but with the totally different internal implementations, it doesn't use the stream file facilities in libc (but keep trying to be consistent with it), the buffer is maintained inside this module, and follows Cosocket's internals.

If you want to learn more about Nginx's thread pool, just try this article.

Back to TOC

Prerequisites

This Nginx C module relies on the lua-nginx-module and the thread pool option, so configure your Nginx branch like the follow way:

./auto/configure --with-threads --add-module=/path/to/lua-nginx-module/ --add-module=/path/to/lua-io-nginx-module/

Due to some existing limitations in ngx_lua, you must place the --add-module=/path/to/lua-nginx-module/ before --add-module=/path/to/lua-io-nginx-module/. These limitations might be eliminated in the future if ngx_lua exposes more C functions and data structures.

Directives

lua_io_thread_pool

Syntax: lua_io_thread_pool thread-pool-name;
Default: lua_io_thread_pool default;
Context: http, server, location, if in location

Specifies which thread pool should be used, note you should configure the thread pool by the thread_pool direction.

lua_io_log_errors

Syntax: lua_io_log_errors on | off
Default: lua_io_log_errors off;
Context: http, server, location, if in location

Specifies whether logs the error message when failures occur. If you are already doing proper error handling and logging in your Lua code, then it is recommended to turn this directive off to prevent data flushing in your nginx error log files (which is usually rather expensive).

lua_io_read_buffer_size

Syntax: lua_io_read_buffer_size
Default: lua_io_read_buffer_size 4k/8k;
Context: http, server, location, if in location

Specifies the buffer size used by the reading operations.

lua_io_write_buffer_size

Syntax: lua_io_write_buffer_size
Default: lua_io_write_buffer_size 4k/8k;
Context: http, server, location, if in location

Specifies the buffer size used by the writing operations.

Data will be cached in this buffer until overflow or you call these "flushable" APIs (like file:flush) explicitly.

You can set this value to zero and always "write through the cache".

APIs

To use these APIs, just import this module by:

local ngx_io = require "ngx.io"

ngx_io.open

Syntax: local file, err = ngx_io.open(filename [, mode])
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Opens a file and returns the corresponding file object. In case of failure, nil and a Lua string will be given, which describes the error reason.

The first parameter is the target file name that would be opened. When filename is a relative path, the nginx prefix will be placed in front of filename, for instance, if the filename is "foo.txt", and you start your Nginx by nginx -p /tmp, then file /tmp/foo.txt will be opened.

The second optional parameter, specifes the open mode, can be any of the following:

  • "r": read mode (the default);
  • "w": write mode;
  • "a": append mode;
  • "r+": update mode, all previous data is preserved;
  • "w+": update mode, all previous data is erased (file will be truncated);
  • "a+": append update mode, previous data is preserved, writing is only allowed at the end of file.

file:read

Syntax: local data, err = file:read([format])
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Reads some data from the file, according to the given formats, which specify what to read.

The available formats are:

  • "*a": reads the whole file, starting at the current position. On end of file, it returns nil.
  • "*l": reads the next line (skipping the end of line), returning nil on end of file. This is the default format.
  • number: reads a string with up to this number of characters, returning nil on end of file. If number is zero, it reads nothing and returns an empty string, or nil on end of file.

A Lua string will be returned as the expected data; In case of failure, nil and an error message will be given.

This method is a synchronous operation and is 100% nonblocking.

file:write

Syntax: local n, err = file:write(data)
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Writes data to the file. Note data might be cached in the write buffer if suitable.

the number of wrote bytes will be returned; In case of failure, 0 and an error message will be given.

This method is a synchronous operation and is 100% nonblocking.

CAUTION: If you opened the file with the append mode, then writing is only allowed at the end of file. The adjustment of the file offset and the write operation are performed as an atomic step, which is guaranteed by the write and writev system calls.

file:seek

Syntax: local offset, err = file:seek([whence] [, offset])
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Sets and gets the file position, measured from the beginning of the file, to the position given by offset plus a base specified by the string whence, as follows:

  • "set": base is position 0 (beginning of the file);
  • "cur": base is current position;
  • "end": base is end of file;

In case of success, function seek returns the final file position, measured in bytes from the beginning of the file. If this method fails, it returns nil, plus a string describing the error.

The default value for whence is "cur", and for offset is 0. Therefore, the call file:seek() returns the current file position, without changing it; the call file:seek("set") sets the position to the beginning of the file (and returns 0); and the call file:seek("end") sets the position to the end of the file, and returns its size.

Cached write buffer data will be flushed to the file and cached read buffer data will be dropped. This method is a synchronous operation and is 100% nonblocking.

CAVEAT: You should always call this method before you switch the I/O operations from read to write and vice versa.

file:flush

Syntax: local ok, err = file:flush([sync])
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Saves any written data to file. In case of success, it returns 1 and if this method fails, nil and a Lua string will be given (as the error message).

An optional and sole parameter sync can be passed to specify whether this method should call fsync and wait until data was saved to the storage, default is false.

This method is a synchronous operation and is 100% nonblocking.

file:lines

Syntax: local iter = file:lines()
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Returns an iterator that, each time it is called, returns a new line from the file. Therefore, the construction

for line in file:lines() do body end

will iterate over all lines of the file.

The iterator is like the way file:read("*l"), and you can always mixed use of these read methods safely.

file:close

Syntax: local ok, err = file:close()
Context: rewrite_by_lua*, access_by_lua*, content_by_lua*, ngx.timer.*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*

Closes the file. Any cached write buffer data will be flushed to the file. This method is a synchronous operation and is 100% nonblocking.

In case of success, this method returns 1 while nil plus a Lua string will be returned if errors occurred.

Author

Alex Zhang (张超) [email protected], UPYUN Inc.

Back to TOC

lua-io-nginx-module's People

Contributors

tokers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

lua-io-nginx-module's Issues

crash with ngx.io

Hi,

You module looks very useful. Thank you for your work.

I built the library with OpenResty 1.15.8.1rc1
https://github.com/openresty/docker-openresty/blob/master/xenial/Dockerfile

Configuration of the module is default (as on the main page).

And I constantly get crash with multi requests.

My use case is many clients write and read files simultaneously (files are from 500 bytes till 1GB ) and each 1-5 minutes nginx crashes with always the same stacktrace:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `nginx: worker process '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 ngx_http_lua_io_file_finalize (r=0x3730333d646e6f63, ctx=0x7ff24166e4e0)
at /tmp/lua-io-nginx-module/src/ngx_http_lua_io_module.c:1492
1492 *ctx->cleanup = NULL;
[Current thread is 1 (Thread 0x7ff257c31740 (LWP 99))]

With common lua io it works as expected.

I can provide more info if you tell me what I can gather more.

Thank you. BR Alex.

Unsafe usage of memory from request pool

Unfortunately my PR (#8) is insufficient to fully solve the memory issues with this module. It helps but it doesnt solve it completely, is still possible for requests to finish removing memory in use or about to be used by tasks.

To quote the lua module: https://github.com/openresty/lua-nginx-module/blob/c9a0808c89219f74e4d20cef44a322a1bbe72df3/src/ngx_http_lua_worker_thread.c#L69

 * Re-implement ngx_thread_task_alloc to avoid alloc from request pool
 * since the request may exit before worker thread finish.

Fortunately its hence now possible to implement this module in lua!

Bad file descriptor error at file:close()

Hi!

The following error occurs very often:
close() failed (9: Bad file descriptor)

This is produced at the end of the complex PUT workflow (described below) after main data already written and re-sended and a small piece of data should be placed in a separate file (it is point 5 of the PUT workflow). The exact code:

local io = require "ngx.io"

local file,err = io.open( filename, "w+" )
if err then return nil, err end

for name, value in pairs(headers) do
  if value ~= nil then
    local res,err = file:write(name:lower()..": "..value.."\n")
    if not res then return nil,err end
  end
end

return file:close()

The last file:close is the problem that returns the error "close() failed (9: Bad file descriptor)" .

Requesting model is the following. There are many PUT/HEAD/GET/DELETE requests at 1Gbps speed. Most PUT requests write file data locally and send it to other service using this library:
https://github.com/ledgetech/lua-resty-http

so chunks are read, written to local file and forwarded to other service.

GET requests read file and "print" data to output.

Chunk size is 65536 bytes.

PUT is most interested since http request to forward data is sent to local "proxy_pass" location that passes query to an external service.

So PUT sequence is:

  1. Get PUT request
  2. Open local file to write
  3. Connect to local (127.0.0.1:80/local_proxy_pass) location with proxy_pass using resty.http library
    3.1. for body reader an iterator function is passed where
    3.1.1. receive a chunk data with socket:receive(current_chunk_size)
    3.1.2. write data to local file
    3.1.3. return data
  4. When all data is read - close the file
  5. At very end of the request create and write a small file.

Input and external requests are secured with https.

Memory usage after free

We upgraded to a newer openresty and begun to see segfaults.

Valgrind output:

==32455== Invalid read of size 8
==32455==    at 0x20C72D: ngx_thread_pool_handler (ngx_thread_pool.c:387)
==32455==    by 0x20A923: ngx_epoll_notify_handler (ngx_epoll_module.c:456)
==32455==    by 0x20B47B: ngx_epoll_process_events (ngx_epoll_module.c:901)
==32455==    by 0x1FACE4: ngx_process_events_and_timers (ngx_event.c:247)
==32455==    by 0x208F79: ngx_worker_process_cycle (ngx_process_cycle.c:750)
==32455==    by 0x205988: ngx_spawn_process (ngx_process.c:199)
==32455==    by 0x207FC4: ngx_start_worker_processes (ngx_process_cycle.c:359)
==32455==    by 0x207692: ngx_master_process_cycle (ngx_process_cycle.c:131)
==32455==    by 0x1C6789: main (nginx.c:382)
==32455==  Address 0x88cb6c0 is 0 bytes inside a block of size 192 free'd
==32455==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==32455==    by 0x1CA22F: ngx_destroy_pool (ngx_palloc.c:85)
==32455==    by 0x39FC7C: ngx_http_lua_close_fake_connection (ngx_http_lua_util.c:3812)
==32455==    by 0x39FA77: ngx_http_lua_close_fake_request (ngx_http_lua_util.c:3733)
==32455==    by 0x39F960: ngx_http_lua_finalize_fake_request (ngx_http_lua_util.c:3693)
==32455==    by 0x3C56AD: ngx_http_lua_ssl_cert_aborted (ngx_http_lua_ssl_certby.c:422)
==32455==    by 0x1CA1F1: ngx_destroy_pool (ngx_palloc.c:57)
==32455==    by 0x23172B: ngx_http_close_connection (ngx_http_request.c:3731)
==32455==    by 0x22BF36: ngx_http_ssl_handshake_handler (ngx_http_request.c:879)
==32455==    by 0x22BCE0: ngx_http_ssl_handshake (ngx_http_request.c:789)
==32455==    by 0x22AE56: ngx_http_init_connection (ngx_http_request.c:384)
==32455==    by 0x1FDC9F: ngx_event_accept (ngx_event_accept.c:308)
==32455==  Block was alloc'd at
==32455==    at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==32455==    by 0x201E4F: ngx_alloc (ngx_alloc.c:22)
==32455==    by 0x1CA58A: ngx_palloc_large (ngx_palloc.c:220)
==32455==    by 0x1CA375: ngx_palloc (ngx_palloc.c:131)
==32455==    by 0x1CA78E: ngx_pcalloc (ngx_palloc.c:302)
==32455==    by 0x20C272: ngx_thread_task_alloc (ngx_thread_pool.c:219)
==32455==    by 0x412779: ngx_http_lua_io_thread_post_read_task (ngx_http_lua_io.c:248)
==32455==    by 0x411D05: ngx_http_lua_io_file_read_helper (ngx_http_lua_io_module.c:1750)
==32455==    by 0x40F98B: ngx_http_lua_io_file_read (ngx_http_lua_io_module.c:748)
==32455==    by 0x4D01AD5: lj_BC_FUNCC (in /usr/local/lib/libluajit-5.1.so.2.1.0)
==32455==    by 0x39B129: ngx_http_lua_run_thread (ngx_http_lua_util.c:1090)
==32455==    by 0x3C5A20: ngx_http_lua_ssl_cert_by_chunk (ngx_http_lua_ssl_certby.c:533)

After working down the changes the cause appears to be the new luajit2

It appears to be early free'ing of memory, perhaps thread pool memory allocated from the request pool?

Crash in ngx_http_lua_io_read_all

#0  0x00005603b2e4b020 in ngx_http_lua_io_read_all ()
#1  0x00005603b2e4a3e7 in ngx_http_lua_io_file_do_read ()
#2  0x00005603b2e49fce in ngx_http_lua_io_prepare_retvals ()
#3  0x00005603b2e49672 in ngx_http_lua_io_resume ()
#4  0x00005603b2e49514 in ngx_http_lua_io_content_wev_handler ()
#5  0x00005603b2e495db in ngx_http_lua_io_thread_event_handler ()
#6  0x00005603b2c91d14 in ngx_thread_pool_handler ()
#7  0x00005603b2c8ff0b in ngx_epoll_notify_handler ()
#8  0x00005603b2c90a63 in ngx_epoll_process_events ()
#9  0x00005603b2c7f9c8 in ngx_process_events_and_timers ()
#10 0x00005603b2c8e362 in ngx_worker_process_cycle ()
#11 0x00005603b2c8ac63 in ngx_spawn_process ()
#12 0x00005603b2c8ddf1 in ngx_reap_children ()
#13 0x00005603b2c8cb39 in ngx_master_process_cycle ()
#14 0x00005603b2c4a7a5 in main ()

attempt to write data on a closed file object

case is:

  1. lua module main: (for content_by_lua)
local  _M = {}
_M.fd_cache = {}   // cache for file object created by ngx_io.open, key is file_path, val is file object
......
return _M
  1. create file and cache file object
local  fd1, err = ngx_io.open(file_path1, "a+")
-- do check open ret
_M.fd_cache [file_path1]= fd1
-- fd2, fd3 created like this
  1. when a request arrived
-- get request args
-- get file path to write args , eg: file_path1
--get file object  fd1 = _M.fd_cache[file_path1]
-- call   fd1:write(str_data), then get err: attempt to write data on a closed file object

I sure never call fd1:close() in my lua project

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.