Code Monkey home page Code Monkey logo

cdn's People

Contributors

adamkdean avatar carlbuelow777 avatar davidmacp avatar eduardoboucas avatar greenkeeper[bot] avatar greenkeeperio-bot avatar jean-luc avatar jimlambie avatar josephdenne avatar mingard avatar philip-hunt avatar snyk-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdn's Issues

A black border is added to images

Exploration of alternatives to imagemagick-native

Explore the potential for replacing imagemagick-native with gd or gmic to reduce dependencies, improve compatibility and improve performance.

imagemagick-native is currently not compatible with recent releases of Node, which is less than ideal.

Add support for content aware crop generation

I've been playing with smartcrop.js, an algorithm that, given an image and dimensions for a crop, uses image processing to find good crops automatically.

It currently supports Sharp and Image Magick as image operation processors, but I've created an adapter module for lwip, looking to integrate it with CDN.

I see this being useful in two ways:

  1. A mode for getting an automatically cropped image, for when we want to display an image in various aspect ratios but don't have human-generated crops;
  2. A mode for getting the coordinates of the best crop (by requesting a response in JSON format), to be used as a default or suggested crop in a crop editing tool.

This test suite shows the algorithm selecting crops for over 100 images, and this test bed shows a live preview with any uploaded images.

If we're happy with this, I can look into the integration with CDN soon.

Images being force downloaded

The latest build is forcing images to download and those images are corrupt once downloaded.

For example:

http://54.229.54.139/jpg/40/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/1/1100/612/uploads/image/homepage-promotions/qol-chooser-568f8aae94aee.jpg

Signed URL support

The addition of support for signed URLs to enable the access of private assets in S3.

Auth token flow

Obtaining an access token

The authentication endpoint should be at /token, inline with the endpoint for authentication in DADI API.

It should also require a payload of clientId and secret. for example:

{ "clientId": "test", "secret": "ewjkgdiuwt567832tgjhgurdweknbdkwjegg6r723ghvfj_" }

The credentials provided should match those given in the config. In the event that they do not match, the application should return a 401: Unauthorised.

In the event of successful authentication the API should return a bearer token with a 200 header. For example:

{
  "accessToken": "243606ed-e43f-41c6-8e53-75e5e2f85b82"
  "tokenType": "Bearer"
  "expiresIn": 1800 (or whatever `config.auth.tokenTtl` is)
}

Authenticated requests

When a user has a token, it should be sent in the request as an Authorization header in place of the current x-access-token header. For example:

Authorization: Bearer 243606ed-e43f-41c6-8e53-75e5e2f85b82

Token store and expiry

Please use https://github.com/simonlast/node-persist to store auth tokens on the filesystem. An example can be found in https://github.com/dadi/cleanse

Errors

If no Authorization header is present, the application MUST return the following:

Status: 401 Unauthorized
Header: WWW-Authenticate: Bearer realm="example" (where example is the domain used for the application)

If an Authorization header is present but the token is invalid/expired, the application MUST return the following:

Status: 401 Unauthorized
Header: 
WWW-Authenticate: Bearer realm="example",
                 error="invalid_token",
                 error_description="The access token has expired"

GIF seems to crash app

A shot of the console errors found on Lifestyle when a GIF resize is called. Appears to crash the app, and further images exhibit a 504. Those images can then be opened in isolation, and will generate perfectly, suggesting that the app was previously out-of-service.

screen shot on 2016-02-12 at 17-34-07

Images with spaces in the filename are not served

Spaces are encoded as %20 by the browser, but images are not found on the filesystem (where they are saved unencoded)

Example:
this image with spaces in the filename is not being shown
http://bantam.test.empireonline.com:8080/jpg/70/0/0/640/480/aspectfit/0/0/0/0/0/0/c/articles/578f6fa76c7bafe3054f1635/Finding%20Dory%20-%20Seals.jpg

but it exists on the filesystem

-rw-r--r-- 1 www-data www-data 150728 Jul 20 15:20 /data/asset-store/c/articles/578f6fa76c7bafe3054f1635/Finding Dory - Seals.jpg

Please note that previous versions of CDN were able to serve images, for example this is 0.1.11-beta
http://54.229.5.231:8080/jpg/70/0/0/640/480/0/0/0/1/0/North/0/0/0/0/0/c/celebrity/5698144c01aa022c6edd2986/DRAKE%20WHO%20WORE%20IT%20BEST_950x953.png

Ability to post a recipe

Suggest that the recipe is posted to the intended path + /config e.g. /new-crop-path/config

with content

{
  "recipe": "new-crop-path",
  "settings": {
    "format": "jpg",
    "quality": 70,
    "ratio": "1-1"
  }
}

Remove node-canvas dependency

canvas is only required for extracting colours from an image, but requires a complex setup on all platforms. There must be another way...

Block origin access

Media can have origin-block boolean which when true blocks CDN from outputting the original file. Especially useful when working with large images or where licensing is granted based on maximum usable resolution.

Suggest this is stored in CDN db rather than using AWS permissions to avoid reliance on third-party

Server Error when using S3

We are seeing this error when using S3 for image storage in the latest build:

PermanentRedirect: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.

Proposed new URL Syntax

The current path based syntax for working with images is very fragile. Adding, removing or modifying parameters will instantly break all clients using the CDN. I therefore would like to propose a new url structure based primarily on query values.

mycdn/{storage_adapter}/{image_path}?{operation}={value}

So for instance, media.whatcar.com/s3/wc/reviews/porsche911.jpg?cropX=600&cropY=400

In addition with the potential of switching to libgd, #16 we could pair the new image processing library to the new syntax. Clients can then change at their own pace instead of being forced to update.

Standalone generator of responsive images with art direction

It would be interesting to create a standalone web app to showcase CDN features, more specifically on-the-fly crop generation and the content-aware cropping mode.

People would be able to upload an image, select the breakpoints they want and the tool would generate all the necessary markup (this pen being an example of what could be generated).

References:

SSL handling

X-Post from: dadi/web#99

At the moment configuration appears to enable SSL as being either on or off. SSL needs to be supported in a variety of configurations:

  • Just SSL
  • SSL, with support for routing traffic from HTTP to HTTPS
  • HTTPS alongside HTTP (either/or support)

In addition the method of recognising/enforcing an SSL request needs attention. In a load balanced setup the load balancer will usually handle SSL, piping requests as HTTP to the end instances.

Add ENV variable names for all sensitive properties

Config settings can be loaded from the config files, from environment variables or from the command line when launching the app.

All sensitive settings need to have an "env": "xxx" property added to the config schema

CDN crashes if cache is enabled and JSON format is requested

This error shows an empty buffer passed to the getColours method:

[2016-12-31 10:40:00.936] [LOG]   <Buffer >
[2016-12-31 10:40:00.959] [LOG]   []
{"name":"dadi-cdn","hostname":"Jamess-MacBook.local","pid":3861,"level":50,"err":{"message":"Cannot read property 'getHex' of undefined","name":"TypeError","stack":"TypeError: Cannot read property 'getHex' of undefined\n    at /Users/jameslambie/projects/dadi/product/cdn/dadi/lib/handlers/image.js:650:35\n    at /Users/jameslambie/projects/dadi/product/cdn/node_modules/node-vibrant/lib/vibrant.js:58:18\n    at Jimp.<anonymous> (/Users/jameslambie/projects/dadi/product/cdn/node_modules/node-vibrant/lib/image/node.js:50:47)\n    at Jimp.throwError (/Users/jameslambie/projects/dadi/product/cdn/node_modules/jimp/index.js:83:44)\n    at new Jimp (/Users/jameslambie/projects/dadi/product/cdn/node_modules/jimp/index.js:212:31)\n    at new JimpImage (/Users/jameslambie/projects/dadi/product/cdn/node_modules/node-vibrant/lib/image/node.js:47:7)\n    at Vibrant.module.exports.Vibrant.getPalette (/Users/jameslambie/projects/dadi/product/cdn/node_modules/node-vibrant/lib/vibrant.js:54:20)\n    at Vibrant.module.exports.Vibrant.getSwatches (/Users/jameslambie/projects/dadi/product/cdn/node_modules/node-vibrant/lib/vibrant.js:72:17)\n    at /Users/jameslambie/projects/dadi/product/cdn/dadi/lib/handlers/image.js:644:7\n    at getColours (/Users/jameslambie/projects/dadi/product/cdn/dadi/lib/handlers/image.js:640:10)\n    at PassThroughExt.<anonymous> (/Users/jameslambie/projects/dadi/product/cdn/dadi/lib/handlers/image.js:620:7)\n    at emitNone (events.js:85:20)\n    at PassThroughExt.emit (events.js:179:7)\n    at /Users/jameslambie/projects/dadi/product/cdn/node_modules/readable-stream/lib/_stream_readable.js:875:14\n    at _combinedTickCallback (internal/process/next_tick.js:67:7)\n    at process._tickDomainCallback (internal/process/next_tick.js:122:9)"},"msg":"Cannot read property 'getHex' of undefined","time":"2016-12-31T02:40:00.962Z","v":0}

The addition of routes

The addition of a conditional routing concept to DADI CDN, whereby you can configure rules to select different recipes based on information derived about a user.

You can think of a route as a colleciton of recepes that are dynamically chosen on the basis of known data.

Recepe selection can be made on the basis of:

  • The device being used
  • Network speed
  • Current location (continent, region, country, city)
  • Language

For example: if I requested image X on an iPhone, I would receive recipe Y, but if I requested image X on a laptop I would receive recipe Z.

For device and language lookup, DADI CDN will use device sniffing of request headers. For example: https://www.npmjs.com/package/mobile-detect

For location and network lookup DADI CDN will make use of the DADI location and netspeed APIs.

Note: the DADI location API requires a clientId and secret and as such the use of location and network speed will be optional, configured within the main config file

Routes will be held in /workspace/routes as JSON files on disk, as per recipes.

Example route:

{
    "route": "thumbnail",
    "conditions": [
        {
            "condition": {
                "device": "desktop",
                "network": "broadband",
                "location": "london",
                "language": "en"
            },
            "recipe": "desktop-london-en-thumbnail-xlarge"
        },
        {
            "condition": {
                "device": "desktop",
                "network": "mobile"
            },
            "recipe": "mobile-thumbnail-xlarge"
        },
        {
            "condition": {
                "device": "mobile",
                "network": "mobile"
            },
            "recipe": "mobile-thumbnail"
        }
    ],
    "else": "thumbnail"
}

Routes will specify one or more conditions against which to match a user request, along with a recipe to load in the event of a positive match, and a fallback recipe in the event that no matches are found.

Route files will be named in the format {ROUTE-NAME}.json, where {ROUTE-NAME} is the URL string for the route and where {ROUTE-NAME} also matches the route within the JSON.

A route is called in the same way as a recipe. For example:

http://youdomain.com/example-route-name/image-filename.png

Note: routes take precedences over recepes, so in the event that there is a route called "thumbnail" and a recepe called "thumbnail", it is the route that is parsed.

Closed issues?

Please comment on and then close any of the issues that have been resolved.

Middleware support

The ability to extend the JSON image response to augment the data points available.

For example, adding support for a facial recognition API (such as Lambda Labs) would allow the automatic tagging of celebrities.

Image upload

A useful addition to CDN would be POST support, where it would accept file data posted from a form and store the image in the same location as the retrieval mechanism uses.

CDN should allow configuration options for:

  • upload enabled/disabled
  • authentication required
  • the format for the new path (for example a chunked string, giving a series of directories beneath the configured root 01abc/34f3e/124ff/00aec/etc/etc/etc)

It should be possible to upload using both the S3 and disk storage adapters, but not the HTTP remote server adapter.

Installation fails on standard AWS ubuntu server

With latest node 4.x (4.4.4)

It fails also with node 6, 5.5.1, 4.2.4 and works only with 0.10.37.

But 4.x is what we run for the other DADI applications and what we would need to support.

If necessary, please update the installation instructions here
https://github.com/dadi/cdn/blob/docs/docs/installGuide.ubuntu.md

make: *** [Release/obj.target/imagemagick/src/imagemagick.o] Error 1
make: Leaving directory `/data/cdn/node_modules/imagemagick-native/build'
gyp ERR! build error 
gyp ERR! stack Error: `make` failed with exit code: 2
gyp ERR! stack     at ChildProcess.onExit (/usr/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:276:23)
gyp ERR! stack     at emitTwo (events.js:87:13)
gyp ERR! stack     at ChildProcess.emit (events.js:172:7)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:200:12)
gyp ERR! System Linux 3.13.0-74-generic
gyp ERR! command "/usr/bin/nodejs" "/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
gyp ERR! cwd /data/cdn/node_modules/imagemagick-native
gyp ERR! node -v v4.4.4
gyp ERR! node-gyp -v v3.3.1

Missing image rewrite

Handling missing assets in web relies on Javascript manipulation and has been a global requirement.

Adding a configurable fallback image for CDN resolves this problem, however it does reduce our ability to debug in both Publish and Web.

As a way to achieve both, we could serve the asset with a 410 (gone) response code which would:

  • Stop Google indexing the asset
  • Be useful in differentiating between successful and unsuccessful resolution for debugging.

Add cache control header configuration

CDN requires cache control headers configurable by mime type:

For example see the Apache directive below,
add a long cache expiration for images that are not going to change

#1 Month for all your static assets
 <filesMatch ".(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
 Header set Cache-Control "max-age=2592000, public"
 </filesMatch>

Incorrect application of parameters when cropping

Existing Monocle Image URL - using Symphony JIT Image Manipulation

https://images.monocle.com/5/2776/1561/0/746/549/309/uploads/image/article/_rc16026-57850acda818c.jpg

Translated to DADI CDN, URL Syntax V2
https://images.monocle.com/uploads/image/article/_rc16026-57850acda818c.jpg?resize=crop&crop=0,746,2776,2307&w=549&h=309

Formula for crop coordinates

  • //images.monocle.com/5/r/b/l/t/w/h/uploads/image/article/_rc16026-57850acda818c.jpg
  • //images.monocle.com/uploads/image/article/_rc16026-57850acda818c.jpg?resize=crop&crop=l,t,l+r,t+b&w=w&h=h
  • l: left
  • t: top
  • r: right
  • b: bottom
  • w: width to resize to, after crop
  • h: height to resize to, after crop
lt---------+
|          |
|          |
|          |
+----------rb

In the example above using Symphony's JIT service, the image is cropped and then resized.
In CDN we don't perform this final resize step.

width & height are both used, but only if two crop coordinates are supplied. w & h are then
used to determine the final size of the crop rectangle.

If four coordinates are supplied, they are used to size the crop rectangle. The specified w & h
parameters are ignored. (Note in the code block below I have added a resize call to test the theory
using Monocle's images, and it works as expected, giving the same output as the existing JIT service)

I believe the solution could be to use the originalWidth & originalHeight when only two crop
coordinates are specified, rather than the w & h parameters, which should then be used to resize the resulting crop.

case 'crop':
  if (options.crop) {
    var coords = options.crop.split(',').map((coordStr) => {
      return parseInt(coordStr)
    })

    // Reduce 1 pixel on the edges
    coords[2] = (coords[2] > 0) ? (coords[2] - 1) : coords[2]
    coords[3] = (coords[3] > 0) ? (coords[3] - 1) : coords[3]

    if (coords.length === 2) {
      batch.crop(coords[0], coords[1], width - cords[0], height - oords[1])
    }
    else if (coords.length === 4) {
      // image.crop(left, top, right, bottom)
      batch.crop(coords[0], coords[1], coords[2], coords[3])

      // NOTE: added just now to test the theory above
      if (width && height) {
        batch.resize(width, height, filter)
      }
    }

Asset revisions

I propose we introduce an asset revision identifier in the new URL syntax, allowing for a site to refer to a specific git revision of assets. This would solve the issue of syncing backend/frontend deploys (if cdn is updated automatically the backend could refer to any version it wants), as well as removing the need for cache busting (as the paths would change when the version changes). Finally it would make it possible for different projects/subsites to use different asset versions if needed.

I can see two ways of doing this

  • Either the revision is "per-repo" and refers to the sha of the latest commit of the repository that feeds the cdn
  • Revision per file, meaning when a deploy happens only files that change will have their paths changed.

The second alternative (per-file revision, using its checksum), while more difficult to implement the frontend/backend integration, but would provide clear cache performance benefits over repository-wide revisions (only files that change would be "invalidated").

Bandwidth dependent asset delivery

The addition of current bandwidth measurement as part of the first http request made by a client in order to effectively deliver variable resolution/quality content dependant on the network speed between infrastructure and device.

Available bandwidth will be made available as a variable for use within routes.

External image load

CDN currently handles retrieving assets from disk, S3 and a configurable remote server. A very useful addition would be to allow requesting publicly available images via CDN, where they could be processed and cached.

There is already an http storage module which is used for the remote server, so perhaps all we need is an extension to determine that a request is for an external image.

Audio file section with optional transcription

Endpoint

{cdn.host}/path/to/audio?clip={"from": "01:20:04", "to": "01:24:01", "transcript": true, "format": "video/audio"}

Video

  • Format could have an optional coverImage attribute
  • If transcript is true, embed transcript in video output
  • Use gsspeech-api or something similar

Audio

Simply a cut down clip

SSL support

We need integrated support for HTTPS to provide certificate coverage for stand alone instances (outside of a clustered setup where support is offset to a load balancer or proxy.

HTTPS enabled (true/false) and certificate locations (public key, private key, intermediary) should be configurable within the main config file. Certificates should live outside of the main application directory (i.e. include a default location of /etc/ssl/whatever).

Plugin/extension support

To keep the core of CDN as clean and lightweight as possible it requires the addition of support for plugins/addons/extensions.

plugins should

  • be configurable
  • conform to a standard
  • run in an isolated process to the core app
  • be able to operate on image and JSON responses

Lifted from issue #13:
The ability to extend the JSON image response to augment the data points available.

For example, adding support for a facial recognition API (such as Lambda Labs) would allow the automatic tagging of celebrities.

Dynamic text compositing

Enable single and multiline text overlays with full typographic control.

Explore the ability to pull in copy from DADI API to enable realtime embedded copy.

@abovebored can you feed into nice to have functionality please. E.g. typesetting, leading, tracking, backgrounds and transparencies...

Configurable robots.txt

If you visit the CDN url /robots.txt you get an error:

{"statusCode":404,"detail":"'robots.txt' is not a valid route, recipe, processor or image format"}

Would be handy to be able to configure a correct response for indexing robots.

Quality change ignored when reformatting PNG as JPG

When you simultaneously change the format of a PNG to JPG and applying a quality change, the output remains full quality.

This means there is no way to change from lossless to lossy and degrade the quality.

I may be able to take a look at this shortly @jimlambie

Extend the concept of environment to allow config switching by domain

We need to be able to test the integration between DADI applications within a staging environment which is as-live; i.e. the instance set that is then imaged and pushed into production.

Rather than having to play to play with config files or environment variables manually, the ability to automatically switch based on domain is desired. This would enable the use of different config files based on the domain used, allowing us to test the integration between applications without modification immediately prior to deployment.

Example domain pattern:

client.product.env.app.dadi.technology

Example domains:

  • Test: bauer.empire.test.api.dadi.technology
  • QA: bauer.empire.qa.api.dadi.technology
  • Live: empireonline.com

Addition of a status endpoint

A new endpoint at /status would enable a finer-grained approach to platform monitoring; would enable us make decisions relating to load (e.g. at load balancer level) on the basis of much more than just whether or not an CDN instance is responding to ping on port 80; and would allow us to more easily visualise stack performance.

/status would ideally return:

  • An overall health indicator (green/amber/red)
  • Messaging relating to the health indicator
  • Provide CPU load and load averages
  • Provide memory utilisation as well as free and freeable memory
  • Provide uptime
  • Provide the version of DADI CDN being used
  • Alert where new version(s) are available

/status should be authenticated (using the same keys as used for the invalidation API), and also optional (enabled by default). The port for /status should also be configurable.

Dimension values do not take into account pixel ratio

Compression issues - examples included

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.