Code Monkey home page Code Monkey logo

capture-all-screens's Introduction

Capture all screens explainer

Introduction

getAllScreensMedia is an API that allows clients to capture all monitors attached to a device at once without user interaction. It is only available in managed sessions for allow-listed web apps and it is a requirement that the user must be informed about capturing at all times. The usage indicator cannot be prevented by the web app.

Motivation

Web developers have expressed interest for such an API in order to meet legal and internal compliance requirements.

Use cases

  • Contact centers may require full documentation of provided information for compliance and / or training purposes.
  • In the financial industry a consultant may provide financial advice digitally and a complete documentation of the information may be required by law in some jurisdictions.
  • Internet usage in prisons may require full traceability to allow convicts to access the internet.

API

getAllScreensMedia can be used similarly to getDisplayMedia with a few differences:

  • getAllScreensMedia returns a promise to a sequence of MediaStreams (one MediaStream per monitor).
  • Constraints cannot be passed in the getAllScreensMedia call. Constraints may be different depending on the monitor and information determining the desired constraints (resolution, size, …) is likely only available after the monitor was captured. Constraints can be applied on the returned MediaStreamTrack with applyConstraints.
  • getAllScreensMedia will only be available for web apps allowlisted by a policy.
  • Usage indicators must be shown to the user at all times.

getAllScreensMedia will use ScreenCaptureMediaStreamTrack (which is a subclass of MediaStreamTrack) in the returned MediaStream which provides access to monitor details analogous to the getScreenDetails API.

partial interface MediaDevices {
 Promise<sequence<MediaStream>> getAllScreensMedia();
}
interface ScreenCaptureMediaStreamTrack : MediaStreamTrack {
  ScreenDetailed screenDetailed();
}

Example usage

try {
  const mediaStreams = await navigator.mediaDevices.getAllScreensMedia();
  mediaStreams.forEach((mediaStream, index) => {
    files.push(saveToFile(mediaStream));
  })
} catch (e) {
  console.log('Unable to acquire screen captures: ' + e);
}

Alternatives considered

One alternative is to first pursue getDisplayMediaSet (mediacapture-screen-share/issues/204), which allows the user to choose multiple surfaces, then add a managed-only setting on top which would auto-accept all monitors. This approach was abandoned due to insufficient Web-developer interest in the former without the latter.

capture-all-screens's People

Contributors

shangl avatar yoavweiss avatar eladalon1983 avatar beaufortfrancois avatar marcoscaceres avatar

Stargazers

 avatar Vikki Fox avatar tony eve avatar  avatar Giorgio Gunawan avatar  avatar Sam Ken  (SAMkenXCC) avatar Ciprian avatar Matías López avatar  avatar

Watchers

Christoph Guttandin avatar  avatar Ciprian avatar  avatar Neeraj Sharma avatar

capture-all-screens's Issues

Snapshots and short-lived indicators

The spec discusses user-facing indicators that a capture is ongoing. However, if an app uses the API to grab snapshots every N seconds, the indicators might appear for such a brief time that the user misses them. The way iOS and Android tackle this risk is by ensuring that user-facing indicators are only disappear after K seconds. I think it would be good for this spec to make similar requirements, or at least recommendations.

Specification is too tied to enterprise/admin use cases

Reading through the spec, I was struck by this particular paragraph (emphasis mine):

The user agent MUST obtain permission by checking an allowlist of origins specified by an administrator or device owner. If the origin is not in the allowlist defined by the administrator, reject p with a new DOMException object whose name attribute has the value NotAllowedError and return p.

This is the first (and last) time a device administrator is mentioned in the spec, so for one thing, if that's the only intended mode, then it should be explained up front as such ("this is an API to allow administrators to grant websites access to all screens"). But this seems a little weird to me, from a platform perspective, to have getDisplayMedia be a general web API that lets users grant access to a selected display, and getAllScreensMedia be essentially the same concept only it picks all displays at once, but it only works when administrators explicitly allowlist the origin.

In my opinion, getAllScreensMedia should be available on all origins to all users, and should basically have the same permission model as getDisplayMedia only allowing multiple displays to be selected instead of one. This logically suggests that the dialog should be a multi-select where you pick which displays to return (similar to how you can show a file picker that only allows one file, or that allows multiple files to be picked).

I see in your Alternatives Considered section, you mention w3c/mediacapture-screen-share-extensions#8 in which @eladalon1983 proposes the same conclusion as I came to above. It says "This approach was abandoned due to insufficient Web-developer interest in the former without the latter." (But the thread looks like there was some positive responses?) I think going with that approach would make the web API side pretty straightforward, as it has exactly the same security story as getDisplayMedia, it only has a bit of new UX and API surface.

Now clearly there's an overriding use case to allow administrators to allowlist certain origins and give them access to the displays without prompting. I see that as a private agreement at the user agent level (not part of the spec), similar to how administrators in browsers can auto-grant certain permissions. So I would suggest that we don't make admin allowlisting a requirement of the spec, but rather a power that user agents can implicitly grant to administrators.

Improve the usage example in the explainer

Currently the explainer example is very vague and it is not clear what it is doing. index and screenDetailed are not used, files is not defined and comments about what the example is trying to achieve would be helpful.

Vote: Adopt spec by Screen Capture Community Group

This issue collects votes for the Screen Capture Community Group to adopt the Capture All Screens spec.

The recommended format for casting your vote is to use one of the following:

  1. I support adopting the Capture All Screens specification.
  2. I support adopting the Capture All Screens specification; we at {company} intend to use it soon for {purpose}.
  3. I object to the adoption of the Capture All Screens specification. My list of blocking-issues is {list-blocking-issues}.

Mentioning which company you work with, and whether you have a specific use-case in mind for this API, helps browser vendors set the prioritization of implementing this API. For some browser vendors, such "Web developer signals" are taken into account when deciding whether an API is ready to be shipped.

Is this on-track?

Hello,

This is a highly useful API for the agent screen capture use case and I wanted to understand if this is on track for shipping and when is it expected to be shipped. I noticed that it is available under a chrome flag until chrome 130 so is it planned to ship after that?

Apologies if I am missing out on something basic here, I am quite new to Chrome feature releases and hence not aware of it properly.

Thanks,
Neeraj

Warning before capture starts

As of the time of writing, the spec only mentions that the user should be made aware of ongoing captures. Maybe users should be made aware, before they start interacting with the device, that their session may start being captured at any time, because the system is configured to allow that. (And if so - should they be made aware of which origins? Might be overkill.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.