andrewpknight / zoomgroupstats Goto Github PK

R package that provides utilities for processing and analyzing the files that are exported from a recorded 'Zoom' Meeting. This includes analyzing data captured through video cameras and microphones, the text-based chat, and meta-data. You can analyze aspects of the conversation among meeting participants and their emotional expressions throughout the meeting.

Home Page: http://zoomgroupstats.org

License: Other

R 100.00%

conversation cran groups negotiation r teams virtual-meetings zoom-meetings

zoomgroupstats's People

Contributors

Stargazers

Watchers

Forkers

dobrowski einsmaleins myeomans pwkraft nastashavelasco1987 hscarter

zoomgroupstats's Issues

Revise and add the audio transcription functions

The initial version of zoomGroupStats had functions for transcribing audio files using AWS. I never really liked the transcriptions that came out of those, so haven't yet ported it over. At some point, though, will be necessary to add this functionality b/c not everyone has access to Zoom Cloud Recording.

Create master dataset function

Write a function to pull together and create individual-level and meeting-level output from anything in the batchOut object. What this will do is create usable datasets at the individual and meeting levels as efficiently as possible.

Create wrapper for that combines transcript and chat analysis

User would supply both the name of the chat file and the name of the transcript file. The wrapper would run textConversationAnalysis on both. In addition to the output for chat/transcript, the wrapper would also output a combination that assesses someone's total contributions in a standardized way. For example, it would give total speaking time, total number of messages / characters in chat, and the comparisons of each. This would be useful to understanding someone's tendency to use one channel or the other for making contributions.

Test zoomGroupStats with latest version of paws.common 0.6.0

Hi all,

I am in the process in releasing the latest version of paws.common 0.6.0 paws-r/paws#657. paws.common 0.6.0 comes with a new xml parser that has increased performance: paws-r/paws#621. However I am going around all paws dependent packages to ensure the new xml parser doesn't break any existing code.

Is it possible to run your existing tests with the latest version of paws.common 0.6.0? If not please let me know how to set up my environment so that I can test these changes on your behalf.

To install paws.common 0.6.0 you can use r-universe:

install.packages('paws.common', repos = 'https://paws-r.r-universe.dev')

Side note: From memory I believe all the current unit tests are mocked. I am happy to investigate an integrated test to ensure it doesn't break anything.

Add a basic battery of descriptive graphs

Provide simple graphing functionality that produces images giving individual's speaking time, chat contributions, and sentiment of speech/text. As a template, could use the existing layout of meetingmeasures.com.

Add a "flattenSelf" option for the turnTaking function

Add an option to the turnTaking function to "flattenSelf". If true, this would compress sequential utterances by the same speaker and convert them into a single long utterance. Might be good to also include options to compress based on same speaker + time.

Detect names and timestamps in the video feed

There is not currently a good way--without a pre-existing repository of identified faces that is linked to a unique individual identifier, which is itself linked to the zoom display name--to associate a given person's video feed with their Zoom display name.

Develop function to extract the user's zoom display name from the video feed.

Add option to use magick for breaking down a video file

The initial version of videoFaceAnalysis used the magick package to break down a video file into still frames. The current version relies on ffmpeg. To make this functionality more accessible:

Provide an option for a user to specify a directory that contains all image files from a video that has been pre-split.
Provide an option for a user to specify using magick to break down the file rather than ffmpeg

Add instructions for setting up AWS access

The package is more powerful if users take the time to set up AWS access. Need to add instructions for how precisely to do this for folks who may be less savvy with code + apis.

Add instructions for installing ffmpeg

To use the video functions, users will have to install ffmpeg - this would be the case if they used the av package, the magick package, or a straight system call. Need to provide guidance in Part04 of the guide for how to install ffmpeg.

Add functionality to analyze audio

One area for dramatic enhancement is to use the raw audio file as the basis for analysis. The software Praat provides one possible path that could be taken for doing this.

In the near term, it would be great to be able to mark instances of laughter in the course of a virtual meeting.

Add a face size measure to the videoFaceAnalysis function

The size of someone's face in the gallery video depends on two factors: (1) the size of the person's face within their own video feed and (2) the number of video tiles in the gallery. Both of these are dynamic throughout a video feed.

Would be useful to detect (1) how many videos are active on the gallery feed and (2) use this as a way of generating a standardized face size measure.

Detect timestamps in the video feed

There is not currently a good indicator of when a Zoom recording begins. This information is needed to precisely sync up the video file with the chat file. One way of getting this information is to extract it from the zoom video feed.

Add function for batch processing meetings

Few people will have access to the Zoom API for their institution. To accommodate batch processing of meetings, there needs to be a function that reads a batch file with necessary information for each meeting -- ideally this would connect to the meeting id in the file read by zoomParticipantsInfo.

There are efficient ways to build this that impose filenaming constraints on the user. Alternatively, the batch file itself could have full information (e.g., paths for each file--info, transcript, chat, audio, and video).

Add videoInfo to batch processing output of videos

This will add metadata about the videos that were processed to the master batchOut object when someone runs batchGrabVideoStills(). This will tell where to find images, how many there are, and what the sample window was.

Add options for other kinds of text analysis

Currently the textConversationAnalysis does not assess anything other than the sentiment of speech. Should explore other models that are available for the following:

Add options for other kinds of sentiment analysis to reduce dependency on AWS - could use basic lexical options in existing r packages, for example.
Draw from other standardized dictionaries to assess other attributes of speech.

Add an aggregateVideoFaceAnalysis function

This function would read in the output of videoFaceAnalysis and produce simple group level metrics. For example, this could include: average age, age diversity, gender composition, and group-level aggregations of the emotion metrics. Should also include metrics for facial detection, which could be used as a proxy -- in combination with the participantsInfo file -- for video on/off.

Create zoomRosetta functionality

One of the most challenging aspects of running larger scale virtual meetings studies is the lack of a unique identifier for meeting participants. This functionality would take in the participants info file, the transcript, and the chat and output a sheet containing all unique display names (along with any email address). The user would then use fuzzy matching or a more manual process to provide a new unique identifier for each display name that shows up in the meeting.

Also, need to check to see the alignment of display names across these files so that we have an accurate reading of someone's time in the meeting.

Guidance for local recordings

Add a section to the guide to address local recordings. This includes how to transcribe the audio, parse the chat, and analyze the video.

Create interruptions detector

Either using the raw audio file or using decision rules on the transcript (e.g., sharp cut from one speaker to the next), need to create the ability to measure the prevalence of interruptions in the audio transcript.