Code Monkey home page Code Monkey logo

studio-support's Introduction

If you found a bug, have a question, or suggestion how can we do better, please create a ticket here, we'll try to handle it asap.

This repository doesn't contain source code, it's made specifically as a place to collect feedback about bugs, new features, etc.

DVC Studio

studio-support's People

Contributors

shcheklein avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

hercules261188

studio-support's Issues

SAML/SSO setup for Studio

How is access set up for organizations with GitHub? Adding this as we had a question from a prospect: "If we have SAML/SSO setup with our GitHub Enterprise Organization, do we also need SAML/SSO setup for DVC Studio, or is access based on repo access within GitHub?"

Metric Changes/Comparison Sidebar Float Column Headers

Hi,

When using the comparison mode between two commits I've noticed when using a long list of metrics scrolling makes the column headers disapear. I have to keep scrolling up to remind myself which columns are for which commit. It would be nice to have them float over the data like in the main view showing commit histories.

Turn off a certain filter without deleting it

If you have a number of filters set up, sometimes it is nice to be able to turn off a certain filter but at the moment you need to remove it so you lose it completely. It would be nice to be able to disable the filter without deleting it. Thanks!

Having different names for multiple views created out of the same repository

Currently, while I create multiple views for the same repository, I am seeing the same name for them.
The use case where different names might be beneficial is when I need to share my experiments with different stakeholders across the board:

  1. Peers looking through the experiments and contributing to the project.
  2. Managers looking through the metrics and a selected list of experiments that I can slice and dice for them.

Hide branches without any metric values

In case there a lot of branches which have nothing related to dvc, View becomes very polluted. If I not mistaken, you can't do anything useful or get any useful information from these branches (though maybe having .dvc in the default repo branch allows to track metrics in files in other branches, IDK about this).

To get an example, import this repo: https://github.com/iterative/yolov5
Or see the View https://studio.iterative.ai/user/aguschin/views/yolov5-ypd0c4rbtj

It has only one branch which has .dvc and dvc.yaml and it is shown at the bottom.

My guess that it should be related to https://github.com/iterative/viewer/issues/1372

The current situation in which this situation happened: there was a large repo on which dvc was applied. To try out dvc, one specific branch was created.

Also, if there are situations in which these branches could be useful, we can hide this option in Settings.

Azure repos support

It would be intresting to get support from other git repos such as Azure.
I really look forward to try Studio!

More detailed error report

When I click on Show Warnings I get a list of error messages saying 1 commit(s) failed to parse but I don't know how to find which commit caused the error or how to get more details:

Captura de pantalla de 2021-06-25 11-58-53

I think it would be nice to have the option to see a more detailed error report.

Unexpected Error When Running a new Experiment in Studio

I just made a commit to my github repository, and tried to run a new Experiment in Studio.

I have set up a Github Action which will run CML workflow in a self-hosted server. By Running a new Experiment in Studio, I expect that it will make a commit to my github repository, and trigger the Action to run the training process. But then i met the problem.
The ScreenShot below describes my operations.

image

I guess the problem occurs when commiting to github, because my github repository didn`t receive any update.

Maybe i should set up a token somewhere to provide access to my private repository?

Any suggestions are welcomed !

Add option to hide unrelated commits

Currently we show all commits in Studio regardless of the changes in anything we track (parameters, metrics, outputs), but it is a common case when there are a lot of commits which aren't related to any changes in ML stuff. To name a few: changes in CI workflow files, changing documentation, editing configuration files which aren't related to ML model (.gitignore for example), etc, etc.

It may be the case that having more clean commit history showing only the commits which introduce changes will be more convenient for the user. As we already have "Delta mode" this is not very important, but I suppose there could be situations in which such option would be convenient.

related to #15

Add support for combining metrics

It is always nice to be able to combine different plots from the same experiment. The classical example would be to compare train and validation loss or the relation between learning rate schedule changes and training loss

image

Feature request: allow to select "default" branch

If you have no .dvc in the master branch, Studio won't parse it, for example:
image

Suggestion: add an option to select some branch to be "default", probably in the repo settings in Studio.

Motivation: This could be inconvenient if you already have a repo without dvc and now are trying out dvc in some other branch, to later create a PR. Of course, as a workaround, you could fork repo and add dvc in master branch, but in this case you lose all CI/CD set up (runners, variables, etc) -- and if you want to use dvc in CI, you will need to deal with setting up CI/CD in your local repo which could be a waste of time.

For me it's unclear how frequent this scenario is, but it's definitely exists. Feel free to add +1 and leave a comment if you need this functionality or suggest how to deal with this more gracefully.

Metrics not showing if output is too big

Description

I am using DVC Studio to track metrics of my project. The outputs of the model training stage are

  • a csv that contains the test set predictions
  • the model
  • some plots including a confusion matrix and some linecharts
  • a full metrics json that is traced as an output, not a metric (this has too many fields, and is too detailed for display)
  • compact_metrics.json that is traced as a metric

When I run the training with the full dataset, DVC Studio breaks, anddoes not display the confusion matrix, and some of the metrics are missing too. (screenshot attached). the UI says that one big file failed to parse. Everything shows up in GitHub & CML though.

When I limit the size of the dataset, everything works as expected.

I suspect that the issue is that the confusion matrix input file and the test set predictions are too big for DVC Studio (they are about 10 - 20 MB, around 100-200k lines at most), so Studio does not load them.

Feature request:

  • display which files are too big, and the ability to set a threshold

Keep up the great work, I love your products! :)

BR,

András K

Update view popup pushes bottom scrollbar off the screen

When a new commit is added, a popup shows up at the top of the page asking to refresh the view. This popup causes the bottom scrollbar to be pushed off the bottom of the screen. This means that you first have to scroll down in order to scroll left or right on the metrics panel. This is in contrast to the "Review tracking scope" popup at the top that does not push the bar off the bottom. I think the desired behaviour is to have the view resize to accommodate the popup rather than requiring scrolling.

Metrics from today do not show in Trends

I have a model run from today at 09:33 which which is not showing up in the trend plots when i select '1 week', '1 month' or '1 year':

image

It does appear when i select 'all time' (see brown dot to the far right):

image

Add `live` metrics support

When working on some deep learning problems, it's common to have experiments running for hours (or even days). Some alternatives to Studio (i.e. wandb or mlflow) have some sort of "view" for individual experiments that is being updated as the training goes on.

Within the "iterative ecosystem" there is the possibility of using dvclive with DVC in order to "see a plot for metrics logged during the model training".

I think it would be a good feature to have something similar (or even that same .html) integrated inside Studio.

Not all repositories are showen using Gitlab

In the Add a view page I can't find some of the repositories that I have on Gitlab. I tried to search the missing repositories using the search bar at the top of the page, but I didn't find any of them.
I'm the owner of the missing Gitlab repository so I don't think is a permission issue.

Expected Behavior

In the Add a view page all the repositories from Gitlab have to be displayed.

Current Behavior

Only few repositories from Gitlab are showen in the Add a view page

Step to reproduce

  1. Connect a Gitlab account to Studio
  2. Click on the upper right button Add a view

Possibile solution

In my Gitlab account I've access to a lot of projects. Maybe the issue is caused by a limit in the Gitlab API: I saw on the Gitlab API documentation that the API uses a pagination system (https://docs.gitlab.com/ee/api/#pagination).

Error while accessing gitlab view

When I try to open gitlab view the following error is prompted:

Something went wrong
c[p] is undefined
smth_wrong

Expected Behavior

The view have to be displayed.

Current Behavior

Error instead of a view.

Steps to reproduce

  1. Connect a Gitlab account to Studio
  2. Create group in Gitlab
  3. Add project in this group
  4. Try to import this project into DVC Studio

Workaround

Import only personal projects.

Show pushed DVC experiments

If a user runs experiments through dvc exp run workflow, they may push experiments to GitHub or another git server (via dvc exp push). It would be great to see the results of these experiments in Studio.

This may be useful in several scenarios:

  • Sharing results with leadership but need to show comparison to other experiments tried. Leaders may need to see other experiments because they need to provide input on which experiment to choose, or for audit purposes.
  • Team experimentation. If multiple team members are running different experiments, it's useful to be able to compare them all side by side.
  • Individual user who wants to review old experiments. If I need to review results from a previously pushed experiment, I don't want to have to pull all of the experiment info locally to check its performance.

Can't integrate with my github repos and "Configure" link not working

Hi,
I can't currently see any of my repos. And then if I click on Configure Git integration settings (see below)
image

I'm directed to this page, but then when I click on the Configure button. The link returns a 404 response.

image

image

Any idea on what's the issue / why the link does not work? (I have granted Iterative Studio access to my github account)
Thank you.

Only show true experiment commits & Declutter UI

I was delighted by the new update that hid commits that do not change metrics compared to the default branch.

Unfortunately, cosmetic commits and so on are still shown! I'd suggest showing commits that change metrics compared to the previous commit, instead of compared to the head of the default branch! This would declutter the UI a LOT.

Another option would be to only show branch heads by default, and expand them on request. Right now only expanded branches have any metrics shown.

Eg.: I often create a new feature, train a model to see if it works, then I need to change a bunch of stuff to make CI/CD & unit tests work.

Dangling experiment persists in the view even after the force parsing

Related #20.

After the initial problem, the user has experienced the dangling branch with an experiment that is not even present in the repository. Force import is not helping the situation. It might be related to the GitLab runner logic handling after the experiment run, but it needs to be researched more.

image

Add a view encounter network error

After I install the github app and grant permissions to repos, when I add a view, encounter unexpected network error.
The repository is under my working company organization, but I have the permission of admin for those repos, and can see the installed applications.

Trace id xRL1QW2ZG-Fkpu-PtkhLe

image

DVC Studio 'Run new experiment' fields

It would be nice to have support for multiline text field for experiment configuration.
For example if I need to pass several elements to config in a list, I must write them in one tiny line which is not very convenient.
Here's an example:

  callbacks:
    - callback: 'TensorBoard'
      args:
        log_dir: '../logs'
    - callback: 'ModelCheckpoint'
      args:
        filepath: 'data/checkpoints/ckpt_{epoch:03d}'
        save_best_only: True
        monitor: 'val_output_1_map'

which get transformed in something like this
image

Feature Request: Branch Comparison Mode

I currently manage different models (for example small, medium and large). The way that I do this is by having a branch for each model type. Since these models serve different purposes (accuracy/runtime tradeoff) I need to maintain top performing models for each of these types. It would be great to have a mode where DVC studio only displays the HEAD of each branch to get a quick comparison accross different branches. This differs from the current view that displays the last 3 commits from each branch. Ideally this would also be a toggleable viewing mode.

Non-responsive Tracking Scope Settings

I have a private view that tracks a project with > 200 tracked metrics/params. For obvious reasons, the table does not show all of these metrics/data/params in the table, but there appears to be a bug with updating the "Tracking Scope" settings in this situation. I have tried numerous times to adjust the settings to <10 values, or even clear out all tracking scope, but every time the view updates/re-imports after clicking save, it still comes back trying to display all of the possible values (> 200) again and shows the warning to reduce scope.

Add Support for Notes

I'm not sure how this would be implemented or if it's possible. But It would be greate to have git notes show up through the UI. Ideally the would be editable as well. Documentation of git notes can be found here. This would make tracking experiments through studio a more holistic experience. git notes have some advantages over tags (limited in length and formatting) an commit messages (immutable).

Multi remote feature

Hello, we have a dvc repository with 2 remotes and intentionally no default remote. We need that because we have a data residency constraint by region.

Would it be possible to specify multiple data remote by projects?

`Experiments`: Show status of CI workflows

Given that:

DVC Studio uses your regular CI/CD setup (e.g. GitHub Actions) to run the experiments.

I think it would be nice to display the status of the CI/CD workflows in order to be able to better monitor the experiments directly from Studio.

`Trends`: Expand view to make it similar to `Show Plots`

The current Trends view it's not very friendly to use when multiple metrics are selected and used to generate the Trends. There is not much room to compare and visualize and I find myself constantly scrolling.

I think that it would be nice to expand Trends in order to have a UX similar to the current state of the Show Plots view.


Background use case:

I have a dataset that gets updated every ~day. The dataset has ~20 classes and I generate a metric for each class (counting the current number of instances in the dataset). I find Trends really useful to visualize the evolution of the dataset but the current view has some limitations (described above).

Tensorboard support

CML already supports comments that link to public tensorboards on tensorboard.dev, and they have an open issue for self-hosted tensorboards: iterative/cml#607.

It would be nice to provide:

  1. Hosting or support for self-hosted private tensorboards.
  2. Embedding tensorboards to view in studio.

Confusion matrix not plotting correclty

I have a plots csv file which creates a confusion matrix. It plots correctly when I use dvc plots show ./confmat.csv locally. My dvc yaml looks like:

    plots:
    - ./confmat.csv:
        cache: false
        template: confusion
        x: actual
        y: predicted

However, it does not plot correctly in dvc studio, all I get is an empty plot:
image

Thanks.

Selected columns change when I open a new PR

I have a very large number of metrics that I filter down to 12 using the tracking scope (those that contain the word weighted):

image

However, sometimes I see columns in the view which are not selected in the tracking scope.

image

I remove them in the columns menu, but after every PR they come back and I need to remove them again.

Nested branches in DVC Studio GUI

The bug occurs when DVC Studio tries to show branch that has no unique commits that differ form branch that current branch was created from.
Let me give you an example:

Example

So, looks like "Some branch" is nested in Master and all commits of "Some branch" belongs to master. And DVC Studio shows it that way.
But this isn't correct. "Some branch" is a separate branch in terms of git understanding, but DVC Studio thinks about it as a commit of Master. Therefore search and Filter features of DVC Studio isn't operate correctly because they can't operate with branch that recognized as a commit.

GUI Example
Image above displays what I'm talking about.

Bug: Changes not reflected in Studio View

If I change mandatory columns in my project, the change has no effect, and I get the same UI ass before the changes. On the other hand, if I delete the view and create a new one with different mandatory fields, it works.

Gerrit support

I'd like to use CML / Studio on a vanilla Git repo that uses Gerrit for code review

Publish a list of our IP addresses

In #24 we discovered that GitHub Enterprise Cloud has an IP allow-list feature. Administrators of organizations can configure an allow list of IP addresses that can access resources in their organization.

We currently do not publish such a list of IP addresses. Publishing one and supporting it requires a commitment from Iterative + has operational challenges and costs.

This issue exists to gauge interest in publishing such a list. Please leave a 👍 or ❤️ reaction in case you would like this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.