Code Monkey home page Code Monkey logo

color_cluster's Introduction

SmartSelect_20231227-233547_Instagram

color_cluster's People

Contributors

giovana-morais avatar

Watchers

 avatar  avatar

color_cluster's Issues

alterar maneira como arquivos `.pkl` são nomeados

atualmente, os arquivos são salvos apenas como nome_do_arquivo.pkl, independente do número de cores que foram extraídas dele.

o problema disso é que o método de clusterização (kmeans ou vector_quantization) pode se confundir ao encontrar um arquivo com nome correto, mas número de cores diferentes das que recebeu como parâmetro.

pra corrigir isso provavelmente é só passar o número de cores pro nome do arquivo, por exemplo, nome_arquivo_3.pkl para apenas três clusters.

ignore .pickle files when reading images

when a .pickle file is in the same folder that files with image extensions, like .jpg, .jpeg etc etc, the program crashes because it tries to apply image functions to those files.

maybe save pickle files in another folder or just try to ignore them when reading the images.

calculate image similarity

after getting every image pallette, we need to compare it and check how similar they are, so we can group it.

nowadays, we have a method get_image_clusters that should work, but it doesn't because of the clusters' dimension. i'm not sure if we can reuse the method or if we'll have to implement another one from scratch, but this grouping images method must be implemented.

major steps

  1. refactor get_colors
  2. order palettes so comparison may be easier (issue #7)
  3. refactor/rewrite get_image_clusters

set `black` as code formatter

it would be nice to standardize the code use a formatter like black. flake8 may also be a option to have a nicer and readable code.

identify color pallette with Pillow

Pillow has a quantize method that changes image to use only the n dominant colors. despite the fact that I believe this method ALSO uses KMeans under it, I think it will be nice to try to use it too and then compare performances.

create README

it must contain a detailed explanation about the following cases:

  1. argument is a folder
  • if this happens, images should be grouped by their similarity and saved in different folders to represent the clusters. or maybe just output a list with the clusters participants.
  1. argument is a file
  • if only a file is provided, code should return only the clusters of that image.
  1. a file and a folder are passed
  • basically, the program should get the image and return the images related ONLY to that image.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.