Code Monkey home page Code Monkey logo

webgenerator's Introduction

WebGenerator

Generate easily probabilistic dataset of web interfaces and content. The datasetter allows you to generate HTML files, their corresponding screenshots and a JSON file with the labeled HTML elements. This way you can train supervised and non-supervised models. You can also set probabilities and options generation of the batch to suit your needs.

Example 3

This development is kindly supported by the awesome SDAS Group.

Some selected examples

Example 1

Example 2

Example 3

A full dataset of 1000 elements with 800x600 size generated with the tool can be shown here and can be downloaded here. In this dataset you will find a folder with CSS, js, HTML files, image folders and JSON files. The html directory has html files rw prefix with the name (rw_0.html, row_1.html,.., row_n.html). Inside the CSS folder, the Bootstrap distribution file with the web page's color palette and another file with the necessary CSS rules for the sidebar and extra required styling. The js folder contains the needed JQuery and Bootstraps Javascript files.

Requirements

Browser and driver

The chrome driver allows Web Generator manage instances of the browser to take the screenshots and create tags annotations of the inner html elements.

  1. If you have a Chrome or Chromium browser installed you can skip this step. Otherwise you can download either a setup or a zip file with the software. In this case we recommend downloading Chromium from this builds website. You should select "Archive" (Zip folder) or Installer.
  2. Next you have to download the Chrome Driver from here. Make sure you have SAME VERSIONS for the driver and the browser. Once downloaded the driver, extract and put the file in your browser's executable folder. If you installed Chrome the path could be C:/Program Files/Google/Chrome/Application.

You can always check the official documentation of Selenium

Installation

Simply git clone this repository or download the zip folder:

git clone https://github.com/agsoto/webgenerator.git
cd webgenerator

Then install the dependencies

pip install -r requirements.txt

Since screen capturing feature depends on Selenium Driver, you should add the path to the system's enviroment variables. Look how to set your enviroment variables on Windows and Mac. Or if your'e using linux you can create a symbolic link: ln -s path-to-executable-driver chromedriver.

However if you don't want to add an eviroment variable, when using the class ScreenShutter, you can set the path to the driver this way:

ScreenShutter(driver_path="path-to-executable-driver")

This optional parameter could be set as it appears in line 18 of Main.py file.

Execution

There's a code example of the use of the generator in the Main.py file. Once you're all set just run:

python ./Main

Potential Applications

This dataset has a potential applications for will generate GUI web, here you will find three deep learning models examples.

  • GAN: To generate GUI web images through web generator images.
  • Fast RCNN: To detect components in web page's images.
  • Pix2Pix: To generate GUI web images through images's edges (canny mask).

GAN

Faster RCNN

Pix2Pix

Generation Probabilities

The parameters for the WebLayoutProbabilities object (that is used for the generation), are described below.

Param # Name Type Description
1 with_sidebar_p float Probability that the Sidebar is present
2 with_header_p float Probability that the Header is present
3 with_navbar_p float Probability that the Navbar is present
4 with_footer_p float Probability that the Footer is present
5 layouts_p list[4] List with the probabilities for each possible layout. The sum of the probabilities should be 1
6 boxed_body_p float Probability that the page's Body is boxed inside a container
7 big_header_p float Probability of having a big header (A big header is considered 50% or more of the screen height)
8 sidebar_first_p float Probability of the Sidebar being at the left side of the Body
9 navbar_first_p float Probability of the Navbar being above the header
10 bg_color_classes_p list[3] List with the probabilities for the combination of CSS Bootstrap's background color classes. The sum of the probabilities should be 1

webgenerator's People

Contributors

agsoto avatar magohector avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

webgenerator's Issues

AttributeError: module 'enum' has no attribute 'OrderedDict'

This was tested using Python versions 3.7 to 3.11 and by installing the dependencies from requirements.txt. When executing Main.py from the master branch, the following error is produced:

  File "./Main.py", line 1, in <module>
    from Randomization.WebLayout import WebLayoutProbabilities
  File "/home/user/dataset/generation/webgenerator/Randomization/WebLayout.py", line 3, in <module>
    from Layout.WebLayout import WebLayout
  File "/home/user/dataset/generation/webgenerator/Layout/WebLayout.py", line 10, in <module>
    from Layout.HtmlComponent import EmptyHtmlComponent
  File "/home/user/dataset/generation/webgenerator/Layout/HtmlComponent.py", line 3, in <module>
    from Core.Enums import *
  File "/home/user/dataset/generation/webgenerator/Core/Enums.py", line 9, in <module>
    class MeasureUnit(enum.OrderedDict):
AttributeError: module 'enum' has no attribute 'OrderedDict'

This appears to be resolved in the dev branch, where class MeasureUnit(enum.OrderedDict) is class MeasureUnit(enum.Enum).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.