The CodeGen paper & Github Repo trained and released code generation models of various sizes (up to 16.1B parameters), focusing primarily on python code.
I found that the fastest & smallest multi-language model would get confused between programming languages, so wanted to fine-tune it to focus on a specific language.
I took the smallest model (350M parameters) and fine-tuned it twice:
- on HTML from The Stack Dataset, model card 350m_html
- on CSS from The Stack Dataset, model card 350m_css
This model is an auto regressive transformer that predicts the next token.
The first model generates the HTML, which is then feed into the CSS model to generate the styling.