Code Monkey home page Code Monkey logo

haskell-css-syntax's Introduction

This package provides functions to parse a CSS file into a stream of tokens and from that into rules or declarations. The exact algorithm is defined in the css-syntax module (it is the same algorithm that Blink uses since mid 2015 when it replaced their old Bison based parser).

Note: Only the tokenizer is currently implemented here. Parsing the token stream into rules or declarations is available at Stylist Traits.

Motivation

I needed a library which would allow me to process a CSS file, modify all image URLs and write the result to disk. Existing CSS libraries for Haskell either focus only on generating CSS (through a DLS) or parse the source into a format that is too high-level and makes this kind of processing difficult.

A token stream is just the right level of abstraction for that task. The spec defines <url-token> and <function-token>, which is what I needed to extract the image URLs from the source.

More advanced processing likely requires a higher level abstraction than the token stream provides. E.g. to expand vendor prefixes you'll need to parse the token stream into a list of rules and declarations, so you can pick the declarations you want to process.

Motivation 2

I (the second author) needed to preprocess HTML in realtime to make it responsive. Besides other things it requires parsing style=... attribute that can have any amount of junk so I optimized a parser/serializer a lot while still passing all the tests.

Motivation 3

Haskell CSS Syntax is an adopted component of the Argonaut Stack browser engine. A full styling engine can readily be implemented around it!

Tokenizer

The tokenizer uses fast low-level parser (20-50MB/s on average CSS files) to convert the input to a list of tokens. This process removes all comments and collapses consecutive whitespace into a single space character (U+0020). There may be other occasions where the tokenizer looses information from the input stream.

Serializer

Serializer converts list of tokens back to string. Serialization round-trips: tokenizing produces same tokens list as tokenizing, serializing and tokenizing again. Tokenize-serialize pair works at about 15MB/s or more.

Example

In the following example I replace all URLs in the source CSS file with links to predefined image.

import qualified Data.Text.IO as T
import           Data.CSS.Syntax.Tokens (tokenize, serialize)

main :: IO ()
main = do
    source <- T.readFile "path-to-your.css"
    let tokens = tokenize source

    putStrLn $ "The CSS file has " ++ show (length tokens)
        ++ " tokens"

    let newTokens = map f tokens

    T.writeFile "path-to-lolcatted.css" $
        serialize newTokens

f :: Token -> Token
f (Url _) = Url "http://lol.cat/img.png"
f x       = x

haskell-css-syntax's People

Contributors

alcinnz avatar vshabanov avatar werehamster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

iand675 vshabanov

haskell-css-syntax's Issues

Please publish fixes to support Text v2 to Hackage

I've been making great use of your CSS lexer to develop my own (now NLnet-funded) browser engine, & am very grateful for your work. Thank you!

But now I've got another component depending on Text v2 (which has switched from UTF-16 to UTF-8 native encoding), leading to a dependency clash holding me up from publishing a component. You've recently accepted a pull request addressing this, but the changes are not yet on Hackage.

I am willing to take over this project if you need?

Parser

Hi,
is this project still active? Do you plan to include a parser that takes the token stream and spits out a higher-level representation of the CSS?
Btw, I believe there's a "not" missing in "Parsing the token stream into rules or declarations is available as of yet." in your readme.

I am looking for a way to find all @import declarations in a file, and am not sure how to do that on the token level.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.