Code Monkey home page Code Monkey logo

simpatico's Introduction

DEPRECATED

Go to Simon's fork for a totally revamped version of this project.

simpatico

A C source code style checker..

What is it?

This style marker should enforce the rules outlayed in the 'csse2310-style-guide.pdf' which was created for the computer science course 'CSSE2310' at the University of Queensland.

Motivation

If/when this project becomes more reliable than the previous implementations of the automarker then the tutors of this course will swap over to using 'simpatico' for their marking.

Also, many hours of marking time is quite expensive for such a menial task, and this project could save money as well as time and effort.

Current method

Currently a C++ program called 'vera++' tokenises the C source code input and feeds it to a very large tcl script which generates the errors.

This script has many issues and is terribly hard to modify. One of the major problems with the current script is that it generates a large number of style errors for validly styled C code.

After the automarker is complete the course tutors must go through the generated errors extremely carefully to validate the process. Very frequently the tutors make mistakes.

Error Format

Each style error must be declared in the format: filename:lineNumber: [CATEGORY] Description

The categories are roughly described below. More details are given in rules.md and in csse2310-style-guide.pdf.

NAMING

  • variables
  • defines
  • functions
  • typedefs

BRACES

  • space before brace
  • correct placement
  • correct alignment

INDENTATION

  • multiples of four spaces
  • nesting correctly indented
  • line continuation

WHITESPACE

  • grammatical spacing around assignment operators
  • correctly spaced vertically

COMMENTS

  • globals
  • functions (parameters esp.)
  • lengthy or tricky code

OVERALL

  • no function over 50 lines
  • modularity / no excessive duplication of code

LINE-LENGTH

  • all lines must be shorter than 80 chars long (including \r)

Usage

./simpatico.py file1.c file2.c

simpatico's People

Contributors

jgat avatar pat-laub avatar sjshaw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

simpatico's Issues

Incorrectly marks multiline assignments as style violations

Example:

int main() {
    char* test = "This is a really long string and I am just going to keep 
        typing random stuff to make it more than 80 chars";
}

Outputs:

>>> check_all()
test.c:
   3: Indentation error (expected 4, got 8)

Possible fix:

  • Add a check for ';' in the line and if not on that line then the next lines tab needs to be 8.

Variables declared in seperate .c/.h files

Oh yeah and while I remember, my script currently fails to correctly identify variable declarations where the type of the variable is not declared in the current file (you'll see this if you run it over a multiple file assignment 3 code).

I know why - it's because the variable naming code requires a list of types, which it builds though an initial parse of the file - but it's a bit annoying to fix. I think we'd have to switch to a proper tokeniser in this respect.

Sean Purdon

Need python wrapper around vera++ tokenizer

To get vera to output basic tokens you run 'vera++ -rule DUMP source.c' and
this will spit out something like the tokens below. To use in the main python script
we need a wrapper around either this output, or a replication of the original tcl
library (see http://www.inspirel.com/vera/ce/doc/tclapi.html), or something in
between.

There are a variety of ways to merge C++ code with python code, some which
involve python changes only, others which involve modifying the C++ (using
Boost or otherwise) so that C++ objects become 'native' inside python.

Note: we probably only have binaries for the vera++ Windows install, unless we
create a CMake file to cross-compile the tokenizer than any C++ changes would
restrict the platform (which isn't the end of the world but..)

We need to:

  1. decide on what functionality we need from the tokenizer in the python script
  2. choose which method of integration is appropriate
  3. actually code that up.

Example vera++ output:

Tokens in file test.c:
1/0 ccomment /* hello world */
1/17 newline

2/0 newline

3/0 pp_hheader #include <stdio.h>
3/18 newline

4/0 newline

5/0 identifier main
5/4 leftparen (
5/5 rightparen )
5/6 space
5/7 leftbrace {
5/8 newline

6/0 space
6/4 identifier printf
6/10 leftparen (
6/11 stringlit "hello, world\n"
6/27 rightparen )
6/28 semicolon ;
6/29 newline

7/0 space
7/4 return return
7/10 space
7/11 intlit 0
7/12 semicolon ;
7/13 newline

8/0 rightbrace }
8/1 newline

9/0 eof

Variable name checking

Some bugs in checking variable names and struct names:

  • Globally declared variables appear to be not checked
  • Function parameters are not checked
  • Multiple variables declared on the same line are not checked.
  • Function pointers are not checked.
  • struct names are not checked (e.g. typedef struct bobStruct { } or struct bobStruct { })

Need some unit tests

We really need some test cases to get this project started..

To start with we don't need to a really formal collection using python's 'unittest' framework,
can start with some ad hoc tests as a basis and formalise later..

Need someone to test the tokenizer_unix branch

Basically just need to check the tokenizer (vera++) compiles on other peoples machines (Linux/Mac).
It works on my Mac but I might have forgotten about setup steps I've performed in the past..

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.