Code Monkey home page Code Monkey logo

data-frame's Introduction

Contributors Forks Stargazers Issues MS-PL License LinkedIn


Logo

Lisp-Stat

An environment for statistical computing
Explore the docs »

Report Bug · Request Feature · Reference Manual

Table of Contents

  1. About the Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Resources
  6. Contributing
  7. License
  8. Contact

About the Project

Lisp-Stat provides support for vectorized mathematical operations, and a comprehensive set of statistical methods that are implemented using the latest numerical algorithms. In addition, Common Lisp provides a dynamic programming environment (REPL), an excellent object-oriented facility (CLOS) and meta-object protocol (MOP).

Lisp-Stat is fully functional today, and most of the XLISP-STAT libraries can be ported with the aid of a compatibility package XLS-compat. This gives Lisp-Stat a leg up on ecosystem development.

Built With

Getting Started

To get a local copy up and running follow these steps:

Prerequisites

An ANSI Common Lisp implementation. Developed and tested with SBCL and CCL.

Note: CCL is in poor condition these days, and we no can longer support it due to some serious problem with numerical accuracy. See issue 390 for just one of the problems. A shame, because it's a great environment to work in.

Installation

Lisp-Stat is composed of several systems that are designed to be independently useful. So you can, for example, use select to obtain selections from two dimensional arrays without bringing in all of Lisp-Stat.

The easy way

Quicklisp has many dependencies, and the easiest way to load it is with a package manager, such as Quicklisp or CLPM. The install is a one-liner:

(clpm-client:sync :sources "clpi") ;sources may vary
(ql:quickload :lisp-stat)

From source

To make the system accessible to ASDF (a build facility, similar to make in the C world), clone the repository in a directory ASDF knows about. By default the common-lisp directory in your home directory is known. Create this if it doesn't already exist and then:

  1. Clone the repositories
cd ~/common-lisp && \
git clone https://github.com/Lisp-Stat/data-frame.git && \
git clone https://github.com/Lisp-Stat/dfio.git && \
git clone https://github.com/Lisp-Stat/special-functions.git && \
git clone https://github.com/Lisp-Stat/numerical-utilities.git && \
git clone https://github.com/Lisp-Stat/array-operations.git && \
git clone https://github.com/Lisp-Stat/documentation.git && \
git clone https://github.com/Lisp-Stat/distributions.git && \
git clone https://github.com/Lisp-Stat/plot.git && \
git clone https://github.com/Lisp-Stat/select.git && \
git clone https://github.com/Lisp-Stat/cephes.cl.git && \
git clone https://github.com/Symbolics/alexandria-plus && \
git clone https://github.com/Lisp-Stat/statistics.git && \
git clone https://github.com/Lisp-Stat/lla.git && \
git clone https://github.com/Lisp-Stat/smoothers && \
git clone https://github.com/Lisp-Stat/lisp-stat.git
  1. Reset the ASDF source-registry to find the new system (from the REPL)
    (asdf:clear-source-registry)
  2. Load the system
    (asdf:load-system :lisp-stat)

If you have installed the slime ASDF extensions, you can invoke this with a comma (',') from the slime REPL.

You'll need to use Quicklisp, CLPM or manually obtain the remaining third-party dependencies.

Running Tests

To run the lisp-stat tests, evaluate this form: (asdf:test-system :lisp-stat)

Usage

Create a data frame from a file named sg-weather.csv on the local disk:

(defparameter *df*
	(read-csv #P"LS:DATA;sg-weather.csv"))

For more examples, please refer to the Documentation.

Roadmap

See the open issues for a list of proposed features (and known issues).

Resources

This system is part of the Lisp-Stat project; that should be your first stop for information. Also see the community page for more information.

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated. Please see CONTRIBUTING for details on the code of conduct, and the process for submitting pull requests.

License

Distributed under the MS-PL License. See LICENSE for more information.

Contact

Project Link: https://github.com/lisp-stat/lisp-stat

data-frame's People

Contributors

snunez1 avatar tpapp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

data-frame's Issues

Data Frame - summarize-columns example

in https://lisp-stat.dev/docs/getting-started/data-frame/ documentation

(asdf:load-system :lisp-stat)
(in-package :ls-user)
(defdf mtcars (read-csv rdata:mtcars))
(summarize-column 'mtcars:mpg)

`The value
#(21 21 22.8d0 21.4d0 18.7d0 18.1d0 14.3d0 24.4d0 22.8d0
19.2d0 17.8d0 16.4d0 17.3d0 15.2d0 10.4d0 10.4d0 14.7d0
32.4d0 30.4d0 33.9d0 21.5d0 15.5d0 15.2d0 13.3d0 19.2d0
27.3d0 26 30.4d0 15.8d0 19.7d0 15 21.4d0)

is not of type
SYMBOL
[Condition of type TYPE-ERROR]

Restarts:
0: [RETRY] Retry SLIME REPL evaluation request.
1: [*ABORT] Return to SLIME's top level.
2: [ABORT] abort thread (#<THREAD "repl-thread" RUNNING {25C391B1}>)

Backtrace:
0: (DATA-FRAME::SUMMARIZE-GENERIC-VARIABLE #(21 21 22.8d0 21.4d0 18.7d0 18.1d0 ...))
1: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SUMMARIZE-COLUMN (QUOTE MTCARS:MPG)) #)
2: (EVAL (SUMMARIZE-COLUMN (QUOTE MTCARS:MPG)))
--more--`

Some systems failed to build for Quicklisp dist

Building with SBCL 2.0.5 / ASDF 3.3.1 for quicklisp dist creation.

Trying to build commit id 6c747d6

data-frame fails to build with the following error:

; caught ERROR:
;   READ error during COMPILE-FILE: Symbol "MAP-ARRAY" not found in the ARRAY-OPERATIONS/ALL package. Line: 25, Column: 19, File-Position: 754 Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/quicklisp/quicklisp-controller/dist/build-cache/data-frame/13a7e77710f0eb439a8ff4c9b4241e8613d79087/data-frame-20210523-git/src/missing.lisp" {1006B66DE3}>
...
Unhandled UIOP/LISP-BUILD:COMPILE-FILE-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING {1000A18083}>: COMPILE-FILE-ERROR while compiling #<CL-SOURCE-FILE "data-frame" "missing">

data-frame/tests fails to build because of a failure in data-frame.

Full log here

Some systems failed to build for Quicklisp dist

Building with SBCL 2.3.6.173-55d27b14b / ASDF 3.3.5 for quicklisp dist creation.

Trying to build commit id d378e9f

data-frame fails to build with the following error:

Unhandled SB-INT:SIMPLE-FILE-ERROR in thread #<SB-THREAD:THREAD tid=3619261 "main thread" RUNNING {1001710003}>: Failed to find the TRUENAME of /home/quicklisp/quicklisp-controller/dist/build-cache/data-frame/a705d11eecd0ad126891bc05ffce80053df27a3f/data-frame-20230902-git/src/random-sample.lisp: No such file or directory

data-frame/tests fails to build with the following error:

Unhandled SB-INT:SIMPLE-FILE-ERROR in thread #<SB-THREAD:THREAD tid=3619257 "main thread" RUNNING {1001710003}>: Failed to find the TRUENAME of /home/quicklisp/quicklisp-controller/dist/build-cache/data-frame/a705d11eecd0ad126891bc05ffce80053df27a3f/data-frame-20230902-git/src/random-sample.lisp: No such file or directory

Full log here

No applicable method for generic function column-length when given data-frame

Hello, I'm trying to run lisp-stat on linux, in SBCL 2.4.0 with quicklisp 2021-02-13 and asdf 3.3.1. I'm trying to run the example labeled "Simple Bar Chart" in the plotting examples of the docs. When I try to run

(ql:quickload :lisp-stat)
(ql:quickload :plot/vega)
(plot:plot
 (vega:defplot simple-bar-chart
   `(:mark :bar
     :data (:values ,(plist-df '(:a #(A B C D E F G H I)
                                 :b #(28 55 43 91 81 53 19 87 52))))
     :encoding (:x (:field :a :type :nominal :axis ("labelAngle" 0))
                :y (:field :b :type :quantitative)))))

I get the error:

There is no applicable method for the generic function
  #<STANDARD-GENERIC-FUNCTION DATA-FRAME::COLUMN-LENGTH (1)>
when called with arguments
  (#<DATA-FRAME (9 observations of 2 variables)>).
   [Condition of type SB-PCL::NO-APPLICABLE-METHOD-ERROR]

I'm still new to common lisp, so I'm wondering if it's something with my environment that's different? Since I don't see a method for data-frame here on github either

[BUG] [CSV] First Column Cannot be removed

image

image

The first column cannot be removed from the dataframe read from csv file.

#20

I see. I think it's great that you're learning Common Lisp with Lisp-Stat. Lisp-Stat is stable enough to be used for analysis, but the edge cases aren't well covered in the test suites. For example removing all the columns of a data-frame isn't something that you normally see in practice because that would leave you no data to work with! You can remove one column though: (remove-columns df1 '(name)) works. The examples in the IPS (Introduction to the Practice of Statistics) repo might be helpful if you're looking to use JupyterLab for your analysis.

However, I'm glad you're testing these things! We'll fix everything you find. Would you mind opening up a new issue for this latest bug?

defdf unlessf problem

Apparently I'm in some sort of version hell with various latest repos from github HEADs for data-frame, plot, numerical-utilities, etc... But that's neither here nor there (sort of).

My problem is that defdf-env uses the unlessf macro which used to live in numerical-utilities. Only it's not there in the latest. Now it's in alexandria+ (good luck googling that).

There's no indication where this symbol comes from. Apparently one is supposed to know that this is in alexandria+ (occasionally called "alexandria-plus") which has its own repo. Can we at least document where random symbols like this are coming from? Maybe use alexandria+:unlessf at the code site instead of just importing it?

If I sound grumpy it's because trying to build plot is a nightmare for someone with decades of lisp experience. Not exactly fun for new users, I'd imagine...

Some systems failed to build for Quicklisp dist

Building with SBCL 2.3.6 / ASDF 3.3.5 for quicklisp dist creation.

Trying to build commit id 74fbfa8

data-frame fails to build with the following error:

; caught ERROR:
;   READ error during COMPILE-FILE: Symbol "MAP-ARRAY" not found in the ARRAY-OPERATIONS/ALL package. Line: 28, Column: 19, File-Position: 830 Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/quicklisp/quicklisp-controller/dist/build-cache/data-frame/e068a24f92b8e1cec7a5d2064866edf60b6aef11/data-frame-20230722-git/src/missing.lisp" {1013538ED3}>
...
Unhandled UIOP/LISP-BUILD:COMPILE-FILE-ERROR in thread #<SB-THREAD:THREAD tid=1026045 "main thread" RUNNING {1001770003}>: COMPILE-FILE-ERROR while compiling #<CL-SOURCE-FILE "data-frame" "missing">

data-frame/tests fails to build because of a failure in data-frame.

Full log here

Possibly make aops:dims ignore row-names column

aops:dims for data-frame includes the column typically called row-name. This is often in a CSV with a blank column name, e.g. "". By convention this is typically the row names for the data set. R ignores these, and dplyr has a small toolbox for working with row names.

How we handle row names should be though through before making changes, so I'm leaving this as is for the moment and living with the inconsistency with what's reported from data-frame (which removes the row-name column) and the more array/matrix oriented aops:dims.

Upgrade to clunit2

clunit is abandoned, and clunit2 its replacement. When testing under clunit2, the deffixture is broken and many of the tests fail because the variables define there do not exist.

Windowing functions

Does data-frame implement a facility for easily computing windowing functions? For example in python/pandas we can trivially calculate a rolling moving average with the baked-in data frame API. Pandas contains a rich set of common window functions and also lets you supply your own function via lambda.

What's nice is the library handles bounds-checking and whatnot automatically in a convenient and ergonomic manner.

Is there a way to do this with data-frame? If there's no built-in, what's the "canonical" way to iterate over two columns in lockstep, using one to compute the values of the other? I'm visualizing a pretty messy LOOP construct and hoping there's an easier way.

Some systems failed to build for Quicklisp dist

Building with SBCL 2.0.5 / ASDF 3.3.1 for quicklisp dist creation.

Trying to build commit id 6c747d6

data-frame fails to build with the following error:

; caught ERROR:
;   READ error during COMPILE-FILE: Symbol "MAP-ARRAY" not found in the ARRAY-OPERATIONS/ALL package. Line: 25, Column: 19, File-Position: 754 Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/quicklisp/quicklisp-controller/dist/build-cache/data-frame/13a7e77710f0eb439a8ff4c9b4241e8613d79087/data-frame-20210528-git/src/missing.lisp" {1006B6EDE3}>
...
Unhandled UIOP/LISP-BUILD:COMPILE-FILE-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING {1000A18083}>: COMPILE-FILE-ERROR while compiling #<CL-SOURCE-FILE "data-frame" "missing">

data-frame/tests fails to build because of a failure in data-frame.

Full log here

Some systems failed to build for Quicklisp dist

Building with SBCL 2.2.7.28-02bc916fd / ASDF 3.3.5 for quicklisp dist creation.

Trying to build commit id b916229

data-frame fails to build with the following error:

; caught ERROR:
;   READ error during COMPILE-FILE: Package SB-CLTL2 does not exist. Line: 565, Column: 44, File-Position: 23051 Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/quicklisp/quicklisp-controller/dist/build-cache/data-frame/0afa85af3d8f216024d8a7bf148125780a7d7725/data-frame-20220926-git/src/data-frame.lisp" {10136D6A93}>
...
Unhandled UIOP/LISP-BUILD:COMPILE-FILE-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING {10016C8003}>: COMPILE-FILE-ERROR while compiling #<CL-SOURCE-FILE "data-frame" "data-frame">

data-frame/tests fails to build because of a failure in data-frame.

Full log here

Data Frame 'Summary' function Error

in https://lisp-stat.dev/docs/getting-started/data-frame/ documentation

(asdf:load-system :lisp-stat)
(in-package :ls-user)
(defdf mtcars (read-csv rdata:mtcars))
(summary mtcars)

`The value
#(21 21 22.8d0 21.4d0 18.7d0 18.1d0 14.3d0 24.4d0 22.8d0
19.2d0 17.8d0 16.4d0 17.3d0 15.2d0 10.4d0 10.4d0 14.7d0
32.4d0 30.4d0 33.9d0 21.5d0 15.5d0 15.2d0 13.3d0 19.2d0
27.3d0 26 30.4d0 15.8d0 19.7d0 15 21.4d0)

is not of type
SYMBOL
[Condition of type TYPE-ERROR]

Restarts:
0: [RETRY] Retry SLIME REPL evaluation request.
1: [*ABORT] Return to SLIME's top level.
2: [ABORT] abort thread (#<THREAD "new-repl-thread" RUNNING {2511E8F9}>)

Backtrace:
0: (DATA-FRAME::SUMMARIZE-GENERIC-VARIABLE #(21 21 22.8d0 21.4d0 18.7d0 18.1d0 ...))
1: ((:METHOD SUMMARY (DATA-FRAME)) #<DATA-FRAME MTCARS (32 observations of 12 variables)> #<SYNONYM-STREAM :SYMBOL SWANK::CURRENT-STANDARD-OUTPUT {23A70B79}>) [fast-method]
2: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SUMMARY MTCARS) #)
3: (EVAL (SUMMARY MTCARS))
--more--
`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.