Code Monkey home page Code Monkey logo

stdlib's People

Contributors

14ngiestas avatar adenchfi avatar aman-godara avatar aradi avatar arjenmarkus avatar awvwgk avatar certik avatar chuckyvt avatar degawa avatar ejovo13 avatar fiolj avatar gareth-nx avatar ghbrown avatar hsnyder avatar hugomvale avatar ivan-pi avatar jalvesz avatar jhenneberg avatar jim-215-fisher avatar jvdp1 avatar lewisfish avatar mardiehl avatar milancurcic avatar nshaffer avatar perazz avatar sakamoti avatar scivision avatar wclodius2 avatar zbeekman avatar zoziha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stdlib's Issues

Proposal for bit-reproducible numerical operations

@marshallward wrote in #1 (comment):

I would like to see greater support for bit-reproducible numerical operations. This is a very high priority for us since our models are used in weather and climate forecasting, and much of our time is devoted to bit reproducibility of our numerical calculations.

A common problem is intrinsic Fortran reduction operations like sum(), where the order is ambiguous (deliberately, one might say), and therefore not reproducible. A more serious problem for us is transcendental functions, like exp() or cos(), which will give different results for different optimizations, and we typically cannot say where it was invoked (libm? Vendor library? etc.).

A standard library may be a place to provide bit-reproducible implementations.

How to implement same procedures for different numeric kinds

This question comes up in #34 and elsewhere. How to implement specific procedures that work on different kinds (sp, dp, qp, int8, int16, int32, int64) as well as characters, where the body of the procedure is the same (can be copy/pasted entirely without breaking it). Let's first just focus on this scenario, and we can consider more complex cases later.

I know of a few approaches:

  1. Repeat the code, that is, implement all specific procedures explicitly. That's what I did in functional-fortran, see https://github.com/wavebitscientific/functional-fortran/blob/master/src/lib/mod_functional.f90. Repeating is fine if you do it once and forget about it. The upside is that you can see the specific code and it needs no extra tooling. The downside is combinatorial explosion if you have procedures that are to handle all combinations of types and kinds. Most procedures are rather simple (one or two arguments), and I ended up with > 3K lines of code for 23 generic procedures. Most work was in editing the argument types to specific procedures, and less work was in copy/pasting of the repeatable content. I don't recommend this approach for stdlib.

  2. Approach 1 can be somewhat eased by explicitly typing out the interfaces, and using #include 'procedure_body.inc', defined in a separate file. Then your procedure body collapses to one line. This reduces the total amount of code, but not so much the amount of work needed, as most work is in spelling out the interfaces. This approach still doesn't need extra tooling as a C preprocessor comes with all compilers that I'm aware of.

  3. Use a custom preprocessor or templating tool. For example, a function that returns a set of an array:

pure recursive function set(x) result(res)
  integer, intent(in) :: x(:) !! Input array
  integer, allocatable :: res(:)
  if(size(x) > 1)then
    res = [x(1), set(pack(x(2:), .not. x(2:) == x(1)))]
  else
    res = x
  endif
end function set

A template could look like this:

pure recursive function set(x) result(res)
  {int*, real*}, intent(in) :: x(:) !! Input array
  {int*, real*}, allocatable :: res(:)
  ... ! body omitted for brevity
end function set

or similar, where the custom preprocessor would spit out specific procedures for all integer and real kinds. Some additional or alternative syntax would be needed if you wanted all combinations of type kinds between arguments.

There may be tools that do this already, and I think @zbeekman mentioned one that he uses. In general, for stdlib I think this is the way to go because we are likely to see many procedures that support multiple arguments with inter-compatible type kinds. The downside (strong downside IMO) is that we're likely to introduce a tool dependency that also depends on another language. If the community agrees, we can use this thread to review existing tools and which would be most fitting for stdlib.

Let's say we pick a tool to do the templating for us, we have two choices:

a) Have user build specifics from templates. In this scenario, the user must install the templating tool in order to build stdlib. I think we should avoid this.
b) Use the templating tool as developers only, and maintain the pre-built specifics in the repo. This means that when we're adding new code that will work on many type kinds, we use the tool on our end to generate the source, and commit that source to the repo (alongside the templates in a separate, "for developers" directory).

Assuming we can find a fitting tool, I'm in favor of the 3b approach here. There may be other approaches I'm not aware of or forgot about. What do you think and any other ideas?

Which implementation languages can be used in stdlib

Obviously Fortran.

In #19, it looks like also using C might be beneficial.

I was hoping to avoid depending on C++, because that will make stdlib much simpler to distribute (and statically link, etc.). Down the road I would like stdlib to be shipped by default with compilers. So to make it as simple as possible to distribute will be key.

Even simpler would be if we can stay in pure Fortran, but I don't know if we can do that.

test_savetxt_qp fails with ifort

Using "ifort (IFORT) 18.0.1 20171018" on linux x86_64, test_savetxt_qp fails

The routine qsavetxt writes one number per line, which is allowed by the star format.

I propose a fix in https://github.com/pdebuyl/stdlib/tree/qsavetxt_format_string , tested with said ifort and with gfortran 6.4.0 and 8.3.0.

I have not opened a PR yet, as I don't know if ifort compatibility is considered necessary at this point. I would support keeping stdlib working with several compilers early on.

Manual Makefiles: Do we want them and how to maintain them?

Related to #2 and #7. Let's discuss here whether we should maintain manual Makefiles (besides CMake).

First some pros and cons to this.

Pros:

  • People who don't have CMake can still build the code with just make;
  • Comes built-in on most Linux systems;
  • Others?

Cons:

  • Need to be maintained and updated often. This is especially problematic with fast-moving and experimental APIs.
  • Others?

Second, how much is this desired by the community? Use this issue to speak up.

Overall I like the convenience of plain Makefiles, and I like being able to see exactly what they're doing. However, in presence of a working CMake setup, I'm not likely to use them so I don't care as much for it. I never ever use systems that don't have access to CMake.

Third, if we do want manual Makefiles, how to we develop and maintain them? Should every contributor be responsible for adding their code to the Makefile? Otherwise, would we need a volunteer Makefile maintainer?

Use autoformatting, enforced by CI

@zbeekman wrote:

Best thing I've found so far is findent and eclint for Fortran. findent works off an actual AST and claims to test against all code examples in the latest Modern Fortran Explained by MRC, so it should be robust. Probably won't help with this particular style issue, but I haven't been using it in earnest yet as I just discovered it (and added it to Homebrew).

Parallel linalg

The modern Fortran API for a serial linear algebra (#10) seems natural.

How would that be extended to work in parallel using co-arrays? If there is a similar "natural" parallel API for linear algebra using modern Fortran, then that would be a good candidate for inclusion into stdlib, and we can have different backends that do the work (Scalapack, ..., perhaps even our own simpler reference implementation using co-arrays directly), that way if somebody writes a faster 3rd party library, then it could be plugged in as a backend, and user codes do not need to change, because they would already be using the stdlib API for parallel linear algebra.

Facilitate default values of optional arguments

An annoyance with optional arguments is handling their default values. This usually looks something like:

function mylog(x, base) result(y)
    real, intent(in) :: x
    real, intent(in), optional :: base
    real :: y

    real :: base_
    
    base_ = 10.0
    if (present(base)) base_ = base
    
    y = log(x)/log(base_)
end function mylog

I propose to introduce a module that exports a generic function default which will in many cases allow for the elimination of the local copy of the optional argument. The above example could be rewritten

function mylog(x, base) result(y)
    use default_values, only: default
    real, intent(in) :: x
    real, intent(in), optional :: base
    real :: y
    
    y = log(x)/log(default(10.0, base))
end function mylog

This is a convenience, but it's incredibly handy. The module is very simple to write. See, e.g., this CLF post by Beliavsky, where I first learned of this trick. I put this in all my serious codes, as it removes a lot of the tedium and error potential with optional arguments.

I can easily spin up a illustrative PR that can be fleshed out once we've decided how we're going to automate generic interfaces.

Documentation

Use this issue how to best document the stdlib. I put a placeholder Markdown file here.

Pretty printing of matrices (and multidimensional arrays)

Currently the standard Fortran's print *, A prints a 2D array A as a 1D list of numbers. Rather, I would like stdlib to have a function print_array (we can discuss a better name) that would print the array as NumPy:

>>> numpy.arange(10000).reshape(250,40)
array([[   0,    1,    2, ...,   37,   38,   39],
       [  40,   41,   42, ...,   77,   78,   79],
       [  80,   81,   82, ...,  117,  118,  119],
       ..., 
       [9880, 9881, 9882, ..., 9917, 9918, 9919],
       [9920, 9921, 9922, ..., 9957, 9958, 9959],
       [9960, 9961, 9962, ..., 9997, 9998, 9999]])

or Julia:

julia> B = [1 2; 3 4; 5 6; 7 8; 9 10]
5ร—2 Array{Int64,2}:
 1   2
 3   4
 5   6
 7   8
 9  10

Julia can also use nice unicode characters for ... and vertical ... if the array is too large.

Then we should use this function at

subroutine print_array(a)
and other places.

Then compilers can perhaps optionally use such print_array as default in the Fortran's language print statement.

Proposal for a common data model module to set default KINDS

The following (sorry about the length) is a module that sets defaults for real and integer KIND parameters (what I call a data model) but allows users to define there own or change the defaults using preprocessor defines. I think something like this along with some module procedures for checking things like storage size and interrogating machine parameters ala R1MACH from Linpack etc. should be a basic part of the library. This would allow users to build different versions with different combinations of integer and real default types. I prefer this approach over trying to provide a different routine for each possible combination of integer and real types. Again, thats a personal preference but I prefer having 4 different versions of a library over trying to to provide four different versions of a subroutine/function and providing a generic interface. Also, I think part of the coding standard for the library should forbid real(8), real*8, implicit double precision etc.

*** dataModel.F90 ***

Module dataModel

! Define default data model precisions (kinds) for a project
! The default data model is assumed to be 32 bit integers and
! 64 bit reals

! Define intrinsic KIND parameters for Modern Fortran programs. 
! If we have a Fortran 2008 compliant version of ISO_FORTRAN_ENV, we will 
! use it; if not we will create our own versions of the equivalent Fortran 
! integer and real kinds using SELECT_INT_KIND and SELECT_REAL_KIND. We
! also define a "machine zero" function for both 32 and 64 bit Real values 

  Use ISO_FORTRAN_ENV

#ifdef HAVE_USER_DATA_MODEL
  USE USERDATAMODEL, ONLY: USER_DEFAULT_INT, USER_DEFAULT_REAL
#endif

  Implicit NONE 

#ifdef NO_F2008_ENV

! Define Integer and Real intrinsic KINDS with same names as in Fortran 2008
! ISO_FORTRAN_ENV
   
  Integer, Parameter :: INT8    = SELECTED_INT_KIND(2) 
  Integer, Parameter :: INT16   = SELECTED_INT_KIND(4) 
  Integer, Parameter :: INT32   = SELECTED_INT_KIND(9)
  Integer, Parameter :: INT64   = SELECTED_INT_KIND(18)
  Integer, Parameter :: REAL32  = SELECTED_REAL_KIND(P=6,  R=37)
  Integer, Parameter :: REAL64  = SELECTED_REAL_KIND(P=15, R=307)

#endif

#ifdef NO_REAL128
!  Integer, Parameter :: REAL128 = REAL64
#else
#ifdef NO_F2008_ENV
  Integer, Parameter :: REAL128 = SELECTED_REAL_KIND(p=33, R=4931)
#endif
#endif

#ifdef HAVE_USER_DATAMODEL

! Set DEFAULT_INT and DEFAULT_REAL to USER DATAMODEL
! values. Otherwise use native defaults

  Integer, Parameter :: DEFAULT_INT  = USER_DEFAULT_INT
  Integer, Parameter :: DEFAULT_REAL = USER_DEFAULT_REAL

#else 

! Set default values to 32 bit integers and 64 bit reals but allow
! users to change this at compile time by setting -DI8INT or -DR4REAL
! to select 64 bit integers and/or 32 bit reals

#ifdef I8INT 
  Integer, Parameter :: DEFAULT_INT  = INT64
#else
  Integer, Parameter :: DEFAULT_INT  = INT32
#endif

#ifdef R4REAL
  Integer, Parameter :: DEFAULT_REAL = REAL32
#else
#ifdef R16REAL
  Integer, Parameter :: DEFAULT_REAL = REAL128
#else
  Integer, Parameter :: DEFAULT_REAL = REAL64
#endif
#endif

#endif

! Define short(er) names and comman aliases for the default kind parameters

  Integer, Parameter :: WP   = DEFAULT_REAL 
  Integer, Parameter :: IWP  = DEFAULT_INT 
  Integer, Parameter :: QP   = REAL128

! Define a C_ENUM type for use with Fortran ENUMERATOR variables. This
! should be just C_INT but we define an explicit C_ENUM to handle
! the possiblility that its not. This should have been included in
! the Fortran 2003 C-Interop facility but for some reason known
! only to the standards folks was not. 

  Enum, BIND(C)
    ENUMERATOR :: dummy
  End Enum

  Private :: dummy

  Integer, Parameter :: C_ENUM=KIND(dummy)

  Public :: INT8, INT16, INT32, INT64, REAL32, REAL64, REAL128, C_ENUM, WP, &
            IWP, QP

End Module dataModel

Build systems

Use this issue to discuss and propose build systems and/or methods.

Some candidates so far:

  • autotools (configure && make && make install);
  • CMake
  • Hand-written Makefiles

Interface to POSIX I/O API

stdlib may include a module which provides an interface to the POSIX I/O calls in the C standard library. Such a module would support higher level functionality proposed in #14 on Unix-like platforms.

This module, or specific components, could be conditionally integrated into stdlib by the build system (CMake, autotools, etc) depending on whether they are available.

Such a module could also be extended to include more POSIX calls, e.g. thread support, memory allocation, lower-level system calls. For now, I'd say to keep the scope more narrow in order to keep it achievable.

Optimization, Root finding, and Equation Solvers

I'm willing to spearhead a module or set of modules containing:

  • general constrained and unconstrained nonlinear optimization (sqp, global optimization, etc.)
  • nonlinear equation solving (newton, least squares, etc.)
  • scalar minimizers and root solvers (fmin, zeroin, etc.)

I have already modernized some classic codes and there are also some others with permissive licenses floating around. I am willing to merge mine into the stdlib. I need to gather then together and provide a nice interface so that a user can call any of the solvers in a similar way. We can have discussion on what that interface would look like. Here are some example codes:

  • slsqp (SQP)
  • pikaia (genetic algorithm)
  • Brent codes zeroin and fmin
  • Minpack
  • Other derivative-free bracketed root solvers such as bisection, Anderson-Bjorck, Muller, Pegasus, ...
  • Basic Newton and/or Broyden solvers
  • FilterSD

Eventually we could even provide (optional) integrations with larger third-party and/or commercial libraries such as IPOPT and/or SNOPT

See also

Note that some of these in various Python packages are just wrappers to Fortran 77 codes (such as slsqp).

Proposal for high level I/O

Branched from #1 (comment)


One personal challenge I have with stock Fortran are its somewhat awkward and low-level I/O facilities -- open, read, write, inquire, rewind, and close. I often wished for a higher-level interface, like what you get with Python's open() -- you open a file with a function, get a file-like instance with methods that let you do stuff with it.

This would do away with unit numbers, which I don't think application developers should have to deal with. It could also be a solution to the problem that allocatable character strings must be pre-allocated before use on read statement.

Is there anything similar out there for Fortran? Would this be of interest to people here? I'd use it.

@jvdp1 wrote: I would use it too.

@cmacmackin wrote: I'd personally like something along these lines. However, the problem is in defining methods on the file-object; these would need to know the number and type of arguments at compile-time. It would be impractical to produce methods with every conceivable permutation of object types. It would also require variadic functions, which are not available. As such, this can not be implemented well in Fortran, although perhaps something would be possible if we were to wrap some C-routines and pass in deferred-type objects.

packaging ecosystem experiment

Sorry if this post is irrelevant to this repo. I am posting here because I understand people here are mostly package maintainers, so I think it's the right recipients for this post. I can move away to not mess this place if this project takes off.

I just saw this repo for the first time and my first fear was "it is not going to work". Standard library in C, Python, IDL or other languages I had experience with would hold the most essential functions possible. Here half of the proposals or more are extremely particular features. The do not fit in stdlib. These proposals are awesome but more suitable for packaging ecosystem (that we discussed in the other repo) than an stdlib.

I am so happy that there is this movement because since I started using fortran I've felt frustration and I thought nobody shared it. Then I realized some people have the same issues and then founders of fortran-lang project have made the amazing effort to organize these chaotic movement into streamed and targeted action. Since it's Christmas time I wanted to express my thankfulness for this from the bottom of my heart.

I wanted to ask if anyone is willing to participate in a following experimental project.

How about we try to take a dozen of packages (each of us created or maintains at least one) and attempt to make an experimental packaging ecosystem that will hold all of them. So we use currently existing and mature tools (make, gcc, gfortran) just to make it work on one platform (I think it should be linux/unix because it's the easiest and free). If that takes off and we reach the critical mass, we might expand to cover all needs (Windows, other compilers).

I am willing to put my effort (as I have a bit of free time now), however I have had no experience with packaging other than pypi and rpm (which is mostly binary packages). I know some people have mentioned they worked with some of source based packaging systems that could work for Fortran.

In this issue, instead of discussing whether it's a good idea or not, I would like to collect suggestions and advice of tools and solutions that would make it possible in the fastest time.

We need:

  • people who have experiences with packaging systems to share them and/or use their expertise to help setting things up
  • package developers/maintainers that are willing to add their packages
  • people who can provide some basic infrastructure for testing (I can share my RPi server for the start)
  • users to test the solutions

Again, despite my personal feels, I do not want to argue which of the solutions (stdlib vs packaging ecosystem) is superior. I just want to make a demostration working product in a short time.

Module naming convention

Tangential to #3. The choice of a module naming pattern is important because a module with a generic name can cause conflict in the client code. In the absence of namespaces, the standard library modules should have a specific enough prefix to prevent such conflicts.

One approach could be to use

  • stdlib for the top-level module;
  • stdlib_<group> for specific modules, e.g. stdlib_collections, stdlib_sorting, etc. you get the idea.

In the recent years, I tend to name modules with a mod_ prefix, e.g. mod_functional. I see in the wild that module_, m_ prefixes, or even _m suffix are used. If I name the source file the same as the module, then I can easily see which source files contain modules and which don't. To me, this is useful if I haven't looked at the library in a long time, but now I can't think of any other benefit to this "convention". If this convention is used for stdlib, then we'd have:

  • mod_stdlib
  • mod_stdlib_<group>

However, this seems unnecessarily verbose, and most library files are likely to be modules anyway (with the exception of tests which would be programs in their own directory, see #7). Thus, if we use the stdlib_ prefix universally, I don't think we need mod_ or similar.

Reading and writing common image formats, ppm, tiff, jpeg, png

Here is an example implementation of loading and saving ppm: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/ppm.f90. The advantage of the ppm format is that it is simple to write such readers and writers. Then one can use external tools (such as pnmtopng) to convert to more common formats. So perhaps tiff, jpeg and png are not initially needed and ppm might be enough to allow to work with images in Fortran.

Prior art:

  • Matlab: imread can read BMP, JPEG, PNG, CUR, PPM, GIF, PBM, RAS, HDF4, PCX, TIFF, ICO, PGM and XWD files. See also imwrite.
  • SciPy: imread uses Python Imaging Library (PIL) to read the image; PIL supports: BMP, DIB, EPS, GIF, ICNS, ICO, IM, JPEG, MSP, PCX, PNG, PPM, SGI, SPIDER, TGA, TIFF, WebP, XBM. See also imsave.
  • Julia: Images.jl package has a nice comparison page with SciPy and Matlab; can read at least PNG, GIF, TIFF, JPEG.

Workflow for contributors

Use this issue to discuss the workflow that we should adopt. This will eventually become part of the Workflow doc

For the time being:

  • Use #1 to broadly discuss what should be part of stdlib;
  • Open a new issue for a specific proposal;
  • Once the proposal gets reasonably enough discussion and support from the community, open a pull request to add code to the stdlib;
  • We can flesh out the specific requirements regarding quality control and similar in this thread.

We can also discuss the preferred git/GitHub workflows for contributing to the reop, such as feature branches, pull requests, etc.

What should be part of stdlib?

Existing libraries, for inspiration or adoption


First issue in this repo which evolved from this thread. This is a broad, open ended, high-level issue, so feel free to go wide and crazy here.

To propose a specific module, procedure, or derived type, please open a new issue. You can follow the same format as in Fortran Proposals.

Wishlist from upthread

From @apthorpe:


From @FortranFan:

  • Containers
    • string type
    • bitsets
    • Enhanced 'array' types such as vectors, singly-linked and doubly-linked lists, etc.
    • Adapters such as stacks, queues, etc.
    • Associative ones such as dictionaries (maps), hash_sets, etc.
  • Algorithms
    • Generic methods for sort, findloc, etc. that can work with any type, intrinsic and derived,
    • Operations and permutations on a range of elements such as merge/union, difference,
      etc.
  • Utilities
    • Iterator-like facilities which make it easy to work with Containers,
    • Operator (<, >, ==, etc. ) and assignment(=) overload abstractions that perhaps make
      the use of standard algorithms more efficient?
    • Miscellaneous other functions, subroutines (like generic swap), datetime, named
      constants, etc.
  • Special
    • Any basic facilities (extensions perhaps to ABSTRACT INTERFACE block?) needed toward
      "special" functions such as Variadic ones in the language e.g., MAX, MIN
    • Ability to "overload" array subsection notation facility with Containers that standard Fortran
      provides with its 2 built-in containers: arrays and CHARACTER intrinsic type.
    • Any special mechanisms that can help aid with improved constructors of arrays/containers
      and derived types ('classes'). I envision certain fundamental 'computer engineering' aspects
      being pursued here that can enable, say, efficient operation on the diagonal of a matrix or
      initialization to an identity matrix; or efficient 'dynamic' construction of 'classes' in Fortran similar
      to that is achieved near universally using new keyword in other languages.

From @zbeekman:

  • Strings
    • Conversion to/from integer/real/logical (all kinds of each)
    • Conversion on string concatenation
    • raw string processing functions inspired by Ruby & Python
    • string class to make using all the machinery easier via TBPs
  • Files
    • For now just name manipulations like dirname, basename, etc.
  • OS/Environment integration
    • is_a_tty(), OS%env("HOME"), .envExists. "USER", etc.
  • Unit testing & assertions stuff
    • Subtest summaries w/ color
    • File and line number triggering failures
  • Error Stack class/object
    • Maintain a call-stack
    • Raise errors, but optionally trap them later with good call stack including line number and file

Linked list

Problem

Linked list is one of the essential data structures beside an array. It allows you to add, insert, or remove elements in constant time, without re-allocating the whole structure.

Fortran doesn't have a linked list. There are 3rd party libraries, but no obvious go-to solution. Fortran stdlib should have a linked list. I would use it.

Examples

What kind of data can the linked list hold?

There's various levels of capability we could pursue:

  1. Single type: Basically just like an array, but allows insertion in constant time;
  2. Elements can be of any intrinsic type in a single list;
  3. Can take intrinsic type and user-defined derived types (is this even possible in current Fortran?)

API

I don't know, something like this?

use stdlib_experimental_collections, only :: List
type(List) :: a = List()

call a % append(42)
call a % append(3.141)
call a % append('text')
print *, a % get(2) ! prints 3.141
call a % remove(3) ! a is now List([42, 3.141])
call a % insert(2, 'hello') ! a is now List([42, 'hello', 3.141])

a = List([1, 2, 3]) ! instantiate a list from an array

linspace and logspace

The linspace API is implemented here: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/mesh.f90#L157

Matlab's linspace.

The logspace is similar, but I don't have it implemented yet -- historically I have used a function called meshexp, which is more general --- it allows you to change the gradation of the mesh, which the Matlab's logspace does not allow. The NumPy's logspace allows to set different base which allows to change gradation. So I think my meshexp can be implemented using NumPy's logspace. NumPy also has geomspace where you can specify the end points directly (just like in my meshexp) but it does not allow to change gradation. So I think there is room for meshexp, perhaps we should change the name somehow to be consistent with the other functions.

Continuous Integration

What should we use for continuous integration?

GitLab's CI tools are amazing, but on GitHub I struggle every time I want to set up something. What's the state-of-the-art here for a Fortran+C project, ideally with builds for as many platforms we can get, and ideally free of charge?

@zbeekman @jacobwilliams @scivision

We have some time to explore and decide, but at the point when we have some code + tests in, we should CI.

Proposal for ascii

This module should include functions for character classification and conversion (lower, upper). I have prepared a basic implementation at https://github.com/ivan-pi/fortran-ascii.

The plan is to cover the same functionality as found in the C, C++, and D libraries:

@zbeekman has already opened an issue (see ivan-pi/fortran-ascii#1) on dealing with different character kinds. The problem is that the ascii and iso_10646 character sets need not be supported by the compilers. Even if they are supported their bitwise representation might be different from the default kind.

I realized while creating these functions, that agreeing upon a style guide #3 and documentation #4 early on would be helpful to improve future pull requests. Some agreement upon unit testing will also be necessary.

cc: @jacobwilliams

Define a base user class to support ADTs, sorting etc.

I implement a user base class to support some of the Abstract Data Types (lists etc) and sorting codes I've implemented. It contains no data but defines dummy procedures for things I need to do to support sorting , generic lists etc. mainly relational operators (> < >= <= == assighment etc) and a print method. I implement this as a concrete (non-abstract) class to avoid having to overide all the methods as would be required with an abstract class with deferred abstract interfaces for the procedures since I might not need all of the procedures defined in the concrete class in the extended class. I think we will need something similar to this (or maybe a God or World class ala java that all classes are derived from) to support user defined types.

Discuss and possibly change `sp`, `dp`, `qp` kinds constants

We have not reached an agreement if we should be using sp, dp, qp or some other names. This is a subset of the issue #25. This current issue is only for the naming convention. Anything else should be discussed in #25.

This is not a pressing issue, as for now use use sp, dp, qp as placeholders to allow us to move on to implement an actual functionality. But we definitely have to reach an agreement before we consider moving from experimental to main.

I was hoping doing a survey of all open source Fortran projects, as well as some closed source that I have access to, and then we'll see what the large community is actually using. Then we can decide what to do.

Implement standard assert subroutine and associated macros

Per @certiks request, I propose we extend his stdlib_experimental_error.f90 code to a standard assert subroutine and supplement it with some pre-processor macros. As an example here is my implementation of an assert routine and the associated macros

assertions.f90

assert.txt

The associated preprocessor macros are
assert_macros.txt

Note, these are my implementation of similar routine and macros found in the FTL project

Sparse matrix support

Prior art in other languages:

In Fortran (I keep this list updated with all implementations posted in this issue):

I really like the SciPy simple non-OO implementation, and I have ported it to modern Fortran in the link above. If people also want an OO implementation, then it can be build on top as an option.

One thing that I found out is that one must sort the indices and the overall speed very much depends on how quickly one can sort it. I ended up using quicksort, but it might be even faster to use some specialized sorting algorithm (such as Timsort) because in practice, the indices have subsections that are already sorted (typically coming from some local to global mapping as in finite elements), but overall it is not sorted.

PGI / Flang compilers not working

Due to bug(s) in PGI / Flang, the stdlib io and optval don't build. There are spurious errors like

PGF90-F-0000-Internal compiler error. interf:new_symbol, symbol not found     630  (src/tests/io/test_loadtxt.f90: 2)

fiddling around with the code, PGI will even give a similar error for implicit none

Basically this seems a lot like other cases where there were bugs in PGI / Flang and not something wrong with stdlib. I did some attempts at workarounds in https://github.com/scivision/stdlib/tree/qp_opt

One could file a bug report with each of PGI and Flang, as the error is effectively identical, perhaps boiling it down to a minimum working example.

String handling routines

Let's start a discussion on routines for string handling and manipulation. The thread over at j3-fortran already collected some ideas:

  • split - given a separator, splits the string into some form of array
  • upper/lower - convert a character string to all upper/lower case

The discussion also mentioned the proposed iso_varying_string module, which was supposed to include some string routines. I found three distinct implementations of this module:

I also found the following Fortran libraries targeting string handling:

It is likely that several of the tools in the list of popular Fortran projects also contain some tools for working with strings. Given the numerous implementations it seems like this is one of the things where the absence of the standard "... led to everybody re-inventing the wheel and to an unnecessary diversity in the most fundamental classes" to borrow the quote of B. Stroustrup in a retrospective of the C++ language.

For comparison here are some links to descriptions of string handling functions in other programming languages:

Obviously, for now we should not aim to cover the full set of features available in other languages. Since the scope is quite big, it might be useful to break this issue into smaller issues for distinct operations (numeric converions, comparisons, finding the occurence of string in a larger string, joining and splitting, regular expressions).

My suggestion would be to start with some of the easy functions like capitalize, count, endswith, startswith, upper, lower, and the conversion routines from numeric types to strings and vice-versa.

License

How should Fortran stdlib be licensed?

I initially chose my favorite license, which is MIT. We can discuss and change this if other license is preferred.

I think it's important that the license:

  • is friendly for commercial / closed-source projects;
  • grants copyright to contributors

What else or anything I'm missing?

Consider using swig-fortran to access C++ STL

I'll just throw this out as an alternative to building stdlib from scratch. Seth Johnson at ORNL has a project that can generate SWIG bindings to the C++ STL for Fortran. (see https://github.com/swig-fortran). I ran across his project a few months back and thought it interesting but didn't do a deep dive into what issues where involved in using it for things like lists etc. More info in the following two PDFS
Johnson_Automated_Fortran-C++_Bindings_ArXiv.pdf
siam-cse-johnsonsr.pdf

Message for errors inside stdlib?

In stdlib_experimental_error.f90, there is the following example:
call error_stop("Invalid argument")
A similar case can be found in stdlib_experimental_io.f90.

When running a large program, messages such as "Invalid argument" are quite useless. Should we discuss and agree on a good way to mention error messages, e.g.,

ERROR (_name_of_the_function_): Invalid argument (_argument_)

Proposal for linalg

eig, eigh, inv, solve, det, svd, ... . A possible implementation is here: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/linalg.f90.

All these functions will be implemented in stdlib_linalg module, and they would probably just call Lapack. The general idea of these routines is to be general routines that will just work, with a simple intuitive interface, and the highest performance given the simple API. One can always achieve higher performance with more specialized routines for a particular problem (and more complicated API), but that is not the point here. Rather we would like a Matlab / NumPy style routines to do linear algebra.

In particular, let's start with eig, for an eigenvalue and eigenvectors of a general (non-symmetric) matrix. Both NumPy and Matlab have a very similar interface called eig, that I propose we use:

Julia seems to have more of an "object oriented" interface called eigen: https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/index.html#LinearAlgebra.eigen, which uses some Julia language features to emulate the Matlab style vals, vecs = eigen([1.0 0.0 0.0; 0.0 3.0 0.0; 0.0 0.0 18.0]).

How should stdlib handle single, double, quadruple precision types

When we declare a real variable in our API, we have to use some variable to represent the real kind. Available options for single, double and quadruple precision that people have used in codes:

  1. real(sp), real(dp), real(qp)
  2. real(real32), real(real64), real(real128)
  3. real(r32), real(r64), real(r128)
  4. real(float32), real(float64), real(float128)
  5. real(f32), real(f64), real(f128)
  6. real(wp) (for default real)
  7. real(r4), real(r8), real(r16)

I will keep appending to this list if we find some code that uses different names.

Those variables must be defined somewhere. I will use the case 1. below, for other cases we simply substitute different names. The available options:

a. stdlib_types module that provides sp, dp, qp for single, double and quadruple precision as follows:

integer, parameter :: sp=kind(0.), &             ! single precision
                      dp=kind(0.d0), &           ! double precision
                      qp=selected_real_kind(32)  ! quadruple precision

The main idea behind this option is that there is a module in stdlib that provides the types and every other module uses it. The proposal #13 is similar to it. There are several options how the types are defined inside the module: one can define sp and dp using selected_real_kind also. Another alternative is to define sp, dp and qp using iso_fortran_env as in b. to a. The module stdlib_types can be called differently also.

b. use iso_fortran_env, only: sp=>real32, dp=>real64, qp=>real128 (if the case 2. above is used, then one does not need to rename, so it simplifies to just use iso_fortran_env, only: real32, real64, real128). Unlike a., this option does not introduce a new module in stdlib. One simply uses iso_fortran_env everywhere directly.

c. use iso_c_binding, only: sp=>c_float, dp=>c_double, qp=>c_float128. Unlike a., this option does not introduce a new module in stdlib.

I will keep this list updated if more options become available.

Parallel algorithms in stdlib

Currently most of the features so far discussed are serial.

  1. Should we include parallel algorithms in stdlib?

  2. Should we use co-arrays or MPI?

  3. What would be some good initial parallel algorithms to start with?

Directory structure

Mostly cosmetic, but affects quality of life: What kind of directory structure should we use for the library?

Potentially touching on #2 and #3.

I tend to like structures like this:

stdlib/
  src/
    lib/
      mod_stdlib.f90
      subdir1/
      subdir2/
      ...
    tests/

C-style formatting

In #14 @milancurcic indicated he'd like to have something along the lines of a printf function, like in C. I suggested a way to do this:

C-style formatting is something I'd very much like in Fortran. I think it may help some newcomers to the language as well because this kind of formatting is more common in other languages. But that's for another proposal. :)

This actually wouldn't be too difficult to implement in a standard library. We'd just write a series of wrappers in C, taking different numbers of void*t arguments. We'd then use interoperability to call these from Fortran and wrap them in a generic block. We could have versions accepting between, say, 1 and 30 arguments (tedious, but could be automatically generated), which should be enough for anyone.

I went on to comment:

My suggestion of calls to C was specifically for a printf function. This would avoid the combinatorial explosion because printf works on void* data types. These can be passed in from Fortran using a "deferred-type" argument, type(*). The interface would look something like this:

void printf_wrapper1(const char *format, void *arg1) {
    printf(format, arg1, arg2)
} 

void printf_wrapper2(const char *format, void *arg1, void *arg2) {
    printf(format, arg1, arg2)
} 
interface printf
subroutine printf_wrapper1(format_str, arg1) bind(c)
    character(len=1), dimension(*), intent(in) :: format_str
    type(*), intent(in) :: arg1
end subroutine printf_wrapper1

subroutine printf_wrapper2(format_str, arg1, arg2) bind(c)
    character(len=1), dimension(*), intent(in) :: format_str
    type(*), intent(in) :: arg1, arg2
end subroutine printf_wrapper2
end interface printf

The main complication with this is how to convert between Fortran and C strings. It wouldn't be hard to provide wrapper routines which do this for the Format string, but string arguments to printf could be more of a challenge.

Define a consistent naming scheme for derived types

I think we should have as part of the style guide and/or coding practices a standard naming convention (or two) for derive types. I currently append an _t tail to all derived type names ala

Type :: user_t

However, as I use more extended types and classes I'm considering either using _et or _c to define
an extended type. ala

Type :: user_et or user_c

Just an idea I would like to throw out

Governance

Governance

I feel that it is important to have some sort of formal governance. This should be in place in advance of when conflict arises, which is when you actually need it. While this issue is related to #5 (workflow), it is, in my opinion, distinct.

A case study

I was invited to be a Mac Homebrew maintainer in July 2018 right before a period of drawn out conflict mostly between a very active maintainer and the projects "Lead Maintainer". In addition, this conflict was born out of a technical issue with vocal users and contributors fanning the flames. If the organization had a more formal governance model then:

  1. The decision making process for technical decisions would have some basis in a pre-established framework and therefore appear less arbitrary and personal
  2. A formal procedure would have been in place for resolving both technical and personal conflicts

Within 6 months the organization had its first ever in-person meeting, which I had the pleasure of attending, and ratified more formal bylaws. Since then I have never seen the same level of conflict between maintainers, or between users/contributors and maintainers (or the project itself).

Issues to be addressed in bylaws or less formal governance

An imperfect list of the questions that need to be answered follows.

  1. How are controversial decisions made? Who gets the final say?
  2. Should there be a technical steering committee? Or a project leader? Or both
  3. What sort of decisions should be put to a vote among some sort of membership?
  4. How are differing levels of responsibility defined?
  5. What is the fortran-lang's formal or informal relationship to J3/WG5?
  6. How can we setup an infrastructure and process to ensure the continued health and enthusiasm about this project as it (hopefully) grows?
  7. How can we divide responsibilities amongst maintainers and contributors to ensure that important decisions don't get made under the radar, while ensuring that there is little to no unnecessary duplication of effort and reduce/prevent bikeshedding.

Caveats

Of course this is a very new & young project, and as such adopting formal bylaws is probably overkill. It's certainly not fun, and too much bureaucracy can certainly be harmful. But, giving people a framework for how decisions are made and finalized---especially controversial ones---makes the outcome easier to understand and tolerate when it isn't in your favor, and having a process to deal with controversial decisions and conflict is very nice to have BEFORE you need to use & apply it.

Preprocessor

Will stdlib require a preprocessor? It is mentioned in #13 #35 and #72

I suggest that this is discussed as it impacts the way the code is built and possibly the way users of stdlib will call certain features. Not that I have any definitive opinion on the topic, but as there was a discussion on cmake, compiler support and CI, this seems appropriate also for preprocessing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.