fortran-lang / stdlib Goto Github PK
View Code? Open in Web Editor NEWFortran Standard Library
Home Page: https://stdlib.fortran-lang.org
License: MIT License
Fortran Standard Library
Home Page: https://stdlib.fortran-lang.org
License: MIT License
@marshallward wrote in #1 (comment):
I would like to see greater support for bit-reproducible numerical operations. This is a very high priority for us since our models are used in weather and climate forecasting, and much of our time is devoted to bit reproducibility of our numerical calculations.
A common problem is intrinsic Fortran reduction operations like sum()
, where the order is ambiguous (deliberately, one might say), and therefore not reproducible. A more serious problem for us is transcendental functions, like exp()
or cos()
, which will give different results for different optimizations, and we typically cannot say where it was invoked (libm? Vendor library? etc.).
A standard library may be a place to provide bit-reproducible implementations.
This question comes up in #34 and elsewhere. How to implement specific procedures that work on different kinds (sp
, dp
, qp
, int8
, int16
, int32
, int64
) as well as characters, where the body of the procedure is the same (can be copy/pasted entirely without breaking it). Let's first just focus on this scenario, and we can consider more complex cases later.
I know of a few approaches:
Repeat the code, that is, implement all specific procedures explicitly. That's what I did in functional-fortran, see https://github.com/wavebitscientific/functional-fortran/blob/master/src/lib/mod_functional.f90. Repeating is fine if you do it once and forget about it. The upside is that you can see the specific code and it needs no extra tooling. The downside is combinatorial explosion if you have procedures that are to handle all combinations of types and kinds. Most procedures are rather simple (one or two arguments), and I ended up with > 3K lines of code for 23 generic procedures. Most work was in editing the argument types to specific procedures, and less work was in copy/pasting of the repeatable content. I don't recommend this approach for stdlib.
Approach 1 can be somewhat eased by explicitly typing out the interfaces, and using #include 'procedure_body.inc'
, defined in a separate file. Then your procedure body collapses to one line. This reduces the total amount of code, but not so much the amount of work needed, as most work is in spelling out the interfaces. This approach still doesn't need extra tooling as a C preprocessor comes with all compilers that I'm aware of.
Use a custom preprocessor or templating tool. For example, a function that returns a set of an array:
pure recursive function set(x) result(res)
integer, intent(in) :: x(:) !! Input array
integer, allocatable :: res(:)
if(size(x) > 1)then
res = [x(1), set(pack(x(2:), .not. x(2:) == x(1)))]
else
res = x
endif
end function set
A template could look like this:
pure recursive function set(x) result(res)
{int*, real*}, intent(in) :: x(:) !! Input array
{int*, real*}, allocatable :: res(:)
... ! body omitted for brevity
end function set
or similar, where the custom preprocessor would spit out specific procedures for all integer and real kinds. Some additional or alternative syntax would be needed if you wanted all combinations of type kinds between arguments.
There may be tools that do this already, and I think @zbeekman mentioned one that he uses. In general, for stdlib I think this is the way to go because we are likely to see many procedures that support multiple arguments with inter-compatible type kinds. The downside (strong downside IMO) is that we're likely to introduce a tool dependency that also depends on another language. If the community agrees, we can use this thread to review existing tools and which would be most fitting for stdlib.
Let's say we pick a tool to do the templating for us, we have two choices:
a) Have user build specifics from templates. In this scenario, the user must install the templating tool in order to build stdlib. I think we should avoid this.
b) Use the templating tool as developers only, and maintain the pre-built specifics in the repo. This means that when we're adding new code that will work on many type kinds, we use the tool on our end to generate the source, and commit that source to the repo (alongside the templates in a separate, "for developers" directory).
Assuming we can find a fitting tool, I'm in favor of the 3b approach here. There may be other approaches I'm not aware of or forgot about. What do you think and any other ideas?
Here is how NumPy and Pandas do it:
and @jacobwilliams wrote a module for it here: https://github.com/jacobwilliams/fortran-csv-module
Obviously Fortran.
In #19, it looks like also using C might be beneficial.
I was hoping to avoid depending on C++, because that will make stdlib
much simpler to distribute (and statically link, etc.). Down the road I would like stdlib
to be shipped by default with compilers. So to make it as simple as possible to distribute will be key.
Even simpler would be if we can stay in pure Fortran, but I don't know if we can do that.
Using "ifort (IFORT) 18.0.1 20171018" on linux x86_64, test_savetxt_qp fails
The routine qsavetxt writes one number per line, which is allowed by the star format.
I propose a fix in https://github.com/pdebuyl/stdlib/tree/qsavetxt_format_string , tested with said ifort and with gfortran 6.4.0 and 8.3.0.
I have not opened a PR yet, as I don't know if ifort compatibility is considered necessary at this point. I would support keeping stdlib working with several compilers early on.
Related to #2 and #7. Let's discuss here whether we should maintain manual Makefiles (besides CMake).
First some pros and cons to this.
Pros:
Cons:
Second, how much is this desired by the community? Use this issue to speak up.
Overall I like the convenience of plain Makefiles, and I like being able to see exactly what they're doing. However, in presence of a working CMake setup, I'm not likely to use them so I don't care as much for it. I never ever use systems that don't have access to CMake.
Third, if we do want manual Makefiles, how to we develop and maintain them? Should every contributor be responsible for adding their code to the Makefile? Otherwise, would we need a volunteer Makefile maintainer?
@zbeekman wrote:
Best thing I've found so far is findent and eclint for Fortran. findent works off an actual AST and claims to test against all code examples in the latest Modern Fortran Explained by MRC, so it should be robust. Probably won't help with this particular style issue, but I haven't been using it in earnest yet as I just discovered it (and added it to Homebrew).
The modern Fortran API for a serial linear algebra (#10) seems natural.
How would that be extended to work in parallel using co-arrays? If there is a similar "natural" parallel API for linear algebra using modern Fortran, then that would be a good candidate for inclusion into stdlib, and we can have different backends that do the work (Scalapack, ..., perhaps even our own simpler reference implementation using co-arrays directly), that way if somebody writes a faster 3rd party library, then it could be plugged in as a backend, and user codes do not need to change, because they would already be using the stdlib API for parallel linear algebra.
An annoyance with optional arguments is handling their default values. This usually looks something like:
function mylog(x, base) result(y)
real, intent(in) :: x
real, intent(in), optional :: base
real :: y
real :: base_
base_ = 10.0
if (present(base)) base_ = base
y = log(x)/log(base_)
end function mylog
I propose to introduce a module that exports a generic function default
which will in many cases allow for the elimination of the local copy of the optional argument. The above example could be rewritten
function mylog(x, base) result(y)
use default_values, only: default
real, intent(in) :: x
real, intent(in), optional :: base
real :: y
y = log(x)/log(default(10.0, base))
end function mylog
This is a convenience, but it's incredibly handy. The module is very simple to write. See, e.g., this CLF post by Beliavsky, where I first learned of this trick. I put this in all my serious codes, as it removes a lot of the tedium and error potential with optional arguments.
I can easily spin up a illustrative PR that can be fleshed out once we've decided how we're going to automate generic interfaces.
Use this issue how to best document the stdlib. I put a placeholder Markdown file here.
Currently the standard Fortran's print *, A
prints a 2D array A
as a 1D list of numbers. Rather, I would like stdlib
to have a function print_array
(we can discuss a better name) that would print the array as NumPy:
>>> numpy.arange(10000).reshape(250,40)
array([[ 0, 1, 2, ..., 37, 38, 39],
[ 40, 41, 42, ..., 77, 78, 79],
[ 80, 81, 82, ..., 117, 118, 119],
...,
[9880, 9881, 9882, ..., 9917, 9918, 9919],
[9920, 9921, 9922, ..., 9957, 9958, 9959],
[9960, 9961, 9962, ..., 9997, 9998, 9999]])
or Julia:
julia> B = [1 2; 3 4; 5 6; 7 8; 9 10]
5ร2 Array{Int64,2}:
1 2
3 4
5 6
7 8
9 10
Julia can also use nice unicode characters for ... and vertical ... if the array is too large.
Then we should use this function at
stdlib/src/tests/loadtxt/test_loadtxt.f90
Line 21 in ae5591f
Then compilers can perhaps optionally use such print_array
as default in the Fortran's language print
statement.
The following (sorry about the length) is a module that sets defaults for real and integer KIND parameters (what I call a data model) but allows users to define there own or change the defaults using preprocessor defines. I think something like this along with some module procedures for checking things like storage size and interrogating machine parameters ala R1MACH from Linpack etc. should be a basic part of the library. This would allow users to build different versions with different combinations of integer and real default types. I prefer this approach over trying to provide a different routine for each possible combination of integer and real types. Again, thats a personal preference but I prefer having 4 different versions of a library over trying to to provide four different versions of a subroutine/function and providing a generic interface. Also, I think part of the coding standard for the library should forbid real(8), real*8, implicit double precision etc.
*** dataModel.F90 ***
Module dataModel
! Define default data model precisions (kinds) for a project
! The default data model is assumed to be 32 bit integers and
! 64 bit reals
! Define intrinsic KIND parameters for Modern Fortran programs.
! If we have a Fortran 2008 compliant version of ISO_FORTRAN_ENV, we will
! use it; if not we will create our own versions of the equivalent Fortran
! integer and real kinds using SELECT_INT_KIND and SELECT_REAL_KIND. We
! also define a "machine zero" function for both 32 and 64 bit Real values
Use ISO_FORTRAN_ENV
#ifdef HAVE_USER_DATA_MODEL
USE USERDATAMODEL, ONLY: USER_DEFAULT_INT, USER_DEFAULT_REAL
#endif
Implicit NONE
#ifdef NO_F2008_ENV
! Define Integer and Real intrinsic KINDS with same names as in Fortran 2008
! ISO_FORTRAN_ENV
Integer, Parameter :: INT8 = SELECTED_INT_KIND(2)
Integer, Parameter :: INT16 = SELECTED_INT_KIND(4)
Integer, Parameter :: INT32 = SELECTED_INT_KIND(9)
Integer, Parameter :: INT64 = SELECTED_INT_KIND(18)
Integer, Parameter :: REAL32 = SELECTED_REAL_KIND(P=6, R=37)
Integer, Parameter :: REAL64 = SELECTED_REAL_KIND(P=15, R=307)
#endif
#ifdef NO_REAL128
! Integer, Parameter :: REAL128 = REAL64
#else
#ifdef NO_F2008_ENV
Integer, Parameter :: REAL128 = SELECTED_REAL_KIND(p=33, R=4931)
#endif
#endif
#ifdef HAVE_USER_DATAMODEL
! Set DEFAULT_INT and DEFAULT_REAL to USER DATAMODEL
! values. Otherwise use native defaults
Integer, Parameter :: DEFAULT_INT = USER_DEFAULT_INT
Integer, Parameter :: DEFAULT_REAL = USER_DEFAULT_REAL
#else
! Set default values to 32 bit integers and 64 bit reals but allow
! users to change this at compile time by setting -DI8INT or -DR4REAL
! to select 64 bit integers and/or 32 bit reals
#ifdef I8INT
Integer, Parameter :: DEFAULT_INT = INT64
#else
Integer, Parameter :: DEFAULT_INT = INT32
#endif
#ifdef R4REAL
Integer, Parameter :: DEFAULT_REAL = REAL32
#else
#ifdef R16REAL
Integer, Parameter :: DEFAULT_REAL = REAL128
#else
Integer, Parameter :: DEFAULT_REAL = REAL64
#endif
#endif
#endif
! Define short(er) names and comman aliases for the default kind parameters
Integer, Parameter :: WP = DEFAULT_REAL
Integer, Parameter :: IWP = DEFAULT_INT
Integer, Parameter :: QP = REAL128
! Define a C_ENUM type for use with Fortran ENUMERATOR variables. This
! should be just C_INT but we define an explicit C_ENUM to handle
! the possiblility that its not. This should have been included in
! the Fortran 2003 C-Interop facility but for some reason known
! only to the standards folks was not.
Enum, BIND(C)
ENUMERATOR :: dummy
End Enum
Private :: dummy
Integer, Parameter :: C_ENUM=KIND(dummy)
Public :: INT8, INT16, INT32, INT64, REAL32, REAL64, REAL128, C_ENUM, WP, &
IWP, QP
End Module dataModel
Use this issue to discuss and propose build systems and/or methods.
Some candidates so far:
configure && make && make install
);stdlib
may include a module which provides an interface to the POSIX I/O calls in the C standard library. Such a module would support higher level functionality proposed in #14 on Unix-like platforms.
This module, or specific components, could be conditionally integrated into stdlib
by the build system (CMake, autotools, etc) depending on whether they are available.
Such a module could also be extended to include more POSIX calls, e.g. thread support, memory allocation, lower-level system calls. For now, I'd say to keep the scope more narrow in order to keep it achievable.
I'm willing to spearhead a module or set of modules containing:
I have already modernized some classic codes and there are also some others with permissive licenses floating around. I am willing to merge mine into the stdlib. I need to gather then together and provide a nice interface so that a user can call any of the solvers in a similar way. We can have discussion on what that interface would look like. Here are some example codes:
Eventually we could even provide (optional) integrations with larger third-party and/or commercial libraries such as IPOPT and/or SNOPT
Note that some of these in various Python packages are just wrappers to Fortran 77 codes (such as slsqp).
Branched from #1 (comment)
One personal challenge I have with stock Fortran are its somewhat awkward and low-level I/O facilities -- open, read, write, inquire, rewind, and close. I often wished for a higher-level interface, like what you get with Python's open() -- you open a file with a function, get a file-like instance with methods that let you do stuff with it.
This would do away with unit numbers, which I don't think application developers should have to deal with. It could also be a solution to the problem that allocatable character strings must be pre-allocated before use on read statement.
Is there anything similar out there for Fortran? Would this be of interest to people here? I'd use it.
@jvdp1 wrote: I would use it too.
@cmacmackin wrote: I'd personally like something along these lines. However, the problem is in defining methods on the file-object; these would need to know the number and type of arguments at compile-time. It would be impractical to produce methods with every conceivable permutation of object types. It would also require variadic functions, which are not available. As such, this can not be implemented well in Fortran, although perhaps something would be possible if we were to wrap some C-routines and pass in deferred-type objects.
What minimal CMake version should we require?
Sorry if this post is irrelevant to this repo. I am posting here because I understand people here are mostly package maintainers, so I think it's the right recipients for this post. I can move away to not mess this place if this project takes off.
I just saw this repo for the first time and my first fear was "it is not going to work". Standard library in C, Python, IDL or other languages I had experience with would hold the most essential functions possible. Here half of the proposals or more are extremely particular features. The do not fit in stdlib. These proposals are awesome but more suitable for packaging ecosystem (that we discussed in the other repo) than an stdlib.
I am so happy that there is this movement because since I started using fortran I've felt frustration and I thought nobody shared it. Then I realized some people have the same issues and then founders of fortran-lang project have made the amazing effort to organize these chaotic movement into streamed and targeted action. Since it's Christmas time I wanted to express my thankfulness for this from the bottom of my heart.
I wanted to ask if anyone is willing to participate in a following experimental project.
How about we try to take a dozen of packages (each of us created or maintains at least one) and attempt to make an experimental packaging ecosystem that will hold all of them. So we use currently existing and mature tools (make, gcc, gfortran) just to make it work on one platform (I think it should be linux/unix because it's the easiest and free). If that takes off and we reach the critical mass, we might expand to cover all needs (Windows, other compilers).
I am willing to put my effort (as I have a bit of free time now), however I have had no experience with packaging other than pypi and rpm (which is mostly binary packages). I know some people have mentioned they worked with some of source based packaging systems that could work for Fortran.
In this issue, instead of discussing whether it's a good idea or not, I would like to collect suggestions and advice of tools and solutions that would make it possible in the fastest time.
We need:
Again, despite my personal feels, I do not want to argue which of the solutions (stdlib vs packaging ecosystem) is superior. I just want to make a demostration working product in a short time.
Tangential to #3. The choice of a module naming pattern is important because a module with a generic name can cause conflict in the client code. In the absence of namespaces, the standard library modules should have a specific enough prefix to prevent such conflicts.
One approach could be to use
stdlib
for the top-level module;stdlib_<group>
for specific modules, e.g. stdlib_collections
, stdlib_sorting
, etc. you get the idea.In the recent years, I tend to name modules with a mod_
prefix, e.g. mod_functional
. I see in the wild that module_
, m_
prefixes, or even _m
suffix are used. If I name the source file the same as the module, then I can easily see which source files contain modules and which don't. To me, this is useful if I haven't looked at the library in a long time, but now I can't think of any other benefit to this "convention". If this convention is used for stdlib, then we'd have:
mod_stdlib
mod_stdlib_<group>
However, this seems unnecessarily verbose, and most library files are likely to be modules anyway (with the exception of tests which would be programs in their own directory, see #7). Thus, if we use the stdlib_
prefix universally, I don't think we need mod_
or similar.
Here is an example implementation of loading and saving ppm: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/ppm.f90. The advantage of the ppm format is that it is simple to write such readers and writers. Then one can use external tools (such as pnmtopng
) to convert to more common formats. So perhaps tiff
, jpeg
and png
are not initially needed and ppm
might be enough to allow to work with images in Fortran.
Prior art:
Use this issue to discuss the code style for the stdlib.
The most widely supported elements of style will eventually be merged into the Style Guide for contributors.
Use this issue to discuss the workflow that we should adopt. This will eventually become part of the Workflow doc
For the time being:
We can also discuss the preferred git/GitHub workflows for contributing to the reop, such as feature branches, pull requests, etc.
First issue in this repo which evolved from this thread. This is a broad, open ended, high-level issue, so feel free to go wide and crazy here.
To propose a specific module, procedure, or derived type, please open a new issue. You can follow the same format as in Fortran Proposals.
From @apthorpe:
From @FortranFan:
From @zbeekman:
Without the datafiles, the test would fail on my machine when running an out of source build.
The branch here https://github.com/pdebuyl/stdlib/tree/cmake_copy_dat_files make a copy of the data files. I preferred to open an issue since I don't know if a PR would be of any use here.
Linked list is one of the essential data structures beside an array. It allows you to add, insert, or remove elements in constant time, without re-allocating the whole structure.
Fortran doesn't have a linked list. There are 3rd party libraries, but no obvious go-to solution. Fortran stdlib should have a linked list. I would use it.
There's various levels of capability we could pursue:
I don't know, something like this?
use stdlib_experimental_collections, only :: List
type(List) :: a = List()
call a % append(42)
call a % append(3.141)
call a % append('text')
print *, a % get(2) ! prints 3.141
call a % remove(3) ! a is now List([42, 3.141])
call a % insert(2, 'hello') ! a is now List([42, 'hello', 3.141])
a = List([1, 2, 3]) ! instantiate a list from an array
The linspace API is implemented here: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/mesh.f90#L157
Matlab's linspace.
The logspace is similar, but I don't have it implemented yet -- historically I have used a function called meshexp, which is more general --- it allows you to change the gradation of the mesh, which the Matlab's logspace does not allow. The NumPy's logspace allows to set different base
which allows to change gradation. So I think my meshexp
can be implemented using NumPy's logspace
. NumPy also has geomspace where you can specify the end points directly (just like in my meshexp
) but it does not allow to change gradation. So I think there is room for meshexp
, perhaps we should change the name somehow to be consistent with the other functions.
What should we use for continuous integration?
GitLab's CI tools are amazing, but on GitHub I struggle every time I want to set up something. What's the state-of-the-art here for a Fortran+C project, ideally with builds for as many platforms we can get, and ideally free of charge?
@zbeekman @jacobwilliams @scivision
We have some time to explore and decide, but at the point when we have some code + tests in, we should CI.
This module should include functions for character classification and conversion (lower, upper). I have prepared a basic implementation at https://github.com/ivan-pi/fortran-ascii.
The plan is to cover the same functionality as found in the C, C++, and D libraries:
@zbeekman has already opened an issue (see ivan-pi/fortran-ascii#1) on dealing with different character kinds. The problem is that the ascii and iso_10646 character sets need not be supported by the compilers. Even if they are supported their bitwise representation might be different from the default kind.
I realized while creating these functions, that agreeing upon a style guide #3 and documentation #4 early on would be helpful to improve future pull requests. Some agreement upon unit testing will also be necessary.
cc: @jacobwilliams
In order to more easily judge how Fortran developers actually use Fortran in real production codes (this can be useful for issues like #25), let's maintain a list of popular open source Fortran projects here, sorted by the number of stars at GitHub (in parentheses).
The list was moved to a Wiki:
https://github.com/fortran-lang/stdlib/wiki/List-of-popular-open-source-Fortran-projects
I implement a user base class to support some of the Abstract Data Types (lists etc) and sorting codes I've implemented. It contains no data but defines dummy procedures for things I need to do to support sorting , generic lists etc. mainly relational operators (> < >= <= == assighment etc) and a print method. I implement this as a concrete (non-abstract) class to avoid having to overide all the methods as would be required with an abstract class with deferred abstract interfaces for the procedures since I might not need all of the procedures defined in the concrete class in the extended class. I think we will need something similar to this (or maybe a God or World class ala java that all classes are derived from) to support user defined types.
We have not reached an agreement if we should be using sp
, dp
, qp
or some other names. This is a subset of the issue #25. This current issue is only for the naming convention. Anything else should be discussed in #25.
This is not a pressing issue, as for now use use sp
, dp
, qp
as placeholders to allow us to move on to implement an actual functionality. But we definitely have to reach an agreement before we consider moving from experimental to main.
I was hoping doing a survey of all open source Fortran projects, as well as some closed source that I have access to, and then we'll see what the large community is actually using. Then we can decide what to do.
Per @certiks request, I propose we extend his stdlib_experimental_error.f90 code to a standard assert subroutine and supplement it with some pre-processor macros. As an example here is my implementation of an assert routine and the associated macros
assertions.f90
The associated preprocessor macros are
assert_macros.txt
Note, these are my implementation of similar routine and macros found in the FTL project
Prior art in other languages:
In Fortran (I keep this list updated with all implementations posted in this issue):
I really like the SciPy simple non-OO implementation, and I have ported it to modern Fortran in the link above. If people also want an OO implementation, then it can be build on top as an option.
One thing that I found out is that one must sort the indices and the overall speed very much depends on how quickly one can sort it. I ended up using quicksort
, but it might be even faster to use some specialized sorting algorithm (such as Timsort) because in practice, the indices have subsections that are already sorted (typically coming from some local to global mapping as in finite elements), but overall it is not sorted.
Here is a 2D implementation https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/mesh.f90#L164, but NumPy's meshgrid is much more general. Here is Matlab's meshgrid.
Due to bug(s) in PGI / Flang, the stdlib io
and optval
don't build. There are spurious errors like
PGF90-F-0000-Internal compiler error. interf:new_symbol, symbol not found 630 (src/tests/io/test_loadtxt.f90: 2)
fiddling around with the code, PGI will even give a similar error for implicit none
Basically this seems a lot like other cases where there were bugs in PGI / Flang and not something wrong with stdlib. I did some attempts at workarounds in https://github.com/scivision/stdlib/tree/qp_opt
One could file a bug report with each of PGI and Flang, as the error is effectively identical, perhaps boiling it down to a minimum working example.
We can use FFTPACK, or my own modern Fortran refactoring of it: https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/fourier.f90 and we need to allow optionally using MKL or FFTW.
NumPy uses FFTPACK that they transformed into C and heavily modified: https://github.com/numpy/numpy/tree/a9bb517554004cf2ce7a4be93bcbfb63ee149844/numpy/fft. We could use it also, but I think it might be valuable to stay in Fortran (see #20).
Let's start a discussion on routines for string handling and manipulation. The thread over at j3-fortran already collected some ideas:
The discussion also mentioned the proposed iso_varying_string
module, which was supposed to include some string routines. I found three distinct implementations of this module:
iso_varying_string
proposal; the module dates back to 1998)I also found the following Fortran libraries targeting string handling:
sub
, gsub
, split
, join
, and conversion on concatenation. WIP thoughIt is likely that several of the tools in the list of popular Fortran projects also contain some tools for working with strings. Given the numerous implementations it seems like this is one of the things where the absence of the standard "... led to everybody re-inventing the wheel and to an unnecessary diversity in the most fundamental classes" to borrow the quote of B. Stroustrup in a retrospective of the C++ language.
For comparison here are some links to descriptions of string handling functions in other programming languages:
Obviously, for now we should not aim to cover the full set of features available in other languages. Since the scope is quite big, it might be useful to break this issue into smaller issues for distinct operations (numeric converions, comparisons, finding the occurence of string in a larger string, joining and splitting, regular expressions).
My suggestion would be to start with some of the easy functions like capitalize
, count
, endswith
, startswith
, upper
, lower
, and the conversion routines from numeric types to strings and vice-versa.
Implementation: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/utils.f90#L176
The interface is compatible with NumPy (e.g., you can do savetxt
from Fortran and do loadtxt
from NumPy and it just works, and vice versa).
The loadtxt
has the argument as allocatable, intent(out)
, because you typically do not know the size of the matrix ahead of time.
How should Fortran stdlib be licensed?
I initially chose my favorite license, which is MIT. We can discuss and change this if other license is preferred.
I think it's important that the license:
What else or anything I'm missing?
I'll just throw this out as an alternative to building stdlib from scratch. Seth Johnson at ORNL has a project that can generate SWIG bindings to the C++ STL for Fortran. (see https://github.com/swig-fortran). I ran across his project a few months back and thought it interesting but didn't do a deep dive into what issues where involved in using it for things like lists etc. More info in the following two PDFS
Johnson_Automated_Fortran-C++_Bindings_ArXiv.pdf
siam-cse-johnsonsr.pdf
In stdlib_experimental_error.f90
, there is the following example:
call error_stop("Invalid argument")
A similar case can be found in stdlib_experimental_io.f90
.
When running a large program, messages such as "Invalid argument" are quite useless. Should we discuss and agree on a good way to mention error messages, e.g.,
ERROR (_name_of_the_function_): Invalid argument (_argument_)
eig
, eigh
, inv
, solve
, det
, svd
, ... . A possible implementation is here: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/linalg.f90.
All these functions will be implemented in stdlib_linalg
module, and they would probably just call Lapack. The general idea of these routines is to be general routines that will just work, with a simple intuitive interface, and the highest performance given the simple API. One can always achieve higher performance with more specialized routines for a particular problem (and more complicated API), but that is not the point here. Rather we would like a Matlab / NumPy style routines to do linear algebra.
In particular, let's start with eig
, for an eigenvalue and eigenvectors of a general (non-symmetric) matrix. Both NumPy and Matlab have a very similar interface called eig
, that I propose we use:
Julia seems to have more of an "object oriented" interface called eigen
: https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/index.html#LinearAlgebra.eigen, which uses some Julia language features to emulate the Matlab style vals, vecs = eigen([1.0 0.0 0.0; 0.0 3.0 0.0; 0.0 0.0 18.0])
.
When we declare a real variable in our API, we have to use some variable to represent the real kind. Available options for single, double and quadruple precision that people have used in codes:
real(sp)
, real(dp)
, real(qp)
real(real32)
, real(real64)
, real(real128)
real(r32)
, real(r64)
, real(r128)
real(float32)
, real(float64)
, real(float128)
real(f32)
, real(f64)
, real(f128)
real(wp)
(for default real)real(r4)
, real(r8)
, real(r16)
I will keep appending to this list if we find some code that uses different names.
Those variables must be defined somewhere. I will use the case 1. below, for other cases we simply substitute different names. The available options:
a. stdlib_types
module that provides sp
, dp
, qp
for single, double and quadruple precision as follows:
integer, parameter :: sp=kind(0.), & ! single precision
dp=kind(0.d0), & ! double precision
qp=selected_real_kind(32) ! quadruple precision
The main idea behind this option is that there is a module in stdlib
that provides the types and every other module uses it. The proposal #13 is similar to it. There are several options how the types are defined inside the module: one can define sp
and dp
using selected_real_kind
also. Another alternative is to define sp
, dp
and qp
using iso_fortran_env as in b. to a. The module stdlib_types
can be called differently also.
b. use iso_fortran_env, only: sp=>real32, dp=>real64, qp=>real128
(if the case 2. above is used, then one does not need to rename, so it simplifies to just use iso_fortran_env, only: real32, real64, real128
). Unlike a., this option does not introduce a new module in stdlib. One simply uses iso_fortran_env
everywhere directly.
c. use iso_c_binding, only: sp=>c_float, dp=>c_double, qp=>c_float128
. Unlike a., this option does not introduce a new module in stdlib.
I will keep this list updated if more options become available.
Currently most of the features so far discussed are serial.
Should we include parallel algorithms in stdlib?
Should we use co-arrays or MPI?
What would be some good initial parallel algorithms to start with?
In #14 @milancurcic indicated he'd like to have something along the lines of a printf
function, like in C. I suggested a way to do this:
C-style formatting is something I'd very much like in Fortran. I think it may help some newcomers to the language as well because this kind of formatting is more common in other languages. But that's for another proposal. :)
This actually wouldn't be too difficult to implement in a standard library. We'd just write a series of wrappers in C, taking different numbers of void*t arguments. We'd then use interoperability to call these from Fortran and wrap them in a generic block. We could have versions accepting between, say, 1 and 30 arguments (tedious, but could be automatically generated), which should be enough for anyone.
I went on to comment:
My suggestion of calls to C was specifically for a printf
function. This would avoid the combinatorial explosion because printf
works on void*
data types. These can be passed in from Fortran using a "deferred-type" argument, type(*)
. The interface would look something like this:
void printf_wrapper1(const char *format, void *arg1) {
printf(format, arg1, arg2)
}
void printf_wrapper2(const char *format, void *arg1, void *arg2) {
printf(format, arg1, arg2)
}
interface printf
subroutine printf_wrapper1(format_str, arg1) bind(c)
character(len=1), dimension(*), intent(in) :: format_str
type(*), intent(in) :: arg1
end subroutine printf_wrapper1
subroutine printf_wrapper2(format_str, arg1, arg2) bind(c)
character(len=1), dimension(*), intent(in) :: format_str
type(*), intent(in) :: arg1, arg2
end subroutine printf_wrapper2
end interface printf
The main complication with this is how to convert between Fortran and C strings. It wouldn't be hard to provide wrapper routines which do this for the Format string, but string arguments to printf
could be more of a challenge.
I think we should have as part of the style guide and/or coding practices a standard naming convention (or two) for derive types. I currently append an _t tail to all derived type names ala
Type :: user_t
However, as I use more extended types and classes I'm considering either using _et or _c to define
an extended type. ala
Type :: user_et or user_c
Just an idea I would like to throw out
I feel that it is important to have some sort of formal governance. This should be in place in advance of when conflict arises, which is when you actually need it. While this issue is related to #5 (workflow), it is, in my opinion, distinct.
I was invited to be a Mac Homebrew maintainer in July 2018 right before a period of drawn out conflict mostly between a very active maintainer and the projects "Lead Maintainer". In addition, this conflict was born out of a technical issue with vocal users and contributors fanning the flames. If the organization had a more formal governance model then:
Within 6 months the organization had its first ever in-person meeting, which I had the pleasure of attending, and ratified more formal bylaws. Since then I have never seen the same level of conflict between maintainers, or between users/contributors and maintainers (or the project itself).
An imperfect list of the questions that need to be answered follows.
Of course this is a very new & young project, and as such adopting formal bylaws is probably overkill. It's certainly not fun, and too much bureaucracy can certainly be harmful. But, giving people a framework for how decisions are made and finalized---especially controversial ones---makes the outcome easier to understand and tolerate when it isn't in your favor, and having a process to deal with controversial decisions and conflict is very nice to have BEFORE you need to use & apply it.
Will stdlib require a preprocessor? It is mentioned in #13 #35 and #72
I suggest that this is discussed as it impacts the way the code is built and possibly the way users of stdlib will call certain features. Not that I have any definitive opinion on the topic, but as there was a discussion on cmake, compiler support and CI, this seems appropriate also for preprocessing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.