Fungicide is a benchmark suite for Befunge-98 interpreters. It consists of a set of synthetic benchmarks; the results may or may not be representative of performance on "practical" programs. ("Practical" in quotes because the context is Befunge-98, which is decidedly not practical).
There used to be a rankings page here with data on the interpreters of the time (). That page has been retired because that data is outdated by now and the community is no longer as active, so an update would not be very useful either.
Download
The source code is available here:
File | Type | Size (octets) | Last modified |
---|---|---|---|
Fungicide 1.0 source code | xz-compressed tarball | 72 352 | 2014-12-29 |
Internals
Fungicide suffers from a lack of proper documentation. The information in the remainder of this page should be part of a provided readme, but currently there's no such thing.
Benchmarks
Each individual benchmark is identified by a script which produces it (that script is confusingly often also referred to as a benchmark) and the parameter passed to the script. That parameter is the problem size, and its meaning depends on the benchmark in question.
Benchmarks tend to end with the f.@
sequence to report completion: what's interesting is what comes before that.
Brief descriptions of each benchmark follow.
- horizontal.b98
- The
>
character repeated the given number of times. - vertical.b98
- The
v
character repeated the given number of times, on separate lines. - diagdown.b98
11x
followed by the given number ofz
in a diagonal line.- diagup.b98
- Like diagdown.b98, but traversing the diagonal line northeast instead of southeast.
- hollow-square.b98
- The edge of a square whose edge length is the given number, traversed clockwise using
>v^<
. - filled-square.b98
- A filled square whose edge length is the given number: each of the cells in the square are traversed using
>v^<
.
The above all have "-p
" forms as well. This signifies that instead of placing all the code to be executed into the file directly, much of it is first generated using the p
instruction. For example, horizontal-p.b98 first loops the given number of times, placing >
instructions as it goes, resulting in an in-memory copy of horizontal.b98, which is then executed. This not only benchmarks file loading versus runtime space manipulation, it also allows for long "diagdown" and "diagup" benchmarks without requiring lots of storage space for files that would consist mostly of indentation.
filled-square.b98 has two "-p
" forms, "horizontal" and "vertical". They differ in the order in which the square is built: rowwise and columnwise respectively.
Further benchmarks are listed below.
- push.b98
- The
f
instruction repeated the given number of times. - pushpop.b98
- The
:$
instruction sequence repeated the given number of times. - yn-rep.b98
- The
yn
instruction sequence repeated the given number of times. - y-rep-n.b98
- The
y0
instruction sequence repeated the given number of times, followed by ann
prior to exit. - fork.b98
- Spawns the given number (must be a power of two) of threads using
t
, then runs them all into an@
for termination.
Measurements
Time is measured simply from a Perl script using the POSIX gettimeofday()
function.
Memory usage is measured using a Python script which repeatedly reads the /proc/<pid>/smaps
pseudofile, summing up any "Shared" and "Private" values. This is done as often as possible in a busy loop while the interpreter process is still alive.
Process
First, the interpreter is run on a benchmark once and its time and memory use are measured. Memory usage is assumed to not vary, and thus it is measured only this one time per benchmark. Based on the time it took to run this, there are three possible continuations:
- If the time was less than a minute, it is run once more. This result is discarded. (A sort of cache-cleaning thing between the memory and time measurements.) Then, ten time-measured runs are performed.
- If the time was instead less than ten minutes, three time-measured runs are performed, with no in-between run.
- If the time exceeded ten minutes, it is run only once more for time measuring.
Thus, we may have either two, four, or eleven temporal measurements in total. The representative one, for analysis, is chosen as the mean of all but the maximum of these (which, in almost all cases, simply cuts off the memory-measuring run).
Usage
At least the following software is required:
In addition, the following software is optional:
On the hardware side, the busy-looping nature of the memory use measurement tool means that having at least two CPU cores is recommended.
Given an extracted Fungicide package and all required dependencies, the following steps will run the benchmarks and generate visualizations:
- Write a file called "interpreters.dat" in the package root, containing lines of the form "interpreter-name /path/to/binary".
- Run "runallruns.sh", which will deposit raw results in the "data" directory and report its progress. At this point, the benchmarks have been run.
- Process the data by running "process.sh", populating the "preprocessed-data" directory.
- Run "make-all-analyses.sh", creating graphs and tables in the "plotstables" directory.