Discussion:
An idea, now that we have dynamic loading
John Wiegley
2018-05-06 01:44:42 UTC
Permalink
It occurred to me today that we have tons of examples of how Lisp functions
can be written in a flavor C. This "compilation" is typically done by hand by
a few experts.

However, what if we had a compiler from Emacs Lisp -> Lisp-flavored C, which
could turn .el files into .c files suitable for compiling into .so's that can
be loaded into Emacs?
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
Óscar Fuentes
2018-05-06 02:25:17 UTC
Permalink
Post by John Wiegley
It occurred to me today that we have tons of examples of how Lisp functions
can be written in a flavor C. This "compilation" is typically done by hand by
a few experts.
However, what if we had a compiler from Emacs Lisp -> Lisp-flavored C, which
could turn .el files into .c files suitable for compiling into .so's that can
be loaded into Emacs?
Mandatory mention:

JIT Compilation for Emacs

http://tromey.com/blog/?p=982

IMHO and based on my experience with other projects, this is the way
Elisp should go for improving performance. It is dubious that the
approach you mention would be any better on terms of raw performance,
apart from decreasing load time. Anyways, if speed is really important,
nothing beats a pure C implementation.
Eli Zaretskii
2018-05-11 17:38:35 UTC
Permalink
Date: Fri, 11 May 2018 22:22:47 +0700
John even asked for creation of such a suite of performance tests, but
AFAIK no one has picked the gauntlet till now.
Would it suffice to use Emacs Lisp to run these performance tests
(i.e. using benchmark.el)?
I don't see why not.
Even so, how would one account for external factors in the operating
system? Perhaps when the test is performed, the deltas between
commits are given as percentages instead of CPU time. Once we have
that, it would be great to have tests run multiple times and/or on
various devices to refine the data further.
Performance should indeed compare several versions n the same system.
An area that would be interesting to look at is memory usage of
functions over commits, but I don't know of a way of measuring memory
usage in Emacs (especially over an extended period, to analyze things
such as maximum memory usage).
Memory analysis is tricky on modern systems. but it isn't impossible.
One could start using the values reported by process-attributes.
Phillip Lord
2018-05-14 11:37:09 UTC
Permalink
Post by Eli Zaretskii
Even so, how would one account for external factors in the operating
system? Perhaps when the test is performed, the deltas between
commits are given as percentages instead of CPU time. Once we have
that, it would be great to have tests run multiple times and/or on
various devices to refine the data further.
Performance should indeed compare several versions n the same system.
I wonder if there is a way to work out a base line value. Benchmarks
could then look for multiples of this.

So, something like time to create a big list (for CPU), and time to read
a defined file (for IO). Then you could say "this file should parse in
1x IO base-line + 10x CPU base-line.

That way the bench marks could become part of the test set. If things
got significantly slower, tests would fail.

Phil
Eli Zaretskii
2018-05-14 16:18:48 UTC
Permalink
Date: Mon, 14 May 2018 12:37:09 +0100
Post by Eli Zaretskii
Performance should indeed compare several versions n the same system.
I wonder if there is a way to work out a base line value. Benchmarks
could then look for multiples of this.
But such a base line would also have to be specific to a platform and
a given set of build options.
So, something like time to create a big list (for CPU), and time to read
a defined file (for IO). Then you could say "this file should parse in
1x IO base-line + 10x CPU base-line.
You assume a linear scalability, but that is not necessarily so. The
ratio between performance indices of different codes could vary
depending on the build option and the underlying OS.

Btw, IME I/O is mostly negligible in Emacs applications, and generally
is not interesting for the issue at hand. Only CPU is important, and
maybe also memory usage.
Stefan Monnier
2018-05-14 16:30:37 UTC
Permalink
Post by Eli Zaretskii
Post by Phillip Lord
So, something like time to create a big list (for CPU), and time to read
a defined file (for IO). Then you could say "this file should parse in
1x IO base-line + 10x CPU base-line.
You assume a linear scalability, but that is not necessarily so.
The ratio between performance indices of different codes could vary
depending on the build option and the underlying OS.
Another issue is that performance measuring is notoriously difficult
(even on an otherwise idle machine, let alone on some server that has
other tasks running at the same time). So you might be able to catch
the "10x slower" case easily with a fairly high confidence that the
problem is indeed that the code got slower, but if you want to catch the
"20% slowdown" with any kind of confidence (without being drowned in
false positives), you'll need either a very tight control on the test
runs, or a good statistical analysis.


Stefan
Phillip Lord
2018-05-15 13:24:58 UTC
Permalink
Post by Stefan Monnier
Post by Eli Zaretskii
Post by Phillip Lord
So, something like time to create a big list (for CPU), and time to read
a defined file (for IO). Then you could say "this file should parse in
1x IO base-line + 10x CPU base-line.
You assume a linear scalability, but that is not necessarily so.
The ratio between performance indices of different codes could vary
depending on the build option and the underlying OS.
Another issue is that performance measuring is notoriously difficult
(even on an otherwise idle machine, let alone on some server that has
other tasks running at the same time). So you might be able to catch
the "10x slower" case easily with a fairly high confidence that the
problem is indeed that the code got slower, but if you want to catch the
"20% slowdown" with any kind of confidence (without being drowned in
false positives), you'll need either a very tight control on the test
runs, or a good statistical analysis.
Indeed, it would be very broad brush. Or, alternatively, it would be
tied very tightly to a specific infrastructure.

Phil

John Wiegley
2018-05-11 07:56:01 UTC
Permalink
Do we have profiles of the uses where there's a perceived latency? Are we
sure the latency is caused by Lisp code?
No, this is just assumption on my part, I don't have any knowledge, or even a
reasonably good opinion, that it would make things any better. I suspect that
just getting Gnus to use lexical-scope everywhere would be a bigger win for
less effort.
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
John Wiegley
2018-05-10 20:31:51 UTC
Permalink
TT> https://github.com/tromey/el-compilador

TT> This still has a bunch of bugs but it has successfully turned some elisp
TT> into C. I have an emacs branch where I replaced some of the code in subr.c
TT> with elisp.

Very nice! This is just what I was thinking of, so what do you need to help
this project mature?
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
Tom Tromey
2018-05-14 21:44:15 UTC
Permalink
TT> https://github.com/tromey/el-compilador

TT> This still has a bunch of bugs but it has successfully turned some elisp
TT> into C. I have an emacs branch where I replaced some of the code in subr.c
TT> with elisp.

John> Very nice! This is just what I was thinking of, so what do you need to help
John> this project mature?

I think the main thing is feeling like it matters. One of my long-term
goals was to have more of Emacs be written in Emacs Lisp, and this was a
supporting technology for that. (Other pieces were the FFI and gcc-jit
wrappers.)

However when hacking on it I ended up taking two different tacks: one
was rewriting bits of the C core in elisp; and the other was trying to
compile bits of elisp (say, stuff from subr.el or something) to C. It
wasn't clear to me which of these was more fruitful.

Anyway, the backend has some bugs, and it generates "Emacs C core"-style
C code, not "Emacs dynamic module"-style C code.


If you're just interested in performance, though, maybe my JIT is a
better choice. I haven't tried submitting the JIT here, yet, mainly
because there's still one more feature I want to finish (improved
calling convention, plus maybe inlining); but also because Stefan
pointed out a very interesting JIT paper to me, which I think would
require a ground-up rewrite (avoiding libjit).

Tom
John Wiegley
2018-05-14 23:36:58 UTC
Permalink
TT> Anyway, the backend has some bugs, and it generates "Emacs C core"-style C
TT> code, not "Emacs dynamic module"-style C code.

Could some of our internal C sources be replaced by Emacs Lisp, which the
Makefile would compile into C as part of the regular build process? This would
allow testing using the Emacs Lisp version, with compilation into C for the
sake of performance. And maybe benchmark would even rule out the need to
compile that file in the end...
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
Loading...