Discussion:
[Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagnostics
(too old to reply)
Dmitry Gutov
2015-08-16 21:44:28 UTC
Permalink
branch: master
commit 9ce1d38890a77e93af0d20f51c53419c097200d3
Use curved quotes in core elisp diagnostics
And there we go, sneaking curved quotes everywhere in the Elisp source
files. FTR, I hate that.
In the core elisp files, use curved quotes in diagnostic formats,
so that they follow user preference as per ‘text-quoting-style’
rather than being hard-coded to quote `like this'.
Why do you need that? Doesn't (format "Expand `%s'? " string) use the
"preferred quoting style"?

And if not, why don't you change it so it does?
Alan Mackenzie
2015-08-16 22:53:46 UTC
Permalink
Post by Dmitry Gutov
branch: master
commit 9ce1d38890a77e93af0d20f51c53419c097200d3
Use curved quotes in core elisp diagnostics
And there we go, sneaking curved quotes everywhere in the Elisp source
files. FTR, I hate that.
No big surprises here, but so do I.

Paul, people have got to edit these files. Doing this has rarely been
easy. Polluting them globally with non-working characters is going to
make this even more difficult and much less pleasant.
Post by Dmitry Gutov
In the core elisp files, use curved quotes in diagnostic formats,
so that they follow user preference as per ‘text-quoting-style’
rather than being hard-coded to quote `like this'.
This user's preference is to be able to type quote marks in Emacs source
files directly without mincing workarounds, and to be able to search for
them likewise.

Revert this change, please.
--
Alan Mackenzie (Nuremberg, Germany).
Drew Adams
2015-08-16 23:16:29 UTC
Permalink
Post by Alan Mackenzie
Post by Dmitry Gutov
Use curved quotes in core elisp diagnostics
And there we go, sneaking curved quotes everywhere in the Elisp
source files. FTR, I hate that.
No big surprises here, but so do I.
And I. (No surprise here either.)

It was obvious that this "experiment" would not be just that.
Post by Alan Mackenzie
Paul, people have got to edit these files. Doing this has rarely
been easy. Polluting them globally with non-working characters
is going to make this even more difficult and much less pleasant.
But surely no such consideration is as important as Paul's sense
that `...' is, hmm, ugly?

A small sacrifice for Paul's cosmetology.
A giant leap of somekind.
The Eagle will keep landing.
Post by Alan Mackenzie
Revert this change, please.
Good luck with that.
Paul Eggert
2015-08-18 05:20:00 UTC
Permalink
what benefit is there in putting
any real curly quotes into those doc strings?
Format strings are easier to read and use, particularly by novices, if
characters typically stand for themselves. For example, typical Emacs users who
type ‘C-h f length RET’ will see text with curved quotes like this in their
*Help* buffers:

... To get the number of bytes, use ‘string-bytes’. ...

Users can cut this text from *Help* and paste it directly into their source
code's doc strings; this is simpler than having to change those quotation marks
to be grave accent and apostrophe.
Bastien
2015-08-18 08:56:18 UTC
Permalink
Post by Paul Eggert
Format strings are easier to read and use, particularly by novices, if
characters typically stand for themselves.
Did we ever receive a complaint from a novice about `...' readability?
Paul Eggert
2015-08-18 17:34:45 UTC
Permalink
Post by Bastien
Post by Paul Eggert
Format strings are easier to read and use, particularly by novices, if
characters typically stand for themselves.
Did we ever receive a complaint from a novice about `...' readability?
Most novices don't bother to write bug reports -- they don't even know how to
write bug reports. But yes, people occasionally gripe about the use of grave
accent to quote, and this can hurt Emacs's reputation among people who may not
know it better. For example, <http://wordyenglish.com/musing/typography.html>
(2007) says:

"the problem with the GNU is that even today, in 2007, where curly quotes have
been widely available in word processors for over a decade (and Unicode have
been practical and widely available for at least 5 years...), they are still
using plain ASCII hacks. (in general, GNU and the Open Source morons have like a
5 to 10 years lag in adopting technology, for reasons that are inadvertently
intentional and or simply incapable)"

And here we are in 2015, with the quote problem still only partly fixed.
Andreas Schwab
2015-08-18 18:05:50 UTC
Permalink
Post by Paul Eggert
And here we are in 2015, with the quote problem still only partly fixed.
In 2015, people use `...`.

Andreas.
--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Chad Brown
2015-08-18 18:32:36 UTC
Permalink
Post by Andreas Schwab
Post by Paul Eggert
And here we are in 2015, with the quote problem still only partly fixed.
In 2015, people use `...`.
I work more in prose now than in code, and I’ve received numerous
complaints from editors in the last ~3 years about text that uses
that. In practice, its currently hard for me to type that exact
string, because all of the programs I use that aren’t Emacs or some
of the terminal console interfaces automatically turn it into ‘…’.

I’ve mostly given up on using Emacs for long-form prose work for
this reason (which is too bad, because the editing support for prose
is still quite good, and also long-since pushed down into muscle
memory). With all this in mind, I’ve been watching these exploding
threads with a mixture of hope and dread, because the modernization
effort would be very nice for me, but I’m seeing a lot of older
people whose opinions and efforts I respect argue for "just use
old-style ascii quotes."

I hope that helps. Thanks again to Paul for trying to push Emacs forward.
~Chad
Alan Mackenzie
2015-08-18 19:42:53 UTC
Permalink
Hello, Chad.
Post by Chad Brown
Post by Andreas Schwab
Post by Paul Eggert
And here we are in 2015, with the quote problem still only partly fixed.
In 2015, people use `...`.
I work more in prose now than in code, and I’ve received numerous
complaints from editors in the last ~3 years about text that uses
that. In practice, its currently hard for me to type that exact
string, because all of the programs I use that aren’t Emacs or some
of the terminal console interfaces automatically turn it into ‘…’.
Yet, for lots of programming languages, it is essential to be able to
type `, ', and even .... (That's three dots followed by a full stop!)
Post by Chad Brown
I’ve mostly given up on using Emacs for long-form prose work for
this reason (which is too bad, because the editing support for prose
is still quite good, and also long-since pushed down into muscle
memory).
It seems what you're missing in Emacs is some sort of "literary mode"
which would make the translations you need.

When you say "still quite good", this gives the impression of a steady
diminution of quality, which hasn't yet dropped below some sort of
threshold. Is this what you actually mean?
Post by Chad Brown
With all this in mind, I’ve been watching these exploding threads with
a mixture of hope and dread, because the modernization effort would be
very nice for me, but I’m seeing a lot of older people whose opinions
and efforts I respect argue for "just use old-style ascii quotes."
I think you may have misunderstood what the controversy is about. It's
not about adding or not adding support for curly quotes and curly
ellipses in users' documents. It's about whether or not to replace `
and ' in Emacs's source files' error messages and comments with curly
quotes, for a perceived aesthetic benefit, at the cost of making those
files more difficult to maintain, particularly in environments which
aren't X-windows.
Post by Chad Brown
I hope that helps. Thanks again to Paul for trying to push Emacs forward.
We're _all_ trying to push Emacs forward. It's just we sometimes don't
fully agree what direction forward is. :-)
Post by Chad Brown
~Chad
--
Alan Mackenzie (Nuremberg, Germany).
Drew Adams
2015-08-18 20:17:56 UTC
Permalink
Post by Alan Mackenzie
I think you may have misunderstood what the controversy is about.
It's not about adding or not adding support for curly quotes and curly
ellipses in users' documents. It's about whether or not to replace
` and ' in Emacs's source files' error messages and comments with
curly quotes, for a perceived aesthetic benefit, at the cost of making
those files more difficult to maintain, particularly in environments which
aren't X-windows.
AND at the cost of confusing ordinary text quotation (which, naturally,
uses curly quotes, both double and single) with mention, within passages
of ordinary text, of things that the text wants to talk about: code
fragments such as symbols and sexps, as well as key sequences, file names,
and URLS. IOW, the kinds of things that are often distinguished from
ordinary text by setting them off in a fixed-width font (such as Courier).

IOW, regardless of the (important) problems Alan mentions, about
difficulty of use, burden of maintenance, misfit with existing
code-oriented tools, and complication of user and core code (layers and
layers of ugly workarounds) -- i.e., even if those problems were not a
problem, it is misguided to treat setting off code etc. using ordinary
text quotation. It introduces confusion and is information lossy.
Post by Alan Mackenzie
We're _all_ trying to push Emacs forward. It's just we sometimes
don't fully agree what direction forward is. :-)
Not to mention the effect on many others who will deal with the
consequences without having voiced anything about the direction.
(And yes, I'm sure there will be many end users who will not get
beyond a first blush of "Look Ma, curly quotes now! How pretty!")
Dmitry Gutov
2015-08-18 20:53:29 UTC
Permalink
Post by Chad Brown
because all of the programs I use that aren’t Emacs or some
of the terminal console interfaces automatically turn it into ‘…’.
All programs? None of the popular programming text editors do that.

You might be thinking of LibreOffice Writer.
Post by Chad Brown
With all this in mind, I’ve been watching these exploding
threads with a mixture of hope and dread, because the modernization
effort would be very nice for me, but I’m seeing a lot of older
people whose opinions and efforts I respect argue for "just use
old-style ascii quotes."
Since you're writing prose, the discussion should be irrelevant to you.

It's only about which quotes to use in the Emacs Lisp source files.

For inserting proper punctuation marks automatically when writing prose,
you should check out
http://www.emacswiki.org/emacs/TypographicalPunctuationMarks
Paul Eggert
2015-08-18 21:43:05 UTC
Permalink
Post by Dmitry Gutov
Since you're writing prose, the discussion should be irrelevant to you.
It's only about which quotes to use in the Emacs Lisp source files.
That remark could easily be misinterpreted by those not closely following the
discussion. We're not talking about ", `, and ' in Emacs Lisp syntax --
nobody's proposing changing that. We're talking only about the prose that
appears in commentary and strings of Emacs Lisp source files, and this makes
Chad Brown's remarks about prose relevant to the discussion.
Dmitry Gutov
2015-08-19 01:09:22 UTC
Permalink
Post by Paul Eggert
We're talking only
about the prose that appears in commentary and strings of Emacs Lisp
source files, and this makes Chad Brown's remarks about prose relevant
to the discussion.
Not at all. I'm pretty sure the "numerous complaints from editors" Chad
mentioned were not about contents of docstrings in Elisp Chad wrote.

Whatever choice is made WRT to docstring syntax, it should have to
bearing on people's ability to write prose (books, articles, blog posts)
and typography used therein.
Richard Stallman
2015-08-19 18:14:30 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
Post by Dmitry Gutov
For inserting proper punctuation marks automatically when writing prose,
you should check out
http://www.emacswiki.org/emacs/TypographicalPunctuationMarks
If this is a good feature, it should not be relegated to a wiki. It
should be available in Emacs and documented in the Emacs Manual.
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Dmitry Gutov
2015-08-20 13:56:56 UTC
Permalink
Post by Richard Stallman
If this is a good feature, it should not be relegated to a wiki. It
should be available in Emacs and documented in the Emacs Manual.
No argument from me here (but I'm not the target audience, so I don't
really know how valuable it is, or whether than implementation works well).

typopunc.el is authored by Oliver Scholz who seems to have copyright
assignment signed, but the wiki page contains a lot of code snippets
seemingly required to get the most out of the package.
Eli Zaretskii
2015-08-21 07:51:12 UTC
Permalink
Date: Thu, 20 Aug 2015 16:56:56 +0300
typopunc.el is authored by Oliver Scholz who seems to have copyright
assignment signed, but the wiki page contains a lot of code snippets
seemingly required to get the most out of the package.
At least some parts of that should be simply added to iso-transl.el.
Richard Stallman
2015-08-19 01:22:14 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Tex mode automatically turns " into `` or '' according to the context.
Couldn't we easily have a mode for use in Text mode, that turns ' into
‘ or ’ according to the context? (Maybe we already do.)
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Paul Eggert
2015-08-19 03:52:52 UTC
Permalink
Post by Richard Stallman
Tex mode automatically turns " into `` or '' according to the context.
Couldn't we easily have a mode for use in Text mode, that turns ' into
‘ or ’ according to the context? (Maybe we already do.)
Yes we do, in Emacs master. It's Electric Quote mode. You can try it out by
putting this into your ~/.emacs file:

(if (fboundp 'electric-quote-mode)
(electric-quote-mode))
Richard Stallman
2015-08-20 16:54:37 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
it may be a bit much to ask him to try out the bleeding-edge master version.
Even if he decides not to do the work to try it, he might feel touched
that we are concerned about users like him.
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Dmitry Gutov
2015-08-21 00:48:43 UTC
Permalink
We can do that after Emacs 25 comes out. That feature isn't in Emacs
24.5, and it may be a bit much to ask him to try out the bleeding-edge
master version.
Would it be too much trouble for you to provide a version of Emacs built
from master to some of your students who expressed the confusion about
the quoting method?
Paul Eggert
2015-08-21 01:35:57 UTC
Permalink
Would it be too much trouble for you to provide a version of Emacs built from
master to some of your students who expressed the confusion about the quoting
method?
I could try that with new students, yes. (The old ones have mostly moved on....)
Dmitry Gutov
2015-08-18 20:47:36 UTC
Permalink
Post by Paul Eggert
Most novices don't bother to write bug reports -- they don't even know
how to write bug reports.
Bug reports are written by users who are at least a little experienced,
sure, but we shouldn't assume that every such user has necessarily
become accustomed to Emacs's quirks, and wouldn't call out this problem,
if it were a real problem.
Post by Paul Eggert
But yes, people occasionally gripe about the
use of grave accent to quote, and this can hurt Emacs's reputation among
people who may not know it better. For example,
I sincerely hope the whole effort wasn't kicked off by this Xah Lee's
rant. It's pretty shallow. And the author should really "know Emacs
better" by now.
Post by Paul Eggert
"the problem with the GNU is that even today, in 2007, where curly
quotes have been widely available in word processors for over a decade
(and Unicode have been practical and widely available for at least 5
years...), they are still using plain ASCII hacks. (in general, GNU and
the Open Source morons have like a 5 to 10 years lag in adopting
technology, for reasons that are inadvertently intentional and or simply
incapable)"
"morons"... yeah.
Post by Paul Eggert
And here we are in 2015, with the quote problem still only partly fixed.
One would have to define the "problem" first.

In 2015, the documentation markup languages (Markdown, Asciidoc, etc)
support rich content (images, hyperlinks, document structure), and
decoupling markup from presentation (usually through rendering into HTML).

Yet here we are, not talking about any big features, and instead
discussing using unicode quotes in the markup (which none of the modern
markup languages do), because it's "easier" if the markup and
presentation are the same. That's a step back, if anything.
Richard Stallman
2015-08-19 01:24:49 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
Post by Paul Eggert
what benefit is there in putting
any real curly quotes into those doc strings?
Format strings are easier to read and use, particularly by novices, if
characters typically stand for themselves.
The issue at hand is doc strings in the source code, not format strings.
Those are a different issue.
Post by Paul Eggert
... To get the number of bytes, use ‘string-bytes’. ...
Users can cut this text from *Help* and paste it directly into their source
code's doc strings; this is simpler than having to change those quotation marks
It's a bad practice to copy text from a help buffer into
a doc string in the source code. There are several constructs
used in source code doc strings that get converted into something
different in a help buffer. So if you want to copy one doc string
into another, you should copy from source code.

I doubt that case arises very often, though. If the user's new
function is so similar to an existing function that he wants to copy
from its doc string, he probably wrote his function by modifying the
code of that function, and probably already has its doc string.
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Paul Eggert
2015-08-19 04:52:17 UTC
Permalink
Post by Richard Stallman
The issue at hand is doc strings in the source code, not format strings.
True. Sorry about the confusion.
Post by Richard Stallman
It's a bad practice to copy text from a help buffer into
a doc string in the source code
That's a bit like saying it's bad practice to copy from a text file into a Lisp
string. Although such copying can lead to problems because arbitrary text can
be misinterpreted in a Lisp string, as a practical matter it's often quite
convenient to copy snippets of text files into Lisp strings or vice versa, and
it often works well, either because users know the text doesn't have string
escapes or delimiters, or because they repair any glitches that may arise.

And it's not just *Help* buffers. I often cut and paste from documentation into
code. The Emacs info files have used curved quotes for some time, and it's
helpful to have these work in doc strings too. As a simple example, the Elisp
manual says this:

-- Function: eql value1 value2
This function acts like ‘eq’ except when both arguments are
numbers. It compares numbers by type and numeric value,...

If I were to write a similar function I might copy some of this into the new
function's doc string and then adapt it, like this:

(defun eqfuzz (value1 value2)
"Compare two values but use fuzzy comparison on numbers.
This function acts like ‘eq’ except when both arguments are numbers....

This sort of editing is natural and convenient, and there's no compelling reason
to prohibit it.
Paul Eggert
2015-08-17 03:44:25 UTC
Permalink
Doesn't (format "Expand `%s'? " string) use the "preferred quoting style"?
No, as that would break usages such as (format "\\`%s\\'" (car e)). This sort
of thing is reasonably common, so it'd be a compatibility problem to change
‘format’ to replace accent-grave.
This user's preference is to be able to type quote marks in Emacs source
files directly without mincing workarounds, and to be able to search for
them likewise.
You can continue to type quote marks as before, and Emacs will behave as it
always has. The only change here is if that if you want quotes in diagnostics
to come out in the user's preferred style, a "mincing workaround" of some sort
is needed.

By the way, it's easy to search for quote marks in either style: just type C-s `
or C-s '. And it's easy to type curved quote marks by using Electric Quote
mode. These are both new features. If you don't want to use these new
features, that's fine too.
Dmitry Gutov
2015-08-17 11:33:52 UTC
Permalink
Post by Paul Eggert
No, as that would break usages such as (format "\\`%s\\'" (car e)).
This sort of thing is reasonably common, so it'd be a compatibility
problem to change ‘format’ to replace accent-grave.
But it's not a compatibility problem to replace a relatively-rare
combination of characters? Just because the odds of it coming up are lower?
Paul Eggert
2015-08-17 16:41:01 UTC
Permalink
Post by Paul Eggert
No, as that would break usages such as (format "\\`%s\\'" (car e)).
This sort of thing is reasonably common, so it'd be a compatibility
problem to change ‘format’ to replace accent-grave.
But it's not a compatibility problem to replace a relatively-rare combination of
characters? Just because the odds of it coming up are lower?
Yes, of course. When we change how Emacs behaves, it's important to see how
likely these changes break existing uses in significant ways. If the odds are
significantly nonzero, as would be the case if we changed the behavior of
(format "...`..." ...), then we have a significant compatibility problem. In
contrast, if the odds are very low -- the latter being true for
(substitute-command-keys "...`...")) -- then we should be OK.
Bastien Guerry
2015-08-17 11:09:33 UTC
Permalink
Hi Paul,
Post by Paul Eggert
By the way, it's easy to search for quote marks in either style: just
type C-s ` or C-s '. And it's easy to type curved quote marks by
using Electric Quote mode. These are both new features. If you don't
want to use these new features, that's fine too.
I frequently use find-grep to find bits of code I'm interesting in.

I sometimes use something like

find . -type f -exec grep --color -nH -e "\`.\+'" {} +

from M-x fing-grep to match variables in docstrings among various
files.

If the code is using ‘...’ quotes, how can I grep to match quoted
strings? Also, electric-quote-mode does not work in the minibuffer,
and I just don't know how to type ‘ ... any hint here?

Thanks,
--
Bastien
Paul Eggert
2015-08-17 16:39:38 UTC
Permalink
Post by Bastien Guerry
I sometimes use something like
find . -type f -exec grep --color -nH -e "\`.\+'" {} +
from M-x fing-grep to match variables in docstrings among various
files.
If the code is using ‘...’ quotes, how can I grep to match quoted
strings?
You can use curved quotes in the grep pattern. I just now tried the above
command in the Emacs source code and got 108,000 hits, mostly false alarms. I
had better luck with this:

find . -name test -prune -o '(' -name '*.el' -o -name '*.c' ')' -exec \
grep --color -nH -Ee \
"[\`'‘][[:alpha:]][[:alnum:]]*-[[:alnum:]-]*[[:alnum:]][’']" \
{} +

This yields only 39,000 hits, mostly not false alarms. Although this is the
sort of thing one would want to package up rather than type by hand, the curved
quotes aren't the major reason for needing packaging.
Paul Eggert
2015-08-17 17:25:40 UTC
Permalink
how do you enter the curved quote in the minibuffer?
FWIW i use a Thinkpad X200.
C-x 8 [ and C-x 8 ] should work on any platform. A-[ and A-] might work on your
Thinkpad, depending on how it's set up.
Bastien
2015-08-17 17:47:43 UTC
Permalink
Post by Paul Eggert
how do you enter the curved quote in the minibuffer?
FWIW i use a Thinkpad X200.
C-x 8 [ and C-x 8 ] should work on any platform. A-[ and A-] might
work on your Thinkpad, depending on how it's set up.
Okay, thanks.
--
Bastien
Dmitry Gutov
2015-08-17 11:47:26 UTC
Permalink
Post by Paul Eggert
You can continue to type quote marks as before, and Emacs will behave as
it always has. The only change here is if that if you want quotes in
diagnostics to come out in the user's preferred style, a "mincing
workaround" of some sort is needed.
You could use something like uLSQM", employed in the C code.
Post by Paul Eggert
By the way, it's easy to search for quote marks in either style: just
type C-s ` or C-s '. And it's easy to type curved quote marks by using
Electric Quote mode. These are both new features. If you don't want to
use these new features, that's fine too.
Enabling electric-quote-mode means *all* quotes will be translated to
curvy ones in the source code one types from there on, automatically.

Bye-bye straight quotes.
Paul Eggert
2015-08-17 16:42:07 UTC
Permalink
Post by Dmitry Gutov
You could use something like uLSQM", employed in the C code.
You mean, instead of this:

(message "Buffer ‘%s’ is read only." buf)

we do this:

(message (concat "Buffer "uLSQM"%s"uRSQM" is read only.") buf)

where uLSQM and uRSQM are global string constants? I did consider that, but
rejected it as it would make Lisp code too hard to read and maintain. uLSQM and
uRSQM are needed in C code due to C99's limitations, but at least there it is
reasonably rare (fewer than a hundred uses) and C is already pretty ugly so we
can put up with it. If we did the above sort of thing to Lisp we'd need to
uglify thousands of uses, which would not be a good thing.

By the way I considered many other possibilities, and you're welcome to bring up
other alternatives you're interested in. Maybe we can come up with something
better. The point of this change is to fix a real problem, after all, not to
stir up trouble.
Post by Dmitry Gutov
Enabling electric-quote-mode means *all* quotes will be translated to curvy ones in the source code one types from there on, automatically.
No, just the grave accents in strings and comments, and any matching
apostrophes; this is a considerably lighter touch than what the above remark
implies. Occasionally one needs to type C-q ` when one really needs a grave
accent inside a string, but that's pretty rare unless one is hard-coding
grave-accent quoting which is something we should be moving away from nowadays
anyway.
Paul Eggert
2015-08-17 18:38:23 UTC
Permalink
(message "Buffer %qs is read only." buf)
I could easily implement that and turn existing uses of "‘%s’" to "%qs", if you
think it'd help. It's a bit more complicated and it would not address all the
uses of curved quotes in diagnostics, but it would address many of them.
Paul Eggert
2015-08-17 23:55:41 UTC
Permalink
Post by Paul Eggert
It's a bit more complicated and it would not
address all the uses of curved quotes in diagnostics, but it would
address many of them.
The other uses, which currently don't employ a formatting sequence, can be
changed to use %qs as well. It'll just be less of a mechanical conversion.
Do you mean replacing this sort of thing:

(message "Press ‘?’ or ‘h’ for help, ‘q’ to quit")

with this?

(message "Press %qs or %qs for help, %qs to quit" "?" "h" "q")

If so, this doesn't sound like a good idea, as it would make the code harder to
read. Also, it wouldn't suffice for code like this:

(insert (symbol-name type)
(format " is a type (of kind ‘"))
(help-insert-xref-button (symbol-name metatype)
'cl-help-type metatype)
(insert (format "’)"))

which formats the matching quotes separately. Of course in general one could
rewrite even the latter example to use %qs (if only to grab the quote characters
out of the result string and reuse them individually!) but the rewritten version
would be significantly harder to read and maintain.

As we need to support formatting individual curved quotes anyway, there is an
argument for keeping it simple and omitting the attached patch for paired
quotes. With all this in mind, do you still think the complexity of the
attached draft patch is a good idea?
Dmitry Gutov
2015-08-18 11:31:53 UTC
Permalink
(message "Press ‘?’ or ‘h’ for help, ‘q’ to quit")
with this?
(message "Press %qs or %qs for help, %qs to quit" "?" "h" "q")
Yes.
If so, this doesn't sound like a good idea, as it would make the code
harder to read.
It looks okay to me, but if you don't like it, the first option is
available as well: two format sequences, one for opening quote, and one
for closing. That would be more cumbersome, though.

I've taken the idea for %qs from GCC. Do you know if they handle the
above kind of situation somehow specially?
(insert (symbol-name type)
(format " is a type (of kind ‘"))
(help-insert-xref-button (symbol-name metatype)
'cl-help-type metatype)
(insert (format "’)"))
But this is not about diagnostic messages anymore, right? At the moment,
IIRC, these situations are handled by substitute-command-keys (and
there's no need to have curly quotes in the strings here).
which formats the matching quotes separately. Of course in general one
could rewrite even the latter example to use %qs (if only to grab the
quote characters out of the result string and reuse them individually!)
but the rewritten version would be significantly harder to read and
maintain.
I'm definitely not suggesting that.
As we need to support formatting individual curved quotes anyway, there
is an argument for keeping it simple and omitting the attached patch for
paired quotes. With all this in mind, do you still think the complexity
of the attached draft patch is a good idea?
This patch solves the problem of "curved quotes in core elisp
diagnostics", which you've felt neccessary to resolve with
9ce1d38890a77e93af0d20f51c53419c097200d3, kicking off this discussion.

So yes, I think it's valuable.

And if your point is that by having this logic in `format', we won't
need it in `substitute-command-keys', then I stand by the assertion that
a separate, different, function should translate the quotes.

text -> (substitute-command-keys) -> text with "escaped" text prop
-> (translate-quotes) -> text with non-escaped straight quotes
replaces with curly ones

`format' can't serve as `translate-quotes', because, like you said,
"\\`%s\\'" is a relatively common format string.
Paul Eggert
2015-08-19 06:28:32 UTC
Permalink
Post by Dmitry Gutov
So yes, I think it's valuable.
OK, thanks for reviewing; I installed the %q patch, and followed up with
simplifications that it allows at the C level, mostly by using %qs in formats
instead of uLSQM and uRSQM. This makes the C code easier to read and so is a
clear win. Doing something similar in Elisp code makes the code harder to read,
though, so I held off on that.
Post by Dmitry Gutov
two format sequences, one for opening quote, and one for closing. That
would be more cumbersome, though.
Yes, it would be more cumbersome. Instead of the current:

(message "Press ‘?’ or ‘h’ for help, ‘q’ to quit")

we would have something like:

(message "Press %<?%> or %<h%> for help, %<q%> to quit")

which is harder to read and is more error-prone. This less-readable approach is
used in GCC's source code, which must port to old-fashioned C++ compilers that
lack support for multibyte characters in strings. It is not needed for Emacs
Lisp, and we shouldn't insist on it there.
Post by Dmitry Gutov
Post by Paul Eggert
(insert (symbol-name type)
(format " is a type (of kind ‘"))
(help-insert-xref-button (symbol-name metatype)
'cl-help-type metatype)
(insert (format "’)"))
But this is not about diagnostic messages anymore, right? At the moment, IIRC,
these situations are handled by substitute-command-keys (and there's no need to
have curly quotes in the strings here).
At the moment these situations are handled by ‘format’. Simply changing the
code to use ‘substitute-command-keys’ and ASCII characters would not work in
cases like the above, because (insert (substitute-command-keys "')")) would
insert an apostrophe regardless of user text-quoting preference. Of course
there are workarounds but the workarounds are clumsy. (Also, the
ASCII-only-source approach would typically be a tad slower, which is annoying. :-)
Post by Dmitry Gutov
text -> (substitute-command-keys) -> text with "escaped" text prop
-> (translate-quotes) -> text with non-escaped straight quotes replaces
As mentioned above it's not obvious how to get that to work; even if we did so
it'd be more complicated and error-prone, and a bit slower. Hardly seems worth it.
Dmitry Gutov
2015-08-19 13:30:43 UTC
Permalink
Post by Paul Eggert
OK, thanks for reviewing; I installed the %q patch, and followed up with
simplifications that it allows at the C level, mostly by using %qs in
formats instead of uLSQM and uRSQM. This makes the C code easier to
read and so is a clear win. Doing something similar in Elisp code makes
the code harder to read, though, so I held off on that.
Please don't take it as a review: I only verified that it corresponded
to my proposal. But if you're not going to use the proposal as stated,
and are still keeping curlies in Elisp source code, then, from where I'm
standing, the new format sequence is not exactly justified.
Post by Paul Eggert
(message "Press %<?%> or %<h%> for help, %<q%> to quit")
That looks quite nice to me.
Post by Paul Eggert
which is harder to read and is more error-prone.
The %qs approach should be the least error-prone.
Post by Paul Eggert
At the moment these situations are handled by ‘format’. Simply changing
the code to use ‘substitute-command-keys’ and ASCII characters would not
work in cases like the above, because (insert (substitute-command-keys
"')")) would insert an apostrophe regardless of user text-quoting
preference.
Why? IIRC, when discussing a Lisp solution, you voted in favor of simple
translation logic, one that didn't even check for pairings.
substitute-command-keys could do the same.
Post by Paul Eggert
Post by Dmitry Gutov
text -> (substitute-command-keys) -> text with "escaped" text prop
-> (translate-quotes) -> text with non-escaped straight quotes replaces
As mentioned above it's not obvious how to get that to work; even if we
did so it'd be more complicated and error-prone, and a bit slower.
Hardly seems worth it.
As I've explained several times, we'll need this (or maybe some other,
comparable) complexity to solve related issues.
Richard Stallman
2015-08-20 16:56:05 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Would someone like to summarize the various alternatives now
being considered for quotation in format strings?
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Paul Eggert
2015-08-20 19:32:01 UTC
Permalink
Post by Richard Stallman
Would someone like to summarize the various alternatives now
being considered for quotation in format strings?
Perhaps best illustrated with examples.

Here's what's implemented in master now (either works):

(format "‘add-to-list’ can't use var ‘%s’; use ‘push’ or ‘cl-pushnew’" sym)

(format "%q can't use var %qs; use %qs or %qs" "add-to-list" sym "push"
"cl-pushnew")

Digraphs were proposed, but not implemented:

(format "%<add-to-list%> can't use var %<%s%>; use %<push%> or
%<cl-pushnew%>" sym)

There are lots of other possibilities of course, but these have been the main
ones considered so far.
Dmitry Gutov
2015-08-19 22:38:09 UTC
Permalink
Do you like trigraphs too?!? (Sorry, couldn't resist. :-) Whatever
:)
niceness it has is clearly trumped by the niceness of using quotation
marks to represent quotation marks, and by the better compatibility of
quotation marks when running in older Emacs versions.
I can buy the "compatibility" argument. This approach is basically
predicated on using electric-quotes-mode, however, and that brings in
curly quotes in docstrings.
Paul Eggert
2015-08-20 13:55:30 UTC
Permalink
Post by Dmitry Gutov
when discussing a Lisp solution, you voted in favor of simple
translation logic, one that didn't even check for pairings.
substitute-command-keys could do the same.
I suppose it could, yes. I'll look into that.
After looking into it I'm not so sure it's a good idea. Although it simplifies
substitute-command-string, quite a few docstrings quote with apostrophes only
when quoting non-symbols, e.g., for rfc2368-mailto-prequery-index:

"Describes the portion of the url between 'mailto:' and '?'."

and it's not right to turn these into right single quotation marks:

"Describes the portion of the url between ’mailto:’ and ’?’."

Although we could change all these docstrings to grave or curved or double
quoting and that would make Emacs a bit more consistent, this is not clearly an
improvement and there still would be a problem with users' own docstrings.

This sort of thing is why Electric Quote mode doesn't transform isolated
apostrophes, and there is something to be said for keeping
substitute-command-keys somewhat consistent with that mode.

So I'm inclined to leave this alone, and stick with the less-aggressive approach
that substitute-command-keys is currently using.
Artur Malabarba
2015-08-17 23:25:03 UTC
Permalink
(message "Buffer %qs is read only." buf)
I could easily implement that and turn existing uses of "‘%s’" to "%qs",
if you think it'd help.
I think so.
Me too.
Alan Mackenzie
2015-08-17 12:15:13 UTC
Permalink
Hello, Paul.
Post by Paul Eggert
Doesn't (format "Expand `%s'? " string) use the "preferred quoting style"?
No, as that would break usages such as (format "\\`%s\\'" (car e)). This sort
of thing is reasonably common, so it'd be a compatibility problem to change
‘format’ to replace accent-grave.
This user's preference is to be able to type quote marks in Emacs source
files directly without mincing workarounds, and to be able to search for
them likewise.
You can continue to type quote marks as before, and Emacs will behave as it
always has.
This may be true, but is completely irrelevant to the points I am making.
Post by Paul Eggert
The only change here is if that if you want quotes in diagnostics to
come out in the user's preferred style, a "mincing workaround" of some
sort is needed.
The main change here is that from now on, particularly if so-called
Electric Quote mode [*] is used, we're going to end up with a chaotic
mix of ascii quotes and curly quotes in our source code. So are users
in their source code.

[*] "Electric Quote Mode" isn't an electric mode at all (see definition
in electric.el) - it'a a translation mode, and would, perhaps, be less
confusing if renamed "Translate Quote Mode".

These source files, stuffed full of non-working characters, are more
difficult to work with, whether with less, or with grep, or whatever,
and even inside of Emacs.
Post by Paul Eggert
By the way, it's easy to search for quote marks in either style: just type C-s `
or C-s '. And it's easy to type curved quote marks by using Electric Quote
mode.
Electric Quote Mode is a global mode. It desperately needs to be buffer
local. As it is, if EQM is enabled, non-working characters are going to
be accidentally inserted into buffers where they do not belong.
Post by Paul Eggert
These are both new features. If you don't want to use these new
features, that's fine too.
I don't want to be lumbered with editing source files with non-working
characters.

Paul, these changes of yours are simply the Wrong Thing. The fact that
one "needs" such flaccid workarounds like EQM and the folding of quote
characters with quote characters in (some of) the searching code should
be taken as a hint just to stop and think hard.

I predict that there will be an open-ended series of bugs following from
this, little things that just don't quite work anymore, and their fixing
will be difficult and introduce yet more complexity and more little
things that don't quite work.

Emacs is steadily getting more complicated, and these changes are
gratuitous complexity. I would say that Emacs's biggest danger at the
moment is disappearing up its own complexity. We should be resisting
this process, not accelerating it.
--
Alan Mackenzie (Nuremberg, Germany).
Drew Adams
2015-08-17 14:40:55 UTC
Permalink
Post by Alan Mackenzie
Paul, these changes of yours are simply the Wrong Thing. The fact
that one "needs" such flaccid workarounds like EQM and the folding of
quote characters with quote characters in (some of) the searching code
should be taken as a hint just to stop and think hard.
Amen. Handwriting on wall. Look. See.
Post by Alan Mackenzie
I predict that there will be an open-ended series of bugs following
from this, little things that just don't quite work anymore, and their
fixing will be difficult and introduce yet more complexity and more
little things that don't quite work.
Emacs is steadily getting more complicated, and these changes are
gratuitous complexity. I would say that Emacs's biggest danger at
the moment is disappearing up its own complexity. We should be
resisting this process, not accelerating it.
Yes. As I said a while back:

So far, my impression is that there have been a shiPload of
deep changes that try to work around all kinds of problems
that have been introduced - and to work around the problems
that those workarounds have introduced...

I haven't seen such an invasive change in a long time (never?).
Makes Rube Goldberg machines look Occam-elegant. And for
what?

For what? Seriously.
Eli Zaretskii
2015-08-17 16:53:08 UTC
Permalink
This change (or maybe one of the previous ones in this series) caused
the following messed-up message to be displayed during the build:

./temacs --batch --load loadup bootstrap
[...]
Loading minibuffer...
Loading d:/gnu/git/emacs/trunk/lisp/abbrev.el (source)...
lisp/abbrev.el: =CE=93=D7=90=D7=A8with-wrapper-hook=CE=93=D7=90=
=D7=A9 is an obsolete macro (as of 24.4); use a <foo>-function variab=
le modified by `add-function'.
Loading d:/gnu/git/emacs/trunk/lisp/simple.el (source)...
lisp/simple.el: =CE=93=D7=90=D7=A8with-wrapper-hook=CE=93=D7=90=
=D7=A9 is an obsolete macro (as of 24.4); use a <foo>-function variab=
le modified by `add-function'.

This is displayed on a Windows console which is incapable of showing
curved quotes. Its codeset is cp862.

To reproduce this, I do the following:

. remove lisp/abbrev.elc and lisp/simple.elc
. touch some of the src/*.c files
. type "make" and watch the fun

All the other diagnostics I've seen are perfectly legible.

Let me know if I can help you with additional information.
Paul Eggert
2015-08-17 19:06:09 UTC
Permalink
Post by Eli Zaretskii
This is displayed on a Windows console which is incapable of showing
curved quotes. Its codeset is cp862.
Thanks for reporting that. I reproduced the problem on my Fedora box by running
with --batch in the C locale, and fixed it in master commit
7f2b98d09d113e0f9b1fffb0524622adfafe3ac4. Please give it a try. If the fix
doesn't work for you, could you please investigate why the new using_utf8
function returns true in your environment? It should return false.
Eli Zaretskii
2015-08-17 19:34:46 UTC
Permalink
Date: Mon, 17 Aug 2015 12:06:09 -0700
Post by Eli Zaretskii
This is displayed on a Windows console which is incapable of showing
curved quotes. Its codeset is cp862.
Thanks for reporting that. I reproduced the problem on my Fedora box by running
with --batch in the C locale, and fixed it in master commit
7f2b98d09d113e0f9b1fffb0524622adfafe3ac4. Please give it a try.
That fixed the problem, thanks.
Paul Eggert
2015-08-17 16:53:37 UTC
Permalink
Post by Alan Mackenzie
The main change here is that from now on, particularly if so-called
Electric Quote mode [*] is used, we're going to end up with a chaotic
mix of ascii quotes and curly quotes in our source code.
Although I also would prefer a simpler approach (one that consistently uses
curved quotes), you've objected to that, necessitating a "chaotic" compromise.
Post by Alan Mackenzie
The fact that
one "needs" such flaccid workarounds like EQM and the folding of quote
characters with quote characters in (some of) the searching code should
be taken as a hint just to stop and think hard.
For years Emacs has had significant problems in editing and searching and
generating non-ASCII text. Making Emacs better in this area will inevitably
have teething problems, and we'll inevitably come up with worse solutions before
coming up with better ones. But we shouldn't just do nothing: these are real
problems that need to be addressed.

Insisting that Emacs developers live and work in an ASCII ghetto has contributed
to these problems, as it has led us to discount the importance of non-ASCII
editing in the real world. (I've been guilty of this as the next guy, by the
way -- I'm not trying to cast aspersions on anybody in particular.) It'll be
helpful to break out of these old mindsets, and if regularly using a few
non-ASCII characters in Emacs source will help us do that, then that'll be a
good thing.
Dmitry Gutov
2015-08-17 18:22:19 UTC
Permalink
Post by Paul Eggert
Although I also would prefer a simpler approach (one that consistently
uses curved quotes), you've objected to that, necessitating a "chaotic"
compromise.
"Necessitating", here, is a matter of opinion.
Post by Paul Eggert
Insisting that Emacs developers live and work in an ASCII ghetto has
contributed to these problems, as it has led us to discount the
importance of non-ASCII editing in the real world. (I've been guilty of
this as the next guy, by the way -- I'm not trying to cast aspersions on
anybody in particular.) It'll be helpful to break out of these old
mindsets, and if regularly using a few non-ASCII characters in Emacs
source will help us do that, then that'll be a good thing.
The image that comes to mind, is of an odd kid who's late to a party (to
which he's been invited for the first time), who wants to fit in, and to
that end, drinks too much. It makes me sad.

Unicode characters are good, we should use them in text, but not in the
basic syntax of the language or its environment.
Stephen J. Turnbull
2015-08-18 03:55:37 UTC
Permalink
Post by Dmitry Gutov
Unicode characters are good, we should use them in text, but not in the
basic syntax of the language or its environment.
Oh, so you want a computer language where characters are used only in
strings? Good trick, that.

Seriously, now that we do have Unicode, and good implementations of it
(although Emacs's isn't complete yet, it's certainly usable), there's
really no excuse for _a priori_ restricting the character set used in
a computer language. Yes, discipline is necessary: the *size* of the
character set (aside from identifier constituents) should not be
expanded without good reason. But which characters are used shouldn't
be decided on the basis of historically limited charsets. They should
be chosen because they are appropriate to their syntactic roles.

Backward compatibility is important. The old-timers have a point --
the ASCII workarounds we've used for decades still work, and adding
new synonyms or changing the syntax to substitute more accurate
versions is costly to experienced users and developers. I personally
agree with Paul -- the appropriate place to experiment with this kind
of thing is with string conventions that don't change the meaning of
Lisp programs per se, although they do affect parsing of output and
editing Lisp programs. It's all about eating your own dogfood. But
although I like these changes, they are hardly a no-brainer.

On the other hand, not liking input methods? That's not admissible:
Emacs is the world's biggest, most complex input method, and that is
its primary mission. If you can handle Emacs, you can learn a couple
dozen additional keystroke combinations to input new syntactically
significant characters (and surely the extended repertoire will
include only a few such for many years -- "a couple dozen" is a
generous concession to reactionary fears).
Dmitry Gutov
2015-08-18 10:51:45 UTC
Permalink
Post by Stephen J. Turnbull
Oh, so you want a computer language where characters are used only in
strings? Good trick, that.
Ha-ha.
Post by Stephen J. Turnbull
Seriously, now that we do have Unicode, and good implementations of it
(although Emacs's isn't complete yet, it's certainly usable), there's
really no excuse for _a priori_ restricting the character set used in
a computer language.
In theory.
Post by Stephen J. Turnbull
Yes, discipline is necessary: the *size* of the
character set (aside from identifier constituents) should not be
expanded without good reason. But which characters are used shouldn't
be decided on the basis of historically limited charsets. They should
be chosen because they are appropriate to their syntactic roles.
Historically limited or not, my keyboard, in English layout, only
contains a given set of characters. And those are the ones we're
comfortable typing.
Post by Stephen J. Turnbull
Emacs is the world's biggest, most complex input method, and that is
its primary mission. If you can handle Emacs, you can learn a couple
dozen additional keystroke combinations to input new syntactically
significant characters (and surely the extended repertoire will
include only a few such for many years -- "a couple dozen" is a
generous concession to reactionary fears).
Why would I want to handle them? Having to use input methods adds a
certain constant overhead, motoric and mental.

That might be fine for a language like APL, where you're forced to use
an input method almost everywhere. There, you sacrifice the ease of
input for succinctness across the board. That won't happen for Emacs Lisp.
Óscar Fuentes
2015-08-18 12:31:46 UTC
Permalink
"Stephen J. Turnbull" <***@xemacs.org> writes:

[snip]
Post by Stephen J. Turnbull
Seriously, now that we do have Unicode, and good implementations of it
(although Emacs's isn't complete yet, it's certainly usable), there's
really no excuse for _a priori_ restricting the character set used in
a computer language.
[snip]

My experience says that there are plenty of *reasons* for being
extremely cautious about using anything other than ASCII in programs,
(including (doc-)strings, to a lesser extrem.)

Just see the issues faced on this topic of curved quotes, and it is a
single case of Unicode use.

I'm all for modernization of Emacs, but this change just doesn't make
sense to me.
Alan Mackenzie
2015-08-17 17:35:51 UTC
Permalink
Hello, Paul.
Post by Paul Eggert
Post by Alan Mackenzie
The main change here is that from now on, particularly if so-called
Electric Quote mode [*] is used, we're going to end up with a chaotic
mix of ascii quotes and curly quotes in our source code.
Although I also would prefer a simpler approach (one that consistently uses
curved quotes), you've objected to that, necessitating a "chaotic" compromise.
Post by Alan Mackenzie
The fact that one "needs" such flaccid workarounds like EQM and the
folding of quote characters with quote characters in (some of) the
searching code should be taken as a hint just to stop and think
hard.
For years Emacs has had significant problems in editing and searching and
generating non-ASCII text.
Just to be entirely clear, I've got nothing against "non-ASCII" text.
I've been typing and editing text with characters like £, ä, ß, Ü for
decades, and would not use a program which prevented me from doing so.

What I object to is _non-working_ characters - characters which appear
on nobody's keyboard (see Bastien's question about typing curly quotes)
and are problematic to display (See Eli's recent post, for example). I
would object equally if you were, say, to insist that all symbol names
had to be written in Greek characters, and you converted all our source
files overnight to this convention. With all due respect to our friends
who grok Greek.
Post by Paul Eggert
Making Emacs better in this area will inevitably have teething
problems, and we'll inevitably come up with worse solutions before
coming up with better ones.
Imposing non-working characters on Emacs hackers has no connection with
problems users may have with non-ascii (working-)characters. They're
two distinct themes.
Post by Paul Eggert
But we shouldn't just do nothing: these are real problems that need to
be addressed.
The two problems are separate. I'm not aware of the problems with
non-ascii working-characters, probably because I use only Latin based
scripts. If there are problems there, let them be solved!

But I dispute that using ` and ' as quoting marks, in contexts used by
hackers, is a problem. If it is, it is less of a problem than those
caused by having non-working chracters throughout our sources.
Post by Paul Eggert
Insisting that Emacs developers live and work in an ASCII ghetto has contributed
to these problems, as it has led us to discount the importance of non-ASCII
editing in the real world. (I've been guilty of this as the next guy, by the
way -- I'm not trying to cast aspersions on anybody in particular.)
Again, nobody's insisting this - the topic is working characters vs.
non-working characters, not ascii vs. the rest.
Post by Paul Eggert
It'll be helpful to break out of these old mindsets, and if regularly
using a few non-ASCII characters in Emacs source will help us do that,
then that'll be a good thing.
Again, nobody objects to the non-ascii characters - it's the non-working
characters which are the problem.
--
Alan Mackenzie (Nuremberg, Germany).
Alan Mackenzie
2015-08-18 10:39:21 UTC
Permalink
Hello again, Paul.
What I object to is_non-working_ characters - characters which appear
on nobody's keyboard (see Bastien's question about typing curly quotes)
and are problematic to display (See Eli's recent post, for example).
As for the philosophical issue, I'm afraid we'll just have to disagree.
Sorry, you've lost me. Which philosophical issue would that be?
In my experience, hackers prefer to scratch the itches they feel
personally. Discouraging non-ASCII characters even in relatively
innocuous contexts like doc strings and diagnostics is connected to
putting off the task of making it easier to edit text with these
characters.
Again, you're conflating the two different issues of non-ascii characters
and non-working characters. These issues are separate, and don't
influence eachother. Imposing the non-working characters left and right
curly quotes on Emacs hackers will surely make no difference as to how
easy or difficult it is to edit Russian, or Greek, or Japanese or Korean
in Emacs. Or am I missing something?
--
Alan Mackenzie (Nuremberg, Germany).
Paul Eggert
2015-08-18 16:45:38 UTC
Permalink
Post by Alan Mackenzie
Sorry, you've lost me. Which philosophical issue would that be?
you're conflating the two different issues of non-ascii characters
and non-working characters.
We disagree about this. They're not orthogonal issues. They're so closely
related that they're almost the same issue. Emacs currently makes it harder to
deal with non-ASCII and/or non-working characters than it could. Restricting
our source code to ASCII and/or working characters has helped us put our heads
in the sand about the problem.
Alan Mackenzie
2015-08-18 17:17:15 UTC
Permalink
Hello, Paul.
Post by Paul Eggert
Post by Alan Mackenzie
Sorry, you've lost me. Which philosophical issue would that be?
you're conflating the two different issues of non-ascii characters
and non-working characters.
We disagree about this. They're not orthogonal issues. They're so closely
related that they're almost the same issue.
Well, as I said, I edit texts with non-ascii characters frequently, and
don't experience any particular difficulty with them. Having to type in
a decimal/hex code for a non-working character (or, even worse, having
to look up an input method for it) just stops me in my tracks. An
example is when I reply to Óscar, "Ó" being outside my working character
set.
Post by Paul Eggert
Emacs currently makes it harder to deal with non-ASCII and/or
non-working characters than it could.
Could you give an example of this (pertaining, preferably, to non-ascii
working characters)? Thinking about it, maybe having to use Latin keys
in lots of bindings when ones keyboard's preferred layout is non-Latin
could be quite tiresome. But I've never seen anybody complaining about
this.
Post by Paul Eggert
Restricting our source code to ASCII and/or working characters has
helped us put our heads in the sand about the problem.
Whatever problem that might be, the solution surely cannot be
artificially to inflict it on ourselves.
--
Alan Mackenzie (Nuremberg, Germany).
Paul Eggert
2015-08-18 19:25:18 UTC
Permalink
Post by Alan Mackenzie
Well, as I said, I edit texts with non-ascii characters frequently, and
don't experience any particular difficulty with them. Having to type in
a decimal/hex code for a non-working character (or, even worse, having
to look up an input method for it) just stops me in my tracks. An
example is when I reply to Óscar, "Ó" being outside my working character
set.
Post by Paul Eggert
Emacs currently makes it harder to deal with non-ASCII and/or
non-working characters than it could.
Could you give an example of this (pertaining, preferably, to non-ascii
working characters)?
You gave an example in your previous paragraph, where you're stopped in your
tracks if you have to type "Ó" into Emacs.
Post by Alan Mackenzie
Whatever problem that might be, the solution surely cannot be
artificially to inflict it on ourselves.
There's nothing artificial about using a character to represent itself in
typical usage in a doc string or a diagnostic. What's artificial is requiring
users to laboriously type and read ASCII-only circumlocutions instead.
Alan Mackenzie
2015-08-18 20:42:11 UTC
Permalink
Hello, Paul.
Post by Paul Eggert
Post by Alan Mackenzie
Well, as I said, I edit texts with non-ascii characters frequently, and
don't experience any particular difficulty with them. Having to type in
a decimal/hex code for a non-working character (or, even worse, having
to look up an input method for it) just stops me in my tracks. An
example is when I reply to Óscar, "Ó" being outside my working character
set.
Post by Paul Eggert
Emacs currently makes it harder to deal with non-ASCII and/or
non-working characters than it could.
Could you give an example of this (pertaining, preferably, to non-ascii
working characters)?
You gave an example in your previous paragraph, where you're stopped in your
tracks if you have to type "Ó" into Emacs.
OK, but I can't really see the connection between this (and what Chad
Brown was unhappy about) and the replacement in our source code of ` and
' by curlies. Nobody having to type "Ó" on a Spanish keyboard layout
would have any trouble.
Post by Paul Eggert
Post by Alan Mackenzie
Whatever problem that might be, the solution surely cannot be
artificially to inflict it on ourselves.
There's nothing artificial about using a character to represent itself in
typical usage in a doc string or a diagnostic. What's artificial is requiring
users to laboriously type and read ASCII-only circumlocutions instead.
There is nothing laborious about hitting key 41 or key 40. Anybody who
finds this laborious will not be programming in lisp for very long.

There is nothing indirect about using ` and ' to "stand for themselves",
in quoting things. I've never heard of an Emacs hacker experiencing
difficulty reading things quoted `like this'.

On the contrary, holding down <AltGr> while typing on the numeric
keypad, successively 2, 0, 1, 8 or 2, 0, 1, 9 is laborious indeed, even
assuming that these codes have been retained in memory. For that is
what a user with a normal keyboard layout, outside of Emacs, will be
forced to do. As we know there are workarounds inside Emacs to help
with this.
--
Alan Mackenzie (Nuremberg, Germany).
Paul Eggert
2015-08-18 21:40:17 UTC
Permalink
Post by Alan Mackenzie
Nobody having to type "Ó" on a Spanish keyboard layout
would have any trouble.
Spanish keyboards typically do not have "Ó", so your restrictive definition of
"working" would say that "Ó" is trouble even on a Spanish keyboard.
Post by Alan Mackenzie
On the contrary, holding down <AltGr> while typing on the numeric
keypad, successively 2, 0, 1, 8 or 2, 0, 1, 9 is laborious indeed, even
assuming that these codes have been retained in memory. For that is
what a user with a normal keyboard layout, outside of Emacs, will be
forced to do.
The criterion cannot be that any text editor in any configuration should be able
to edit Emacs source code with no trouble. That hasn't ever been true. All
that's needed is that people be able to edit their source code in Emacs, with
occasional use by other text editors, when properly configured. All modern text
editors can handle UTF-8 text files when properly configured, so this is not a
problem. I just now checked 'less' and 'vim', for example, and they work fine
with UTF-8 curved quotes even on the Linux console if it's properly configured.
Óscar Fuentes
2015-08-18 22:44:55 UTC
Permalink
Post by Paul Eggert
Post by Alan Mackenzie
Nobody having to type "Ó" on a Spanish keyboard layout
would have any trouble.
Spanish keyboards typically do not have "Ó", so your restrictive
definition of "working" would say that "Ó" is trouble even on a
Spanish keyboard.
Every Spaniard who ever used a typewriter knows that accented letters
are composed like this: `o. This is not a problem at all and the
approach generalizes to others cases, such as French and Catalonian,
which have more accent types than Spanish. An Spaniard would have no
problem at all typing the French è, for instance. Furthermore, an
Spaniard would have no problem using an USA keyboard with the
USA-international input method: we quickly figure out that ñ is ~n.

However, I need to learn how to write the curly quotes, and then
remember the method. This makes things harder for us occasional
contributors, not to mention beginners. I'm pretty sure that copy&paste
will become more popular with this convention.

I admit that curly quotes are nicer, but that's easily achievable: make
them appear on *Help* and other buffers that shows docstrings. Dmitry
suggests this, and his comment about modern markup languages restricting
themselves to ASCII is something to think about.

If your goal is to make Emacs more Unicode-friendly, this is not the way
to go. Identify the problematic areas, taking feedback from users like
Chad Brown, and then proceed to fix those shortcomings with those users
acting as "customers" (i.e. judges of the proposed solution.)

I admit that I'm intrigued by your plan about how this change will
initiate an evolution on Emacs input system that will make easier to
type exotic characters (defining "exotic" by "something that it is
infrequent in your daily usage.") Maybe describing the specific
user-visible improvements that this change will help to bring into
reality would buy you more support.
Bastien
2015-08-18 23:11:37 UTC
Permalink
Post by Óscar Fuentes
I admit that curly quotes are nicer, but that's easily achievable: make
them appear on *Help* and other buffers that shows docstrings. Dmitry
suggests this, and his comment about modern markup languages restricting
themselves to ASCII is something to think about.
1+
Paul Eggert
2015-08-18 23:41:01 UTC
Permalink
Post by Óscar Fuentes
I admit that curly quotes are nicer, but that's easily achievable: make
them appear on*Help* and other buffers that shows docstrings.
That was done weeks ago. This thread is about something else: quotes in
diagnostics. It's not as easy to get diagnostics right, which is why we're
having this discussion.
Post by Óscar Fuentes
Maybe describing the specific
user-visible improvements that this change will help to bring into
reality would buy you more support.
I have no secret master plan for revolutionizing Emacs. I'm just trying to make
improvements one step at a time. If it helps, I think diagnostics are the last
major component of getting Emacs quoting to follow common modern practice
(controlled by user preference of course).
Óscar Fuentes
2015-08-19 00:29:29 UTC
Permalink
Post by Paul Eggert
Post by Óscar Fuentes
I admit that curly quotes are nicer, but that's easily achievable: make
them appear on*Help* and other buffers that shows docstrings.
That was done weeks ago. This thread is about something else: quotes
in diagnostics. It's not as easy to get diagnostics right, which is
why we're having this discussion.
I thought that, in this subthread, we were discussing the introduction
of curly quotes in source code. Sorry for the confusion.
Post by Paul Eggert
Post by Óscar Fuentes
Maybe describing the specific
user-visible improvements that this change will help to bring into
reality would buy you more support.
I have no secret master plan for revolutionizing Emacs. I'm just
trying to make improvements one step at a time. If it helps, I think
diagnostics are the last major component of getting Emacs quoting to
follow common modern practice (controlled by user preference of
course).
There are some people (me included) that doubt that using hard-to-type
chars in source code is an improvement, quite the opposite. I've yet to
see a justification for this change. It would be great to have an
explanation (as specific as possible) about how it will help to improve
Emacs.
Óscar Fuentes
2015-08-19 00:38:19 UTC
Permalink
Post by Óscar Fuentes
Post by Paul Eggert
Post by Óscar Fuentes
I admit that curly quotes are nicer, but that's easily achievable: make
them appear on*Help* and other buffers that shows docstrings.
That was done weeks ago. This thread is about something else: quotes
in diagnostics. It's not as easy to get diagnostics right, which is
why we're having this discussion.
I thought that, in this subthread, we were discussing the introduction
of curly quotes in source code. Sorry for the confusion.
Oh, but the diagnostics are part of the source code, of course. I was
not so confused after all.

I know that docstrings are displayed with curly quotes. My point (and I
think Dmitry's) is that the change should end there. Maybe *transform*
the diagnostics too, but that's no so important, because diagnostics are
not intended to be part of regular reading by users.

[snip]
Paul Eggert
2015-08-19 18:40:26 UTC
Permalink
Perhaps the thread was initially about quotes in diagnostics, but a
few days ago the issue of doc strings seemed to come up. That is why
I started posting.
Did I misunderstand?
I think so. Your first message about doc strings, dated Monday:

http://lists.gnu.org/archive/html/emacs-devel/2015-08/msg00608.html

was in reply to a Sunday message about diagnostics:

http://lists.gnu.org/archive/html/emacs-devel/2015-08/msg00566.html

The docstring changes were mostly done weeks ago and haven't changed much
recently. The diagnostics changes are within the past week: they were initiated
by Bug#21222 and they kicked off this thread.
Wolfgang Jenkner
2015-08-20 13:37:10 UTC
Permalink
Post by Paul Eggert
If it helps, I think
diagnostics are the last major component of getting Emacs quoting to
follow common modern practice (controlled by user preference of
course).
I'd think that "common modern practice" for GNU projects is described in
(info "(gettext) po/LINGUAS").

IIUC, the description there recommends that the usual "ugly" quotes
should be used in source files and translated at run time according to
the locale.

So, provided things are not messed up by a third party, for the en
locale you get curly quotes only if you set a variant locale like
***@quot or ***@boldquot (after making sure that your systems supports
those locales). E.g., bash provides such message files.
Paul Eggert
2015-08-20 20:23:36 UTC
Permalink
Post by Wolfgang Jenkner
I'd think that "common modern practice" for GNU projects is described in
(info "(gettext) po/LINGUAS").
IIUC, the description there recommends that the usual "ugly" quotes
should be used in source files and translated at run time according to
the locale.
gettext caters to C and that part of its manual makes sense for languages like
C, because portable C programs can't easily put non-ASCII characters into string
literals and C programmers don't have a practical alternative to putting "ugly"
quotes of some sort into their string literals. This sort of uglification isn't
necessary for Emacs Lisp, though.

If Emacs were to use gettext, presumably it would wrap every English-language
diagnostic in a call to gettext, even diagnostics that don't involve quotes.
For example:

(format (gettext "Buffer %S has a running process; kill it? ")
(buffer-name (current-buffer)))
(format (gettext "‘add-to-list’ can't use var ‘%s’") sym)
...

We'd also need ask the Translation Project to maintain a set of translations for
all the Emacs diagnostics. This would work regardless of the quoting style used
in the source code, and would be a good project to do, if someone wanted to do
it. It'd be a much bigger project than fixing quoting, though, as the set of
changes involved to Emacs would be more complicated than the above simple
examples suggest.
Eli Zaretskii
2015-08-21 07:41:53 UTC
Permalink
Date: Thu, 20 Aug 2015 15:37:10 +0200
=20
=20
Post by Paul Eggert
If it helps, I think
diagnostics are the last major component of getting Emacs quoting=
to
Post by Paul Eggert
follow common modern practice (controlled by user preference of
course).
=20
I'd think that "common modern practice" for GNU projects is describ=
ed in
(info "(gettext) po/LINGUAS").
=20
IIUC, the description there recommends that the usual "ugly" quotes
should be used in source files and translated at run time according=
to
the locale.
That description is for messages in locale-specific languages. If
someone wants to work on infrastructure for translating Emacs message=
s
to other languages, then they should include in that infrastructure
facilities for converting the quotes to locale-specific conventions
(or maybe rely on the translators to replace the quotes as part of th=
e
translated messages, as gettext-based projects do).

But as long as Emacs speaks to the users only in US English, it is IM=
O
wrong to replace the quotes by locale-specific ones, exactly for the
same reason it is wrong to sort multi-lingual text in a buffer using
locale-specific sorting rules: those locale-specific rules were
invented for sorting the locale's language(s), not for sorting
multi-lingual text. E.g., it would be very weird for me to see a
message quoting, say, Chinese text with German-style quotes just
because I happen to be in the de_DE locale!
Stephen J. Turnbull
2015-08-19 06:31:47 UTC
Permalink
Dmitry suggests this, and his comment about modern markup languages
restricting themselves to ASCII is something to think about.
Not really. No chicken developed from that egg because there was no
chicken to lay the egg in the first place. By and large programmers'
environments are deficient in respect of input methods, especially in
the U.S., and until a few years ago solid multilingual Unicode
environments weren't really available (and still aren't on Windows, if
I understand Eli's descriptions correctly). So programmers (who
design markup languages) restrict themselves to ASCII-based markup.
It's only become reasonable to think about going beyond ASCII in the
last 5 years or so (if you want to maintain fairly general appeal).
And there's the counterexample of Xe[La]TeX, which in fact developed
for Mac, the most complete Unicode implementation available at the
time -- a single anecdote, but very suggestive IMHO.

Emacs is the perfect environment to experiment with *discoverable*
*multilingual* input methods. AFAIK, they don't exist yet,
*anywhere*. Apple is going backwards, even. Microsoft doesn't have
them, either. The proprietary technology is quite good -- within the
context of monolingual environments (which is where the money is, even
in Europe the number of companies where individuals need multilingual
environments is limited). But they require effort for neophytes to
learn, and are less than useful for "inputting 'exotic' characters.
As far as I can tell, there's nothing better out there for free
software, either -- we're now on our fourth or fifth generation of new
input management frameworks for GNOME and/or KDE, and *still* the most
frequent n00b question on the Tokyo Linux Users Group[sic] is "I just
upgraded MyDistro and now I can't input Japanese in WhateverOffice".

My Chinese students and Buddhist scholar friends all use Macs because
it's very easy to switch among input methods (Chinese, Japanese, and
Sanskrit are radically different -- it's sort of possible to share an
input method between Chinese and Japanese, but it's very painful).
But all of these methods are monolingual, and must be learned
separately (or "taught", as most "learn" the user's habits, changing
priorities in the dictionaries and storing common sequences of words
for "predictive translation").

Emacs at least has Quail, giving language flexibility as good or
better than Apple, although the input methods themselves are static,
so aren't as user-friendly as the proprietary ones that "learn" the
users' habits. And (one small step for Emacs, one giant step for
mankind) Quail methods are self-documenting (although again
discoverability needs to be improved for the purpose of "typing
'exotic' characters").
I admit that I'm intrigued by your plan about how this change will
initiate an evolution on Emacs input system that will make easier to
type exotic characters (defining "exotic" by "something that it is
infrequent in your daily usage.")
By giving people an itch they want to scratch. Most people will just
cut'n'paste or add ad hoc keybindings for the characters they need.
Some people will do more, and sooner or later one of them will come up
with a much better way to do input methods. It's not obvious to me
what that will be, and it's probably useless to ask Paul what it will
be too.

David K pointed out that there are some useful ideas in x-symbol.
That might be one place to look.

Also, besides input methods, it will likely lead to improvements in
other technologies such as searching (adding character classes of
"cognates" such as ` and ‘, for example -- this is useful for
repertoires like Japanese which has about a dozen variants on open
parenthesis more or less commonly used in text, as well as a pile of
numeral variants used for paragraph numbering, and the like).

Those opposed to the change will cry YAGNI, and that's true -- if you
live in an 8-bit world anyway, you just can't afford that kind of
redundancy. But like it or not, the world is now mostly Unicode and
that will only increase. Japanese is probably the most perverse
character set in existence, but I believe Chinese and Korean also have
similar issues of many classes of characters that have redundant
functionality, and it shows up in other places (eg, arrows and
emoticons).
Maybe describing the specific user-visible improvements that this
change will help to bring into reality would buy you more support.
The user-visible improvements have been described and are easily
visible to the eye desiring to see them. Tastes just differ here; the
people who don't like the change see little to no improvement, and
IIUC Drew even considers it a clear step backward aesthetically.
Óscar Fuentes
2015-08-19 06:58:24 UTC
Permalink
"Stephen J. Turnbull" <***@xemacs.org> writes:

[snip]
Post by Stephen J. Turnbull
Emacs is the perfect environment to experiment with *discoverable*
*multilingual* input methods.
In my experiments with Unicode in source code, input was the least of
the concerns (as long as Emacs was used.) Display was the no-no part.
Not because Emacs bugs or limitations, by the way.

If the goal is to improve input for other scripts, I have no idea how
forcing curly quotes in Emacs' source code will help.

[snip]
Post by Stephen J. Turnbull
Maybe describing the specific user-visible improvements that this
change will help to bring into reality would buy you more support.
The user-visible improvements have been described and are easily
visible to the eye desiring to see them.
Maybe I skipped the post that described the improvements, but what I've
seem so far is "because it is modern", "because beginners think our
current style is weird", "because other GNU projects use it", "because
this will help improve our Unicode support"... Am I missing something?

[snip]
Stephen J. Turnbull
2015-08-19 09:09:19 UTC
Permalink
Post by Óscar Fuentes
Maybe I skipped the post that described the improvements, but what
I've seem so far is "because it is modern",
I think you misread. The point is that modernizing projects are
already doing it and the sky didn't fall, not that calling a feature
"modern" makes it attractive.
Post by Óscar Fuentes
"because beginners think our current style is weird",
Indeed, they do. So do a lot of people who are way beyond beginners.
Post by Óscar Fuentes
"because other GNU projects use it",
Indeed, they do.
Post by Óscar Fuentes
"because this will help improve our Unicode support"...
I think it will, as I gather Paul does.
Post by Óscar Fuentes
Am I missing something?
"It improves readability." But that's a matter of taste.
Andreas Schwab
2015-08-19 09:13:49 UTC
Permalink
Post by Stephen J. Turnbull
Post by Óscar Fuentes
"because beginners think our current style is weird",
Indeed, they do. So do a lot of people who are way beyond beginners.
Should we switch to `...` then ?

Andreas.
--
Andreas Schwab, SUSE Labs, ***@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Andreas Schwab
2015-08-19 14:47:52 UTC
Permalink
Post by Andreas Schwab
Should we switch to `...` then ?
Looks like a deprecated Python idiom to me, so I'd say no.
Everyone is using it, so it must be cool.

Andreas.
--
Andreas Schwab, SUSE Labs, ***@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Stephen J. Turnbull
2015-08-19 16:10:48 UTC
Permalink
Post by Andreas Schwab
Everyone is using it, so it must be cool.
To borrow a line from Eli, you are confusing me with someone else. I
never argued that, and I don't recall anybody in favor of the change
arguing it either.
Dmitry Gutov
2015-08-20 23:46:15 UTC
Permalink
Post by Andreas Schwab
Should we switch to `...` then ?
Looks like a deprecated Python idiom to me,
You might want to look into modern documentation markup languages, then.
Eli Zaretskii
2015-08-19 14:16:28 UTC
Permalink
Date: Wed, 19 Aug 2015 15:31:47 +0900
until a few years ago solid multilingual Unicode environments
weren't really available (and still aren't on Windows, if I
understand Eli's descriptions correctly)
Not quite. Solid multilingual environments do exist on Windows, just
not for console (i.e. text-mode) programs. GUI programs on MS-Windows
have any number of facilities for presenting fully functional
multilingual environments, and Emacs uses these facilities to a large
degree, both for keyboard input and for display.
Emacs is the perfect environment to experiment with *discoverable*
*multilingual* input methods. AFAIK, they don't exist yet,
*anywhere*. Apple is going backwards, even. Microsoft doesn't have
them, either.
If you are implying that input methods don't exist on Windows, then
this is false: they do.

FWIW, I don't understand and don't share people's gripes about input
methods in Emacs. I use them quite a lot, and actually prefer them to
whatever keyboards I have installed on my systems: the Emacs input
methods are easier to learn and modify, work the same on any system,
and there's an easy way of showing which key does what.
Stephen J. Turnbull
2015-08-19 16:03:54 UTC
Permalink
Post by Eli Zaretskii
If you are implying that input methods don't exist on Windows, then
this is false: they do.
No, I'm saying that the people around me who have to do both Japanese
and Chinese input prefer Mac, largely because it's easier to configure
Mac with multiple input methods and switch among them when doing email
and the like.

That may just be that Macs are generally easier to configure without
paying a consulting firm $500 for the Power User course, of course. I
don't have personal experience.
Richard Stallman
2015-08-19 18:16:03 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
Post by Stephen J. Turnbull
Emacs is the perfect environment to experiment with *discoverable*
*multilingual* input methods.
Hear, hear!

Maybe we should bind the INSERT key to a new facility for easily
discoverable insertion of characters. If they fit, it could also
provide ways to insert other things -- buffers, files, the kill ring
and registers, effectively becoming a prefix key for all kinds of
insertion.

This would make the command overwrite-mode harder to type. I don't
know how many users would mind that; perhaps we should conduct a poll.
INSERT INSERT could run overwrite-mode.

What do people think?
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Eli Zaretskii
2015-08-21 07:17:11 UTC
Permalink
Date: Wed, 19 Aug 2015 14:16:03 -0400
Maybe we should bind the INSERT key to a new facility for easily
discoverable insertion of characters. If they fit, it could also
provide ways to insert other things -- buffers, files, the kill ring
and registers, effectively becoming a prefix key for all kinds of
insertion.
This would make the command overwrite-mode harder to type. I don't
know how many users would mind that; perhaps we should conduct a poll.
INSERT INSERT could run overwrite-mode.
If we do that, there should be a prominent indication that the INSERT
"discovery" mode was activated. I frequently press that key by
accident, when I actually want to press "Delete" or "Home" that are
nearby.
Richard Stallman
2015-08-19 18:14:44 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
Post by Óscar Fuentes
I admit that curly quotes are nicer, but that's easily achievable: make
them appear on *Help* and other buffers that shows docstrings.
Haven't we already done that?
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Alan Mackenzie
2015-08-18 23:15:29 UTC
Permalink
Hello, Paul.
Post by Paul Eggert
Post by Alan Mackenzie
Nobody having to type "Ó" on a Spanish keyboard layout
would have any trouble.
Spanish keyboards typically do not have "Ó", so your restrictive definition of
"working" would say that "Ó" is trouble even on a Spanish keyboard.
I think you know full well what I mean by "working character", and that
my definition wasn't meant to be water tight in the way you're now
trying to pick holes in. Spanish keyboards DO have "Ó", for any
sensible value of "have".
Post by Paul Eggert
Post by Alan Mackenzie
On the contrary, holding down <AltGr> while typing on the numeric
keypad, successively 2, 0, 1, 8 or 2, 0, 1, 9 is laborious indeed, even
assuming that these codes have been retained in memory. For that is
what a user with a normal keyboard layout, outside of Emacs, will be
forced to do.
The criterion cannot be that any text editor in any configuration should be able
to edit Emacs source code with no trouble. That hasn't ever been true.
I think it pretty much has been, certainly for all but a few special
files, up until the last few days.
Post by Paul Eggert
All that's needed is that people be able to edit their source code in
Emacs, with occasional use by other text editors, when PROPERLY
CONFIGURED. All MODERN text editors can HANDLE UTF-8 text files when
PROPERLY CONFIGURED, so this is not a problem. I just now checked
'less' and 'vim', for example, and they WORK FINE with UTF-8 curved
quotes even on the Linux console if it's PROPERLY CONFIGURED.
Paul, that paragraph is pretty much content free, without having some
good idea of what that vague bits (I've capitalised them) mean.

I think you're agreeing with me that to work with curly quotes using
these tools, you're going to be having to enter their hex codes, as
described above. In the file versions without the curlies, this simply
didn't arise. Putting the curly quotes into our source files have made
them more difficult to work with. This is a Bad Thing.
--
Alan Mackenzie (Nuremberg, Germany).
Paul Eggert
2015-08-19 04:24:20 UTC
Permalink
Post by Alan Mackenzie
I think you know full well what I mean by "working character",
No, actually, I don't. From my point of view "working character" is a slippery
notion that mutates when we try to pin it down. Perhaps we should stick with
"non-ASCII character"; that's clear.
Post by Alan Mackenzie
Spanish keyboards DO have "Ó", for any sensible value of "have".
Only if properly configured. If you use these keyboards with the wrong
settings, Compose ' O won't work, and there's no "Ó" key on the keyboard so
you'll be stuck. In these respects "Ó", "‘" and "’" are all in the same
category on a Spanish keyboard. And this is OK.
Post by Alan Mackenzie
Post by Paul Eggert
The criterion cannot be that any text editor in any configuration should be able
to edit Emacs source code with no trouble. That hasn't ever been true.
I think it pretty much has been, certainly for all but a few special
files, up until the last few days.
No, it hasn't been true *at all*. It's quite common for Japanese keyboards to
lack a ‘\’ key, for example. And this is OK too.
Post by Alan Mackenzie
I think you're agreeing with me that to work with curly quotes using
these tools, you're going to be having to enter their hex codes
Not at all. That would be silly. I never use hex codes to enter these characters.
Óscar Fuentes
2015-08-19 07:37:13 UTC
Permalink
Post by Paul Eggert
Post by Alan Mackenzie
I think you know full well what I mean by "working character",
No, actually, I don't. From my point of view "working character" is a
slippery notion that mutates when we try to pin it down. Perhaps we
should stick with "non-ASCII character"; that's clear.
Post by Alan Mackenzie
Spanish keyboards DO have "Ó", for any sensible value of "have".
Only if properly configured. If you use these keyboards with the
wrong settings, Compose ' O won't work, and there's no "Ó" key on the
keyboard so you'll be stuck. In these respects "Ó", "‘" and "’" are
all in the same category on a Spanish keyboard. And this is OK.
Paul, as an Spaniard with 30+ years of experience with computers, I can
assure you that accented letters is a non-issue since a *long* time ago.
I can't remember having issues with accents since I'm using PCs, that's
20+ years. The only way you can have problems with accented letters is
if you lie to the OS saying that your keyboard is not Spanish.

However, I'm lost if you ask me how to input "‘" and "’" (see, I
copy&pasted from your text). Probably I would use the Emacs Unicode
facilities looking for something that looks like those chars, and hope I
arrive at the actual chars and not at something with the same looks but
that's not quite the same.

[snip]
Nicolas Richard
2015-08-19 10:10:58 UTC
Permalink
Post by Paul Eggert
Only if properly configured. If you use these keyboards with the
wrong settings, Compose ' O won't work, and there's no "Ó" key on the
keyboard so you'll be stuck. In these respects "Ó", "‘" and "’" are
all in the same category on a Spanish keyboard. And this is OK.
I don't know about Spanish keyboards.

On a french/belgian keyboard, however, some accented letters have their
own key (é è à ù), and for the rest of them we have dead keys. So for Ó,
I have <dead-acute> O.

The curly quotes OTOH are inconvenient to type for me. In emacs I now
know I can use "C-x 8 [" but that's 6 keys to press (numbers are shifted keys
unless I move to the numpad, and opening square bracket is obtained by
pressing AltGr and another key). I don't know yet how to type it outside
emacs. M-a M-k. I just found out how to type them outside emacs:
<compose> < ' is ‘ and <compose > ' is ’.
--
Nico
Óscar Fuentes
2015-08-20 13:31:45 UTC
Permalink
Paul Eggert <***@cs.ucla.edu> writes:

[snip]
Post by Óscar Fuentes
The only way you can have problems with accented letters is
if you lie to the OS saying that your keyboard is not Spanish.
Yes, and I've lied to the OS about that sort of thing with localized
keyboards; it happens.
Maybe that's because you are not a typical user. In my decades-long
experience supporting non-techy users, I barely recall a case involving
a misconfigured keyboard.
We can't expect Emacs to work well in every such misconfiguration.
My point was about the difficulty of typing those curly quotes. That's
on a properly configured keyboard. BTW, the X <compose> method (which
doesn't work on my Kubuntu 15.04 + USA layout international variant, but
that's maybe a misconfiguration caused by my efforts of making it work
on a sensible way) is an aberration, IMHO. It *always* takes me more
time to figure out which key combination gives the char that I want than
searching the web and paste it. And when I use some <compose>
combination, and manage to remember it (difficult) then it is broken on
the next OS upgrade.

I'm sure that Emacs can do better than <compose>. I *think* that it
already does better. What I don't know is how forcing curly quotes on
the source code can help.
Andreas Schwab
2015-08-20 07:20:28 UTC
Permalink
Post by Nicolas Richard
<compose> < ' is ‘ and <compose > ' is ’.
Yes, that's a common combination, standard in GTK+ and usable in both
Emacs and other apps when running under Gnome.
This has nothig to do with gnome. It's a standard X11 feature that
works everywhere.

Andreas.
--
Andreas Schwab, SUSE Labs, ***@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Marcin Borkowski
2015-08-19 14:26:57 UTC
Permalink
For the record: instead of M-a M-k, I'd use C-x <backspace>.

Best,
--
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University
Yuri Khan
2015-08-18 15:09:51 UTC
Permalink
Post by Alan Mackenzie
What I object to is _non-working_ characters - characters which appear
on nobody's keyboard (see Bastien's question about typing curly quotes)
and are problematic to display (See Eli's recent post, for example).
It’s these two problems which need fixed.

We need keyboards (or input methods) which offer a convenient way to
enter typographically correct quotes and dashes and other essential
punctuation, because otherwise people fall back to their
easier-to-enter ASCII substitutes.

We need terminals which are capable of displaying the whole repertoire
of Unicode, because otherwise we have to make a choice of the subset
we’d like to be able to see.

Making these two long-standing problems more visible is a good thing.

(As far as I am concerned, both are solved problems already. It’s just
that the solutions are not mainstream enough.)
Andreas Schwab
2015-08-18 15:24:47 UTC
Permalink
Post by Yuri Khan
(As far as I am concerned, both are solved problems already. It’s just
that the solutions are not mainstream enough.)
In X11 a whole lot of special characters are available via Compose, but
they are not easily discoverable (see
/usr/share/X11/locale/en_US.UTF-8/Compose for the full list).

Andreas.
--
Andreas Schwab, SUSE Labs, ***@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Yuri Khan
2015-08-18 15:48:23 UTC
Permalink
Post by Andreas Schwab
Post by Yuri Khan
(As far as I am concerned, both are solved problems already. It’s just
that the solutions are not mainstream enough.)
In X11 a whole lot of special characters are available via Compose, but
they are not easily discoverable (see
/usr/share/X11/locale/en_US.UTF-8/Compose for the full list).
Better than that: enabling the misc:typo XKB option, along with any of
the options of the lv3:* group, puts many useful characters directly
on the keyboard, by adding two more character layers (in addition to
normal and Shift layers).

There are tools (based on xkbprint) that show the resulting layout.
GNOME 2 used to make it reasonably discoverable; I don’t know about
GNOME 3.
Alan Mackenzie
2015-08-18 15:48:57 UTC
Permalink
Hello, Yuri.
Post by Yuri Khan
Post by Alan Mackenzie
What I object to is _non-working_ characters - characters which appear
on nobody's keyboard (see Bastien's question about typing curly quotes)
and are problematic to display (See Eli's recent post, for example).
It’s these two problems which need fixed.
We need keyboards (or input methods) which offer a convenient way to
enter typographically correct quotes and dashes and other essential
punctuation, because otherwise people fall back to their
easier-to-enter ASCII substitutes.
There are already input methods for curly quotes (C-x 8 [ and C-x 8 ], I
believe), but whether these will ever count as "convenient", I somehow
doubt. Even typing [ and ] on a German keyboard layout (just as an
example) is somewhat less than convenient.

One solution is to enhance our own personal keyboard layouts, but that
isn't a solution for our users. Nothing practical we can come up with
is going to be as easy as just hitting key number 41 and key number 40.
Post by Yuri Khan
We need terminals which are capable of displaying the whole repertoire
of Unicode, because otherwise we have to make a choice of the subset
we’d like to be able to see.
Such terminals probably exist. However, they're not what "everybody" is
using. The Linux virtual terminal, which I use, is currently limited to
256 distinct glyphs. I've had a look at the code, with a view to
enhancing it, but it is not well maintained and easily adaptible code.
Displaying the curly quotes in it is not unproblematic. Yesterday, Eli
Z. reported a problem on an MS-Windows terminal which couldn't display
these characters at all.

One of the outstanding features of Emacs is that it will run equally
well in "any" environment. This property is well worth preserving.
Post by Yuri Khan
Making these two long-standing problems more visible is a good thing.
Visible to whom? We definitely don't want to make them visible to our
users. That will just make them annoyed and resentful.
Post by Yuri Khan
(As far as I am concerned, both are solved problems already. It’s just
that the solutions are not mainstream enough.)
I'd be interested in hearing a bit more about what you see as the
solutions. Usually, you don't get something for nothing, and I'd bet
that these solutions come with their own disadvantages, compared with
what "everybody" is currently using.
--
Alan Mackenzie (Nuremberg, Germany).
Yuri Khan
2015-08-18 17:08:27 UTC
Permalink
Post by Alan Mackenzie
There are already input methods for curly quotes (C-x 8 [ and C-x 8 ], I
believe), but whether these will ever count as "convenient", I somehow
doubt. Even typing [ and ] on a German keyboard layout (just as an
example) is somewhat less than convenient.
Correct. Using Emacs as the input method is not feasible for users of
other applications.

My current method of entering curly quotes is comparable to entering
brackets on a German keyboard — hold down right Alt, press a single
key. I find it good enough for such infrequent characters.
Post by Alan Mackenzie
Post by Yuri Khan
We need terminals which are capable of displaying the whole repertoire
of Unicode, because otherwise we have to make a choice of the subset
we’d like to be able to see.
Such terminals probably exist. However, they're not what "everybody" is
using.
“Everybody” is not using a terminal at all. “Everybody” uses a
graphical desktop. Including, in some circumstances, a terminal
emulator. Xterm, for one, does not support all of Unicode equally
well, but curly quotes are unproblematic.
Post by Alan Mackenzie
The Linux virtual terminal, which I use, is currently limited to
256 distinct glyphs.
Actually 512 if you sacrifice 8 of the 16 colors, and there is
possibility of replacing it with fbterm or other framebuffer-based
terminals.
Post by Alan Mackenzie
Yesterday, Eli
Z. reported a problem on an MS-Windows terminal which couldn't display
these characters at all.
I used to use Windows, including the Windows console, as my primary
environment. It displays most of the European part of Unicode
allright, once you configure it to use a TrueType or OpenType font.
CJK is harder (because most Han characters want to occupy two
character cells each) and RTL is harder still, but, again, curly
quotes are unproblematic.

Some applications (notably, ports of Unix utilities) have problems
displaying Unicode on the Windows console. That is a bug in those
applications.
Post by Alan Mackenzie
Post by Yuri Khan
(As far as I am concerned, both are solved problems already. It’s just
that the solutions are not mainstream enough.)
I'd be interested in hearing a bit more about what you see as the
solutions. Usually, you don't get something for nothing, and I'd bet
that these solutions come with their own disadvantages, compared with
what "everybody" is currently using.
For output, the solution is a graphical environment. With TrueType,
OpenType or otherwise vector-based scalable fonts, rendered through a
facility which supports Unicode, ligatures, combining diacritics, RTL,
complex scripts, rich formatting and whatnot.


For input, my current setup involves two layouts (for English/Latin
and Russian/Cyrillic), which differ in their 1st and 2nd levels, but
have common 3rd and 4th levels, activated with right Alt with and
without Shift. This accommodates like 99.99% of my typing needs. For
the remaining cases, I resort to a character map application or to
Emacs’ insert-char.

I don’t think I am losing much for it. I am vaguely aware that having
two Alt keys is more convenient than just one but, to be frank, I also
under-use right Shift and right Ctrl. (My first computer did not have
a right Shift; that might have influenced my typing habits.)

New keyboard designs are emerging which provide more keys intended to
be pressed with thumbs. This is ideal for multi-level layouts. The
classic AT keyboard with only 0.5 to 1.5 keys per thumb needs to give
way.
Eli Zaretskii
2015-08-18 18:12:09 UTC
Permalink
Date: Tue, 18 Aug 2015 23:08:27 +0600
I used to use Windows, including the Windows console, as my primary
environment. It displays most of the European part of Unicode
allright, once you configure it to use a TrueType or OpenType font.
Europe is but a small part of the world, and the fonts available for
that on the Windows console are ugly and hard to read.

So these problems are much more significant than what you are trying
to convince us.
curly quotes are unproblematic.
Not true, see above.
Some applications (notably, ports of Unix utilities) have problems
displaying Unicode on the Windows console. That is a bug in those
applications.
LOL. I have yet to see a native Windows port of a significant Unix
utility that doesn't have that "bug". About the only one I know about
is the stand-alone Info reader from the latest Texinfo 6.0 release.

And once again, when all is said and done, and the ported code handles
Unicode characters correctly when writing to the console, you are back
at the console font problem I mentioned above: anything outside the
European locales is downright impossible, and inside Europe you have
only the ugly Lucida Console font.

Let's not pretend this is easy, especially on Windows and in any other
non-UTF-8 locale.
Yuri Khan
2015-08-19 04:45:32 UTC
Permalink
Post by Eli Zaretskii
Post by Yuri Khan
I used to use Windows, including the Windows console, as my primary
environment. It displays most of the European part of Unicode
allright, once you configure it to use a TrueType or OpenType font.
Europe is but a small part of the world, and the fonts available for
that on the Windows console are ugly and hard to read.
The Windows console has a two-stage mechanism for configuring the font.

First, the registry specifies a subset of fonts which can be used in
the console. Initially this subset only contains Lucida Console or
Courier New. As far as I can tell, Far East localizations add a
CJK-enabled font. Googling for “add windows console font” will give
you the exact steps.

After that, chosen fonts appear in the console properties dialog.

I used Andale Mono and subsequently Liberation Mono with great success.
Post by Eli Zaretskii
Post by Yuri Khan
Some applications (notably, ports of Unix utilities) have problems
displaying Unicode on the Windows console. That is a bug in those
applications.
LOL. I have yet to see a native Windows port of a significant Unix
utility that doesn't have that "bug".
Ubiquity of a bug does not imply that it can or should go unfixed.
Eli Zaretskii
2015-08-19 14:14:30 UTC
Permalink
Date: Wed, 19 Aug 2015 10:45:32 +0600
Post by Eli Zaretskii
Post by Yuri Khan
I used to use Windows, including the Windows console, as my primary
environment. It displays most of the European part of Unicode
allright, once you configure it to use a TrueType or OpenType font.
Europe is but a small part of the world, and the fonts available for
that on the Windows console are ugly and hard to read.
The Windows console has a two-stage mechanism for configuring the font.
First, the registry specifies a subset of fonts which can be used in
the console. Initially this subset only contains Lucida Console or
Courier New. As far as I can tell, Far East localizations add a
CJK-enabled font.
This is year 2015. We are way past "localization" phase of the 1990s,
when certain features existed only in certain locales. Features that
require installation of extra language packs, like Far-Eastern
localizations, and aren't available otherwise, cannot be relied upon.
They don't exist for all practical purposes. You cannot tell your
users to install those localizations, because most of them won't. The
result is that displaying CJK text on a Windows console only works in
CJK locales, and similarly with other scripts outside of Europe. That
flies in the face of any decent multilingual environment such as
Emacs.

We need these features working out of the box, on any end-user's
machine. Until then, they don't exist, and therefore Emacs in its
console mode cannot provide a decent multilingual environment on
Windows.
I used Andale Mono and subsequently Liberation Mono with great success.
Good for you. But this doesn't solve the problem for others. Very
few people will invest a significant amount of their time into
tinkering with their systems. A solution that relies on that will not
fly.
Post by Eli Zaretskii
Post by Yuri Khan
Some applications (notably, ports of Unix utilities) have problems
displaying Unicode on the Windows console. That is a bug in those
applications.
LOL. I have yet to see a native Windows port of a significant Unix
utility that doesn't have that "bug".
Ubiquity of a bug does not imply that it can or should go unfixed.
But it does say something about the scale of the problem. It's not
like a "non-buggy" port exists somewhere and you can tell people to
install it instead of the buggy one they have. Fixing this is hard,
because Windows doesn't yet fully support the UTF-8 codepage, so any
program that uses 'char *' for text strings needs radical changes to
fix this "bug". It's a small wonder that there are almost no programs
that have this fixed.

Your misrepresentations of these problems makes a disservice to people
who read them.
Stephen J. Turnbull
2015-08-19 05:19:17 UTC
Permalink
And once again, when all is said and done, and the [Windows-]
ported code handles Unicode characters correctly when writing to the
anything outside the European locales is downright impossible, and
inside Europe you have only the ugly Lucida Console font.
Not to prejudge the overall issue, as there are many users of free
environments opposed to the curvely quotes, but seriously: you
complain that you use non-free software and its vendor doesn't support
a proposed Emacs feature?[1] Surely this should not be a consideration
in Emacs development -- if Emacs features can be supported on Windows
or Mac, fine, do it, but Windows or Mac lack of support for an Emacs
feature is no reason to avoid installing that feature.

OTOH Alan's issue about the small glyph repertoire of the Linux
console is a valid concern, although the availability of capable
alternative consoles weakens that particular point to some degree.


Footnotes:
[1] And to add insult to injury, a feature that was in part bought,
paid for, and advocated by that vendor?
Stephen J. Turnbull
2015-08-19 16:05:29 UTC
Permalink
Post by Stephen J. Turnbull
Not to prejudge the overall issue, as there are many users of free
environments opposed to the curvely quotes, but seriously: you
complain that you use non-free software and its vendor doesn't support
a proposed Emacs feature?
I didn't complain. You are confusing me with someone else.
I apologize for the confusion, and wonder why you didn't say the same
thing that I did to that someone else.<wink/>
Richard Stallman
2015-08-19 01:19:11 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

I use non-ASCII characters every day while editing text in French and
Spanish. I also sometimes need to enter names in other languages,
such as Hungarian and Turkish, which is why I have asked for better
support for occasionally inputting unusual (for me) non-ASCII
characters not supported by my input method.

For this reason, and on general principles, I am enthusiastically in
support of improving all the aspects of Emacs support for non-ASCII
characters.

But that is no reason to put non-ASCII characters into the conventions
for Emacs Lisp source code.
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Richard Stallman
2015-08-18 03:44:45 UTC
Permalink
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
Post by Paul Eggert
Post by Alan Mackenzie
The main change here is that from now on, particularly if so-called
Electric Quote mode [*] is used, we're going to end up with a chaotic
mix of ascii quotes and curly quotes in our source code.
Although I also would prefer a simpler approach (one that consistently uses
curved quotes), you've objected to that, necessitating a "chaotic" compromise.
There is a much simpler solution: don't use curly quotes in doc
strings in source code.
Post by Paul Eggert
For years Emacs has had significant problems in editing and searching and
generating non-ASCII text. Making Emacs better in this area will inevitably
have teething problems, and we'll inevitably come up with worse solutions before
coming up with better ones.
These are totally separate areas. By all means work on improving them,
but that is no reason to have curly quotes in doc strings in source code.
--
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.
Loading...