Discussion:
Sphinx 1.6 em/en dash conversion change?
s***@gmail.com
2017-06-26 06:39:56 UTC
Permalink
Hi,

From what I can see Sphinx 1.6 has moved over to converting double dashes
(--) to em dashes and triple dashes (---) to en dashes. Unfortunately in
Sphinx 1.5 and below it looks like double dashes went to *en* dashes and
triple dashes when to *em* dashes.

This change means it's not possible to have a project that will typeset the
same way in both 1.5 and 1.6 versions of Sphinx as older versions have no
concept of setting the dash format used. Would it be possible to change the
dash formats back to match what happened in 1.5 but make all new projects
write a configuration option that explicitly asks for double dash to go to
em and triple dash to go to en?
--
You received this message because you are subscribed to the Google Groups "sphinx-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sphinx-dev+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sitsofe Wheeler
2017-06-26 16:56:31 UTC
Permalink
Hi,
Post by s***@gmail.com
Hi,
From what I can see Sphinx 1.6 has moved over to converting double dashes
(--) to em dashes and triple dashes (---) to en dashes. Unfortunately in
Sphinx 1.5 and below it looks like double dashes went to *en* dashes and
triple dashes when to *em* dashes.
This change means it's not possible to have a project that will typeset the
same way in both 1.5 and 1.6 versions of Sphinx as older versions have no
concept of setting the dash format used. Would it be possible to change the
dash formats back to match what happened in 1.5 but make all new projects
write a configuration option that explicitly asks for double dash to go to
em and triple dash to go to en?
There was no such intentional change. What is your docutils version?
and are you using Sphinx 1.6.1 or 1.6.2 ?
1.6.2
maybe some extension is interfering? it would be interesting to know.
With docutils 0.13.1 I consistently get
<p>Two hyphens –</p>
<p>Three hyphens —</p>
in html output from ``--``, respectively ``---`` in rst source file,
with Sphinx 1.6.1, 1.6.2, and current HEAD.
Hmm. My docutils is 0.13.1 and I'm trying to convert the fio
documentation into HTML.
From https://github.com/sphinx-doc/sphinx/blob/7ffd6ccee8b0c6316159c4295e2f44f8c57b90d6/sphinx/util/smartypants.py
def educate_tokens(text_tokens, attr='1', language='en'):
[...]
# Parse attributes:
# 0 : do nothing
# 1 : set all
[...]
elif attr == "1": # Do everything, turn all options on.
do_quotes = True
do_backticks = 1
do_dashes = 1
[...]
if do_dashes == 1:
text = smartquotes.educateDashes(text)

Looking at https://sourceforge.net/p/docutils/code/HEAD/tree/tags/docutils-0.13.1/docutils/utils/smartquotes.py
see this:
[...]
default_smartypants_attr = "1"
[...]
def educateDashes(text):
"""
Parameter: String (unicode or bytes).
Returns: The `text`, with each instance of "--" translated to
an em-dash character.
"""

text = re.sub(r"""---""", smartchars.endash, text) # en (yes, backwards)
text = re.sub(r"""--""", smartchars.emdash, text) # em (yes, backwards)
return text

So double dash goes to em (which is the longer dash).
Since Sphinx 1.6 such conversion is handled by docutils.
http://docutils.sourceforge.net/docs/user/config.html#smart-quotes
Docutils source code has potential for converting ``--`` rather to EM dash,
and ``---`` to EN dash, but for versions < 0.14, the smart quotes
transform action is hard-coded
and it maps ``--`` to EN dash and ``---`` to EM dash.
When Docutils>=0.14 is used, Sphinx patches nothing, but uses a derived
class for some reasons, and it could benefit from the class attribute
added at Docutils 0.14 called ``smartquotes_action``.
With Docutils<0.14 Sphinx needs to over-write more, and it could
take this opportunity to achieve same as ``smartquotes_action``.
This would be the situation I'm in.
Thus, Sphinx could provide a user config setting to influence this
``smartquotes_action``. But it does not so far.
As far as I can tell Docutils has no user interface for its 0.14
`smartquotes_action`.
It is only customizable at developer level.
--
Sitsofe | http://sucs.org/~sits/
--
You received this message because you are subscribed to the Google Groups "sphinx-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sphinx-dev+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sitsofe Wheeler
2017-06-26 23:59:54 UTC
Permalink
Post by Sitsofe Wheeler
Looking at
https://sourceforge.net/p/docutils/code/HEAD/tree/tags/docutils-0.13.1/docutils/utils/smartquotes.py
[...]
default_smartypants_attr = "1"
[...]
"""
Parameter: String (unicode or bytes).
Returns: The `text`, with each instance of "--" translated to
an em-dash character.
"""
text = re.sub(r"""---""", smartchars.endash, text) # en (yes, backwards)
text = re.sub(r"""--""", smartchars.emdash, text) # em (yes, backwards)
return text
So double dash goes to em (which is the longer dash).
But the actual docutils method used is
smartquotes.educateDashesOldSchool(text)
Have you tested ?
I thought I had but retrying just now shows exactly the behaviour you
described so I was wrong. Thanks for your patience and sorry for the
noise!
--
Sitsofe | http://sucs.org/~sits/
--
You received this message because you are subscribed to the Google Groups "sphinx-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sphinx-dev+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...