LaTeX Sentence Spacing

As a followup to my general post about sentence spacing, here are some brief notes about managing sentence spacing in LaTeX, followed by a not-so-brief explanation of what LaTeX is doing.

By default, LaTeX adds slightly more space between sentences than it does between words.  The space between sentences is about 33 percent wider than the space between words.

You can disable this extra space entirely with the \frenchspacing directive.  From the point LaTeX encounters that directive until the end of the document (or until it reaches a \nofrenchspacing directive), it will use the same amount of space between sentences as between words.

LaTeX tries to figure out where sentences end and apply the extra space on its own.  It usually does a good job, but sometimes it guesses wrong.  Fortunately, it has options for you to adjust its word and sentence spacing yourself, so you can almost always get it to do what you really want.

§ LaTeX’s Basic Rules

The basic rules are these:

  1. A sentence-ending punctuation character (a period, exclamation point, or question mark) followed by a space is considered to signal the end of a sentence.
  2. There can be any number of right parentheses, quote marks, and right brackets between the punctuation and the space.
  3. However, if the character immediately before the punctuation is a capital letter, LaTeX will not end a sentence there.

So these would all be seen by LaTeX as sentence breaks and would get extra space added during text layout:

…thank you. You're…
…the best.  No one…
…else is.         Except…
``Stop!'' The word…
…really right?)  Anyway…

These, however, would not be treated as the ends of sentences:

A Ph.D. in what…
She was SHOUTING.  I didn't like it…

§ Non-sentence Punctation

If you have some punctuation that makes LaTeX think there’s a sentence where there isn’t, you have two options available to you.

A lot of the time, the false sentences come from things like abbreviated titles, e.g. Mr. Rogers.  In those cases, you would probably prefer to tightly bind the two words together.  For that, you can use a tilde to add a nonbreaking space, which LaTeX also calls a tie:


LaTeX will not break a line at a nonbreaking space, nor will it stretch nonbreaking spaces when justifying lines.

In other cases, you might want LaTeX to treat the space after the punctuation as a normal space that can be wrapped and stretched as needed.  There are two ways to do that.

One option is to use the \@ macro between the punctuation and the space.  This interrupts LaTeX’s sentence-ending calculations and causes it to treat the subsequent space like a normal inter-word space.

It was David vs.\@ Goliath all over again. 
``What are you doing?\@'' she asked. 
``What are you doing?''\@ she asked. 
There are many options, e.g.\@ a polar bear. 

Alternately, you can use \ (a backslash followed by a space) to explicitly tell LaTeX to use a normal space at that location, regardless of what its calculations say:

It was David vs.\ Goliath all over again. 
``What are you doing?''\ she asked. 
There are many options, e.g.\ a polar bear. 

The second option is, as you can see, a little more concise.

§ Unrecognized Sentence Punctuation

Conversely, sometimes you have sentences that end with a capital letter right before the period (or other end-of-sentence punctuation).  In that case, you can put \@ just before the punctuation to make sure LaTeX adds the extra space you want between that sentence and the next one:

He had a PhD\@. That worried me. 

As in the previous examples, \@ interrupts LaTeX’s special spacing calculations.  When placed before sentence punctuation, it causes LaTeX to ignore the immediately-preceding character, which means its special “capital before punctuation” rule never comes into play.

§ Sentences Without Recognized Punctuation

On rare occasions, I’ve run into cases where I’d like to have sentence spacing after characters that LaTeX doesn’t normally recognize as sentence-ending.

For example:

I thought---  No, that's not right. 

The best approach I’ve found for this is:

I thought---\spacefactor3000{}  No, that's not right. 

The short explanation for that wordy construct is that it’s forcing LaTeX’s layout engine to apply end-of-sentence semantics at the macro location.  (It might or might not help to know that \@ is basically equivalent to \spacefactor1000{}.)

§ More Details than You Probably Want

The above should be sufficient if you just want to know how to get or suppress LaTeX’s end-of-sentence spacing as needed.  But if you want to know what’s going on under the hood, feel free to read on.

# Space Factors

Every character in TeX has a numeric “space factor” assigned.  That space factor primarily affects the rate at which space after the character is allowed to grow or shrink as TeX adjusts the width of a line to make it justified.  When TeX expands a space, it does it in proportion to the space factor divided by 1000.

Most characters have a space factor of 1000.  That means most inter-word spaces use a proportion of 1 for expansion or, in other words, TeX will expand or shrink all of the spaces by the same amount all the time.

Some characters have a slightly larger space factor.  Commas have a space factor of 1250, for instance, and semicolons have a space factor of 1500.  So let’s say there’s a comma in a line of text and TeX wants to make the line wider.  For every 1 point of space that TeX adds to the “normal” spaces in the line (the ones with a space factor of 1000), it will add 1.25 points to the space after the comma.  (When shrinking space, TeX uses the inverse of that proportion.  So if the normal spaces were decreased by 1 point, the space after the comma would only be decreased by 1/1.25 or 0.8 points.)

# Widening a Space Based on the Space Factor

In addition to these rules about growing and shrinking spaces, TeX has another rule, which is that if a character’s space factor is greater than or equal to 2000, it automatically adds an extra amount to the width of the following space.  That extra amount is defined by the font, but most fonts are pretty similar to the default Computer Modern.  Ten-point Computer Modern uses a width of 3.3333 points for normal spaces and adds an extra 1.1111 points when the “extra space” rule is triggered.

The three standard end-of-sentence punctuation marks—period, question mark, and exclamation point—all have space factors of 3000.  Colons have a space factor of 2000.  So all four of those characters will trigger the addition of extra width to any space that directly follows them.  Spaces after end-of-sentence punctuation will grow three times faster than normal spaces and shrink at one third their rate.  Spaces after colons will grow twice as fast and shrink at half the rate of normal spaces.

# Skipping Some Characters’ Space Factors

A few other punctuation characters have space factors of zero.  These are all characters that sometimes appear between sentence punctuation and the space after the sentence.  They include the right parenthesis, single quote mark, and right bracket.

When TeX encounters a character with a space factor of zero, it carries over the space factor from the previous character.  This allows the spacing algorithm to effectively ignore some characters.  As an example, consider the string “a.) ”.  The “a” has a space factor of 1000.  The “.” has a space factor of 3000.  The “)” has a space factor of zero, which means TeX will carry over the previous value of 3000.  When it finally reaches the space, TeX will use the 3000 value to add extra space and grow the space at three times the rate of a normal space.

# Capital Letters

The final rule TeX has is that if a character has a space factor less than 1000 (but greater than zero) and the next character has a space factor greater than 1000, that next character’s space factor is reduced to just 1000.

All capital letters hava a space factor of 999.  That means any of the special-spacing characters effectively lose their special space factor after a capital letter.  The spaces after “PhD.”, “NASA:”, and “FOMO;” will be normal spaces and will not have any extra width added to them.

# Setting the Space Factor Explicitly

You can set a space factor explicitly at any point with the \spacefactor command, followed by the new value.  As usual in TeX, you can have an explicit assignment (\spacefactor=1000) or allow TeX to figure it out implicitly (\spacefactor 1000 or \spacefactor1000).  Also as usual, the command will consume any space characters after it, so you need an empty statement after it if you want to preserve the space (\spacefactor1000{}).

# Putting it All Together

All of that should explain why the LaTeX macro \@ is equivalent to \spacefactor1000{}.  When placed between a capital letter and a period, it forces the pre-period space factor to 1000, which in turn allows the period to trigger the usual extra space and altered growth and shrinking behaviors.  When placed between a period and a space, it forces the space to see a space factor of 1000 and be treated as a normal space.

The “\ ” (backslash space) macro always inserts a space with a space factor of 1000, which is why it’s equivalent to (and can be seen as a shorthand for) “\@ ” (backslash at space).

§ Reference Material

The bulk of this information can be found in TeX by Topic, chapter 20.  See also the LaTeX2e reference on \spacefactor, an answer to “Is it possible to have non-french spacing without extra stretch?” on the TeX Stack Exchange, and a similar answer to “What is the proper use of \@ (i.e., backslash-at)?”.

Sentence Spacing

One lump or two?

There is an ongoing debate in some parts of the Internet about how much space should go after the end of a sentence.  Practically every publisher and quite a lot of other people will say there should be one space.  A minority of people—principally Gen X and older, I suspect—will say there should be two spaces.

I have opinions.

In short:  Two spaces are too much.  One space is okay, but it can feel a little crowded.  Adding just a little extra space—about 33 percent more—after a sentence gives a nice balance.  This blog does the latter.

§ A Brief History of Sentence Spacing

In the earliest days of the printing press, the standard among typesetters was to have about three times more space between sentences than between words.  But even their inter-word spaces were much larger than we use today.  In numeric terms, there was a space of one em between sentences, and one-third of an em between words.  An “em”, for our purposes, is simply a unit of measurement in printing that scales in proportion to the size of the type.  For example, text set at 72 points will have an em size of one inch, while 36 point text will have an em size of half an inch.  This proportionality makes the em a convenient reference for the sizes of things in typeset text.

Here’s how those earliest spacing conventions might look with some sample text:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

When typewriters, with their fixed character widths, became common, people tried to adapt printing rules that had been developed for proportional typefaces.  Typewriters’ fixed-width spaces were a little larger than the printers’ third-of-an-em proportional-type spaces.  Typing conventions eventually settled on a single fixed-width space between words and two fixed-width spaces between sentences.

If it were typed according to these rules, our sample text would look something like this:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect.  When he lifted his head a little, he could see his dome-like brown belly.  The bed quilt could hardly keep in position and was about to slide off completely.  His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.  It was no dream.  His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls. 

During the twentieth century, publishers gradually began to close up spacing in their printing, both between words and between sentences.  In the first part of the century, inter-word spacing remained roughly a third of an em, but inter-sentence spacing shrank to half an em.  That gave something along these lines:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

During the second half of the twentieth century, professional publishers continued to close up inter-word and inter-sentence spacing.  By the end of the century, most publications used just a quarter of an em between both words and sentences.  I have seen speculation that the convergence between inter-word and inter-sentence spacing grew out of increasing use of automated typesetting.  (“Two Spaces - an Old Typists' Habit?” gives a brief history of automated typesetting and makes a plausible case for how the extra inter-sentence space was lost.)

This sort of spacing is what you’re almost certainly used to seeing everywhere, but here’s what it looks like with our sample text:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

Now, well into the twenty-first century, pretty much every newspaper, magazine, and book publisher uses the same amount of space between sentences as between words, and that space is generally about a quarter of an em.  Every major style guide recommends typing a single space after a sentence.  The only significant exceptions, as far as I can tell, are scientific journals that use LaTeX for their typesetting (and which haven’t explicitly turned off LaTeX’s extra inter-sentence spacing).

§ Readability Studies

Practically every study done on the readability effects on end-of-sentence spacing has been inconclusive.  The one exception is “Are two spaces better than one?” by Johnson, Bui, and Schmitt.  That study purported to find that putting two spaces after sentences improved readability exclusively for people who themselves put two spaces after sentences when writing.  The findings are even weaker than that summary implies, however, for reasons that are covered well by Matthew Butterick.  In short, the test conditions were well outside normal reading conditions and there were unexplored statistical differences between the overall reading abilities of one- and two-space-using readers.

In other words, no one has really demonstrated a tangible benefit to any variation of end-of-sentence spacing.

§ Aesthetics

So that leaves just the question of aesthetics.  What looks good?

As noted previously, nearly every newspaper, magazine, and book you read uses the same amount of space at the end of a sentence as between words.  Nearly every website you read is the same, unless the website author has gone out of their way to do something different.  Everyone reads things with just one space between sentences all the time, so there’s the aesthetics of familiarity, if nothing else.

My personal view is a little more nuanced than just one space versus two.  Using a single space is servicable and two spaces are probably a bad idea, but I think I can do better than either.

Let’s look again at our example text with one space after each sentence:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

I find that readable, but it also feels a bit crowded.  The sentences seem to run together without any room to breathe.

Let’s put two spaces after each sentence and see how that looks:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect.  When he lifted his head a little, he could see his dome-like brown belly.  The bed quilt could hardly keep in position and was about to slide off completely.  His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.  It was no dream.  His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

To me, that feels like too much space.  As I read the paragraph, the extra space between sentences feels almost like an interruption each time.

I mentioned earlier that some publishers still use LaTeX’s inter-sentence spacing.  In particular, LaTeX allocates 33 percent more space at the end of a sentence than it does between words.  (For very specific numbers, see this Stack Exchange answer on the topic.)  If a standard inter-word space is now a quarter of an em, that should lead to inter-sentence spacing of a third of an em.  Here’s what that looks like:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

Personally, I like that balance of spacing.  There’s a very slight extra bit of space between sentences, but not enough that it really draws attention to itself.  This is often the sort of spacing I use when I’m writing longer-form things in HTML that don’t need to adhere to some surrounding style conventions.  For reference, I’m using Unicode character U+2004 THREE-PER-EM SPACE between sentences.  It can be represented in HTML using the XML entity  .

But I actually do something slightly different on this blog.  Because I’m picky about a variety of minutiae, I want this blog to look “right” (according to me) in all sorts of environments, including in text-mode web browsers (and, I guess, in other browsers that use monospaced fonts).  So this blog ends sentences with a combination of a regular space character and U+2009 THIN SPACE (or  ).  In a monospaced font, those will be shown as two full-width spaces.  In proportional fonts, the thin space is typically somewhere between a sixth and a fifth of an em.  In the font I use, it’s closer to a fifth.  Consequently, in most browsers those two spaces will add up to just a tiny bit more than my preferred total of a third of an em between sentences.

That combination of spaces looks like this (and also like every regular bit of text on this blog):

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect.  When he lifted his head a little, he could see his dome-like brown belly.  The bed quilt could hardly keep in position and was about to slide off completely.  His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.  It was no dream.  His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

The Chicago Manual of Style, a fairly conservative style guide, has a blog article about their one-space recommendation, as well as their own history of sentence spacing in publishing.

The Associated Press Stylebook doesn’t have anything online that’s publicly accessible, but a Journalist’s Resource basic guide to the AP style summarizes the guidance as, “Use only one space after the end of a sentence. Period.”

The Modern Language Association, whose style guide is used in many academic settings, says to use one space after a period unless specifically directed otherwise by an instructor.

The American Psychological Association, one of the last holdouts for using two spaces after a sentence, updated their guidelines in 2019 to recommend a single space.

Microsoft Word started flagging two-space sentences as errors in 2020.

Matthew Butterick, a well-regarded typographer online, calls the use of two spaces a “typewriter habit”—a holdover from the days of typewriters that no longer serves any useful purpose.

It’s a little difficult to find a comprehensive online LaTeX reference about its sentence spacing.  It doesn’t help that many LaTeX people seem a bit bitter about the ubiquity of using a single space to separate sentences.  (For example.)  Probably some of the most thorough coverage is in (some of) the answers to the TeX Stack Exchange question “Double space between sentences”.

§ In Conclusion

Don’t type two spaces after sentences in a medium with a proportionally-spaced font.  The closest typographic convention to this practice hasn’t been in use professionally for over a century, and it doesn’t look great alongside modern fonts and typography.

If your only choices are between using two standard spaces and a single standard space after your sentences, use the single space.

In many environments, the choice between one and two spaces is a false dichotomy.  If you can, try adding just a little extra space between your sentences.  I find that sentences look good with about 33 percent more space between them than between the words they contain.  I probably wouldn’t add more than about 50 percent of my inter-word spacing to the ends of my sentences.

Now with Hugo

A little while ago, I finally got around to revamping my website generation.  I’d been using Blosxom for nearly two decades, but I’d also been meaning to move to something a bit newer for a while.  So now I’m using Hugo.

Most of my stuff ported over without too much difficulty.  I’d been using Markdown with Blosxom, so it translated pretty cleanly to Hugo, with the just the addition of some front matter to each file.  Some of the Blosxom plugins I used had their own de facto front matter, so it was just a matter of changing it into YAML.

Overall, Hugo is very nice.  Its server mode and draft posts are particularly useful as I write things.  There were many times I wanted to preview a post in Blosxom and didn’t have a good way to do so.  Hugo solves that problem.  Draft posts let me work on things in the same repository as my published work without making things live until they’re ready.

Because Hugo facilitates working with the website as a whole much more readily, I also finally developed a proper deployment mechanism.  It’s just a Makefile with a few lines to generate intermediate files, run hugo to generate the HTML, and then scp the files to my webhost.  But having that automation makes it far easier to update things.  I could have done something similar with Blosxom, but Hugo has better affordances for it.

While I was at it, I also reworked my website theme.  I’ve had a lot more design practice since I put together the old theme.  While I might not say I’m good at this yet, I think I’ve at least gotten better.  You can compare, say, the old version and new version of my ebook post to see some of the difference.

I doubt anyone outside of me really care about the particulars of this website design.  But in case you do, the theme is available on GitLab.

Common X11 Compose Key Combinations

X11 has a useful feature called the compose key.  After pressing the compose key, you can type a sequence of keys on the keyboard to get various Unicode characters.  For example, Compose o c generates a copyright symbol (©).  This allows for typing a lot of extended characters with a US keyboard map (without the need to have dead keys or other non-US layout features).

There are, however, a lot of key combinations.  See the X11R7.7 documentation for its full list.  So this page focuses on the patterns in the key combinations and restricts itself to the ones I use more often.

Note: I use the XKB configuration option compose:prsc on my systems to map my Print Screen key to the compose key.  How I do that varies.  On some systems, I have Option "XkbOptions" "compose:prsc" in the InputClass section for my keyboard in an xorg.conf file.  On other systems, where it isn’t as easy to modify the system config files, I run setxkbmap -option compose:prsc automatically when I log in.

You can see what other keys are available to map as a compose key with:

grep "compose:" /usr/share/X11/xkb/rules/base.lst

Anyway, on to the list.

§ Quote Marks

The less-than and greater-than keys (or left and right angle brackets, if you prefer) can be combined with single and double quotes to make curly left and right quotes.  They can be typed in either order; both < ' and ' < result in a left single quote mark.

A comma can be used instead of an angle bracket to produce the low version of the quote symbol, for languages that use it to open quotes.

< > ,

Guillemets are the result of doubling angle brackets.  Use a leading period for single guillemets.

Combination Result Unicode Name

§ Accented Characters

Most letters can be combined with another character to add an accent.  The letter and accent character can be typed in either order; both ' a and a ' will give “á”.  Typing the letter last is preferred, since there are some multi-accent combinations that only make sense if you type all of the accents first.

The standard accent characters are:

Character Accent Type Example Note
' (single quote) acute accent ' a → á
` (backtick) grave accent ` a → à
^ (caret) circumflex ^ a → â > (greater than sign) also works:
> a → â
~ (tilde) tilde ~ a → ã
" (double quote) diaeresis; umlaut " a → ä
* (asterisk) ring * a → å Lowercase o also works, but only if it precedes the letter to be modified:
o a → å
, (comma) cedilla , c → ç Also works as an ogonek:
, a → ą
/ (forward slash) slash / o → ø
- (dash) macron - a → ā _ (underscore) also works:
_ a → ā
. (period) dot . e → ė Also removes the dot from a lowercase i:
. i → ı

In similar ways, the < (less than symbol) combines as a caron (e.g. < e → ě); - (dash) combines as a stroke (e.g. - d → đ); and = (equals sign) combines as a double acute accent (e.g. = o → ő).  I don’t really use those, though.

There are some patterns for characters with multiple additions (e.g. * ' A → Ǻ, LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE), but (1) I don’t use them often enough to need to remember the key combinations, and (2) many of them rely on keyboard layouts that can directly generate precomposed characters with one of the two modifiers.

§ Prime Symbols

I use these with some regularity, but they’re unfortunately not available as compose character combinations.  In some media, the HTML codes suffice.  In others, I can sometimes type the codepoint in hex.

Symbol Character Codepoint HTML
(Single) Prime U+2032 &prime;
Double Prime U+2033 &Prime;
Triple Prime U+2034 &tprime;
Quadruple Prime U+2057 &qprime;

In many programs, I can press Ctrl-Shift-U, followed by the Unicode codepoint, followed by the space or enter key.  So I can get a prime symbol with Ctrl-Shift-U 2 0 3 2 Enter.  That works for my terminal program and Firefox, at least.

In Emacs, I have to use C-x 8 Ret 2 0 3 2 Ret (et al.).

Regardless, I have to actually remember the codepoint, which is less convenient than the mnemonics afforded by compose key combinations.

§ Fractions

Many fractions can be created by typing two digits in sequence.  The first digit will be the numerator and the second will be the denominator.

All denominators from two to ten can be used with a numerator of one:

  • 1 2 → ½
  • 1 3 → ⅓
  • 1 4 → ¼
  • 1 5 → ⅕
  • 1 6 → ⅙
  • 1 7 → ⅐
  • 1 8 → ⅛
  • 1 9 → ⅑
  • 1 1 0 → ⅒

For non-unit numerators, all multiples of ⅙ and ⅛ are available, in their simplified forms (which means all multiples of ⅓ and ¼ are also available).

Multiples of ⅙:

  • 1 6 → ⅙
  • 1 3 → ⅓
  • 1 2 → ½
  • 2 3 → ⅔
  • 5 6 → ⅚

Multiples of ⅛:

  • 1 8 → ⅛
  • 1 4 → ¼
  • 3 8 → ⅜
  • 1 2 → ½
  • 5 8 → ⅝
  • 3 4 → ¾
  • 7 8 → ⅞

And, weirdly, 0 3 composes to ↉.

§ Subscripts and Superscripts

All of the digits, as well as the characters plus sign (+), equals sign (=), left parenthesis, and right parenthesis can be superscripted or subscripted by prefixing them with a caret or underscore, respectively.  The letters i and n can be superscripted with the sequences ^ _ i and ^ _ n, respectively.

For example:

  • ^ 2 → ²
  • _ 8 → ₈
  • _ ) → ₎
  • ^ _ i → ⁱ

§ Math Symbols

A number of math symbols are available.  Unfortunately, the Unicode character U+2212 MINUS SIGN (−) is not available, even though it’s a better choice than the plain dash in mathematical contexts.  The same techniques described in “Prime Symbols” above can be used to insert a minus sign by its codepoint.  You can also use &minus; in HTML.

Most math symbol combinations can be typed in either order, but the inclusive inequalities (≤ and ≥) need the equals sign to come after the less-than or greater-than character..

Combination Result Unicode Name
: -
- :
- ,
, -
+ -
- +
/ =
= /

§ Circled Numbers

Any one- or two digit number can be put in a circle by surrounding it with parentheses.  The same works for single upper- and lowercase letters.

For example:

  • ( 1 ) → ①
  • ( 4 2 ) → ㊷
  • ( S ) → Ⓢ

Note: For the copyright and registered copyright symbols, use o c and o r (or their variants; see “Other Characters” below).  The simple circled letters are different codepoints.

§ Currency

Key combinations for currency symbols often cover every ordering and capitalization combination.  They include:

Combination Result Unicode Name
C =
= C
c =
= c
E =
= E
e =
= e
C |
| C
c |
| c
L -
- L
l -
- l
Y =
= Y
y =
= y
Y -
- Y
y -
- y
O x
o X
o x
X o
x O
x o

§ Whitespace

Two space characters turn into a nonbreaking space.  (Or you can use &nbsp; in HTML.)

A space and a period turn into U+2008 PUNCTUATION SPACE.  The Unicode specification says this is a “space equal to narrow punctuation of a font”, or about the width of a period.

Using space characters other than the normal space is, of course, risky, since many programs won’t indicate that the alternate characters are any different from the normal ones.

§ Other Characters

Combination Result Unicode Name Note
- - - U+2014 EM DASH
- - . U+2013 EN DASH
o o
0 *
* 0
° U+00B0 DEGREE SIGN I find o o the easiest to type.
. ^
^ .
. -
· U+00B7 MIDDLE DOT You can think of this as either a raised dot (combining a period with a caret) or a smaller bullet (using a dash with a period instead of the bullet combination's equals sign with a period).
. = U+2022 BULLET
s o
o s
§ U+00A7 SECTION SIGN Either order; upper- or lowercase
t m
T m
t M
U+2122 TRADE MARK SIGN Any mix of capitalization
s m
S m
s M
U+2120 SERVICE MARK Any mix of capitalization
o c
O c
o C
© U+00A9 COPYRIGHT SIGN Any mix of capitalization.  C O and C o also work, but combinations starting with lowercase c don't (they give different characters).  So I prefer to always start with the o. 
o r
O r
o R
® U+00AE REGISTERED SIGN Any mix of capitalization.  R O also works, but none of the other capitalization mixes make characters with the "r" first.  So, just as with the copyright symbol, I prefer to always start with the o. 

Where to Get (DRM-free) Ebooks

This post grew out of a comment I made on Reddit in response to someone who was frustrated with the Kindle walled garden and wanted more generally-usable books.

§ Context

I like to read.  I have no idea how many books I’ve read in my lifetime, but I own hundreds and hundreds of physical books, at least half of which I’ve read; my digital library has around nine hundred books, most of which I have yet to read; and I’ve read many more books besides the ones I own.  (When I was a teenager I’d walk out of the public library with a literal armful of books, read them, then return two weeks later to do the same thing all over again.)  These days I prefer to read ebooks.  It’s easier to manage my ebook library and I can carry a lot more books around with me in ebook form as compared to physical form.  (No more armfuls of physical books.)  It really helps that I have a tablet that doubles as an excellent ebook reader.

But I also prefer to actually own the things I’ve nominally purchased.  Many ebooks, including everything in Amazon’s Kindle ecosystem, come with digital rights management, or DRM.  DRM gives a book’s publisher control over how you use your copies of their books.  It’s theoretically intended to impede piracy, but (a) it’s not too hard to bypass if you’re actually intent on pirating the material, and (b) it effectively means that you don’t fully own things you’ve purchased unless you go out of your way to bypass it.  It has enabled things like Amazon removing a copy of Nineteen Eighty-Four from a high school student’s Kindle library.

As a matter of principle, I will not pay for DRM-encumbered digital media.  If I buy something, I want to feel I actually own it, which means I don’t have to rely on someone else mediating my use of the media.  So here’s where I get my DRM-free ebooks.

§ The List

When I want to buy a particular book, my first stop is  They have a good selection of books, and they clearly indicate whether a given book has DRM or not.

Many book publishers or imprints have their own book stores, and some of those offer DRM-free copies of their books.  Some of the ones I know of are:

  • InformIT for several of Pearson’s imprints, including Addison-Wesley.  InformIT ebooks are DRM-free, but are digitally watermarked to connect them to your account.
  • No Starch Press has books on programming, computers, and other “geek entertainment”.  They’ve got a very good line of books for getting kids into programming.
  • Baen’s Ebook Store primarily sells science fiction/fantasy books.

I don’t have a problem with digital watermarks like the ones InformIT uses.  They don’t prevent any personal use of the watermarked ebook; all they do is allow the publisher to take a copy shared online and track it back to the person who originally bought it.  For the most part, digital watermarking doesn’t restrict use of the book any more than copyright law restricts use of a physical book.

In addition to their ebook store linked above, Baen also has the Baen Free Library, which has a periodically-rotating selection of their books completely for free (and also without DRM).

Tor Books, another science fiction/fantasy publisher, doesn’t have its own storefront (it sells through, among others, and appears to generally do DRM-free books), but it does have an Ebook of the Month Club.  To join the club, you simply sign up for their newsletter.  Every month you’ll get a link to download a free, DRM-free book from their catalog.  [The Ebook of the Month Club may have been discontinued.  The preceding link now redirects to the main website, and it’s been a while since one of their newsletters mentioned a free book.]

A good resource for out-of-copyright books is Standard Ebooks.  It’s a volunteer-run organization dedicated to turning existing books into high-quality ebooks.  Because of the nature of their operation, they mostly focus on books that are no longer restricted by copyrights.  Their books are all well-typeset, with a uniform appearance and consistent, well-curated metadata.  (If you’re so inclined, you can also contribute your own time and talents to their efforts.)

Standard Ebooks largely stands as as an alternative to Project Gutenberg.  Project Gutenberg also has ebook versions of many out-of-copyright books, but Project Gutenberg focuses more on quantity than quality.  (Also, they’ve been around a lot longer; Project Gutenberg will celebrate its 50th anniversary later this year.)  In general, if a book is available from both Standard Ebooks and Project Gutenberg, get it from Standard Ebooks.  But if Standard Ebooks doesn’t have it and Project Gutenberg does, Project Gutenberg’s copy will be serviceable, even if it’s not necessarily formatted prettily and has occasional typos from the automated optical character recognition.

I also follow Humble Bundle.  They periodically offer ebook bundles which are typically DRM-free and available in EPUB, PDF, and MOBI formats.  (Their quality has been gradually declining over time, unfortunately.  For example, some recent book bundles have not had all of the formats for every book.)  Not all of Humble’s partners have good books (looking at you, Packt), and I’m not always interested in the style, genre, or even just the particular selection of books in a bundle.  But I have gotten some good books out of the bundles over the years, so I keep following them.

§ Other Recommendations

I have, over time, received recommendations from other people for additional sources of DRM-free ebooks.  Anything in this section is from those recommendations.  I haven’t used these sources extensively, if at all, so take these with however many grains of salt you need.

The FreeEBOOKS subreddit is more focused on ebooks that don’t cost anything, as opposed to ebooks without DRM, and they don’t restrict themselves by ereader compatibility.  Nevertheless, many of the free books they link to are available as DRM-free EPUBs.

Verso Books is an independent publisher primarily focused on politically left-oriented content.  Books are DRM-free but watermarked.

Smashwords is an ebook store that focuses on self-published authors.  Their official position on DRM is that they think it’s a bad idea but the decision of whether to use it is up to books’ authors and publishers, not them.  But, as of 2023, none of their books have DRM and they say if they add DRM-encumbered books at some point, such books would be clearly labeled as such.

Leanpub is a combination storefront and publisher.  They don’t use DRM on any of the books they publish and sell.

§ Non-EPUB Books

My preference for ebooks is for the EPUB format.  It’s an open standard, has broad compatibility, and is adaptable to a variety of readers and environments.  But there are some sources for books that use other formats, most often PDF.  PDF isn’t great as an ebook format because it presupposes a page size and that page size is quite often either A4 or US letter paper.  Most ereaders have smaller screens than that, which means the text is either small and annoying to read or you have to zoom in and pan around to read everything.  But sometimes a PDF is the best option for a particular book.

The Internet Archive Text Archive has scans of millions of books.  Everything they’ve scanned is available in a web browser where you can page through the scanned images.  If the book is still covered by copyright, you have to create an account and check the book out in order to read it.  Checkouts last for an hour and they can be renewed.  This whole system is pretty convenient in my experience, especially for doing research, when you don’t necessarily need a book for longer than it takes to look up and read a section, take any needed notes, and check the section’s cross-references.

Some Internet Archive books are also available in EPUB, PDF, and other formats.  In those cases, you can download the file and do whatever you like with them.

An alternate entry point to the Internet Archive Text Archive is the Internet Archive Open Library.  It facilitates finding books in all sorts of libraries, but the Internet Archive Text Archive is one of the sources checked.  The site tends to be better for finding either physical or browser-readable books, rather than ereader-compatible books, though.

Wikibooks uses a wiki as a platform for collaborative authoring of books.  Most if not all of the book on Wikibooks are nonfiction reference material.  If you read the books on their website, you’ll always get the most up-to-date text, but many books can be downloaded in PDF format.  A handful are also available in EPUB.

§ Dishonorable Mention

O’Reilly, a publisher of high-quality computer-related and technical books, used to sell DRM-free copies of their books.  If you’ve previously bought any of those, you can still access them through, but you can’t buy new copies, as far as I can tell.  O’Reilly seems to be moving instead to subscription-based access to their ebooks, while still selling the physical versions.  They still have O’Reilly Open Books, which links to all of the books they’ve published under open licenses of various sorts, but very few of them are available in EPUBs.  Most of the open book links go to webpage versions of the books, which aren’t as easy to get into an ebook reader as a premade EPUB is.