Sentence Spacing

One lump or two?

There is an ongoing debate in some parts of the Internet about how much space should go after the end of a sentence.  Practically every publisher and quite a lot of other people will say there should be one space.  A minority of people—principally Gen X and older, I suspect—will say there should be two spaces.

I have opinions.

In short:  Two spaces are too much.  One space is okay, but it can feel a little crowded.  Adding just a little extra space—about 33 percent more—after a sentence gives a nice balance.  This blog does the latter.

§ A Brief History of Sentence Spacing

In the earliest days of the printing press, the standard among typesetters was to have about three times more space between sentences than between words.  But even their inter-word spaces were much larger than we use today.  In numeric terms, there was a space of one em between sentences, and one-third of an em between words.  An “em”, for our purposes, is simply a unit of measurement in printing that scales in proportion to the size of the type.  For example, text set at 72 points will have an em size of one inch, while 36 point text will have an em size of half an inch.  This proportionality makes the em a convenient reference for the sizes of things in typeset text.

Here’s how those earliest spacing conventions might look with some sample text:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

When typewriters, with their fixed character widths, became common, people tried to adapt printing rules that had been developed for proportional typefaces.  Typewriters’ fixed-width spaces were a little larger than the printers’ third-of-an-em proportional-type spaces.  Typing conventions eventually settled on a single fixed-width space between words and two fixed-width spaces between sentences.

If it were typed according to these rules, our sample text would look something like this:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect.  When he lifted his head a little, he could see his dome-like brown belly.  The bed quilt could hardly keep in position and was about to slide off completely.  His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.  It was no dream.  His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls. 

During the twentieth century, publishers gradually began to close up spacing in their printing, both between words and between sentences.  In the first part of the century, inter-word spacing remained roughly a third of an em, but inter-sentence spacing shrank to half an em.  That gave something along these lines:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

During the second half of the twentieth century, professional publishers continued to close up inter-word and inter-sentence spacing.  By the end of the century, most publications used just a quarter of an em between both words and sentences.  I have seen speculation that the convergence between inter-word and inter-sentence spacing grew out of increasing use of automated typesetting.  (“Two Spaces - an Old Typists' Habit?” gives a brief history of automated typesetting and makes a plausible case for how the extra inter-sentence space was lost.)

This sort of spacing is what you’re almost certainly used to seeing everywhere, but here’s what it looks like with our sample text:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

Now, well into the twenty-first century, pretty much every newspaper, magazine, and book publisher uses the same amount of space between sentences as between words, and that space is generally about a quarter of an em.  Every major style guide recommends typing a single space after a sentence.  The only significant exceptions, as far as I can tell, are scientific journals that use LaTeX for their typesetting (and which haven’t explicitly turned off LaTeX’s extra inter-sentence spacing).

§ Readability Studies

Practically every study done on the readability effects on end-of-sentence spacing has been inconclusive.  The one exception is “Are two spaces better than one?” by Johnson, Bui, and Schmitt.  That study purported to find that putting two spaces after sentences improved readability exclusively for people who themselves put two spaces after sentences when writing.  The findings are even weaker than that summary implies, however, for reasons that are covered well by Matthew Butterick.  In short, the test conditions were well outside normal reading conditions and there were unexplored statistical differences between the overall reading abilities of one- and two-space-using readers.

In other words, no one has really demonstrated a tangible benefit to any variation of end-of-sentence spacing.

§ Aesthetics

So that leaves just the question of aesthetics.  What looks good?

As noted previously, nearly every newspaper, magazine, and book you read uses the same amount of space at the end of a sentence as between words.  Nearly every website you read is the same, unless the website author has gone out of their way to do something different.  Everyone reads things with just one space between sentences all the time, so there’s the aesthetics of familiarity, if nothing else.

My personal view is a little more nuanced than just one space versus two.  Using a single space is servicable and two spaces are probably a bad idea, but I think I can do better than either.

Let’s look again at our example text with one space after each sentence:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

I find that readable, but it also feels a bit crowded.  The sentences seem to run together without any room to breathe.

Let’s put two spaces after each sentence and see how that looks:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect.  When he lifted his head a little, he could see his dome-like brown belly.  The bed quilt could hardly keep in position and was about to slide off completely.  His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.  It was no dream.  His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

To me, that feels like too much space.  As I read the paragraph, the extra space between sentences feels almost like an interruption each time.

I mentioned earlier that some publishers still use LaTeX’s inter-sentence spacing.  In particular, LaTeX allocates 33 percent more space at the end of a sentence than it does between words.  (For very specific numbers, see this Stack Exchange answer on the topic.)  If a standard inter-word space is now a quarter of an em, that should lead to inter-sentence spacing of a third of an em.  Here’s what that looks like:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

Personally, I like that balance of spacing.  There’s a very slight extra bit of space between sentences, but not enough that it really draws attention to itself.  This is often the sort of spacing I use when I’m writing longer-form things in HTML that don’t need to adhere to some surrounding style conventions.  For reference, I’m using Unicode character U+2004 THREE-PER-EM SPACE between sentences.  It can be represented in HTML using the XML entity  .

But I actually do something slightly different on this blog.  Because I’m picky about a variety of minutiae, I want this blog to look “right” (according to me) in all sorts of environments, including in text-mode web browsers (and, I guess, in other browsers that use monospaced fonts).  So this blog ends sentences with a combination of a regular space character and U+2009 THIN SPACE (or  ).  In a monospaced font, those will be shown as two full-width spaces.  In proportional fonts, the thin space is typically somewhere between a sixth and a fifth of an em.  In the font I use, it’s closer to a fifth.  Consequently, in most browsers those two spaces will add up to just a tiny bit more than my preferred total of a third of an em between sentences.

That combination of spaces looks like this (and also like every regular bit of text on this blog):

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect.  When he lifted his head a little, he could see his dome-like brown belly.  The bed quilt could hardly keep in position and was about to slide off completely.  His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.  It was no dream.  His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

The Chicago Manual of Style, a fairly conservative style guide, has a blog article about their one-space recommendation, as well as their own history of sentence spacing in publishing.

The Associated Press Stylebook doesn’t have anything online that’s publicly accessible, but a Journalist’s Resource basic guide to the AP style summarizes the guidance as, “Use only one space after the end of a sentence. Period.”

The Modern Language Association, whose style guide is used in many academic settings, says to use one space after a period unless specifically directed otherwise by an instructor.

The American Psychological Association, one of the last holdouts for using two spaces after a sentence, updated their guidelines in 2019 to recommend a single space.

Microsoft Word started flagging two-space sentences as errors in 2020.

Matthew Butterick, a well-regarded typographer online, calls the use of two spaces a “typewriter habit”—a holdover from the days of typewriters that no longer serves any useful purpose.

It’s a little difficult to find a comprehensive online LaTeX reference about its sentence spacing.  It doesn’t help that many LaTeX people seem a bit bitter about the ubiquity of using a single space to separate sentences.  (For example.)  Probably some of the most thorough coverage is in (some of) the answers to the TeX Stack Exchange question “Double space between sentences”.

§ In Conclusion

Don’t type two spaces after sentences in a medium with a proportionally-spaced font.  The closest typographic convention to this practice hasn’t been in use professionally for over a century, and it doesn’t look great alongside modern fonts and typography.

If your only choices are between using two standard spaces and a single standard space after your sentences, use the single space.

In many environments, the choice between one and two spaces is a false dichotomy.  If you can, try adding just a little extra space between your sentences.  I find that sentences look good with about 33 percent more space between them than between the words they contain.  I probably wouldn’t add more than about 50 percent of my inter-word spacing to the ends of my sentences.


Now with Hugo

A little while ago, I finally got around to revamping my website generation.  I’d been using Blosxom for nearly two decades, but I’d also been meaning to move to something a bit newer for a while.  So now I’m using Hugo.

Most of my stuff ported over without too much difficulty.  I’d been using Markdown with Blosxom, so it translated pretty cleanly to Hugo, with the just the addition of some front matter to each file.  Some of the Blosxom plugins I used had their own de facto front matter, so it was just a matter of changing it into YAML.

Overall, Hugo is very nice.  Its server mode and draft posts are particularly useful as I write things.  There were many times I wanted to preview a post in Blosxom and didn’t have a good way to do so.  Hugo solves that problem.  Draft posts let me work on things in the same repository as my published work without making things live until they’re ready.

Because Hugo facilitates working with the website as a whole much more readily, I also finally developed a proper deployment mechanism.  It’s just a Makefile with a few lines to generate intermediate files, run hugo to generate the HTML, and then scp the files to my webhost.  But having that automation makes it far easier to update things.  I could have done something similar with Blosxom, but Hugo has better affordances for it.

While I was at it, I also reworked my website theme.  I’ve had a lot more design practice since I put together the old theme.  While I might not say I’m good at this yet, I think I’ve at least gotten better.  You can compare, say, the old version and new version of my ebook post to see some of the difference.

I doubt anyone outside of me really care about the particulars of this website design.  But in case you do, the theme is available on GitLab.


Common X11 Compose Key Combinations

X11 has a useful feature called the compose key.  After pressing the compose key, you can type a sequence of keys on the keyboard to get various Unicode characters.  For example, Compose o c generates a copyright symbol (©).  This allows for typing a lot of extended characters with a US keyboard map (without the need to have dead keys or other non-US layout features).

There are, however, a lot of key combinations.  See the X11R7.7 documentation for its full list.  So this page focuses on the patterns in the key combinations and restricts itself to the ones I use more often.

Note: I use the XKB configuration option compose:prsc on my systems to map my Print Screen key to the compose key.  How I do that varies.  On some systems, I have Option "XkbOptions" "compose:prsc" in the InputClass section for my keyboard in an xorg.conf file.  On other systems, where it isn’t as easy to modify the system config files, I run setxkbmap -option compose:prsc automatically when I log in.

You can see what other keys are available to map as a compose key with:

grep "compose:" /usr/share/X11/xkb/rules/base.lst

Anyway, on to the list.

§ Quote Marks

The less-than and greater-than keys (or left and right angle brackets, if you prefer) can be combined with single and double quotes to make curly left and right quotes.  They can be typed in either order; both < ' and ' < result in a left single quote mark.

A comma can be used instead of an angle bracket to produce the low version of the quote symbol, for languages that use it to open quotes.

< > ,
'
"

Guillemets are the result of doubling angle brackets.  Use a leading period for single guillemets.

Combination Result Unicode Name
< < « U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
> > » U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
. < U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
. > U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK

§ Accented Characters

Most letters can be combined with another character to add an accent.  The letter and accent character can be typed in either order; both ' a and a ' will give “á”.  Typing the letter last is preferred, since there are some multi-accent combinations that only make sense if you type all of the accents first.

The standard accent characters are:

Character Accent Type Example Note
' (single quote) acute accent ' a → á
` (backtick) grave accent ` a → à
^ (caret) circumflex ^ a → â > (greater than sign) also works:
> a → â
~ (tilde) tilde ~ a → ã
" (double quote) diaeresis; umlaut " a → ä
* (asterisk) ring * a → å Lowercase o also works, but only if it precedes the letter to be modified:
o a → å
, (comma) cedilla , c → ç Also works as an ogonek:
, a → ą
/ (forward slash) slash / o → ø
- (dash) macron - a → ā _ (underscore) also works:
_ a → ā
. (period) dot . e → ė Also removes the dot from a lowercase i:
. i → ı

In similar ways, the < (less than symbol) combines as a caron (e.g. < e → ě); - (dash) combines as a stroke (e.g. - d → đ); and = (equals sign) combines as a double acute accent (e.g. = o → ő).  I don’t really use those, though.

There are some patterns for characters with multiple additions (e.g. * ' A → Ǻ, LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE), but (1) I don’t use them often enough to need to remember the key combinations, and (2) many of them rely on keyboard layouts that can directly generate precomposed characters with one of the two modifiers.

§ Prime Symbols

I use these with some regularity, but they’re unfortunately not available as compose character combinations.  In some media, the HTML codes suffice.  In others, I can sometimes type the codepoint in hex.

Symbol Character Codepoint HTML
(Single) Prime U+2032 &prime;
Double Prime U+2033 &Prime;
Triple Prime U+2034 &tprime;
Quadruple Prime U+2057 &qprime;

In many programs, I can press Ctrl-Shift-U, followed by the Unicode codepoint, followed by the space or enter key.  So I can get a prime symbol with Ctrl-Shift-U 2 0 3 2 Enter.  That works for my terminal program and Firefox, at least.

In Emacs, I have to use C-x 8 Ret 2 0 3 2 Ret (et al.).

Regardless, I have to actually remember the codepoint, which is less convenient than the mnemonics afforded by compose key combinations.

§ Fractions

Many fractions can be created by typing two digits in sequence.  The first digit will be the numerator and the second will be the denominator.

All denominators from two to ten can be used with a numerator of one:

  • 1 2 → ½
  • 1 3 → ⅓
  • 1 4 → ¼
  • 1 5 → ⅕
  • 1 6 → ⅙
  • 1 7 → ⅐
  • 1 8 → ⅛
  • 1 9 → ⅑
  • 1 1 0 → ⅒

For non-unit numerators, all multiples of ⅙ and ⅛ are available, in their simplified forms (which means all multiples of ⅓ and ¼ are also available).

Multiples of ⅙:

  • 1 6 → ⅙
  • 1 3 → ⅓
  • 1 2 → ½
  • 2 3 → ⅔
  • 5 6 → ⅚

Multiples of ⅛:

  • 1 8 → ⅛
  • 1 4 → ¼
  • 3 8 → ⅜
  • 1 2 → ½
  • 5 8 → ⅝
  • 3 4 → ¾
  • 7 8 → ⅞

And, weirdly, 0 3 composes to ↉.

§ Subscripts and Superscripts

All of the digits, as well as the characters plus sign (+), equals sign (=), left parenthesis, and right parenthesis can be superscripted or subscripted by prefixing them with a caret or underscore, respectively.  The letters i and n can be superscripted with the sequences ^ _ i and ^ _ n, respectively.

For example:

  • ^ 2 → ²
  • _ 8 → ₈
  • _ ) → ₎
  • ^ _ i → ⁱ

§ Math Symbols

A number of math symbols are available.  Unfortunately, the Unicode character U+2212 MINUS SIGN (−) is not available, even though it’s a better choice than the plain dash in mathematical contexts.  The same techniques described in “Prime Symbols” above can be used to insert a minus sign by its codepoint.  You can also use &minus; in HTML.

Most math symbol combinations can be typed in either order, but the inclusive inequalities (≤ and ≥) need the equals sign to come after the less-than or greater-than character..

Combination Result Unicode Name
x x × U+00D7 MULTIPLICATION SIGN
: -
- :
÷ U+00F7 DIVISION SIGN
- ,
, -
¬ U+00AC NOT SIGN
+ -
- +
± U+00B1 PLUS-MINUS SIGN
/ =
= /
U+2260 NOT EQUAL TO
< = U+2264 LESS-THAN OR EQUAL TO
> = U+2265 GREATER-THAN OR EQUAL TO

§ Circled Numbers

Any one- or two digit number can be put in a circle by surrounding it with parentheses.  The same works for single upper- and lowercase letters.

For example:

  • ( 1 ) → ①
  • ( 4 2 ) → ㊷
  • ( S ) → Ⓢ

Note: For the copyright and registered copyright symbols, use o c and o r (or their variants; see “Other Characters” below).  The simple circled letters are different codepoints.

§ Currency

Key combinations for currency symbols often cover every ordering and capitalization combination.  They include:

Combination Result Unicode Name
C E U+20A0 EURO-CURRENCY SIGN
C =
= C
c =
= c
E =
= E
e =
= e
U+20AC EURO SIGN
C |
| C
c |
| c
¢ U+00A2 CENT SIGN
L -
- L
l -
- l
£ U+00A3 POUND SIGN
Y =
= Y
y =
= y
Y -
- Y
y -
- y
¥ U+00A5 YEN SIGN
O X
O x
o X
o x
X O
X o
x O
x o
¤ U+00A4 CURRENCY SIGN

§ Whitespace

Two space characters turn into a nonbreaking space.  (Or you can use &nbsp; in HTML.)

A space and a period turn into U+2008 PUNCTUATION SPACE.  The Unicode specification says this is a “space equal to narrow punctuation of a font”, or about the width of a period.

Using space characters other than the normal space is, of course, risky, since many programs won’t indicate that the alternate characters are any different from the normal ones.

§ Other Characters

Combination Result Unicode Name Note
- - - U+2014 EM DASH
- - . U+2013 EN DASH
. . U+2026 HORIZONTAL ELLIPSIS
- > U+2192 RIGHTWARDS ARROW
< - U+2190 LEFTWARDS ARROW
o o
0 *
* 0
° U+00B0 DEGREE SIGN I find o o the easiest to type.
. ^
^ .
. -
· U+00B7 MIDDLE DOT You can think of this as either a raised dot (combining a period with a caret) or a smaller bullet (using a dash with a period instead of the bullet combination's equals sign with a period).
. = U+2022 BULLET
? ? ¿ U+00BF INVERTED QUESTION MARK
! ? U+203D INTERROBANG
? ! U+2E18 INVERTED INTERROBANG
s o
o s
S O
O S
§ U+00A7 SECTION SIGN Either order; upper- or lowercase
P P U+00B6 PILCROW SIGN
t m
T M
T m
t M
U+2122 TRADE MARK SIGN Any mix of capitalization
s m
S M
S m
s M
U+2120 SERVICE MARK Any mix of capitalization
o c
O C
O c
o C
© U+00A9 COPYRIGHT SIGN Any mix of capitalization.  C O and C o also work, but combinations starting with lowercase c don't (they give different characters).  So I prefer to always start with the o. 
o r
O R
O r
o R
® U+00AE REGISTERED SIGN Any mix of capitalization.  R O also works, but none of the other capitalization mixes make characters with the "r" first.  So, just as with the copyright symbol, I prefer to always start with the o. 
T H Þ U+00DE LATIN CAPITAL LETTER THORN
t h þ U+00FE LATIN SMALL LETTER THORN
D H Ð U+00D0 LATIN CAPITAL LETTER ETH
d h ð U+00F0 LATIN SMALL LETTER ETH
S S U+1E9E LATIN CAPITAL LETTER SHARP S
s s ß U+00DF LATIN SMALL LETTER SHARP S

Éowyn Challenge – Walking to Mordor and Back

A couple of months ago I got a FitBit Luxe.  I’ve long used the pedometer on my phone for step tracking, but it’s not always accurate and I don’t always have it on me.  (Plus, I really wanted longitudinal heart rate data, which isn’t feasible with something like a phone.)

To “celebrate” having an always-on health tracker, I’m going to start on a project I came across a while back but for which I hadn’t put together the pieces of participation until now: taking the One Ring to Mordor with Frodo.

§ Walking to Mordor and Back

This is most directly influenced by Nerd Fitness’s “Walking to Mordor and Back” Google spreadsheet.  It’s a fitness motivation tool that, in turn is based on the Éowyn Challenge.  The Éowyn Challenge takes all the traveling various characters do in The Lord of the Rings and breaks it into segments, as shown on their “Walk to Rivendell and Beyond” page.  The goal of the Éowyn Challenge is to pick a segment and track your personal progress in traveling the same distance as the characters.  Each segment has a breakdown by day and mile of where the characters went and what they did.

Nerd Fitness has taken all of Frodo’s segments from the Éowyn Challenge (Hobbiton to Mount Doom back to Hobbiton and then to the Grey Havens) and put them into a spreadsheet for ease of tracking.  They omit all the mile-by-mile details in favor of looking at the overall distance traveled—a bit over 3,600 miles.

§ My Plan

Nerd Fitness’s spreadsheet is designed around having a group of participants and adding their miles to get a total.  (Which is not how group travel works, but we’ll ignore that for now.)  I’m just going to track myself and see how long it takes me to hit all the landmarks myself.  I expect it’ll be significantly longer than it took Frodo.

I have an IFTTT rule to add each day’s fitness summary to a Google spreadsheet.  So I added a sheet to my copy of the “Walking to Mordor and Back” document; that sheet just imports all the lines from my FitBit spreadsheet and calculated a running mileage total.  I’ve modified the main sheet of the “Walking to Mordor and Back” to automatically pull data from that secondary sheet.  It’ll update the mile total and arrival dates all on its own.

With all of that automated, I just have to wear the FitBit and live as usual.  As long as I remember to check on the spreadsheet periodically, I’ll be able to see how I’m doing.

I’m starting the mile accumulation today, January 1.  Let’s see how long it takes me to finish!


Where to Get (DRM-free) Ebooks

This post grew out of a comment I made on Reddit in response to someone who was frustrated with the Kindle walled garden and wanted more generally-usable books.

§ Context

I like to read.  I have no idea how many books I’ve read in my lifetime, but I own hundreds and hundreds of physical books, at least half of which I’ve read; my digital library has around nine hundred books, most of which I have yet to read; and I’ve read many more books besides the ones I own.  (When I was a teenager I’d walk out of the public library with a literal armful of books, read them, then return two weeks later to do the same thing all over again.)  These days I prefer to read ebooks.  It’s easier to manage my ebook library and I can carry a lot more books around with me in ebook form as compared to physical form.  (No more armfuls of physical books.)  It really helps that I have a tablet that doubles as an excellent ebook reader.

But I also prefer to actually own the things I’ve nominally purchased.  Many ebooks, including everything in Amazon’s Kindle ecosystem, come with digital rights management, or DRM.  DRM gives a book’s publisher control over how you use your copies of their books.  It’s theoretically intended to impede piracy, but (a) it’s not too hard to bypass if you’re actually intent on pirating the material, and (b) it effectively means that you don’t fully own things you’ve purchased unless you go out of your way to bypass it.  It has enabled things like Amazon removing a copy of Nineteen Eighty-Four from a high school student’s Kindle library.

As a matter of principle, I will not pay for DRM-encumbered digital media.  If I buy something, I want to feel I actually own it, which means I don’t have to rely on someone else mediating my use of the media.  So here’s where I get my DRM-free ebooks.

§ The List

When I want to buy a particular book, my first stop is eBooks.com.  They have a good selection of books, and they clearly indicate whether a given book has DRM or not.

Many book publishers or imprints have their own book stores, and some of those offer DRM-free copies of their books.  Some of the ones I know of are:

  • InformIT for several of Pearson’s imprints, including Addison-Wesley.  InformIT ebooks are DRM-free, but are digitally watermarked to connect them to your account.
  • No Starch Press has books on programming, computers, and other “geek entertainment”.  They’ve got a very good line of books for getting kids into programming.
  • Baen’s Ebook Store primarily sells science fiction/fantasy books.

I don’t have a problem with digital watermarks like the ones InformIT uses.  They don’t prevent any personal use of the watermarked ebook; all they do is allow the publisher to take a copy shared online and track it back to the person who originally bought it.  For the most part, digital watermarking doesn’t restrict use of the book any more than copyright law restricts use of a physical book.

In addition to their ebook store linked above, Baen also has the Baen Free Library, which has a periodically-rotating selection of their books completely for free (and also without DRM).

Tor Books, another science fiction/fantasy publisher, doesn’t have its own storefront (it sells through ebooks.com, among others, and appears to generally do DRM-free books), but it does have an Ebook of the Month Club.  To join the club, you simply sign up for their newsletter.  Every month you’ll get a link to download a free, DRM-free book from their catalog.  [The Ebook of the Month Club may have been discontinued.  The preceding link now redirects to the main website, and it’s been a while since one of their newsletters mentioned a free book.]

A good resource for out-of-copyright books is Standard Ebooks.  It’s a volunteer-run organization dedicated to turning existing books into high-quality ebooks.  Because of the nature of their operation, they mostly focus on books that are no longer restricted by copyrights.  Their books are all well-typeset, with a uniform appearance and consistent, well-curated metadata.  (If you’re so inclined, you can also contribute your own time and talents to their efforts.)

Standard Ebooks largely stands as as an alternative to Project Gutenberg.  Project Gutenberg also has ebook versions of many out-of-copyright books, but Project Gutenberg focuses more on quantity than quality.  (Also, they’ve been around a lot longer; Project Gutenberg will celebrate its 50th anniversary later this year.)  In general, if a book is available from both Standard Ebooks and Project Gutenberg, get it from Standard Ebooks.  But if Standard Ebooks doesn’t have it and Project Gutenberg does, Project Gutenberg’s copy will be serviceable, even if it’s not necessarily formatted prettily and has occasional typos from the automated optical character recognition.

I also follow Humble Bundle.  They periodically offer ebook bundles which are typically DRM-free and available in EPUB, PDF, and MOBI formats.  (Their quality has been gradually declining over time, unfortunately.  For example, some recent book bundles have not had all of the formats for every book.)  Not all of Humble’s partners have good books (looking at you, Packt), and I’m not always interested in the style, genre, or even just the particular selection of books in a bundle.  But I have gotten some good books out of the bundles over the years, so I keep following them.

§ Other Recommendations

I have, over time, received recommendations from other people for additional sources of DRM-free ebooks.  Anything in this section is from those recommendations.  I haven’t used these sources extensively, if at all, so take these with however many grains of salt you need.

The FreeEBOOKS subreddit is more focused on ebooks that don’t cost anything, as opposed to ebooks without DRM, and they don’t restrict themselves by ereader compatibility.  Nevertheless, many of the free books they link to are available as DRM-free EPUBs.

Verso Books is an independent publisher primarily focused on politically left-oriented content.  Books are DRM-free but watermarked.

Smashwords is an ebook store that focuses on self-published authors.  Their official position on DRM is that they think it’s a bad idea but the decision of whether to use it is up to books’ authors and publishers, not them.  But, as of 2023, none of their books have DRM and they say if they add DRM-encumbered books at some point, such books would be clearly labeled as such.

Leanpub is a combination storefront and publisher.  They don’t use DRM on any of the books they publish and sell.

§ Non-EPUB Books

My preference for ebooks is for the EPUB format.  It’s an open standard, has broad compatibility, and is adaptable to a variety of readers and environments.  But there are some sources for books that use other formats, most often PDF.  PDF isn’t great as an ebook format because it presupposes a page size and that page size is quite often either A4 or US letter paper.  Most ereaders have smaller screens than that, which means the text is either small and annoying to read or you have to zoom in and pan around to read everything.  But sometimes a PDF is the best option for a particular book.

The Internet Archive Text Archive has scans of millions of books.  Everything they’ve scanned is available in a web browser where you can page through the scanned images.  If the book is still covered by copyright, you have to create an account and check the book out in order to read it.  Checkouts last for an hour and they can be renewed.  This whole system is pretty convenient in my experience, especially for doing research, when you don’t necessarily need a book for longer than it takes to look up and read a section, take any needed notes, and check the section’s cross-references.

Some Internet Archive books are also available in EPUB, PDF, and other formats.  In those cases, you can download the file and do whatever you like with them.

An alternate entry point to the Internet Archive Text Archive is the Internet Archive Open Library.  It facilitates finding books in all sorts of libraries, but the Internet Archive Text Archive is one of the sources checked.  The site tends to be better for finding either physical or browser-readable books, rather than ereader-compatible books, though.

Wikibooks uses a wiki as a platform for collaborative authoring of books.  Most if not all of the book on Wikibooks are nonfiction reference material.  If you read the books on their website, you’ll always get the most up-to-date text, but many books can be downloaded in PDF format.  A handful are also available in EPUB.

§ Dishonorable Mention

O’Reilly, a publisher of high-quality computer-related and technical books, used to sell DRM-free copies of their books.  If you’ve previously bought any of those, you can still access them through members.oreilly.com, but you can’t buy new copies, as far as I can tell.  O’Reilly seems to be moving instead to subscription-based access to their ebooks, while still selling the physical versions.  They still have O’Reilly Open Books, which links to all of the books they’ve published under open licenses of various sorts, but very few of them are available in EPUBs.  Most of the open book links go to webpage versions of the books, which aren’t as easy to get into an ebook reader as a premade EPUB is.