Additional Math Pages & Resources

Wednesday, December 9, 2009

Dash it all, Holmes!


Translated into English, that means the proper way to put hyphens in words. You know, the little line in between syllables.

A hyphenation algorithm is a set of rules (used by people or a computer program) that decide at which points a word can be split using a hyphen. For example, a hyphenation algorithm might decide how the word impeachment can be broken.

but not

This is handy for line justification and line-wrapping, which is a cosmetic process applied to a paragraph to keep the right margin smooth. Notice the next paragraph's appearance.

The rules of word-breaking are very complex and there are many exceptions (much to the irritation of editors). In the simplest terms - you should only break multi-syllable words, between the syllables. Of course it gets much more complicated when you care about how the line fits in a paragraph, or on a page, etc. One of the most important rules is NOT to break words across columns or page breaks.

An important math rule that overrides hyphenation dictionaries says you never separate math characters, currency symbols, etc. from the numbers preceding or following them.

As you can see here, $100
is not the same as $-


Here's the Excel Math "house rule" on hyphenation and justification for Lesson Sheets:  DON'T. EVER.

Please don't confuse the hyphenation algorithm with the rules for inserting hyphens when you create compound words, such as mother-of-pearl.

Since computers have taken over most hyphenation tasks, there are new computer-variant characters, such as the soft hyphen or non-breaking hyphen. These trigger special instructions in the hyphenation algorithm.

Strictly speaking, the hyphen is NOT the same as the minus sign (which is longer than a hypen), or any of the dashes.

When a publisher or its computer system must distinguish between these characters (which seem so very much alike), we use special instruction tags known as Unicode.  

Unicode provides a unique number for every character, no matter what the computer platform, no matter what the application program, no matter what the human language being represented.

Unicode is shown in this form: a U, a + and four digits:

hyphen U+2010
non-breaking hyphen U+2211
minus U+2212
figure dash U+2012 (also used as minus sign)
en dash U+2013
em dash U+2014
horizontal bar U+2015 (longer than em dash)
macron U+00AF (short, high overline)
overline U+203E (over other characters)
underscore U+005F (forces words apart visually while keeping them joined)
underline U+0332 (under other characters)
short stroke overlay U+0335 (through other characters, with inter-character gaps)
continuous strikethrough U+0336 (through other characters with no gaps)
diagonal stroke overlay U+0337 (through other characters, diagonally)
soft (invisible) hyphen  U+00AD (invisible but indicates a preferred word break point)

I'm going to stop here with the special characters. Isn't this a pain in the neck really fun stuff?

If you want more UNICODE, you can go to the website.

Math meanings can be dramatically changed by hyphen placement. An example? Try this:
  • three-hundred-year-old trees means some trees that are 300 years old.
  • three hundred-year-old trees means 3 trees that are 100 years old each.
  • three hundred year-old trees means 300 trees that are 1 year old each.
Writing word problems about big old trees is a tricky business!

    No comments:

    Post a Comment

    Type your comment here