Article Index
10 Easy Ways to Promote Your Website
5 Simple Steps to Accepting Payments
5 Steps to Understanding HTML
5 Ways to Avoid the 1998 Look
6 Reasons Why You Need a Website
7 Ways to Make Your Web Forms Better
A Question of Scroll Bars
Ads Under the Radar Linking to Affiliates
AJAX Should You Believe the Hype
All About Design Principles and Elements
An Introduction to Paint Shop Pro
An Issue of Width the Resolution Problem
Avoiding the Nuts and Bolts Content Management Software
Beware the Stock Photographer Picking Your Pictures
Building a Budget Website
Building Online Communities
Clean Page Structure Headings and Lists
ColdFusion Quicker Scripting at a Price
Column Designs with CSS
Content is King
CSS and the End of Tables
Cut to the Chase How to Make Your Website Load Faster
Designing for Sales
Designing for Search Engines
Dont Be Scared Its Only Code HTML for Beginners
Dreamweaver The Professional Touch
Encryption and Security with SSL
Finding a Good HTML Editor
Focus on the User Task Oriented Websites
Fonts are More Important Than You Think
Free Graphics Alternatives
FrontPage Easy Pages
Hints All the Way
Hiring Professionals 5 Things to Look For
How Databases Work
How the Web Works
How to Get Your Website Talked About on Blogs
How to Install and Configure a Forum
How to Make Visitors Add You to Their Favorites
How to Run Ads Without Driving Visitors Crazy
How to Set Up Your Hosting in 5 Minutes Flat
IIS and ASP Microsofts Server
Image Formats GIF JPEG PNG and More
Its a World Wide Web Going International
JSP Java on Your Server
LAMP The Most Popular Server System Ever
Making Friends and Influencing People the Importance of Links
Making Searches Simple
Offering Free Downloads on Your Website
Opening a Web Shop with E Commerce Software
Perl Cryptic Power
Photoshop a Graphic Designers Dream
PHP Easy Dynamic Websites
Picking a Colour Scheme
Printing and Sending the Two Things Users Want to Do
Putting Multimedia to Good Use
Python and Ruby the Newer Alternatives
Registering a Domain Name
Registering Your Users by Stealth
RSS Really Simple Syndication
Setting Up a Mailing List
Setting up a Test Server on Your Own Computer
Some Places to Go For More Information
Taking HTML Further with Javascript
Taking HTML Further
Taking Your Website Mobile
Text Ads Unobtrusive Advertising
The 5 Principles of Effective Navigation
The Art of the Logo
The Basics of Web Forms
The Basics of Web Servers
The Case Against Flash
The Confusing World of Web Hosting Making Your Decision
The Evils of PDFs
The Importance of Validation
The Many Flavours of HTML
The Smaller the Better Avoiding Graphical Overload
The Top 10 Biggest Web Design Mistakes
The Web Designers Toolbox
The Web is Not Paper
Theres More than One Web Browser
Time for User Testing
Titles and Headlines Its Not a Newspaper
Tracking Your Visitors
Understanding Web Jargon
Uploading Your Website with FTP
Using Flash Sensibly
Using Quizzes and Games to Get Traffic
VBScript Javascript Made Easy
Websites and Weblogs Whats the Difference
What Do You Want Your Website to Do
What You See Isnt Always What You Get
Which Database is Right for You
Why Doing It Yourself is Best
Why Java Will Drive Your Visitors Away
Why Word is Bad for the Web
Why You Should Put Your Content in a Weblog Format
Why You Should Stick to Design Conventions
Working With Templates
Writing for the Web

Why Word is Bad for the Web

Why Word is Bad for the Web.

Every so often, you might see text on the web that appears to be corrupted in some way. It's full of odd foreign letters to the point where it's almost unreadable, and it took ages to load. Believe it or not, nine times out of ten the culprit is a program that many people use every day: Microsoft Word, the world's most popular word processor. That's because, while Word might be perfectly good for producing documents to print and email to people, Word is bad for the web.

The Quote Problem.

All those foreign letters you see in that text were originally nothing more than an attempt to make documents a tiny bit nicer to look at. You see, the design of the keyboard comes from the age of typewriters, and the symbols present represent the kind of writing that appears on typewriters. We're stuck with our keyboard designs, but they were never meant to account for all the extra letters and characters included in modern fonts. This led to the quote problem.

What's the quote problem? Well, to answer it, take a look at your keyboard. Notice how there's only one kind of double-quote mark – the straight one. Worse, when you want a single quote, you have to use the same key as for apostrophes! Now, if you were writing on paper, you'd put different shaped quotes at the start and end of a quote, instead of just making straight lines. Altogether, things that would be represented by five different marks on paper only get two symbols on the keyboard.

Long ago, Microsoft decided to solve this problem. First, they set up Word to look for quote marks and replace them with nicer, curly quotes, known as 'smart quotes'. Then, they took some unused character codes – hey, what could anyone ever want those for? – and decided that they would represent these new, pretty quotes.

Everything was fine until, years later, people started copying text they'd written in Word and pasting onto the web. Because Microsoft didn't stick to any international standard when they chose how to represent their smart quotes, the quotes ended up displaying as all sorts of unintended strange letters in web browsers. Word's users never meant to do this, but Word had gone ahead and done it for them, because smart quotes is turned on by default!

Not so smart after all, was it?

Terrible HTML.

Of course, there's more to all this. When Microsoft finally caught on that the web was going to be big, they quickly added web features to Word, not least of which is the ability to save documents to HTML. Unfortunately for the rest of the world, though, Microsoft again failed to stick to any standards at all. They made up their own HTML tags to represent the layout of Word documents, purely to make sure that the documents would look the same if people wanted to open them in Word and save them in another format. These proprietary tags now pollute HTML documents all over the web, simply because the people who created the pages by saving as HTML in Word don't know enough to remove them – and they make pages load much more slowly.

Worse, even if you do remove all the Word-specific tags from the documents, the leftover HTML is still a nightmare. Presumably Microsoft decided to re-use the HTML generation engine from FrontPage, with the same kinds of results – a complete and utter mess.

Smart Tags.

Do you think it ends there? Amazingly, it doesn't. For their latest versions of Word, Microsoft decided it'd be great to add something they called 'smart tags' – a kind of 'link' that adds contextual information to things you type. For example, if you type an address in your document, that address allows you to link through to a map. Useful? Very rarely.

The problem comes when documents containing smart tags are saved as HTML – the tags are saved too! This means that documents all over the web have odd text linked to completely frivolous places, simply because Word thought it looked like an address. Not only do these links take ages to load correctly, but they're ugly too.

What might Microsoft Word unleash upon the web next? We can only wait in fear.