Tag Archives: machine learning

Ways to read and navigate documents

Consultation documents are hardly ‘light reading’ so we thought it might be useful to provide a round-up of ways that you can more easily read and navigate documents on the WriteToReply site. We’ll use the BBC’s Project Canvas document as our example, not least because we want to encourage you to read the consultation before the deadline on 17th April. We’ve previously written about how you can use the RSS news feeds to read both documents and comments on your own Netvibes or Pageflakes dashboard. There’s a lot we could write about the use of feeds, and in this post, we provide a new example below.

Table of Contents

Obviously, the first thing you see on a WriteToReply document site is the Table of Contents. No real innovation here, except that you can see how many comments have been made, which might indicate the ‘hot spots’ in the text. If you’ve not got a lot of time, you could go directly to those sections.

Most Commented

Even more useful is the Most Commented box on the right of the page. This ranks the hot spots so you can immediately see what’s of most interest to other people.


It’s worth remembering that you can search across the entire document for keywords and phrases. Enclose specific phrases in double quotes.

Coming Soon – Serialised Daily Subscriptions

We’re testing a brand new feature which will allow you to have a section of the document delivered to you each day. Rather than receive the entire document at once and have it nagging at you in your feed reader, you’ll receive a single section each day, making the document easier to find time for and digest. Just subscribe to the ‘Serialised Feed’ as you would any other feed and look forward to tomorrow’s subscription!

Hyperlinked Word Cloud

We’ve pointed out how you can navigate comments via our CommentCloud, and we thought it might also be a useful way of navigating the document itself. You’ll see a link to the WordCloud in every document sidebar.

Semantic Tagging

We’ve recently started using OpenCalais, a semantic technology that analyses the document and automatically produces tags from names, facts and events. It produces a far greater number of relevant tags than we would normally include and offers a useful, visual way of mining the content of the text.

The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.


We also produce an eBook of every document we re-publish so that it can be read offline on a variety of eBook readers. Again, you’ll find a link in the sidebar of every document.


If you’re one of the millions of iPhone, iPod Touch or Google Android owners, then each document is specially formatted for your phone (the sections are in reverse order but we should be able to fix that with a bit of free time. In the meantime, just navigate to the first section and bookmark it).

Embedded version

Finally, we also provide an embedded version of the original PDF, which we host on Scribd. This allows you to read the document full screen and print it for reading on the train 😉