The Insane New Article Parser

Read It Later 3.0 will be made up by 6 major releases.  Today, I’m releasing the first: The 3rd generation Text View (now named the Article View), which lands alongside version 2.3 of Read It Later for iPhone/iPad.

The New Article View

The new Article View is without a doubt, the most advanced and accurate text parser on the market today.

Distilled. The new Article View is the cleanest text parser available.  It’s highly skilled in presenting just the article in its purest form, with none of the extraneous page content you’ll see in similar parsers.

Images. View all the images from the article, including any captions and even photographer’s credits.

Videos. Includes all embedded videos within the article.

Meta. Includes information about the article like date published and authors.

Fast. It’s fast.  Really fast.  Even with images, downloading is faster than ever.

Author Bylines

The Article View allows great content to be read the way it was meant to be: with a focus solely on the words themselves.  But writing great content is hard work, which is why I’ve worked hard to ensure the new Article Parser maintains credits to authors and photographers.

No other text parser makes any effort to preserve attribution to the author of the content.  (When they do, it’s surrounded by junk content from the header of the page because the parser failed to grab the content accurately.)  The connection an author has with their readership is of the utmost importance and the new parser does everything it can to preserve that.  Authors are always attributed and their names are linked to their bio (when available).

No Longer the ‘Text View’

You’ll notice that I am no longer referring to the Article View as the ‘text view’.  The first reason is simply because the Article View now offers images and video, not just text.  The second reason is because calling it the ‘text view’ leads to some incorrect assumptions about what it is suppose to do.  While working on the new parser, I went over 2000+ reports from users of when they found a text view that didn’t work.  A lot of these were of non-article content like bus schedules, homepages, pdfs, etc.  For non-article content, you should be using Read It Later’s incredibly accurate full-web view/downloader.  (And this choice just got a lot easier).

Technology Behind the Parser

You’ll find that the parser works on an amazing number of sites.  It’s built on natural language detection and therefore works with some of the ugliest code and layouts imaginable.  It does not rely on specific site rules (ex. “if nytimes, use this section”, etc).  I also will never ask a publisher to use proprietary tags or modify their site’s layout to make it work with my parser.  There are very few instances where Read It Later’s Article Parser should have a problem finding an article, but in the rare case it does not, it’s the parser to blame, not the publisher.

Where To Find the New Article Parser

iPhone/iPad Users:

All new items downloaded after upgrading to version 2.3 will use the new parser.  To get the new Article View for any of your previously downloaded text-views, you’ll need to redownload them.  (See How-To Redownload)

While text views in Read It Later Free will utilize the new Article Parser for text, viewing images and embedded media is a Pro-only feature.  If you haven’t already upgraded to Pro, here is what else you are missing.

Web/Digest Users:

The web interface will utilize the new Article Parser for text, but viewing images and embedded media is a Digest-only feature.  To view images and embedded media on the web, you’ll need Read It Later Digest.  You can learn about Digest here.

Browser Extension Users:

The updated Article Parser will be coming to your extensions very soon.

Major Article View Updates in Version 2.3 of RIL for iOS

Version 2.3 of Read It Later for iOS is available today and it brings a number of new features centered around the new Article View:  Read about the new features in version 2.3 here.