Team blog: Developers

Fun with Node 8.9.0 and jsdoc

lib.reviews is powered by Node.js. This post is very much about the internals, so only read on if you care. :-)

We’ve just upgraded the site to a major new release of Node: 8.9.0. Many excellent blog posts have been written about the new features in this series of Node. Personally, I’m most excited about async/await.

In modern web development, you’re often dealing with operations that are synchronous (executed immediately and blocking operation of other code) vs. asynchronous (effectively running “in the background”).

For example, when you run an expensive database query, you don’t want it to keep other visitors of the site waiting—it should run in the background. But the application needs to know when the query is finished. Promises are one way to organize such asynchronous execution sequences.

Unfortunately, as you deal with more complex sequences of events, using only promises can also make code increasingly difficult to read. Here’s an example from the sync-all script, which is run every 24 hours to fetch information from sites like Wikidata and Open Library:

Thing
  .filterNotStaleOrDeleted()
  .then(things => {
    things.forEach(thing => thing.setURLs(thing.urls));
    let updates = things.map(thing => 
      limit(() => thing.updateActiveSyncs())
    );
    Promise
      .all(updates)
      .then(_updatedThings => {
        console.log('All updates complete.');
      });
  });

What’s going on here? We’re getting a list of all “Things” (review subjects), excluding old and deleted revisions. Then, for each thing, we reset the settings for updating information from external websites. We build an array of asynchronously run promises which contact external websites like Wikidata and Open Library. The limit() call throttles these requests to two at a time.

The main readability problem is the increasing nesting. If you add .catch() blocks to Promises, it can be even more difficult to follow what’s going on, and to make sure all your brackets are in the right place.

Here’s what this sequence looks like with async/await:

  const things = await Thing.filterNotStaleOrDeleted();
  things.forEach(thing => thing.setURLs(thing.urls));
  await Promise.all(
    things.map(thing => 
      limit(() => thing.updateActiveSyncs())
    )
  );

It’s a lot easier to see what’s going on. And this isn’t even accounting for the greater simplicity of success/error handling. Under the hood, async/await works with Promises, and there are many situations where using Promises directly is fine (note even the second version uses Promise.all). But for more complex operations, it really makes a difference.

While I’m at it, I’m adding standardized documentation in jsdoc format to modules as I go. Essentially, these are code comments in a special syntax that can be used to generate HTML output. You can find the generated result here; it will be updated every 24 hours from the currently deployed codebase.


New screencast is up

A lot has happened since the last screencast, from October 2016: we got full-text search, integration with Open Library and Wikidata, a rich-text editor, and other goodies. Here’s an updated screencast (YouTube version) that gives a brief overview:


Open Library autocomplete search is live

On the heels of basic Open Library support announced last week, we now have an autocomplete search box for book titles as well. This has necessitated a bit of a redesign of the relevant part of the review form. Here’s what it looks like to perform an Open Library search:

Open Library search box

What’s new here is the dropdown that lets you choose between Wikidata and Open Library. In future, other sources like OpenStreetMap may make an appearance here as well.

The actual search is, I think, quite a bit nicer than the title search on OpenLibrary.org itself. If you search on OpenLibrary.org directly, you won’t get an autocomplete match for titles like “the wealt” that would match the title “The Wealth of Nations”. In our case this works just fine. Our search is also not sensitive to the word order, often producing more results at the cost of some irrelevant ones.

Unlike OpenLibrary.org’s search, our search attempts a match against both the stemmed version of the words you enter (e.g., “dog” will both match “dog” and “dogs”) and against the wildcard version (“dog” will also match “dogcatcher”). To do this we have to fire off two requests per query. I’ve put some notes together on OL’s GitHub repository in case there’s interest in building on these improvements for the native search.

Finally, this search has a little extra feature: you can narrow search results by author by adding an author’s name (partial or full) separated with “;” after the title. This is a bit more obscure than something like “author:”, but I figured it’s nice to have a shortcut for something you may want to do very frequently—and it’s documented in the help that’s shown next to the input.


Introducing basic support for Open Library metadata

In addition to descriptions and labels from Wikidata, we now also extract authors, titles and subtitles from Open Library URLs. If you haven’t heard of it, Open Library is a fabulous project by the Internet Archive that’s both a structured wiki with data about books, and an actual library.

After making a simple user account, you can “check out” up to 5 books at a time of which the Archive has a physical copy—you can either read them online, or download (DRM-protected) PDF files. As of now, the number of books available is at a staggering 522,358.

We use Open Library as a free catalog that doesn’t have the onerous licensing terms of WorldCat. In future, we’ll add more metadata fields like publication year, number of pages, and so on. For now, when you add an Open Library URL to an existing review subject page, the result is something like this:

Open Library imported data

Editions vs. Works

Open Library distinguishes between “editions” and “works”. A work encompasses all translations and releases of a book, while an edition is a specific one. Information like the number of pages and the publication year obviously is highly variable across editions, which is why we don’t include it yet until we have a concept of “editions” on our side.

We’ll likely want to generalize that concept, since it is applicable in other domains as well: the different versions of a movie, the generations of a product, and so on. This is tricky stuff—at what point is a product so different that it merits its own top level record?

Right now, we’re mushing information together if you provide multiple Open Library URLs for the same item. Modelling out the relations between things without adding too much complexity will be one of the biggest challenges in the future.

Search

We don’t have an Open Library powered search box on the “New review” page yet, as we do for Wikidata. Adding the search box is relatively straightforward, though the search results can be a bit frustrating due to word stemming rules that don’t play well with autocomplete search. Nonetheless, some search is better than no search, so we’ll add that in the near future.


Support for more links related to a review subject

Until recently, we only showed you one link for every review subject (like a movie or book)—the one that was added alongside the first review. We now have an interface for managing links. If you’re logged in with a trusted user account (i.e. you have written at least one sane review), you’ll see a “Manage links” button on review subject pages (say, “New Internationalist”). If you click it, you’ll see this interface:

Manage links interface

As always, this should work just fine without JavaScript if that’s your thing; you’ll just have to make do without the “Add more” button and submit the form several times if you want to add more than a couple of links.

The “primary” link here is typically the official website of a movie, book, or product if one exists. Additional links can go to databases, review sites, or anything else that seems appropriate. The software will attempt to automatically classify links to commonly used sites likes Wikidata, IMDb, OpenStreetMap, etc. On the page, it will look like this (different example):

Metadata example

As before, if one of the links goes to Wikidata, we automatically extract and index summaries from there. In future we’ll add additional interfaces to more sites to pull over metadata where appropriate.

There’s one catch: any link can be used for exactly one review subject. That should usually not be a problem, but there may be cases like “product guides” that are applicable to multiple review subjects. We may create targeted features for some of those cases down the road.


You can now edit descriptions, or keep them in sync with Wikidata

When you have a review subject like Les Misérables, it’s good to have additional metadata. Are you talking about the book by Victor Hugo? One of the many film and TV adaptations? The musical? In future we’re planning to add metadata that’s useful for specific domains (e.g., opening times for restaurants, or publisher info for books), but for now, we’ve added support for a short text description.

Wikidata, which we support as a source for selecting review subjects, already has these descriptions for many items. For example, it describes the book “Nutshell” (reviews) as the “17th novel by English author and screenwriter Ian McEwan”. When you review an item via Wikidata, this description (in all available language) is automatically imported. We’ve now also added these descriptions to all reviews that had a Wikidata URL associated with them, and they will be updated automatically every 24 hours.

An automatically updated description looks like this:

Screenshot of automatically imported description

For reviews that are not associated with a Wikidata item, you can add or edit the description after writing a review.


Translating our FAQ and Terms of Use

In addition to the user interface, it is now also possible to translate our FAQ and Terms of Use via translatewiki.net. This is done through the “page translation” interface. To get started, see the “Content pages” section on this page. Note that you have to be a translatewiki.net translator — see our introductory blog post for more information. Confused? Find us in our chatroom and we’ll try to help.

For these long texts, we will manually import translations when they are completed, and less frequently synchronize smaller updates.

Big thanks to Nemo bis for helping to get this set up, and as always, thanks to all translators. :-)


Mailing list is back up

Our discussion list is online again. To participate, simply subscribe.

If you’re unfamiliar with mailing lists, think of it like a web forum, but through email. You can also post through the web interface if you prefer.

We intend to use the list primarily for long-form discussion about policies, community development, and so on.

Why was the list down? We migrated from the Mailman 2 series to the Mailman 3 series of the software. It adds features like the web interface for posting, searchable archives, social logins, and much more, making the experience much more comparable to modern forum software.


Review anything from Wikidata

You can now review anything that has an entry on Wikidata, Wikimedia’s universal database of concepts. That means almost anything that has a Wikipedia entry, as well as lots of things that don’t — e.g., scientific papers, or individual episodes of webcomics. All you need to do is select the “Search Wikidata” function when starting a new review, like so:

Example of a Wikidata search

We’re excluding some Wikimedia-specific stuff from the search results — if you get a “no relevant results” message, that’s why.

Of course, when you select a Wikidata item, you don’t have to manually add a label or description. They’ll be imported from Wikidata in all supported languages. In future we’ll import additional properties, as well, and keep them in sync with the Wikidata source.

If you don’t have JavaScript enabled, the search feature won’t show up, but Wikidata URLs will still be treated magically: labels and descriptions will be fetched on our end.

This feature is brand new, and there are certainly going to be some issues with it. If you find one, please report it. :-)


Labeling things

When you review something by picking a URL, you have to give a name to the thing you review — the title of a book, the name of a product, and so on. Before today, you had to do this in an extra step after writing a review. Now you can do it from the “New review” form using the “label” field. It will be automatically completed for you if we already know something about that URL, like so:

Screenshot of URL completion

In future, we’ll support finding things to review by picking them from sources like Wikidata and OpenStreetMap, and may also do other things to fetch information automatically from known sites like Amazon.com. For now, this should already take a little friction out of the process.

Note that this change requires JavaScript — without JavaScript, you’ll get the same user experience as before the change.


 Older blog posts