Unglue.it Website is now Open Source

As part of our shift to operation as a community-supported 501(c)3 not-for-profit organization, we’ve opened up the source code to the Unglue.it web application and website. You can now report issues, help us fix bugs, or run your own version of unglue.it from the git repository on GitHub. (You can’t use the name unglue.it without our permission, the name is a trademark of the Free Ebook Foundation.)

Unglue.it is a Django application written in Python running with a MySQL backend on Amazon Web Services. We use Vagrant to build production and test servers; we use a Jenkins instance for continuing integration and testing.

In the coming weeks and months, we’ll be adding our development roadmap to Github, and we’ll mark issues that are suitable to be worked on by volunteers. The main focus of Unglue.it has shifted from crowdfunding for free ebooks to the cataloguing and distribution of free ebooks, but this isn’t so obvious from the website design and documentation. We started Unglue.it before practices such as responsive design matured; we want to make it work much better on mobile.

We’re particularly happy of the work we’ve done to make free books available via APIs; any facet or list on the website can be accessed as ONIX, MARC, and OPDS feeds; there are also facilities to push ebooks via FTP to other sites. Code that imports ebooks from other sources (ONIX, MARC, OAI-PMH) has been a more work because metadata is always messy.

Other areas of our code show the signs of disruptions long past, particularly the payment module, which was designed for Paypal, redesigned for Amazon Payments, then redesigned again for Stripe. Not something we’d wish on anyone, but it works!

commitsThe trickiest part of opening up the source code has been password hygiene. We had to comb through the entire git history (over 6,000 commits!) to find and deactivate passwords, accounts and secret keys that had been put into the repo. To allow us to continue using the open repo without exposing secrets, we’re using Ansible Vault to encrypt all the secrets. A master key to the vault decrypts the vault during the server configuration process; this master key never leaves the secure environment of the admin’s computer.

There isn’t a master key to building a strong community around a project for the public benefit. Luckily, we can get some pointers by reading Karl Fogel’s Open-Licensed book “Producing Open Source Software “, a new version (2.0) of which is available on Unglue.it!

1 Comment

DOAB and Project Gutenberg books in Unglue.it

Slow and steady. That’s how we’ve been improving Unglue.it, turning it into a better place to find free ebooks. A lot of that work has been invisible; our new APIs are being used by organizations like New York Public Library to offer ebooks that deliver value without draining acquisition budgets. We’ve also installed tools that ebook creators will be able to use to better understand how their ebooks are being used. We’ve improved our data model to support relationships between works. So for example, when Peter Suber’s book on Open Access is translated into another language, links between the works are displayed on the unglue.it page. Similarly, Richard Herley’s The Stone Arrow is linked to its sequel, The Flint Lord. And have you noticed that author names are clickable?

Our biggest effort over the last year has been the expansion of our database of free ebooks. Two big sources are worth noting:

  • doabDirectory of Open Access Books (DOAB). DOAB has been tracking books written by academics and published with peer-review, often by university presses. Any book that’s in DOAB now has a page in Unglue.it, and it’s labeled as such. We’ve added a DOAB facet so you can restrict your browsing to books from DOAB You can use the DOAB label as a mark of quality and know that a book is being relied upon by scholars, scientists, and researchers.
  • gtbgProject Gutenberg. Project Gutenberg is the oldest and largest collection of public domain ebooks. Through GITenberg, we’ve been exploring ways to make this collection more discoverable and maintainable. So far, we’ve loaded about 5,000 ebooks from GITenberg into Unglue.it. GITenberg allows programatic access to the ebooks, unlike Project Gutenberg, so Unglue.it can do things like send them to your Kindle. You can use GitHub to suggest improvements to these books, and to their metadata. And we’ve added a Project Gutenberg facet to help you browse these books.

For both DOAB and Project Gutenberg, your Unglue.it “Faves” help us rank the books, and help other ungluers (and our library partners) know which of them to pay more attention to.

We have a lot improvements to make. Don’t hesitate to make suggestions, either in the comments here or by email to unglue.it support. Another way you can support Unglue.it is to put our featured ebook widget on your website.

Free eBooks by ISBN

After reflecting on the coming demise of xISBN, we decided to add an endpoint for free ebooks to the unglue.it API.

The API documentation is at https://unglue.it/api/help

With an API key, you can check if there’s a free ebook for any ISBN. ISBNs can be 10 or 13 digits, and can include dashes. This service returns all free-licensed ebooks for a work associated with an ISBN, and for each ebook includes information about file type, rights, and the provider hosting the file.

For example, here’s how to get a list of ebook files for “Homeland”.

JSON: https://unglue.it/api/v1/free/?isbn=9780765333698&format=json&api_key={your_api_key}&username={your_username}

 "meta": {"total_count": 3},
 "objects": [
    {"filetype": "pdf", "href": "/download_ebook/2576/", "provider": "Internet Archive", "rights": "CC BY-NC-ND"},
    {"filetype": "epub", "href": "/download_ebook/2577/", "provider": "Internet Archive", "rights": "CC BY-NC-ND"},
  {"filetype": "mobi", "href": "/download_ebook/2578/", "provider": "Internet Archive", "rights": "CC BY-NC-ND"}

XML: https://unglue.it/api/v1/free/?isbn=9780765333698&format=xml&api_key={your_api_key}&username={your_username}

 <objects type="list">
 <provider>Internet Archive</provider>
 <rights>CC BY-NC-ND</rights>
 <provider>Internet Archive</provider>
 <rights>CC BY-NC-ND</rights>
 <provider>Internet Archive</provider>
 <rights>CC BY-NC-ND</rights>
 <meta type="hash"
 ><total_count type="integer">3</total_count>

We’ll soon be integrating Gitenberg ebooks into this feed, too.

1 Comment

Unglue.it Goes Non-Profit

Since its beginning 4 years ago, Unglue.it has been a part of Gluejar, Inc., a privately held for-profit company. We initially thought Unglue.it would be mostly about crowd-funding books into the public commons. While unglue.it has always put a public benefit at the center of its mission, the for-profit status made sense for a crowdfunding business. Over the past two years, Unglue.it has shifted into the nuts and bolts of distributing and promoting freely-licensed ebooks, because we realized how dysfunctional the commercial ebook supply chain had become. The for-profit status made less and less sense.

Over the last year, we’ve also started working on GITenberg, and effort to improve the ebooks in Project Gutenberg. To our great surprise and pleasure, we got grant funding for this work from the Knight Foundation, and fiscal sponsorship from the Miami Foundation. Suddenly, our eyes opened to the realization that we would be better able to continue our work as part of a non-profit entity.

FEFlogo2So a bunch of us have created the Free Ebook Foundation. It will be the corporate home for both Unglue.it and GITenberg. There might even be some new projects. We’re really excited about it.

There’s a lot to do in setting up a non-profit. We’ve applied for charitable tax status (it usually takes several months to receive it). We’ll be creating new accounts for the Foundation and transferring over licenses, subscriptions, and assets. We hope to have everything switched over by the end of the year. Unglue.it users should not notice much differences.

Q. Will Unglue.it continue to be developed?

A. Yes! The combination with GITenberg gives us more resources to work on Unglue.it. Expect new distribution agreements to be announced soon.

Q. Can I donate to the Free Ebook Foundation?

A. Not just yet. When we receive confirmation of our tax status, we’ll start offering ways for you to support us on the website.

Q. Will the software running Unglue.it be released as open source?

A. We expect that most of the software will be released under appropriate open source licenses.

Q. Will Unglue.it continue to run crowd-funding campaigns?

A. Only to the extent that doing so is consistent with its tax status, yes.

Q. Will Gluejar, Inc. continue to exist?

A. Yes, Eric Hellman will continue his patent and privacy consulting as businesses of Gluejar. He has helped companies invalidate bad patents through the Inter Partes Review process, and has begun helping libraries identify privacy leakages in their digital services.

Q. The logo is ugly. Could I design you a better one?

A. Oooooh please!

1 Comment

Unglue.it joins GITenberg

Unglue.it has been of two minds about public domain ebooks. On the one hand, we recognize that the public domain contains the greatest literary works ever produced, and ebooks of these works need to be on any serious reader’s ebook shelf. On the other hand, there are plenty of web sites already focused on the public domain- Project Gutenberg is the grandaddy of them all. Other websites- Manybooks.net and Feedbooks.com to call out the best – have done a pretty good job of taking public domain books and make them easy to find and download. Some sites are based in countries where books enter the public domain sooner than in the US. You can get books like “The Great Gatsby” on Feedbooks or ebooks@Adelaide, even while they’re under copyright here in the US.

To be frank, the websites focusing on public domain books haven’t met many of the needs of libraries. I can search for Huckleberry Finn on many library catalogs and be told that the only ebook held by the library is checked out. There’s a good reason for that. Many of the ebooks in Project Gutenberg aren’t formatted so well in for epub or mobi, despite the high quality of the plain text digitization. With 50,000 texts in Project Gutenberg, it’s hard to tell which ones are top quality and which ones would cause support problems for overworked librarians. We loaded a few hundred titles from Project Gutenberg into Unglue.it to see what happened.

As you might expect, these classics accumulated a lot of faves, and so we’d occasionally go and clean up some ebook files. Since we use Github to manage our website code, the natural thing to do was to put the cleaned-up ebook files in Github, in case someone else wanted to use them – there was no obvious way to get them into Project Gutenberg itself. I thought it would be cool if more of project Gutenberg was in Github.

Then I discovered GITenberg. Back in 2012, Seth Woodworth, an ebook technologist, wanted nicer ebook editions of classics from Project Gutenberg. And Github was the obvious platform collaboration. So he created a Github organization, named it “GITenberg”, and created thousands of Github repositories for Gutenberg texts. It was a no-brainer for Unglue.it to join the effort.


When I heard about the Knight News Challenge for Libraries, I suggested to Seth that GITenberg might have a chance. Working together, we wrote up a proposal, adding some library spin.

There were 676 entrants in the News Challenge, and believe it or not, GITenberg was one of 22 entries to receive funding. The team has been awarded a $35,000 “Prototype Grant”, which will allow us to spend some real development time to start turning the idea into something that really works. More to the point, we have a deadline (in late June!) for demonstrating the GITenberg concept.

But aside from 45,000+ repos on GitHub (a significant achievement by itself) GITenberg has so far been more concept than reality. If you try to adopt a repo and submit a pull request, you’ll become aware that the GITenberg of today is more of a sketch than a working system. To make it a working system, we’ll have to assemble a lot of cooperating components. Thankfully most of the components we need exist, and people are working on them. This became very clear at the Hack Day sponsored by New York Public Library in January.

So what does this all mean for Unglue.it?

The obvious benefit is that the quality of public domain ebooks in unglue.it will get a big boost if GITenberg succeeds – the work in GITenberg will be 100% free and open, and Unglue.it will be making sure that all that data flow really works. But in the bigger picture, the machinery that gets built for GITenberg will offer solutions for free ebooks in general. New ways to collaborate around free and open metadata is something that Unglue.it really needs if it is to become the comprehensive database for freely licensed ebooks that we’ve been striving towards.