Monday, December 11, 2023

My learnings from the BUB2 tool

Welcome back, wiki friends! It's been a long time since I shared my learnings with the Wikimedia community and today it's about a tool I've been developing and maintaining for a long time now - Book Uploader Bot 2 (BUB2)

What is BUB2?

BUB2 is a tool for the Wikisource community that uploads books, publications, newspapers, gazettes, etc. from public libraries such as Google Books, Panjab Digital Library, and Trove Digital Library to Internet Archive. The process is completely automated where the user just enters the URL (or book ID) and the upload happens asynchronously.

Motivation behind BUB2

Wikisource is the growing free content library for source texts. It aims to source, digitize, and centralize texts for all languages for the public to use freely. Following the mission for Wikisource, volunteers from all around the world contribute to growing this library. One of the sources of source texts is online digital texts that are published under the public domain. Often, volunteers had to use multiple websites to get a book uploaded to the Internet Archive and from the Internet Archive to Wikisource and there was a lot of manual work involved.
I quickly realized this pain point and being a developer, I wanted to solve this for the community. Hence, I decided to automate the process and developed BUB2 which helped solve this problem for the masses.

Learnings from the tool

  1. Solve for the masses: While talking and working with Wikisource volunteers, I quickly realized that source texts are language-dependent. Each language community has its own digital library and if we want to solve for masses, we have to add support for the maximum libraries. That's why we prioritized Google Books which has books for all languages and Panjab Digital Library for Punjabi Wikisource. Recently, we added support for Trove for Australian source texts.
  2. Many hands make light the work: I quickly realized that BUB2 is a never-ending project. With new libraries, direct integration with the Wikimedia ecosystem, and much more, I would need more helping hands, hence I opened up the project for open-source programs such as Girlscript Summer of Code, Hacktoberbest and recently I'm mentoring at Outreachy for the Wikimedia Foundation. My aim is to spread awareness about the Wikimedia mission and get more long-term contributors for the project who go on contributing to BUB2 and subsequently other projects as well. 
  3. Metadata is important: Just like the source texts, the metadata for the source texts is equally important. The metadata includes author, publisher, published date, etc. and I realized that without proper metadata, the item could be flagged easily as vandalism. Hence, while developing the tool I kept metadata as a high priority because I knew that in the future all those metadata would be direct Wikidata entries. :)
  4. Minimalistic design: The initial design for BUB2 was dirty. Since I had to release the tool early, I didn't focus a lot on the design, hence kept it quick and dirty but as we decided to implement more and more features, design started becoming a bottleneck. I decided to do a design overhaul and followed the Wikimedia Codex design guidelines to redesign the BUB2 tool which made it very easy to implement the long-term goals.

Future of BUB2 tool

There are a lot of cool things planned for the BUB2 tool in the future to improve the overall productivity of Wikisource volunteers. Highlighting some of them below:
  1. Integration of Wikimedia Ecosystem: Direct upload to Wikimedia Commons, automated entries to Wikidata, and creation of Wikisource index page, with just 1 click! 
  2. Adding support for regional libraries: Addition of more regional libraries to support different Wikisource communities
  3. Real-time updates of the upload via email: Real-time updates of each stage to the contributor
  4. Upload via Internet Archive: Upload other files from Internet Archive to Wikimedia Commons, there by replacing IA upload.
Not to mention, I'll be working hard to make BUB2 win the Coolest Tool of the Year! 😎

Contributing to BUB2

If you're a developer and are interested in contributing to BUB2, feel free to check out the GitHub repository. It has all the instructions from setting up to contributing and so on. You can check out the list of issues on Phabricator. Just assign yourself one and get started on the roller coaster guide. :)

No comments:

Post a Comment