What is srt3?

srt3 is a simple Python library for parsing, modifying, and composing SRT files.
Feb 13, 2022
5 min
7/12/2021

What are Captions?

Closed captioning and subtitling are both processes of displaying text on a video screen or display to provide additional or interpretive information to the viewer. This is beneficial for those who are hard of hearing or who are foreign to the language a video uses. Viewers are not the only people who benefit from captions: Studies show that using captions will improve creator-metrics such as watch time. Captions and subtitles commonly use the SubRip file format (.srt), which is what this program modifies.

After I released the Yung Quant EP, I wished to add captions to all of my videos. Unfortunately, the available tools at the time were tedious to use. Adobe Premiere Pro’s captioning tools prior to v15 did not allow you to copy, paste, or remove captions en masse. Automated captioning was underdeveloped for rhythmic sounds, which resulted in more work than using a manual method. You also could not (and still can’t) remove captions while also readjusting the subsequent ones.

Without the ability to readjust captions in bulk, making simple translations between srt files becomes tedious. For example, the difference between the GME Music Video and GME Audio is the intro and skit that occurs in the middle. Without srt3, you would have to manually modify each one by one which means spending twice the amount of work to caption a single video (and it’s respective audio). With srt3, I can do this by simply running a copy operation on repeated sections (such as a chorus), and then run a remove by insert operation to translate the Music Video Captions into the Audio captions.

The Goal of srt3

srt3 is a library and command line interface tool that makes the captioning process easier through the following operations: add, remove, copy (in between/to files), split. These operations have two modes that determine whether subsequent captions are modified (“by insert”) or not. This allows the user to parse, modify, and compose SRT files.

How does srt3 work?

You can learn about the technical details of srt3 by reading the documentation or by viewing the project on Github. srt3 is created using Python. Unit tests with and integration tests are automated using tox in combination with other testing/coverage/linting libraries. The documentation is generated using Sphinx. srt3 is available on PyPi.

Project Timeline

I started working on a library that modified Subrip files in April of 2021. After about a week, I realized that the goal I had for the project — that would become srt3 — was much more monumental than I imagined. Instead of reinventing the wheel, I decided to search for open source projects on Github that I could contribute to (if necessary). Unfortunately, none of the projects I found had what I needed, but the original srt project provided enough of a framework to contribute.

On April 19th, I committed the first of many operations that would be added to the srt tools library: srt-remove. I created a pull request for this commit, and waited for feedback while optimizing the code in the process. On May 4th, I committed this optimization and alerted the maintainer (Chris Down). Unfortunately, the maintainer found the code over-optimized and didn’t see a reason for the srt-remove functionality. The maintainer was correct on the code being over-optimized, as it used a binary search O(log n) when generators O(n) were sufficient. As a result, I re-optimized the code — justified by profiling — while also providing its use-case.

srt3 split edge case diagram
Don’t forget your edge cases.

On May 12th, Chris commented on the re-optimized code which was patched accordingly. In this patch, two things occurred:

  1. A misunderstanding between what I was creating (a tool to remove captions based on time), and what he thought I was creating (a tool to remove captions based on content).
  2. I stated that a review would need to be made within 3-weeks if he wanted me to add the copy functionality. This was the result of a physical-time constraint; less so a demand. I wouldn’t be able to finish the functionality (on his time), if he waited more than three weeks. However, it was perceived as a demand.

This is where the problems began. On May 19th, I pointed out the misunderstanding with a commit that added unit tests. Chris replied stating that he did not have time to review (but did have time to comment) and that the use-case I needed wasn’t needed. Then this happened…

Huh? lines-matching is not supposed to have this functionality at all. Let me just point you at this. 🙂

Chris Down (cdown)

It may not have been the right thing for me to do, but as soon as he started mentioning (rather arbitrary) UNIX conventions, I decided then and there that I’d be doing a bit of trolling. I hit him back with this, a true bozo-blaster. The fact that you could not use the srt tools along with the library was insidious, according to the conventions he provided.

Chris stated that he would not include the remove-functionality because it was “at odds with how srt is designed”. Essentially, he didn’t see a use-case for it which meant there wasn’t one. As a result, he wouldn’t even review the code. At this point, I knew it was over. I had one final commit to make, but it had to be convincing. It had to draw him in. A few hours later, I came up with this masterpiece. Keep in mind that Chris maintains systemd, which is a daemon for Linux.

srt pull request comment
I don’t use Arch, by the way.

I got blocked, so I had to fork the project. I opened a ticket — as you can't fork while blocked — and gave the representative a full-5 stars, because her forking skills were impeccable. The trolling was over. It was time for the real work to begin: From May 20 - July 12 (2021), I’d go on to refactor the entire library by merging the srt package with the command line interface srt-tools while maintaining modularity. I was also able to add the following operations: add, remove, find, match [process], paste (copy), and split. The entire timeline can be viewed using Git Commit History.

To avoid another cdown situation, I decided to add Contribution Guidelines, which makes the project easy to contribute to. The srt-tools Github documentation was upgraded along with the Read the Docs website using Sphinx with reStructuredText (not markdown). Finally, I added the full package to PyPi, allowing other programmers to `pip install -U srt3`: srt3 v1.0.0 was completed. The next day, all my videos were captioned; along with some collaborators’. The new workflow was complete.

Outcome

srt3 was the first tool I created in order to improve my content creation process. Adobe Premiere Pro 2021 v15 introduced a captioning workflow that allowed the modification of captions in bulk; among other machine learning features. While this has covered many use-cases I solve using srt3, it didn't make the library irrelevant in my workflow. As srt3 grows (in popularity), more people use it as a library to create other srt functionalities. In addition, libraries that use the original srt library can be upgraded for enhanced use. For example, an internal subsync fork that assists with captioning audio that has been vocal-stretched.

Read More

improve Your Mindset

Unsubscribe at any time. See Privacy Policy

link