Creating a small extension
During past years I have created, used and reused a lot of code, if the code itself is not intended to be public most of the times you can get away with something that just works ™ and have some concessions towards quality and performance. Recently the battery of my laptop has degraded enough that anything that takes a jab at the CPU will drain it faster than it should, in my daily usage something that I noticed had a big effect on it was watching youtube videos on a browser and since I only usually listen to the audio and don’t even care about the video I decided to look for an extension to help me. I basically found only 2 that claimed to do what I wanted, one was not even working, the other was but had some issues so I decided to look at the code and see if I could make it more suitable for my needs.
The code I was looking is here, it’s very clean and to the point and more importantly it works. If you are not familiar on how youtube delivers its content I’ll leave some resources at the end of this post but the tl;dr is as follow:
YT's video page > ytInitialPlayerResponse loads > gets available streams (they could be encrypted or not) > player get the resources > play
The available streams differ from video to video but you usually get a lot of video only streams, a few audio only streams and even less video+audio streams, later the player decides what to play and how to play them depending on your network connection and whatnot, each stream has an URL that you can just play or download but others don’t, those have a CIPHER instead that needs to be decrypted in order to get the stream URL (that’s what yt-dlp does). The streams are individualized by the use of itags numbers, if you know the number you know what you’ll get, so if you need to get the audio only streams you would look for those itags. The extension in this case goes through all the URLs comming from googlevideos(dot)com (that’s where the actual videos are stored) and if it contains “mime=audio” in the URI you got yourself an audio stream that completely bypassed the need to be deciphered, very clever.
I only had two main issues with it:
- The extension is using a browser action to enable/disable its functionality, I rather have something more handy like a toggle.
- If you get more than one “mime=audio” URL (on average you get four) the stream will reset its playback to acomodate for the newly found URL, under good network conditions you’ll probably get one playback reset if at all but on a low quality network you may get a lot more, for me this was annoying.
You can look at my code here and see how all that and more was addressed, the code is heavily commented.
Now to the actual reason for this post, I originally intended this to be just for me but then I decided to make it public since it could be useful to others, especially now that Mozilla will allow the use of mobile extensions using Firefox on December 14. After making very limited changes to the original code I mostly got what I wanted but after deciding to make the code public and publish the extension to the Mozilla’s Firefox add-ons store here’s what I had to add to it:
- Migrate to manifest v3: this allows for a somewhat future proofing and easy porting to chrome, but it also adds a lot of work, see below.
- Optional permissions: while using manifest v3 you can use optional permissions so your extension doesn’t require “access your data on all sites”, this is good but this optional permissions are not asked or granted by default using Firefox (hence the “optional”), on Chrome they do but that could and probably will change in the future, so now there’s an extra step where you need to add some sort of onboarding process to explain the user all this that requires more coding and testing in html, css and javascript.
- Explaining: You need to explain the users what the extension does and how to use it, this seems simple enough but it took a lot of time and thought to edit images and write/re-write text, this needs to be added not only to extension itself but also to its website and add-ons page.
- Desktop/Mobile UX: Sometimes you can have a single logic that works for both and save a lot of time and code but on others it’s just not possible like in this case.
- Testing: Lots of it, but it’s impossible to do enough even if you have several testers and even then it’s never enough but at least you should try to test as much as possible, using console.log() is invaluable for this, having something to look back as soon as you see an unexpected behavior can save a lot of time.
- Limitations: Know what the extension does and more importantly what it doesn’t or shouldn’t, you can always add to it later or do something new but it helps to focus to know what it actually should do and not more.
Lets see how long it takes Google to send me a ToS violation takedown…
Resources:
Reverse-Engineering YouTube: Revisited
Youtube > Data API