Cpt. Kobra

Software engineer & host of the Post Mortem podcast. I write about the stuff I build and learn.

How to process & export rich text in Qt

As a freelance Software Engineer, I get the chance to work on a vast set of technologies. For one of my clients, I often need to implement features in a C++ backend using the Qt framework.

If Qt makes lots of things easy with its built-in modules, sometimes it's unexpectedly hard to achieve something you thought would be trivial.
I just had an example of it for rich text processing. So let's dive in and see how you can iterate over rich text in Qt and export text and formatting information in a json.

Exporting rich text from Qt, your options

  1. Build your own HTML parser to handle the rich text.
    • pro: flexible
    • con: so flexible you're likely to have bug reports filled every week due to an unexpected input (:
  2. Use a QTextDocument and its setHtml method.
    • pro: handles the HTML processing for you.
    • con: requires to be familiar with QTextDocument sub-classes.

Dissecting the QTextDocument class

Extract from the Qt Doc, source: https://doc.qt.io/qt-5/qtextblock.html
_Extract from the Qt doc, source: https://doc.qt.io/qt-5/qtextblock.html

In a nutshell:

The tricky part is that formatting options are disseminated among the QTextBlock and the QTextFragment classes.

For example, you'll find alignment information (whether the text should be displayed on the left, right, center, or justified) in the format class of the QTextBlock. To be exact, you'll find this information in the QTextBlockFormat class, accessible from the QTextBlock.blockFormat() method.

On the other hand, if you want to know whether a portion of text is in italics, you'll have to look at the QTextFragment level. The styling class for the QTexFragment is the QTextCharFormat, accessible from the QTextFragment.charFormat() method.

C++ code snippet

For iterating over a rich text document and recovering text and formatting options in a json.

QJsonDocument exportRichTextToJson(QString myRichTextToProcess) {

    // 1. Make a document out of the rich text to enable programatic access to the document
    QTextDocument document;
    document.setHtml(myRichTextToProcess);
    if (document.isEmpty())
        return QJsonObject{};

    QJsonObject jsonObject;
    QJsonArray fragments;

    // 2. Iterate over the documents' blocks
    for (QTextBlock block = document.begin(); block != document.end(); block = block.next()) {
        QTextBlock::iterator frag;
        for (frag = block.begin(); !(frag.atEnd()); ++frag) {
            QJsonObject fragment;

            // Recover block level formatting options
            fragment["alignment"] = getBlockAlignment(block);  // custom function, returns `left`, `right`, `center`, or `justify`
            QTextFragment currentFragment = frag.fragment();

            if (currentFragment.isValid()) {                        
                fragment["text"] = currentFragment.text();

            // Recover fragment level formatting options                                
            const auto charFormat = currentFragment.charFormat();
            fragment["italics"] = charFormat.fontItalic();
            fragment["bold"] = isCharBold(charFormat);
            fragment["fontSize"] = charFormat.fontPointSize();
            fragments.append(fragment);
            }
        }
    }
    jsonObject["fragments"] = fragments;
    return jsonObject;
}

Finding bottlenecks in your python app

"Early optimization is the root of all evil." is a common mantra in Software Engineering.

Why profiling?

Optimizing for a specific computation when the bottleneck is at the network level isn't the best use of your time. How do you find out which module needs a critical redesign, a better choice of algorithms, or data structures?

You can use your intuition and test the water using a few print statements here and there. Let's say you time a DB operation against an image processing task to see which one needs some efforts first, comparing both execution time can help prioritize your energy but this severely risks seeing only a part of the picture if the dominant term for your program congestion is instead the API authentification flow.

In a nutshell, It's easy to make wrong assumptions about your system bottlenecks and invest lots of time and effort for marginal performance gains. You need a systematic way to assess the resource usage of your app.

That's exactly what profiling is about.

A brief note on monitoring vs profiling:

  • Monitoring your system is about having metrics on the health of your system in prod in (as close as possible to) real-time. You can monitor the CPU usage, memory, bandwidth, number of incoming requests, errors, etc.
  • Profiling is about identifying resource consumption of the different functions and instructions of your program. Think of profiling as doing a live dissection on your service to look into its deepest internals. From the high-level API calls to library functions, deep down to OS-level operations like read/write. You usually profile your program when the prototype is working and you're looking for the most critical element to optimize, if necessary.

Profiling is particularly useful when you have no clue over what could be your system bottleneck. Start your program, profile it with a typical workload, visualize the profiling report, and voilà. You now know it's this sneaky image serialization task you didn't even suspect that's taking way too long!
Note that you can profile different metrics, like run time or memory, depending on your needs.

Profiling your python app

Fortunately, we have several profiling tools readily available for use in python. Among them, cProfile and py-spy

  • cProfile comes from the Python Standard Library. One drawback of cProfile is the non-negligible footprint it has on the actual code you want to monitor. You're likely to have to change the source code of your Python app to profile it with cProfile. For instance, by adding a context manager (available since Python 3.8) like so
import cProfile

with cProfile.Profile() as pr:
    # ... do something ...

  • py-spy is a third-party library with 7K GitHub ⭐️ at the time of this writing which seems surprisingly low to me given how well it works and the problem it solves. The big advantage of py-spy over cProfile is py-spy ability to profile your app without changing the python source code at all. It's almost too good to be true, right? This comes at a high-security price, py-spy needs to have lots of privilege on the host to get the information over the process it will profile.

Note that both cProfile and py-spy generate a profiling report that is rather dull and hard to interpret past the dozen of functions in your program. Luckily for us, the open-source community offers strong tools to help you visualize those reports, I'm thinking about the beautiful speedscope app for profiling results visualization.

Example of profiling result visualization with the speedscope app
Example of profiling result visualization with the speedscope app, source: https://github.com/jlfwong/speedscope

Profiling your python app running in docker

py-spy proposes a convenient way to profile python apps running within a docker container, it's even easier if you're using docker-compose.

This blog post from Jan Pieter Bruins Slot does a nice job at proposing a minimum working example of how to profile your python app through docker.

Note that I'd recommend dropping thepy-spy service from your production docker-compose.yaml. If it's ok to pay the security price of enabling the SYS_PTRACE capacity to enable a container to read process memory in a dev environment, that has little benefits in prod but expose you to potential security risks.

To address this, you can maintain two docker-compose files:

  • One for your dev environment dev.docker-compose.yaml with a py-spy service
  • And another docker-compose file for your prod without the py-spy service, your prod.docker-compose.yaml

Wrap up

  • Profiling is not so hard in python!
  • Our intuitions for optimizations are often wrong.
  • py-spy and the speedscope app even make it fun to explore the call stack!
  • Profiling helps you get a better understanding of the order of magnitude of the different tasks of your pipeline.

ffmpeg, this under rated tool

A few days ago, I needed to convert a recording from .mkv to .mp4. On Google you find a bunch of free online tools for format conversion, but uploading an hour long video just to do a format conversion, and then downloading it back seems overkill.

Then I remembered this CLI I heard about a while ago: ffmpeg. As per the the official documentation:

FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the cutting edge. No matter if they were designed by some standards committee, the community or a corporation.

Lots of quick edits and format conversions can be achieved through ffmepg, limiting the need for advanced tools like Adobe Premiere Pro or Media Encoder. You can change the format, trim a video, add subtitles, adjust audio, convert a video to a gif 🙂 (or even a gif to a video 🙃 ). All of this from the command line.

The complete doc can be overwhelming but many GitHub gists propose the most commonly used commands in an easy to read fashion, see this ffmpeg cheat sheet

In my case, turned out I only needed to run: 

ffmpeg -i input.mkv output.mp4

If a cheat sheet solves the task at hand, it's worth investing some time in understanding the concepts at play when dealing with audio/video format.

Below are some concepts that helped me decipher the ffmpeg doc. 

Container

A container format is a file format that enables multiple data streams to be embedded into a single file. A codec is required to decode the content of the container (several codecs actually, as audio and video each are encoded with different codecs).

For example, mp4 is a container format.

Muxer / Demuxer

aka multiplexing / demultiplexing

  • multiplexing: going from several signals towards a unified signal. (e.g., from a video + an audio signals to a single output signal)
  • demultiplexing: separating a single signal into several sub-signals. (e.g., from a single output to an audio plus a video output signal)

Some popular codecs for audio...

  • mp3
  • aac


...and for videos

  • h264
  • h265
  • vp9
  • av1


Bonus Adaptive bitrate / Adaptive streaming

Quick refresher, the bitrate corresponds to the number of bits required to encode one second of video. The higher the video quality, the higher its bitrate.

One challenge when streaming videos from mobile devices is the variation of the network quality. Let's say you start watching a 1080p video from the city center with good 4G network. Then, you take the train to the countryside. The network degrades to a capricious 3G and the video buffers or stops.

Adaptive bitrate got your back!

With adaptive bitrate, the video is exposed on multiple formats: 240p, 720p, 1080p, 4K, etc. on the server. The client downloads a table of the mapping quality -> video chunks in a given quality and starts playing the best quality it can afford given the available bandwidth. In the city center, the client consumes the video chunks from the 1080p url, and in the train, as the network degrades, the device starts consuming video chunks from the 720p url. Pretty cool!

See this great article for more details on adaptive streaming: 

Laukens, N. (2011). Adaptive Streaming — a brief tutorial. Retrieved May 16th, 2021, from https://tech.ebu.ch/docs/techreview/trev_2011-Q1_adaptive-streaming_laukens.pdf.

5 Tips to get better at your next technical interview

To become a Software Engineer at a BigTech, you need to ace several rounds of technical interviews. In most rounds, you're asked to implement a solution to a challenging programming exercise and explain your reasoning.

They're plenty of resources available online detailing the concepts you should master, graph traversal, big O notation, dynamic programming, etc. to maximize your chance the interview day. But there's one thing that's seriously under looked in the literature: how hard it is to be consistent in your practice and how to make an habit out of it.

Vincent and I met when we were preparing for GAFAM interviews. We exchanged some tips on how to best prepare and one year and a half later, we've been practicing Software Engineering Interview questions every Saturdays since. We even do it on Twitch, one Leetcode question solved live every Saturday morning @ 09:30am Paris time - come practice with us! 😉

We learned a few useful things along the way that helped us stick to a practice agenda, and even managed to make it a moment we're looking for in the week! Here are the tips:

5 reasons why you need a sparring partner to prepare for technical interviews

  1. Accountability. You can't just "not show up" when your sparring partner is waiting for you to practice on a Saturday morning, you have to commit.
  2. Undivided Attention. When you hear your colleague typing or writing down the layout of their algorithm on a draft paper, there's a genuine ambience of study room in the air. You're in an environment that pushes you to focus on this task at hand, making it less likely you get distracted by your phone and resist the urge to check social media for now.
  3. Enjoying the process. Practicing with a friend removes the stress induced by preparing for interview questions. Here it's about solving a genuinely interesting problem in a judgement free environment. Plus, you can compare your approaches, learn from each other, and share some tips.
  4. Thinking out loud. At the end of each session, make a walkthrough of your implementation, like you'd do in a real interview. That's a great exercise to get used in expressing concisely the logic of a potentially complex solution.
  5. Consistency. It's easy to skip one session when practicing alone, there'll always be a good excuse. But consistency is key, even after you landed a job at your dream company. It's really more about the journey than the outcome since mastering each and every data structures and algorithms is almost impossible in a lifetime, there'll always be something to learn.

On top of those 5 reasons why, here are some tips to make the practice easier and help you get the most out of your preparation.

Getting the most out of your preparation

After you've practiced a few questions, you'll realize you're more comfortable in some areas and a little less in others. That's a great opportunity to build yourself an agenda for the topics you want to get better at. You can schedule 4 to 5 sessions on graph traversal problems with your sparring partner if it's what's causing you the most trouble at the moment.

Also, there's the practice time and the follow-up time. Following up is crucial to ensure you understand what you could transpose from this problem and its solution to other problems.

Some days, even after spending several hours on a problem you just don't find an answer. That's ok. Take a look at a curated answer and follow closely the resolution process and then, try to re-implement the solution. Every week we share a curated answer to a Leetcode question in our substack and a walkthrough of our implementation in a Youtube video (in French). Hope you'll find it useful!

Finally, it's a good habit to take some notes on the problems you tackled. Especially the ones you struggled with, so that you can come back at them. Any note taking app will be good enough to keep track of your progress.

Final word

A French rap music artist once said

"C'est pas parce que tu ne joues plus que le jeu s'arrête" - Kaaris, Zoo, album Or Noir.

Roughly ~ It's not because you stopped playing that the game stopped

That's the same for learning new algorithms and data structures; it's not because you enjoy your current position that you should stop preparing for the next one. These things, they take time.

Why you should use a Reference Manager, as a Software Engineer

The little secret no one told the non Academic folks.

We tend to see the world through the lens of the cumulated media that influence us. The same goes for your Software Engineering career; every technical blog posts, research papers, books, online courses, videos, etc. you consume shape the way you tackle a problem. Some pieces of information stand out more than others. And some pieces even give you that Ahah! moment where a foggy concept becomes crystal clear. It would be great to keep those eyes-opening ressources at hand whenever we need a refresher or want to share them with a friend.

How can we keep track of those valuables chunks of information spread out on the Internet?

Our digital knowledge is scattered among various services and keeping track of a piece of information depends on the format of the media.
You might add the last book you read on your Goodreads bookshelf. If it's a Youtube video, you like it and maybe add it to a playlist. For research papers you read them, annotate them and then never come back at it because you forgot what it relates to. Maybe you use Pocket for interesting blog articles and read 1 out of 10 bookmarked posts. Each of those systems have their own metadata system and information retrieval becomes tedious.

A use case - Indexing research papers

I tried several techniques to organize my pdf. For research papers, I used to have a tag system with the colored label on macOS. Red for an an unread paper, green for read, yellow for outstanding. But this 1-Dimensional approach is limited. If you store all those papers in a single directory you quickly get overwhelmed, so why not using several folders? Ok why not, let's take an example: I have two folders; one for papers related to Distributed Computing, one for Machine Learning. Now let's say I find a great paper on managing ML models in a distributed environment, where do I put it? In the Distributed Computing folder? Or in the Machine Learning folder? Do I create a Distributed Machine Learning folder? All those answers make information retrieval harder.

A solution, using a Reference Management Software

We can take inspiration from the Academic World. They read conference proceedings and other long, scary .pdf almost everyday. Their career is build on their intellectual capital and their ability to connect the dots between results of past experiments to come up with innovative ways to address a problem and move Science forward. To achieve this goal, they need tools to organize their knowledge with little distractions. We share at least this minimal set of features with Academic folks:

  • Storing a reference and its metadata - a Write

  • Retrieving a reference efficiently with multiple search criteria - a Read

In Database speak, the system we aim to design shall sustain heavy read and low write rate. It's ok that it takes more time for us to "write" a new record in the reference management system (adding metadata: author, publisher, source url, year, keywords, abstract, etc.) since it allows cheap read operation. The faster you retrieve information, the more likely you'll re-use the tool over time.

It literally takes less than a minute to add an entry to a your reference manager. You're helping your future self by doing this and adding those tiny bits of metadata that make that paper, blog, video, meaningful to you.

A reference manager gives you that one-stop shop for the pieces of information that build up your intellectual capital. Usage of such a tool pays off in the long term, like 2+ years IMO. Indeed, a folder-based approach is good enough to get started. But once you realize how hard it is to maintain your knowledge base, consider using a reference manager.

Helping your future you..

...and others along the way

So what are the benefits of using a reference manager?

  1. It enforces a consistent metadata system and naming conventions which speeds-up information retrieval.

  2. You now have your one-stop shop for all your resources that have contributed to your intellectual growth. Books, papers, but also Youtube videos and other non-academic formats. I even have a record for this amazing Youtube video distilling the intuition behind the Fourier Transform. I could never get such a visual intuition through a textbook. (I just typed Fourier in my Bookends to get that link :) )

  3. It becomes easier to share a reference with a friend. Discussing with a colleague about string encoding? It's likely they'll be interested in reading Spolsky, J. (2004). The absolute minimum every software developer absolutely, positively must know about unicode and character sets (no excuses!). Retrieved 2021-04-19, 2021, from https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/

  4. Low on disk space? It's ok, you can store metadata of a book/paper/pdf/video and not have the actual data on your disk. The wise man know where to look it up.

Getting practical, two interesting reference management systems

I hesitated between Zotero and Bookends but ultimately went with Bookends as my reference manager on macOS. They offer very similar features and if I were a Linuxian, Zotero would be my go to solution.

Their interface has little distractions to focus on the content. The several panels and options make the tool overwhelming at first but key features of adding/searching for a document are intuitive to use.

Zotero is open source and completely free, while Bookends has a price tag of 63€ with a one time buy (I'd never go for a subscription service for something I know I'll use my whole career if I can answer the need through a one-time payment).

After trying them both, the UI and iCloud sync feature of Bookends made me choose this one.

The secret for a great podcast interview

Preparing for your next interview

In my previous article, I wrote about how to find new guests for your show and how you can reach out to them.
Now, let's focus on how you can best prepare for your conversation.

Doing a solid interview preparation enables two things:
1) You know exactly what you're looking for when discussing with your guest. You've got your direction clearly defined.
2) Freed of the hassle of thinking what to ask next ; you can listen to the speaker and give them your undivided attention.

How to prepare?

The Prep. Document

In my first episodes of Post Mortem I did not use any preparation document (or prep. doc. for short). For sure, I did some research on the guest, had some notes and a few questions. The problem with this first approach is that I often missed to cover parts of the story, If you don't have a clear goal guiding your interview, you easily get sidetracked.

In the Gimlet Podcast How to Get Good Tape, they advocate for having a "Prep. Document". So what's a prep document and how can it help ?

A Prep. Document is a roadmap for your interview.

  • At the top of it there's the "North Star" ⭐️ . It's the idea that drives the whole discussion, that one thing you want to get out of the interview and you won't leave until you have it. Write it as a short guiding principle that will help you keep your interview focused.
  • Then you have an outline; a few chapters with bullet points for the question you want to ask. Note: use open-ended questions and aim for feelings or stories, they're more likely to get engaging responses compared to closed yes/no questions.

A template

Here's the prep document I personally use for my episodes of the Post Mortem Podcast:

North Star 🌟

💡 The thing you don't want to leave the interview without.
A 1 ~ 2 line statement with the goal of the interview

...

Outline

💡 A list of elements and points I want to discuss with the interviewee. Split in chapters
It's best to use open ended question
Make sure to identify the question you want an answer to beforehand.

If it's all you're asking, it's all you'll get

Aim towards; either feelings or stories at each question

I.

a.

b.

c.

II.

a.

b.

c.

Resources

Useful links, blogs, videos, etc. to help understand the issue discussed in the episode

I have an additional Resources section to keep track of documents I found during the interview prep - blogs, research papers, videos, etc. - that could be interesting to share with the audience through the show notes. I can't emphasize enough how useful it is to have a reference management tool to help you with this task, but that will be the subject of another article ;)

Wrapping it up

Having a prep document gives you confidence in running the interview. Most importantly, since you don't have to constantly think about what to ask next, you can truly listen to what your guest have to stay.

Finding guests for your podcast

I host the Post Mortem podcast where software engineers come to the show and reflect on a challenging project or incident. Since each episode is a 2 way interview, I need to find a new guest every month.

Here's the flow I've refined over the past 6 months to ease this booking process:

  1. Identify potential guests - be curious & prioritize
  2. Reach out - a few rules of engagement
  3. Follow up - some bookings take more time than others.

Now, let's break down each step.


1.Finding potential guests

aka lead generation

How do I find new guests ? was the question I asked myself the most when I started my podcast. Indeed, passed the few persons in your existing network that have a relevant experience to share on the show, you feel the fear rising as you're wondering how to cold call/email someone to come to your show.

One advice is to look differently at the media you consume everyday. Whenever you see a news article, blog post, a tweet, or a video that relates to your topic, write down the info of the poster in a spreadsheet. I use one looking like the table below:

Name Contact info Contacted? Answered? Last contacted on VIP ? Notes
John Doe @johnDoe on Twitter Yes No 2021-03-25 Yes John wrote a great article on his blog on how he resolved a major incident in a data center
Carly Ra @carlyRa on Twitter No No N/A No Carly regurlaly tweets on how incident management is dealt with at her company
... ... ... ... ... ... ...

Note: You don't need to contact this person as soon as you think they'd be a great guest, but you must write down their infos so that you don't forget to do it.

It only takes a minute to write down a line in your spreadsheet, but crafting a proposal message where you pitch your show and make it echo with the person's experience is a much harder task. If all cells are relatively self explanatory, the "VIP" one might need some context. Here, VIP is to understand as:

"Is this person someone I really want to book for the show due to their high expertise in the domain or large audience base? Am I ready to put a substential amount of time and energy in reaching out and following up?"

Having this priority clearly defined helps to focus on guests that are likely to bring compelling narratives to your audience.

Now, whenever you watch a YouTube video related to your topic, and find that the speaker is a great storyteller: take note of it. It's only a few words in your spreadsheet. The hard work is in the next step, reaching out.

2. Reaching out

Doing your research

Before even writing your message, do some research on the guest. What do they care about? What are their contributions to the field? What would they be ecited and passionate to talk about? You can find inspiration by checking their blog entries, past tweets, or LinkedIn profile. This research step is fundamental and sometimes you'll realize the person's expertise is just not aligned with what you're looking for. That's still a win, you saved their time, yours, and your audience's.

The first message

People are more active on certain platforms. Some prefer email, other Twitter, etc. So the first step is to choose a medium your guest is more likey to answer and to compose a message according to this medium's rules of engagment. LinkedIn limits you to 300 characters in your connection note so you have to be concise and convincing. An email can be longer but you might get classified as spam or left in unread. In any case, be professional in your first messages.

Now about the message, it has to be tailored for each guest. For sure you can keep some elements pitching your show from one intro message to the other, but the core of the message differs. Make it clear why this person's experience would be valuable to your audience, it's an opportunity to give them a voice to share something they care about. Specific is terrific - If you can pinpoint some recent work of your guest that connects with the show, make it explicit that you'd love to make a show about their experience X, how it felt, and what they learnt form it. When crafing this message, your subtext must already propose a narrative you could adopt during the episode.

I also include a link to one of my episode in the intro message. I pick an episode I think this person would appreciate. It makes the proposal serious and they can ensure the podcast is real. But keep in mind that it's unlikely somone's going to give 20~30min of their attention to a stranger on the internet. Still, they're likely to check the link, read the titles and show notes of an episode or two to assess the show's professionalism.

A Sales' job

Booking guests is a Sales' job, it requires resilience. You're playing the long game and have to be consistent in your effort. It's ok to send follow-ups messages, perhaps via a different channel, to get the conversation started. Your best ally for this is so set yourself reminders.: "in 10 days, if I haven't heard back of them, I send them another email with complementary infos" Think about the doubts that person may have about coming to your show and answer them in those follow up emails. Use your favorite reminders app and commit.

Know to stop when you get a "No.", but not before. It's a Sales's job. You have to push yourself. If you're convinced this person's story would be valuable to your audience and resonates with what they're intersted in, you must give your best shot at having those guests on the show. And don't treat a "no answer" as a no. Life happens and people are busy. A podcast proposal is likely to be at the bottom of their todo list. Maybe they're still considering coming. That's where follow-up messages can clear some doubts they have.

Make it easy to say yes. People don't like making choices, so you have to lower the barriers: clear all logistics issue they might have (how's the recording going to happen, how much time it will take, etc.)

"Hi X,

Did you get a chance to think about dropping by for an episode of the Post Mortem podcast ?

Here are some additional infos: a show is 20 to 30mins long and it should take us 40mins to record. The audience is composed of software engineers, especially data people. Your last blog post about your incident management at "Tech Cie" is a fantastic story!

Do you have some time this week for a chat? Maybe Tuesday afternoon? Here's my calendar, "link to a calendly to ease the booking"

Best regards,

Tracking messages sent and following up

After each message, update your contact spreadsheet on the "last contacted on" column. This tool is to help you compare what you expected to do (e.g., how many people I expected to reach out this month) against the reality (e.g., how many people did I actuallly reached out this month).

Takeaways

  • Do your prep. job, research the guest interests and past experiences.
  • Tailor the invitation request.
  • Follow up to clear potential doubts
  • Make it easy for the guest to say yes.

The advantage of having a two stages pipeline is to decouple two steps requiring two differents level of effort.
1) Adding a row to your spreadhseet is quick and easy, you can do it at anyime of the day.
2) Crafting an engaging intro message to convince your guest requires more time and focus, you'll need a good 1hr and your undivided attention.

Accept that some won't even answer your best piece. Life happens and people have a thousand things to do instead of coming to your show. What you can do though, is to make it as easy and compelling for them to tell their story on your show. Everything else if out of realm of action so there's no need to worry.


Shameless plug, here's the first episode of the Post Mortem Podcast: we discuss an incident during a migration project at the ad retargeting company Criteo. I met the guest, Nicolas, at the TensorFlow World conference the year before, he was giving a talk and we sympapthized at lunch. Talk about serendipity, little we knew the next year we would record a podcast episode together!

Getting better at git once the basics are covered

git learning curve is weird

One can understand the benefits of using a Version Control System (VCS) in minutes, be familiar with the main commands in a few hours, but it can take years before confidently running a command with --force.

Worse, sometimes you just don't know that a command exists, one that could save you hours!

I had an ahah! moment three months ago when discovering git bisect, it helps you find a faulty commit in O(log(N)) steps, where N is the number of commits to inspect. Instead of checking-out every commit, binary search helps you reduce the number of commits to check. With a build time of 7mins on a c++ project, each rebuild I could avoid was worth it.

Sometimes the surprise is even bigger as you discover not a command, but an argument of a command you use everyday. Yeah, I'm looking at you git log -S. I discovered the git pickaxe a few weeks ago, after 3+ years of using git. This gem saved me 15+ mins at its first use. Philip Potter has a great intro for the git pickaxe command in his blog post


Understanding the git learning curve

We can break down git's learning curve in three steps:

  1. The initial learning curve
  2. A plateau once you master the basics
  3. A stairway-like shape where each bump means mastering an advanced concept

The git learning curve

Let's break this down

  1. During the initial learning curve, you're interested in becoming operational with git promptly. You learn the basics for creating a branch, adding changes, committing, push/pull and that's usually good enough to get working on a project within a team.

  2. Once you're familiar with those 101 skills, you enter a plateau. Indeed, the commands above address 80% of your needs with a little time investment. Following the 80/20 principle, mastering the remaining 20% would be a lot of efforts with little substantial gain. This is especially true if you're lucky enough to have a git guru in your team. You know, this person you can come to and ask for help when your repo is in a terrible state and you seek redemption, we've all been there (:

  3. Eventually, you'll want to fix your repo state by yourself instead of asking the local git guru. That's your first step in the stairway to git mastery. You'll put the energy required to understand once and for all how git reset works and what's the difference between git reset --soft and git reset --hard. Each of those little victories is a step forward in your path towards git mastery.

Progress in git comes from the willingness to confront yourself with a repo state you'd rather avoid. The more challenging repo states you encounter, the more likely you are to understand git internals.

How to encounter bumps more often?

Problem: when you're working on a feature and must ship asap, you are not incentivized to learn all the internals of how git works when a corrupted repo state arises. You're more interested in solving the issue at hand.

So we go to Stack Overflow, copy/paste the most relevant one liner, and give a +1 to its author. Problem solved. Or is it? Next time a similar bump arise we're likely to go for that one-liner copy/paste once again. We can do better.

I found two tricks to make room for learning advanced git techniques:

  1. Practice on a toy repo where I don't care about breaking things.
  2. Use the amazing Learning Git Branching tutorial to develop visual intuition over what a command does.

    As git is mostly trees and pointer to nodes, a graph visualisation is super helpful to demystify an obscure command you'd be afraid to run otherwise.

Also, I found that the Atlassian doc on advanced git usage is amazingly well written and easy to digest. It's not exhaustive but it does a better job at conveying the intuition behind a command compared to the official git doc IMO.

Practice makes perfect

At each corrupted repo state or complex branching, rebasing, etc. operation you encounter: give it a solid 10 mins of thinking and make sure you leave the situation with a clear understanding of what the problem was, how you solved it and how you'd solve it if it appeared again.

You can take notes of it, blog about it, or answer a related SO questions to help you commit to this learning ritual.

Closing thoughts

Think at what Rousseau said in his Discours sur l’origine et les fondements de l’inégalité parmi les hommes, 1755:

"Laissez à l’homme civilisé le temps de rassembler toutes ses machines autour de lui, on ne peut douter qu’il ne surmonte facilement l’homme sauvage ; mais si vous voulez voir un combat plus inégal encore, mettez-les nus et désarmés vis-à-vis l’un de l’autre, et vous reconnaîtrez bientôt quel est l’avantage d’avoir sans cesse toutes ses forces à sa disposition, d’être toujours prêt à tout événement, et de se porter, pour ainsi dire, toujours tout entier avec soi."

An approximate english translation:

"Allow the civilized man time to gather all his machines around him, there can be no doubt that he easily overcomes the savage man; but if you want to see an even more unequal fight, put them naked and disarmed vis-à-vis each other, and you will soon recognize what is the advantage of having all your strength constantly at your disposal, to be always ready for any event, and to carry oneself, so to speak, always completely with oneself. "

The "civilized man" would be the you using an IDE with a nice git GUI. The "savage man", would be the you using git commands in the terminal.

For mundane tasks, the IDE enhanced approach can help you gain time. But be wary that when things get complicated, like in case of merge conflicts or for investigating the root cause of a bug, using the git CLI will be a powerful tool.

As scary as they look at first, it's by taking the time to decipher new git commands you encounter that you become more proficient with the tool.

Protips

  • Setting an alias for long git commands can remove the friction preventing you from using them regularly in the terminal. My personal favorite is an alias for git log with nice formatting:
alias glola="git log --graph --pretty='%Cred%h%Creset -%C(auto)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --all"

As an illustration, let's say I'd like to see the commits I made over the past two days on a repo: I'd type glola --since="two days".

The OhMyzsh framework has a git plugin with tons of aliases. I just use a subset of them but it's already a huge gain in time. Here is a list of useful git aliases.

Notes on audio processing

Lost in the spectral domain

Starting my journey with deep learning for audio processing

As I'm getting into machine learning applied to audio signals, I realize that there's very few ressources available to craft yourself a curriculum. On the other hand, googling for deep learning applied to image processing or NLP will return more results than you can use.

Of course more content, blog posts, GitHub repos, online courses, etc. will be available over time. But still, how do you get started in the field in early 2021 ?

Motivating examples, speech-to-text and text-to-speech

Speech to text (stt) and text to speech (tts) are two problems where deep learning seems promising, especially compared to traditional ML approaches using complex feature engineering.

Image processing had its ImageNet moment, NLP has BERT, and audio has ... ? Yeah.. not so obvious.

Even if academia and Big Tech are moving fast, like FAIR with wav2vec, those models are still as not as widely deployed and used as their image/NLP counterparts. We haven't that much tutorials and feedback available on how to use them and what kind of problems they answer well or not.

Learning tts and stt; your options

Ok, let's say you want to be a text to speech expert; there're several way to get ramped up:

  1. Reading research papers
  2. Watching YouTube videos
  3. Forking GitHub repos
  4. Attending online courses
  5. Reading books

Let's review each method and identify its pros and cons:


  1. Research papers are great if you already know the basics of the field. Otherwise, prepare to be overwhelmed with new concepts at each line you read.

    Having done some signal processing in College, I started my journey reading papers. Papers like the ones from Google Tacotron's series.

    • Pro - reading papers gives you access to the "ground truth" that inspire many implementations you'll find on GitHub.
    • Con - The acamedic jargon doesn't aim to explain concepts in layman's terms. It's hard to distinguish between the key concepts authors use to solve a problem and the the low levels tricks they use to beat previous SOTA.
    • Protips:
      • Papers can be biased towards their proposed architecture. Researchers don't have incentives to fairly acknlowedge the advantages of other approaches, especially if they are from a competing Unversity/Big Tech.
      • Read the abstract and conclusion first. Only then, decide if it's worth reading.
  2. I had strong priors against watching YouTube videos for educational purposes. It's so easy to consume an endless stream of media without being proactive. Still, when you start in a domain what you look for is intuition about the key concepts . Armed with this intuition, you'll develop your understanding of the field enough to formulate specific questions.

    I found this video from Ubisoft La Forge to be particularly good at conveying intuition behind implementing a tts pipeline. (Note: La Forge is the Ubisoft team responsible for making viable products out of AI proof of concepts.)

    They describe their ongoing work to have text-to-speech model generating non-playable characters voice in real-time when you're playing an open world like Assassin's Creed, Watch Dogs, or Far Cry. That's badass. And entertaining.

    • Pro - Entertaining, low barrier to entry (when comitting to read a research paper requires lots of focus and mental energy).
    • Con - Watch out for the rabbit hole. Youtube's goal is to maximize your watch time.
    • Protips
      • As you watch such videos, be active. Take notes of the concepts you don't understand and check them out once the video is over. Ideally, don't stop the video while taking note. You just need to write a few words like "Griffin Lim algorithm" when you hear a new concept.
      • Watch videos in 1.5x speed to optimize your time.
  3. I'm a big fan of Andrew Ng's online courses on Coursera. His courses contributed to make Coursera the success it is today. But if Coursera has plenty of ressources available for image and NLP, they have not a single course for deep learning for audio (as of april 2021).

    The existing courses for audio are old fashioned signal processing approached that are not fit to answer problem such as Text2speech or speech-to text.. Guess we'll wait ¯\_(ツ)_/¯

    Sure enough, Coursera isn't the only online courses provider. Youtube offers a promising alternative before "established" providers catch up. Valerio Velardo made Audio for DL its sweet spot with several series mixing theoretical concepts and hands-on implementation.

    • Pro - Courses offer a well defined syllabus that you can cherry pick from.
    • Con - For unknown teacher, it's harder to assess their expertise in the domain.
    • Protips:
      • Run a brief backrgound check of the course teacher to assess their credentials in the field. Related previous Blog posts? Videos?
  4. If well crafted courses for DL audio are scarce, there's plenty of GitHub repos implementing research papers, some from the paper's author themselves, some from outsiders who want to contribute to open source software.
    e.g., this voice cloning app built by a student during its Master Thesis.

    • Pro - If you have a clear use case and the repo addresses it, it's your lucky day. 🍀
    • Con - Open source is messy, you're likely to spend hours struggling with the project installation and replicating the repo's maintainer results. Hello Python dependency hell. Let alone, retraining the model for your use case. Implementing research model in a industry-usable fashion is a skill to practice.
    • Protips:
      • Some repos offer an interesting collection of cherry-picked papers, blog articles, and videos. They're good starting point. See this repo for a collection of curated papers and blog posts on tts and stt.
  5. Books are great reference companions, but they're expensive. Both in $ and in reading time.
    Worse, for technical concerns like librairies versions, they're outdated before reaching print. Who wants to follow a TensorFlow v.1 tutorial in 2021 ?

    • Pro - They're worth it for understanding key concepts and previous SOTA approaches, and books are structured documents that make information retrieval easy. Plus, you can check reviews of the book before commiting to it.
    • Con - Books get outdated fast for technical considerations and implementations. Prefer an online documentation.
    • Protips:

Closing thoughts 🧭

The best way to get lost is to jump in the train without knowing where you want to go.

Having a well-defined motivating use case will give you a clear compass to establish your syllabus. This compass will help you when you're wondering whether you should watch another YouTube video, or start getting your hands on the code.

I hope year 2021 will see the rise of audio expert YouTuber making the field more accessible.