Bonjour

Notes on Text-To-Speech

Posted on November 4, 2023

Recently, I’ve spent quite a lot of time to study about Text-to-Speech (TTS) system, more specifically, acoustic model. To be fair, this module is the most actively researched in the whole pipeline. In the beginning, admittedly, I’m not a big fan of speech processing because of the vagueness of some... [Read More]

Attention and self-attention

Posted on January 1, 2023

Another post insprired by an interview failure. During a chat about a position in speech processing area, interviewer asked me about the differences between attention and self attention. I, at that time, who only have a vague understanding about these two, started to invent something about these two to make... [Read More]

SSD vs YOLO

Posted on December 24, 2022

To be honest, I’m truly fed up with revising knowledge before an interview, especially object detection algorithms like SSD and YOLO. Every time I prepare for an interview in computer vision, I read about these two architectures and I can’t tell the exact differences between them. So today, I decide... [Read More]

SOLO-Segmenting Objects by Locations

Posted on February 22, 2021

In the previous blog, I have discussed YOLACT, a common instance segmentation architecture and my grudges against it and its variants. In this blog, I will introduce another architecture, which I believe a more efficient design. This is namely SOLO. [Read More]

Instance Segmentation-YOLACT

Posted on February 6, 2021

Recently, I reluctantly had a task involving image segmentation. Why reluctance? As a deep learning engineer, I don’t really believe in the power of this branch since convolutional network doesn’t work well in pixel level (I think!!!). Anyways, I give it a try and now I want to share with... [Read More]