Recently, I’ve spent quite a lot of time to study about Text-to-Speech (TTS) system, more specifically, acoustic model. To be fair, this module is the most actively researched in the whole pipeline. In the beginning, admittedly, I’m not a big fan of speech processing because of the vagueness of some...
[Read More]
Attention and self-attention
Another post insprired by an interview failure. During a chat about a position in speech processing area, interviewer asked me about the differences between attention and self attention. I, at that time, who only have a vague understanding about these two, started to invent something about these two to make...
[Read More]
SSD vs YOLO
To be honest, I’m truly fed up with revising knowledge before an interview, especially object detection algorithms like SSD and YOLO. Every time I prepare for an interview in computer vision, I read about these two architectures and I can’t tell the exact differences between them. So today, I decide...
[Read More]
SOLO-Segmenting Objects by Locations
In the previous blog, I have discussed YOLACT, a common instance segmentation architecture and my grudges against it and its variants. In this blog, I will introduce another architecture, which I believe a more
efficient design. This is namely SOLO.
[Read More]
Instance Segmentation-YOLACT
Recently, I reluctantly had a task involving image segmentation. Why reluctance? As a deep learning engineer, I don’t really believe in the power of this branch since convolutional network doesn’t work well in pixel level (I think!!!). Anyways, I give it a try and now I want to share with...
[Read More]