Clustering and Retrieval (Part 2)

In this blog, I will dive deeper into the techniques of clustering. Clustering or segmentation has wide applications, particularly in RecSys, when you can have the insight about users’s preferences given their activities. For examples, assuming that we have a list of articles that user A reads everyday, we can... [Read More]

Clustering and Retrieval (Part 1)

At the moment, we are working in a big project of personalized news distribution for 24h News. In this project, we assume that each reader of this page has their own taste, so we can’t deliver the same content to them: we have to explore their favorite and then deliver... [Read More]

Amazon Personalize

In the re:Invent2018, CEO of Amazon Web Service has introduced Amazon Personalize, a service which helps customers to initialize their own Recommend System (RecSys) without expertise. I am quite surprised that they haven’t launched this service earlier when recommendation is really their strong point. I am assigned a task to... [Read More]

HDFS Recovery Process

In the previous post, I have finished presenting the architecture of HDFS. Today, I focus more on the fault tolerance, in other words, the recovery process of HDFS. As you may know, HDFS developers assume that this platform will run on the unstable system. fault tolerance can’t be ignored if... [Read More]

HDFS Architecture

In the previous post, I have presented an overview about Hadoop platform and its components. In this blog, I will dive more deeply into the storage system of Hadoop: HDFS architecture. Happy reading!!! [Read More]