Clustering and Retrieval (Part 1)

At the moment, we are working in a big project of personalized news distribution for 24h News. In this project, we assume that each reader of this page has their own taste, so we can’t deliver the same content to them: we have to explore their favorite and then deliver... [Read More]

Amazon Personalize

In the re:Invent2018, CEO of Amazon Web Service has introduced Amazon Personalize, a service which helps customers to initialize their own Recommend System (RecSys) without expertise. I am quite surprised that they haven’t launched this service earlier when recommendation is really their strong point. I am assigned a task to... [Read More]

HDFS Recovery Process

In the previous post, I have finished presenting the architecture of HDFS. Today, I focus more on the fault tolerance, in other words, the recovery process of HDFS. As you may know, HDFS developers assume that this platform will run on the unstable system. fault tolerance can’t be ignored if... [Read More]

HDFS Architecture

In the previous post, I have presented an overview about Hadoop platform and its components. In this blog, I will dive more deeply into the storage system of Hadoop: HDFS architecture. Happy reading!!! [Read More]

Introduction to Apache Hadoop Architecture

Recently, I’ve taken part in a online course in order to augment my salary. It is about Big Data and Apache’s tools in this area. So I decide to write a series about this topic. This series will act as a summary for my learning and also help to enrich... [Read More]