Blog Content & Writing — Design Spec

Date: 2026-05-31
Scope: Two sequential sub-projects — (1) metadata for all 28 posts, (2) structural edits for 6 recent posts
Order: Metadata first (unblocks features), structure second


Sub-project 1: Metadata

Goal

Add tags, subtitles, and missing front matter to all 28 posts. Publish the untracked LLM architectures post.

Tag Taxonomy

9 tags. Posts get 1–2 tags maximum.

Tag Used for
deep-learning CNN posts, distributed training, triplet loss, YOLACT, SOLO
computer-vision Image similarity, object detection, pose extraction, YOLACT, SOLO
nlp NER tagger, retrieval, attention, TTS
llm Attention, TTS, LLM architectures post
systems Recommendation system, K8s, SageMaker
infra Hadoop, HDFS, K8s, distributed TensorFlow
recsys Multi-armed bandits, Amazon Personalize, retrieval, recommendation system
statistics Randomization, clustering, mixture models
aws SageMaker, Amazon Personalize

Tag Assignments (all 28 posts)

File Tags
2018-01-04-first-post.md (none — meta post)
2018-01-08-convnet-element.md deep-learning
2018-01-12-convnet-architecture.md deep-learning, computer-vision
2018-01-16-challenges.md deep-learning
2018-01-18-distributed-tensorflow.md deep-learning, infra
2018-02-25-improvement-distributed-training.md deep-learning, infra
2018-03-20-image-similarity.md computer-vision, deep-learning
2018-06-01-sagemaker-tut.md aws, systems
2018-10-20-multi-armed-bandits.md recsys, statistics
2018-10-28-randomization.md statistics
2018-10-30-apache-hadoop-introduction.md infra
2018-10-8-ner-tagger.md nlp, deep-learning
2018-11-07-hdfs-1.md infra
2018-11-10-hdfs-2.md infra
2019-01-12-amazon-personalize.md aws, recsys
2019-01-16-retrieval.md nlp, recsys
2019-02-22-clustering.md statistics, deep-learning
2019-03-04-mixture-models.md statistics
2019-05-19-triplet.md deep-learning, computer-vision
2019-09-06-pose-extraction.md computer-vision, deep-learning
2021-02-06-yolact.md computer-vision, deep-learning
2021-02-22-solo.md computer-vision, deep-learning
2022-12-24-object-detection.md computer-vision, deep-learning
2023-01-01-attention.md nlp, deep-learning
2023-11-4-tts.md nlp, deep-learning
2025-02-12-notes-recommendation-system.md recsys, systems
2025-02-18-notes-k8s.md infra, systems
2026-03-26-llm-architectures.md llm, deep-learning

Front Matter Fields

Add to every post that is missing them:

  • subtitle: — one sentence describing the post (written from content, see below)
  • author: hoangbm
  • social-share: true

Do NOT add image: to posts that don’t have one — adding placeholder images is out of scope.

Subtitles (all posts)

File Subtitle
2018-01-04-first-post.md Why I started writing about machine learning
2018-01-08-convnet-element.md The building blocks of convolutional neural networks
2018-01-12-convnet-architecture.md A tour of landmark CNN architectures from AlexNet to ResNet
2018-01-16-challenges.md Common pitfalls when training deep neural networks
2018-01-18-distributed-tensorflow.md Setting up distributed training with TensorFlow
2018-02-25-improvement-distributed-training.md Techniques for faster and more stable distributed training
2018-03-20-image-similarity.md Learning visual representations for similarity search
2018-06-01-sagemaker-tut.md Training and deploying models with Amazon SageMaker
2018-10-20-multi-armed-bandits.md Balancing exploration and exploitation in recommendation
2018-10-28-randomization.md Why randomization matters for valid experiment design
2018-10-30-apache-hadoop-introduction.md A practical introduction to the Hadoop ecosystem
2018-10-8-ner-tagger.md Building a named entity recognition system from scratch
2018-11-07-hdfs-1.md How the Hadoop Distributed File System stores data
2018-11-10-hdfs-2.md HDFS architecture: NameNode, DataNode, and replication
2019-01-12-amazon-personalize.md Using Amazon Personalize for production recommendation
2019-01-16-retrieval.md Candidate retrieval in large-scale recommendation systems
2019-02-22-clustering.md From K-Means to DBSCAN: clustering algorithms compared
2019-03-04-mixture-models.md Gaussian mixture models and the EM algorithm
2019-05-19-triplet.md Learning embeddings with triplet loss
2019-09-06-pose-extraction.md Estimating human pose from images
2021-02-06-yolact.md Real-time instance segmentation with YOLACT
2021-02-22-solo.md Instance segmentation without bounding boxes
2022-12-24-object-detection.md A personal guide to modern object detection
2023-01-01-attention.md Understanding the attention mechanism in transformers
2023-11-4-tts.md How modern text-to-speech systems work
2025-02-12-notes-recommendation-system.md A practical three-stage recommendation system pipeline
2025-02-18-notes-k8s.md Kubernetes concepts for ML engineers
2026-03-26-llm-architectures.md Encoders, decoders, and how inference actually works

LLM Post Publishing

_posts/2026-03-26-llm-architectures.md and images/llm/ are currently untracked. Add front matter additions (subtitle, author, tags) then commit both with git add.


Sub-project 2: Post Structure (6 recent posts)

Goal

Improve readability and section structure for the 6 posts from 2022 onwards. Each post gets what it specifically needs — not a template.

Guiding Principles

  • Opening: every post gets an explicit first paragraph answering “what will I learn here?” — if it already exists, improve it; if missing, add it
  • Sections: ## for main sections, ### for subsections — no mixed heading levels at the top
  • Closing: every post ends with a named section (title varies by post: “Putting it together”, “In practice”, “Final thoughts”, “Where to go next”) — no generic “Conclusion”
  • Paragraphs: split any prose block longer than 5 lines
  • Transitions: add a linking sentence between sections where the jump is abrupt
  • Examples: add a short concrete example (pseudocode, analogy, or real-world scenario) where a concept is abstract and has none
  • No rewriting of technical content — structure and readability only

Per-Post Edits

2022-12-24-object-detection.md

Longest post. Needs: section heading cleanup (## consistency), split long paragraphs, add linking sentences between the major algorithm families (anchor-based vs anchor-free), closing section “In practice”.

2023-01-01-attention.md

Dense mathematical content. Needs: stronger opening that contextualises why attention matters, short pseudocode example for the attention computation, closing section “Where to go next” pointing toward the transformer series.

2023-11-4-tts.md

Needs: opening paragraph (currently jumps straight to architecture), add one concrete listening analogy for the vocoder section, closing section “Final thoughts”.

2025-02-12-notes-recommendation-system.md

“Notes” format — keep the loose structure. Needs: opening paragraph framing the three-stage pipeline, closing section “Putting it together” that ties the three stages back to each other.

2025-02-18-notes-k8s.md

“Notes” format — keep the loose structure. Needs: opening paragraph stating the audience (“for ML engineers, not ops”), closing section “In practice” with a one-paragraph summary of how these concepts connect.

2026-03-26-llm-architectures.md

Already has a good structure and opening. Needs: split one or two long paragraphs in the decoder section, add a brief closing section “Putting it together” that recaps encoder vs decoder vs encoder-decoder in two sentences.

Out of Scope

  • Posts from 2018–2021
  • Adding new technical content or correcting factual claims
  • Reformatting code blocks or diagrams
  • Adding images

Files Changed

Sub-project 1:

  • All 28 _posts/*.md files — add tags, subtitle, author, social-share
  • _posts/2026-03-26-llm-architectures.md — also newly tracked (was untracked)
  • images/llm/ — newly tracked

Sub-project 2:

  • _posts/2022-12-24-object-detection.md
  • _posts/2023-01-01-attention.md
  • _posts/2023-11-4-tts.md
  • _posts/2025-02-12-notes-recommendation-system.md
  • _posts/2025-02-18-notes-k8s.md
  • _posts/2026-03-26-llm-architectures.md