Transformer Feed-Forward Layer (Medium) | AI Code Lab | AmanAI Lab

Skip to main content

Blog Services About

The most complete AI/ML career platform. 19+ free tools for interview prep, job search, and learning.

Weekly AI/ML tips + new tool alerts

Email address

Interview

AI Simulator
Question Bank
Companies
System Design
Daily Challenge

Tools

Resume Analyzer
Cover Letter Review
Code Lab
LinkedIn Optimizer
Paper Explainer
AI Playground

Career

Career Roadmap
Study Plan
Offer Analyzer
Company Research
JD → Questions
Skill Gap Analyzer

Learn

YouTube Series
Blog
AI News
Free Resources
Courses

Company

About
Services
Contact
Search

© 2026 AmanAI Lab. All rights reserved.

Built for AI/ML professionals

Transformer Feed-Forward Layer (Medium) | AI Code Lab | AmanAI Lab

Transformer Feed-Forward LayerMedium

00:00

Python idle

Transformer Feed-Forward Layer

Every transformer block has two sub-layers: attention + this FFN. The FFN expands then contracts the representation.

Architecture

FFN(x) = W2 × GELU(W1 × x + b1) + b2

- W1: d_model → d_ff (typically 4×)

- W2: d_ff → d_model

- GELU activation (smoother than ReLU)

Example (d_model=2, d_ff=4)

With random weights, verify output shape matches input shape.

Round to **5 decimal places**.

Deep dive: Transformers concepts

Similar Problems

MLayer Normalization MPositional Encoding HScaled Dot-Product Attention

Test Cases (2 visible · 1 hidden)

Case 1: GELU at 0

Input: gelu(0.0)

Expected: 0.0

Case 2: GELU at 1

Input: gelu(1.0)

Expected: 0.84134

OpenAI Google Meta

Python 3.11

13

AI:

⌘↵ Run · ⌘⇧↵ Submit