We put excellence, value and quality above all - and it shows
A Technology Partnership That Goes Beyond Code
“Arbisoft has been my most trusted technology partner for now over 15 years. Arbisoft has very unique methods of recruiting and training, and the results demonstrate that. They have great teams, great positive attitudes and great communication.”
AI Model Compression Part V: The LSTM: A New Kind of Memory

After overcoming the initial struggles with maintaining context over time, the next leap in AI’s memory capabilities came with the development of the Long Short-Term Memory (LSTM) architecture, which added a new level of sophistication to how machines retain and process information.
Birth of the Long Short-Term Memory
In 1997, Hochreiter and Schmidhuber introduced the Long Short-Term Memory (LSTM) architecture. But to understand its genius, let's first understand the human memory system it mirrors:
Human Memory System LSTM Gates
------------------ ----------
Attention Filter → Input Gate
(what to remember)
Working Memory → Memory Cell
(current state)
Memory Consolidation → Forget Gate
(what to forget)
Memory Retrieval → Output Gate
(what to use)
Explore how memory systems in AI evolved in Part IV of this series.
The mathematics of the LSTM tells this story of selective memory:
Input Gate:
i_t = σ(W_i[h_(t-1), x_t] + b_i)
Forget Gate:
f_t = σ(W_f[h_(t-1), x_t] + b_f)
Memory Cell:
c_t = f_t ⊙ c_(t-1) + i_t ⊙ tanh(W_c[h_(t-1), x_t] + b_c)
Output Gate:
o_t = σ(W_o[h_(t-1), x_t] + b_o)
Each equation represents a crucial aspect of conscious memory, forming the backbone of deep learning solutions that drive today's intelligent systems.
- The Input Gate decides what new information is worth remembering
- The Forget Gate determines what old memories can fade
- The Memory Cell maintains the current state of understanding
- The Output Gate chooses what memories are relevant now
The Dance of Memory: How LSTMs Learn
Think of an LSTM as a master storyteller, constantly deciding:
- Which details to emphasize
- Which to let fade
- How to connect distant events
- When to recall earlier information
This mirrors how human consciousness works with memory:
Example: Reading a Mystery Novel
Human Process LSTM Process
------------- ------------
Note key clues → Input Gate activates
for important information
Hold suspects → Memory Cell maintains
in mind key details
Discard red → Forget Gate removes
herrings irrelevant information
Connect final → Output Gate retrieves
clues stored information
...Loading Related Blogs