Which network is designed to address vanishing gradient by maintaining memory across time steps?

Prepare for the GARP Risk and AI (RAI) Exam with targeted quizzes. Utilize flashcards, multiple-choice questions, and detailed explanations to enhance learning. Ace your exam with our comprehensive quiz!

Multiple Choice

Which network is designed to address vanishing gradient by maintaining memory across time steps?

Explanation:
The idea being tested is how a network can keep information across many steps of a sequence without gradients fading away during learning. Long Short-Term Memory networks do this by introducing a cell state that carries memory along the time axis with minimal disruption. The gates—forget, input, and output—control what memory to keep, what new information to add, and what part of that memory affects the output. This setup lets gradients flow through many time steps more reliably, so the network can learn long-range dependencies instead of only short-term patterns. Other options don’t target this capability in the same way. Autoencoders focus on compressing and reconstructing data rather than maintaining temporal memory across steps. Standard RNNs can remember some past information but suffer from vanishing gradients when the sequence is long, making it hard to learn long-range relationships. CNNs are designed for spatial patterns and lack a mechanism to maintain state over time steps, so they aren’t suited for preserving memory across time in the same way.

The idea being tested is how a network can keep information across many steps of a sequence without gradients fading away during learning. Long Short-Term Memory networks do this by introducing a cell state that carries memory along the time axis with minimal disruption. The gates—forget, input, and output—control what memory to keep, what new information to add, and what part of that memory affects the output. This setup lets gradients flow through many time steps more reliably, so the network can learn long-range dependencies instead of only short-term patterns.

Other options don’t target this capability in the same way. Autoencoders focus on compressing and reconstructing data rather than maintaining temporal memory across steps. Standard RNNs can remember some past information but suffer from vanishing gradients when the sequence is long, making it hard to learn long-range relationships. CNNs are designed for spatial patterns and lack a mechanism to maintain state over time steps, so they aren’t suited for preserving memory across time in the same way.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy