Achieving fine-grain pipelining in forward dynamic programming (DP) architectures is difficult. This correspondence proposes novel computation techniques to achieve fine-grain pipelining in such architectures. We implement the sequential DP algorithm using fewer finer grain pipelined processors, and achieve increased hardware efficiency by using a novel computation sequence. We also use look-ahead computation to obtain a concurrent DP algorithm, and use this in combination with an approprite computation sequence to achieve further pipelining in DP architectures. The finer grain pipelined architectures are mapped to ring and mesh processor arrays, and achieve approximately the same iteration rate as the coarse-grain pipelined architectures, but with use of much less hardware. The design of interleaved architectures using multiple clocks is also outlined.