This paper presents two classes of novel folded VLSI architectures, referred to as the word-level folded architecture and the bit-level folded or digit-serial architecture, for implementation of discrete wavelet transforms. In the word-level folded architecture, the computations of all wavelet levels are folded to the same low-pass and high-pass filters. The number of registers in the folded architecture is minimized by the use of generalized life time analysis. The converter units are synthesized with minimum number of registers using forward-backward allocation. The advantage of the word-level folded architecture is low latency and its drawbacks are increased hardware area, less than 100 percent hardware utilization, and complex routing and interconnection required by the converters used in this architecture. These drawbacks are eliminated in the alternate bit-level folded digit-serial architecture which requires simpler control circuits, routing, and interconnection, and achieves complete hardware utilization, at the expense of an increase in the system latency and some constraints on the word-length. In latency-critical applications, we propose the use of the word-level folded architecture. If latency is not so critical, we propose the use of the bit-level folded digit-serial architecture.