Semi-parallel, or folded, VLSI architectures are used whenever hardware resources need to be saved at design time. Most recent applications that are based on Projective Geometry (PG) based balanced bipartite graph also fall in this category. In this paper, we provide a high-level, top-down design methodology to design optimal semi-parallel architectures for applications, whose Data Flow Graph (DFG) is based on PG bipartite graph. Such applications have been found e.g. in error-control coding and matrix computations. Unlike many other folding schemes, the topology of connections between physical elements does not change in this methodology. Another advantage is the ease of implementation. To lessen the throughput loss due to folding, we also incorporate a multi-tier pipelining strategy in the design methodology. The design methodology has been verified by implementing a synthesis tool in C++, which has been verified as well. The tool is publicly available. Further, a complete decoder was manually protototyped before the synthesis tool design, to verify all the algorithms evolved in this paper, towards various steps of refinement. Another specific high-performance design of an LDPC decoder based on this methodology was worked out in past, and has been patented as well.