Polar codes are the first error-correcting codes to provably achieve the channel capacity but with infinite codelengths. For finite codelengths the existing decoder architectures are limited in working frequency by the partial sums computation unit. We explain in this paper how the partial sums computation can be seen as a matrix multiplication. Then, an efficient hardware implementation of this product is investigated. It has reduced logic resources and interconnections. Formalized architectures, to compute partial sums and to generate the bits of the generator matrix k^n, are presented. The proposed architecture allows removing the multiplexing resources used to assigned to each processing elements the required partial sums.