The GES framework postulates a hierarchical order between grounded, embodied, and situated representations. Against this background, the present study investigated the relation of two effects: (i) a semantic priming between number cues and words with referents up or down in the world according to the number's magnitude which is supposed to be grounded (cf. Lachmair et al., 2014) and (ii) the compatibility between number cues and the grammatical word form of the words according to the number's multitude which is supposed to be embodied (cf. Roettger and Domahs, 2015). In two experiments words referring to objects up or down in the world and spatially neutral words were presented subsequent to the numbers “1” and “9.” In Experiment 1 words were presented in singular word form and in Experiment 2 in plural word form. For the first time, Virtual Reality was used in such an experimental setup in order to reduce spatial predispositions of participants and to provide a homogeneous experimental environment for replication purposes. According to GES it was expected that the spatial semantic priming should occur in both grammatical word forms. However, the compatibility with grammatical number should only occur for the plural word form due to its markedness. The results of Experiment 1 support the spatial-semantic-priming-hypothesis but not the grammatical-number-hypothesis. The results of Experiment 2 supported only the grammatical-number-hypothesis. It is argued that the grounded spatial effect of Experiment 1 was not affected by grammatical number. However, in Experiment 2 this effect vanished due to an activated embodied reference frame according to grammatical number.