Humans tend to be inaccurate and inconsistent when estimating a large number of objects. Furthermore, we modify our estimates when feedback or a reference array is provided, indicating that the mappings between perceived numerosity and their corresponding numerals are largely malleable in response to calibration. However, there is great variability in response to calibration across individuals. Using uncalibrated and calibrated numerosity estimation conditions, the current study explored the factors underlying individual differences in the extent and nature of the malleability of numerosity estimation performance as a result of calibration in a sample of 71 undergraduate students. We found that individual differences in performance were reliable across conditions, and participants’ responses to calibration varied greatly. Participants who were less consistent or had more proportionally spaced (i.e., linear) estimates before calibration tended to shift the distributions of their estimates to a greater extent. Higher calculation competence also predicted an increase in how linear participants’ estimates were after calibration. Moreover, the effect of calibration was not continuous across numerosities within participants. This suggests that the mechanisms underlying numeral-numerosity mappings may be less systematic than previously thought and likely depend on cognitive mechanisms beyond representation of numerosities. Taken together, the mappings between numerosities and numerical symbols may not be stable and direct, but transient and mediated by task-related (e.g., strategic) mechanisms. Rather than estimation skills being foundational for math competence, math competence may also influence estimation skills. Therefore, numerosity estimation tasks are not a pure measure of number representations.