We bring the theory of rough paths to the study of non-parametric statistics on streamed
data. We discuss the problem of regression where the input variable is a stream of
information, and the dependent response is also (potentially) a stream.
A certain graded feature set of a stream, known in the rough path literature as the
signature, has a universality that allows formally, linear regression to be used to
characterise the functional relationship between independent explanatory variables
and the conditional distribution of the dependent response.
This approach, via linear regression on the signature of the stream, is almost totally
general, and yet it still allows explicit computation. The grading allows truncation
of the feature set and so leads to an efficient local description for streams (rough
paths). In the statistical context this method offers potentially significant, even
transformational dimension reduction.
By way of illustration, our approach is applied to stationary time series including
the familiar AR model and ARCH model. In the numerical examples we examined, our predictions
achieve similar accuracy to the Gaussian Process (GP) approach with much lower computational
cost especially when the sample size is large.