68
views
1
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Shallow Gibbs Network, Double Backpropagation and Differential Machine learning

      Preprint
      In review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We have built a Shallow Gibbs Network model as a Random Gibbs Network Forest to reach the performance of the Multilayer feedforward Neural Network in a few numbers of parameters, and fewer backpropagation iterations. To make it happens, we propose a novel optimization framework for our Bayesian Shallow Network, called the {Double Backpropagation Scheme} (DBS) that can also fit perfectly the data with appropriate learning rate, and which is convergent and universally applicable to any Bayesian neural network problem. The contribution of this model is broad. First, it integrates all the advantages of the Potts Model, which is a very rich random partitions model, that we have also modified to propose its Complete Shrinkage version using agglomerative clustering techniques. The model takes also an advantage of Gibbs Fields for its weights precision matrix structure, mainly through Markov Random Fields, and even has five (5) variants structures at the end: the Full-Gibbs, the Sparse-Gibbs, the Between layer Sparse Gibbs which is the B-Sparse Gibbs in a short, the Compound Symmetry Gibbs (CS-Gibbs in short), and the Sparse Compound Symmetry Gibbs (Sparse-CS-Gibbs) model. The Full-Gibbs is mainly to remind fully-connected models, and the other structures are useful to show how the model can be reduced in terms of complexity with sparsity and parsimony. All those models have been experimented with the Mulan project multivariate regression dataset, and the results arouse interest in those structures, in a sense that different structures help to reach different results in terms of Mean Squared Error (MSE) and Relative Root Mean Squared Error (RRMSE). For the Shallow Gibbs Network model, we have found the perfect learning framework : it is the $(l_1, \boldsymbol{\zeta}, \epsilon_{dbs})-\textbf{DBS}$ configuration, which is a combination of the \emph{Universal Approximation Theorem}, and the DBS optimization, coupled with the (\emph{dist})-Nearest Neighbor-(h)-Taylor Series-Perfect Multivariate Interpolation (\emph{dist}-NN-(h)-TS-PMI) model [which in turn is a combination of the research of the Nearest Neighborhood for a good Train-Test association, the Taylor Approximation Theorem, and finally the Multivariate Interpolation Method]. It indicates that, with an appropriate number $l_1$ of neurons on the hidden layer, an optimal number $\zeta$ of DBS updates, an optimal DBS learnnig rate $\epsilon_{dbs}$, an optimal distance \emph{dist}$_{opt}$ in the research of the nearest neighbor in the training dataset for each test data $x_i^{\mbox{test}}$, an optimal order $h_{opt}$ of the Taylor approximation for the Perfect Multivariate Interpolation (\emph{dist}-NN-(h)-TS-PMI) model once the {\bfseries DBS} has overfitted the training dataset, the train and the test error converge to zero (0).

          Related collections

          Author and article information

          Journal
          ScienceOpen Preprints
          ScienceOpen
          13 April 2021
          Affiliations
          [1 ] Department of Mathematics and Statistics, Université de Montréal, 2920, chemin de la Tour, H3T 1J4, Montreal, Québec, Canada
          Article
          10.14293/S2199-1006.1.SOR-.PPS25DJ.v2

          This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

          The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

          Computer science, Statistics, Mathematics

          Taylor Theorem, MultivariateRegression, Neural Networks, Probability and stochastic processes, Graphical models, Structured models, Gibbs Fields, Sparse Models , Compound Symmetry, Double Backpropagation

          Comments

          Comment on this article