147
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      Version and Review History
       
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Shallow Gibbs Network, Double Backpropagation and Differential Machine learning

      Preprint
      research-article
      This is not the latest version for this article. If you want to read the latest version, click here.
      Bookmark

            Abstract

            We have built a Shallow Gibbs Network model as a Random Gibbs Network Forest to reach the performance of the Multilayer feedforward Neural Network in a few numbers of parameters, and fewer backpropagation iterations. To make it happens, we propose a novel optimization framework for our Bayesian Shallow Network, called the {Double Backpropagation Scheme} (DBS) that can also fit perfectly the data with appropriate learning rate, and which is convergent and universally applicable to any Bayesian neural network problem. The contribution of this model is broad. First, it integrates all the advantages of the Potts Model, which is a very rich random partitions model, that we have also modified to propose its Complete Shrinkage version using agglomerative clustering techniques. The model takes also an advantage of Gibbs Fields for its weights precision matrix structure, mainly through Markov Random Fields, and even has five (5) variants structures at the end: the Full-Gibbs, the Sparse-Gibbs, the Between layer Sparse Gibbs which is the B-Sparse Gibbs in a short, the Compound Symmetry Gibbs (CS-Gibbs in short), and the Sparse Compound Symmetry Gibbs (Sparse-CS-Gibbs) model. The Full-Gibbs is mainly to remind fully-connected models, and the other structures are useful to show how the model can be reduced in terms of complexity with sparsity and parsimony. All those models have been experimented with the Mulan project multivariate regression dataset, and the results arouse interest in those structures, in a sense that different structures help to reach different results in terms of Mean Squared Error (MSE) and Relative Root Mean Squared Error (RRMSE). For the Shallow Gibbs Network model, we have found the perfect learning framework : it is the $(l_1, \boldsymbol{\zeta}, \epsilon_{dbs})-\textbf{DBS}$ configuration, which is a combination of the \emph{Universal Approximation Theorem}, and the DBS optimization, coupled with the (\emph{dist})-Nearest Neighbor-(h)-Taylor Series-Perfect Multivariate Interpolation (\emph{dist}-NN-(h)-TS-PMI) model [which in turn is a combination of the research of the Nearest Neighborhood for a good Train-Test association, the Taylor Approximation Theorem, and finally the Multivariate Interpolation Method]. It indicates that, with an appropriate number $l_1$ of neurons on the hidden layer, an optimal number $\zeta$ of DBS updates, an optimal DBS learnnig rate $\epsilon_{dbs}$, an optimal distance \emph{dist}$_{opt}$ in the research of the nearest neighbor in the training dataset for each test data $x_i^{\mbox{test}}$, an optimal order $h_{opt}$ of the Taylor approximation for the Perfect Multivariate Interpolation (\emph{dist}-NN-(h)-TS-PMI) model once the {\bfseries DBS} has overfitted the training dataset, the train and the test error converge to zero (0).

            Content

            Author and article information

            Journal
            ScienceOpen Preprints
            ScienceOpen
            10 April 2021
            Affiliations
            [1 ] Department of Mathematics and Statistics, Université de Montréal, 2920, chemin de la Tour, H3T 1J4, Montreal, Québec, Canada
            Article
            10.14293/S2199-1006.1.SOR-.PPS25DJ.v1
            b49facb8-7d6c-4939-8654-289873a7f33a

            This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .


            The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
            Computer science,Statistics,Mathematics
            MultivariateRegression,Neural Networks,Probability and stochastic processes,Graphical models,Structured models,Gibbs Fields,Sparse Models ,Compound Symmetry,Double Backpropagation,Taylor Theorem

            Comments

            Comment on this article