The Meta Model and Meta Meta-Model of Deep Learning
The model for deep learning consists of a computational graph that are most conveniently constructed by composing layers with other layers. Most introductory texts emphasize the individual neuron, but in practice it is the collective behavior of a layer of neurons that is important. So from an abstraction perspective, the layer is the right level to think about.
Underneath these layers are the computational graph, it’s main purpose is to orchestrate the computation of the forward and backward phases of the network. From the perspective of optimizing the performance, this is an important abstraction to have. However, it is not at the ideal level to reason how it all should work.
Deep Learning frameworks have evolved to develop models that ease construction of DL architectures. Theano has Blocks, Lasagne and Keras. Tensorflow has Keras and TF-Slim. Keras is based on Torch, so by default has a high-level modular API. Many other less popular frameworks like Nervana, CNTK, MXNet and Chainer do have high level model APIs. All these APIs however describe models. What then is a Deep Learning meta-model? Is there even a meta meta-model?