讲座简介:
Part of the power of neural networks comes from the fact that it is a very generic, almost "blind", tool to extract useful information directly from data. Unlike more conventional data analysis approaches, it does not assume any knowledge of the statistical model, the structure, relation or constraints in the data, but tries to use a universal network structure to learn and represent all types of models. On the other hand, if we do have some of such knowledge of the model, in principle we should be able to make the inference algorithms more efficiently.
How do we adjust the structure of a neural network to reflect our knowledge of the statistic model and the structures in the data? That is the topic of this talk. Conceptually, we need to identify what happens inside a neural network during the learning process, to find out what statistical quantities are being calculated and how are they stored inside a network. To that end, we formulate a new problem called the "universal feature selection" problem, where we need to select from the high dimensional data a low dimensional feature that can be used to solve, not one, but a family of inference problems. We solve this problem and show that 1) the solution is closely related to a number of concepts in information theory and statistics such as the HGR correlation and common information, and 2) a number of learning algorithms, PCA, CCA, Matrix Factorization, and Neural Networks, implicitly solve the same problem. We then demonstrate how such theoretical understanding of neural networks can help us to establish a performance limit and design network parameters more systematically; and to include specific domain knowledge in the design of new network structures.
个人简介:
Lizhong Zheng received the B.S and M.S. degrees, in 1994 and 1997 respectively, from the Department of Electronic Engineering, Tsinghua University, China, and the Ph.D. degree, in 2002, from the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley. Since 2002, he has been working at MIT, where he is currently a professor of Electrical Engineering. His research interests include information theory, statistical inference, communications, and networks theory. He received Eli Jury award from UC Berkeley in 2002, IEEE Information Theory Society Paper Award in 2003, and NSF CAREER award in 2004, and the AFOSR Young Investigator Award in 2007. He served as an associate editor for IEEE Transactions on Information Theory, and the general co-chair for the IEEE International Symposium on Information Theory in 2012. He is an IEEE fellow.