Frontiers of Electrical and Electronic Engineering

ISSN 2095-2732

ISSN 2095-2740(Online)

CN 10-1028/TM

   Online First

Administered by

, Volume 5 Issue 3

For Selected: View Abstracts Toggle Thumbnails
Research articles
Emerging themes on information theory and Bayesian approach
Lei XU, Yanda LI,
Front. Electr. Electron. Eng.. 2010, 5 (3): 237-240.  
https://doi.org/10.1007/s11460-010-0100-4

Abstract   PDF (95KB)
References | Related Articles | Metrics
Information geometry in optimization, machine learning and statistical inference
Shun-ichi AMARI,
Front. Electr. Electron. Eng.. 2010, 5 (3): 241-260.  
https://doi.org/10.1007/s11460-010-0101-3

Abstract   PDF (374KB)
The present article gives an introduction to information geometry and surveys its applications in the area of machine learning, optimization and statistical inference. Information geometry is explained intuitively by using divergence functions introduced in a manifold of probability distributions and other general manifolds. They give a Riemannian structure together with a pair of dual flatness criteria. Many manifolds are dually flat. When a manifold is dually flat, a generalized Pythagorean theorem and related projection theorem are introduced. They provide useful means for various approximation and optimization problems. We apply them to alternative minimization problems, YingYang machines and belief propagation algorithm in machine learning.
References | Related Articles | Metrics
RESEARCH ARTICLE
Orthogonal nonnegative learning for sparse feature extraction and approximate combinatorial optimization
Erkki OJA, Zhirong YANG
Front Elect Electr Eng Chin. 2010, 5 (3): 261-273.  
https://doi.org/10.1007/s11460-010-0106-y

Abstract   HTML   PDF (320KB)

Nonnegativity has been shown to be a powerful principle in linear matrix decompositions, leading to sparse component matrices in feature analysis and data compression. The classical method is Lee and Seung’s Nonnegative Matrix Factorization. A standard way to form learning rules is by multiplicative updates, maintaining nonnegativity. Here, a generic principle is presented for forming multiplicative update rules, which integrate an orthonormality constraint into nonnegative learning. The principle, called Orthogonal Nonnegative Learning (ONL), is rigorously derived from the Lagrangian technique. As examples, the proposed method is applied for transforming Nonnegative Matrix Factorization (NMF) and its variant, Projective Nonnegative Matrix Factorization (PNMF), into their orthogonal versions. In general, it is well-known that orthogonal nonnegative learning can give very useful approximative solutions for problems involving non-vectorial data, for example, binary solutions. Combinatorial optimization is replaced by continuous-space gradient optimization which is often computationally lighter. It is shown how the multiplicative updates rules obtained by using the proposed ONL principle can find a nonnegative and highly orthogonal matrix for an approximated graph partitioning problem. The empirical results on various graphs indicate that our nonnegative learning algorithms not only outperform those without the orthogonality condition, but also surpass other existing partitioning approaches.

References | Related Articles | Metrics
Research articles
Basics of estimation
Jorma RISSANEN,
Front. Electr. Electron. Eng.. 2010, 5 (3): 274-280.  
https://doi.org/10.1007/s11460-010-0104-0

Abstract   PDF (166KB)
This paper outlines a theory of estimation, where optimality is defined for all sizes of data — not only asymptotically. Also one principle is needed to cover estimation of both real-valued parameters and their number. To achieve this we have to abandon the traditional assumption that the observed data have been generated by a “true” distribution, and that the objective of estimation is to recover this from the data. Instead, the objective in this theory is to fit ‘models’ as distributions to the data in order to find the regular statistical features. The performance of the fitted models is measured by the probability they assign to the data: a large probability means a good fit and a small probability a bad fit. Equivalently, the negative logarithm of the probability should be minimized, which has the interpretation of code length. There are three equivalent characterizations of optimal estimators, the first defined by estimation capacity, the second to satisfy necessary conditions for optimality for all data, and the third by the complete Minimum Description Length (MDL) principle.
References | Related Articles | Metrics
RESEARCH ARTICLE
Bayesian Ying-Yang system, best harmony learning, and five action circling
Lei XU
Front Elect Electr Eng Chin. 2010, 5 (3): 281-328.  
https://doi.org/10.1007/s11460-010-0108-9

Abstract   HTML   PDF (2265KB)

Firstly proposed in 1995 and systematically developed in the past decade, Bayesian Ying-Yang learning

“Ying” is spelled “Yin” in Chinese Pin Yin. To keep its original harmony with Yang, we deliberately adopted the term “Ying-Yang” since 1995.

is a statistical approach for a two pathway featured intelligent system via two complementary Bayesian representations of a joint distribution on the external observation X and its inner representation R, which can be understood from a perspective of the ancient Ying-Yang philosophy. We have q(X,R)=q(X|R)q(R) as Ying that is primary, with its structure designed according to tasks of the system, and p(X,R)=p(R|X)p(X) as Yang that is secondary, with p(X) given by samples of X while the structure of p(R|X) designed from Ying according to a Ying-Yang variety preservation principle, i.e., p(R|X) is designed as a functional with q(X|R), q(R) as its arguments. We call this pair Bayesian Ying-Yang (BYY) system. A Ying-Yang best harmony principle is proposed for learning all the unknowns in the system, in help of an implementation featured by a five action circling under the name of A5 paradigm. Interestingly, it coincides with the famous ancient WuXing theory that provides a general guide to keep the A5 circling well balanced towards a Ying-Yang best harmony. This BYY learning provides not only a general framework that accommodates typical learning approaches from a unified perspective but also a new road that leads to improved model selection criteria, Ying-Yang alternative learning with automatic model selection, as well as coordinated implementation of Ying based model selection and Yang based learning regularization.

This paper aims at an introduction of BYY learning in a twofold purpose. On one hand, we introduce fundamentals of BYY learning, including system design principles of least redundancy versus variety preservation, global learning principles of Ying-Yang harmony versus Ying-Yang matching, and local updating mechanisms of rival penalized competitive learning (RPCL) versus maximum a posteriori (MAP) competitive learning, as well as learning regularization by data smoothing and induced bias cancelation (IBC) priori. Also, we introduce basic implementing techniques, including apex approximation, primal gradient flow, Ying-Yang alternation, and Sheng-Ke-Cheng-Hui law. On the other hand, we provide a tutorial on learning algorithms for a number of typical learning tasks, including Gaussian mixture, factor analysis (FA) with independent Gaussian, binary, and non-Gaussian factors, local FA, temporal FA (TFA), hidden Markov model (HMM), hierarchical BYY, three layer networks, mixture of experts, radial basis functions (RBFs), subspace based functions (SBFs). This tutorial aims at introducing BYY learning algorithms in a comparison with typical algorithms, particularly with a benchmark of the expectation maximization (EM) algorithm for the maximum likelihood. These algorithms are summarized in a unified Ying-Yang alternation procedure with major parts in a same expression while differences simply characterized by few options in some subroutines. Additionally, a new insight is provided on the ancient Chinese philosophy of Yin-Yang and WuXing from a perspective of information science and intelligent system.

References | Related Articles | Metrics
Research articles
An information theory perspective on computational vision
Alan YUILLE,
Front. Electr. Electron. Eng.. 2010, 5 (3): 329-346.  
https://doi.org/10.1007/s11460-010-0107-x

Abstract   PDF (682KB)
This paper introduces computer vision from an information theory perspective. We discuss how vision can be thought of as a decoding problem where the goal is to find the most efficient encoding of the visual scene. This requires probabilistic models which are capable of capturing the complexity and ambiguities of natural images. We start by describing classic Markov Random Field (MRF) models of images. We stress the importance of having efficient inference and learning algorithms for these models and emphasize those approaches which use concepts from information theory. Next we introduce more powerful image models that have recently been developed and which are better able to deal with the complexities of natural images. These models use stochastic grammars and hierarchical representations. They are trained using images from increasingly large databases. Finally, we described how techniques from information theory can be used to analyze vision models and measure the effectiveness of different visual cues.
References | Related Articles | Metrics
The new AI is general and mathematically rigorous
Jürgen SCHMIDHUBER,
Front. Electr. Electron. Eng.. 2010, 5 (3): 347-362.  
https://doi.org/10.1007/s11460-010-0105-z

Abstract   PDF (274KB)
Most traditional artificial intelligence (AI) systems of the past decades are either very limited, or based on heuristics, or both. The new millennium, however, has brought substantial progress in the field of theoretically optimal and practically feasible algorithms for prediction, search, inductive inference based on Occam’s razor, problem solving, decision making, and reinforcement learning in environments of a very general type. Since inductive inference is at the heart of all inductive sciences, some of the results are relevant not only for AI and computer science but also for physics, provoking nontraditional predictions based on Zuse’s thesis of the computer-generated universe. We first briefly review the history of AI since Gödel’s 1931 paper, then discuss recent post-2000 approaches that are currently transforming general AI research into a formal science.
References | Related Articles | Metrics
Network coding theory: An introduction
Raymond W. YEUNG,
Front. Electr. Electron. Eng.. 2010, 5 (3): 363-390.  
https://doi.org/10.1007/s11460-010-0103-1

Abstract   PDF (646KB)
For a long time, store-and-forward had been the transport mode in network communications. In other words, information had been regarded as a commodity that only needs to be routed through the network, possibly with replication at the intermediate nodes. In the late 1990’s, a new concept called network coding fundamentally changed the way a network can be operated. Under the paradigm of network coding, information can be processed within the network for the purpose of transmission. It was demonstrated that compared with store-and-forward, the network throughput can generally be increased by employing network coding. Since then, network coding has made significant impact on different branches of information science. The impact of network coding has gone as far as mathematics, hysics, and biology. This expository work aims to be an introduction to this fast-growing subject with a detailed discussion of the basic theoretical results.
References | Related Articles | Metrics
Bioinformatics — Mining the genome for information
Runsheng CHEN, Geir SKOGERBØ - 12-d ,
Front. Electr. Electron. Eng.. 2010, 5 (3): 391-404.  
https://doi.org/10.1007/s11460-010-0109-8

Abstract   PDF (258KB)
Since the launching of the human genome sequencing project in the 1990s, genomic research has already achieved definite results. At the beginning of the present century, the complete genomes of several model organisms have already been sequenced, including a number of prokaryote microorganisms and the eukaryotes yeast (Saccharomyces cerevisiae), nematode (C. elegans), fruit fly (Drosophila melanogaster) and thale cress (Arabidopsis thaliana) as well as the major part of the human genome. These achievements signified that a new era of data mining and analysis on the human genome had commenced. The language of human genetics would gradually be read and understood, and the genetic information underlying metabolism, development, differentiation and evolution would progressively become known to mankind. Large amounts of data are already accumulating, but at present many of the rules that should guide the understanding of this information are yet unknown. Bioinformatics research is thus not only becoming more important, but is also faced with severe challenges as well as great opportunities.
References | Related Articles | Metrics
Life and information
Yanda LI,
Front. Electr. Electron. Eng.. 2010, 5 (3): 405-410.  
https://doi.org/10.1007/s11460-010-0102-2

Abstract   PDF (254KB)
A computer virus is merely a small piece of program in nature, and similar to that of a computer virus, an organism may be considered as an information system in nature. This paper analyzes above idea from different ways. 1) DNA sequence satisfies the basic requirements of an information system; 2) The controls of a man and a robot both obey the principle of cybernetics; 3) How a man can have ideas but a robot has no such capacity; 4) The advantages of understanding a living organism from the point of view of information systems.
References | Related Articles | Metrics
10 articles