Abstract
〈Vol.12 No.5(2019.9)〉

Titles
[CONTRIBUTED PAPERS]


■ Adaptive Multi-point Temperature Control for Microwave Heating Process via MultiRate Sampling

Chongqing University・Shan LIANG,Tong LIU,
Junrong SONG,Qingyu XIONG,and Kai WANG

Microwave heating has been gradually extended to industrial material process from domestic microwave ovens because of its substantial advantages such as high-efficiency, pollution free, and selective heating. Unfortunately, the drawback of the temperature non-uniformity, which may cause thermal runaway, becomes an obstruction for the development of microwave energy. Besides, a common problem associated with microwave heating systems is that the speed of microwave power transmission is faster than the temperature detection period. Thus, to ensure the global temperature uniformity and to enhance the system adaptivity for deviation of the temperature detecting position in the microwave heating system with input constraints, a multi-rate simple adaptive multi-point temperature control strategy based on almost strictly positive real conditions is proposed, where the use of multi-rate sampling and lifting technique is to solve the case that the system has less inputs than outputs. Finally, simulation results demonstrate the effectiveness of the proposed control strategy.


■ Self-Interference Suppression Based on Sampled-Data H-infinity Control for Baseband Signal Subspaces

Tokyo Institute of Technology・Hampei SASAHARA,
The University of Kitakyushu/Indian Institute of Technology Bomby・Masaaki NAGAHARA,
Osaka City University・Kazunori HAYASHI,and
Kyoto University/CentraleSupelec・Yutaka YAMAMOTO

In this paper, we propose a design method of self-interference cancelers for in-band full-duplex wireless relaying taking account of baseband signal subspaces.We model the relaying system with self-interference as a sampled-data feedback control system.
Then we formulate the design problem as a sampled-data H control problem with a generalized sampler and a generalized hold.
The problem can be reduced to a discrete-time l2-induced norm optimization problem by explicitly considering the subspace spanned by baseband signals.Moreover, for implementation we also adopt ideal uniform samplers and zero-order holds with digital filters and up/down samplers.Under these implementation constraints, we reformulate the problem as a standard discrete-time H control problem by using the discrete-time lifting technique.Simulation results are shown to illustrate the effectiveness of the proposed method.


■ Proposal and Evaluation of Detour Path Suppression Method in PS Reinforcement Learning

Meiji University・Daisuke SHIRAISHI,
National Institution for Academic Degrees and Quality Enhancement of Higher Education・Kazuteru MIYAZAKI,
and Meiji University・Hiroaki KOBAYASHI

Profit sharing (PS) is well known as a kind of reinforcement learning. In a PS method,a reward is generally distributed with a geometrically decreasing function, and the common ratio of the function is called a discount rate. A large discount rate increases the learning speed, but a non-optimal policy may be learned. On the other hand, a small discount rate improves the performance of the policy, but the learning may not proceed smoothly because of the shallow learning depth. In this paper, in order to cope with these problems, we propose a method that reinforces both the detour path and the non-detour path with different discount rates. Finally, this method is applied to a maze problem and an altruistic multi-agent environment to confirm its effectiveness.


■ Utilizing Observed Information for No-Communication Multi-Agent Reinforcement Learning toward Cooperation in Dynamic Environment

The University of Electro-Communicantions・Fumito UWANO and Keiki TAKADAMA

This paper proposes a multi-agent reinforcement learning method without communication toward dynamic environments, called Profit minimizing reinforcement learning with oblivion of memory (PMRL-OM). PMRL-OM is extended from PMRL and defines a memory range that only utilizes the valuable information from the environment. Since agents do not require information observed before an environmental change, the agents utilize the information acquired after a certain iteration, which is performed by the memory range. In addition, PMRL-OM improves the update function for a goal value as a priority of purpose and updates the goal value based on newer information. To evaluate the effectiveness of PMRL-OM, this study compares PMRL-OM with PMRL in five dynamic maze environments, including state changes for two types of cooperation, position changes for two types of cooperation, and a combined case from these four cases. The experimental results revealed that: (a) PMRL-OM was an effective method for cooperation in all five cases of dynamic environments examined in this study; (b) PMRL-OM was more effective than PMRL was in these dynamic environments; and (c) in a memory range of 100 to 500, PMRL-OM performs well.


■ Extended Transfer-Function and Pole-Zero Cancellation in Linear Time-Varying Systems

Kanazawa University・Ichiro JIKUYA and Shingo KONDO

Pole-zero cancellation is a well-known and important concept in linear time-invariant systems. In contrast, transfer functions as well as poles and zeros are not defined for linear time-varying systems. In this paper, we attempt to generalize the concept of pole-zero cancellation to linear time-varying systems. We first introduce the new concept of extended transfer-function for linear time-varying systems in the time domain instead of the frequency domain. We then propose the computational procedure of pole-zero cancellation to linear time-varying systems. We finally discuss the meaning of the proposed computational procedure regardless of the lack of poles and zeros in linear time-varying systems. The proposed concept and computational procedure are illustrated by a numerical example.