Vol.12,No.3

Abstract
〈Vol.12 No.3(2019.5)〉

Titles
[Special Issue on SICE Annual Conference 2018]

■ Realization of Stable Trajectory in Mass Measurement System Using Relay Feedback of Velocity and Restoring Force Compensation
■ Greedy Action Selection and Pessimistic Q-value Updating in Multi-Agent Reinforcement Learning with Sparse Interaction
■ Self-Triggered Optimal Control Based on Path Search Algorithm
■ MPC-Based Optimal Control for Diesel Engine Coupled with Lean NOx Trap System
■ Mistaken Pedal Pressing during Emergency Braking by Analyzing Pedal Behaviors
■ Combination of Statistical Access Point Selection Methods Based on RSSI in Indoor Positioning System
■ Detecting Position and Direction of a Device by Swept Frequency of Microwave Using Two-Dimensional Communication System

[CONTRIBUTED PAPERS]

■ Acquiring Classifiers for Bipolarized Reward by XCS in a Continuous Reward Environment

■ Realization of Stable Trajectory in Mass Measurement System Using Relay Feedback of Velocity and Restoring Force Compensation

Saitama University・Takeshi MIZUNO,Keisuke NISHIZAWA,
Masaya TAKASAKI,Yuji ISHINO,Masayuki HARA,and Daisuke YAMAGUCHI

A mass measurement system using relay feedback of velocity was developed. In this system, the velocity of the object was fed back through a relay with hysteresis. Although the efficacy of the proposed method has been already confirmed experimentally, it was subject to a drift mainly because only the velocity was fed back. To prevent such a drift, a spring element was added to produce restoring force. However, smaller drifts still occurred, and the trajectory of the object fluctuated. To eliminate such fluctuations, intermediate control is introduced. It succeeds to eliminate fluctuation of the trajectory in the developed apparatus.

■ Greedy Action Selection and Pessimistic Q-value Updating in Multi-Agent Reinforcement Learning with Sparse Interaction

Tottori University・Toshihiro KUJIRAI,and Takayoshi YOKOTA

Although multi-agent reinforcement learning (MARL) is a promising method for learning a collaborative action policy, enabling each agent to accomplish specified tasks, MARL has a problem of exponentially increasing state-action space. This state-action space can be dramatically reduced by assuming sparse interaction. We previously proposed three methods (greedily selecting actions, switching between Q-value update equations on the basis of the state of each agent in the next step, and their combination) for improving the performance of coordinating Q-learning (CQ-learning), a typical method for multi-agent reinforcement learning with sparse interaction. We have now modified the learning algorithm used in a combination of these two methods to enable it to cope with interference among more than two agents. Evaluation of this enhanced method using two additional maze games from three perspectives (the number of steps to a goal, the number of augmented states, and the computational cost) demonstrated that the modified algorithm improves the performance of CQ-learning.

■ Self-Triggered Optimal Control Based on Path Search Algorithm

The University of Electro-Communications・Yoshiki NAGATANI,Kenji SAWADA,and Seiichi SHIN

This paper proposes a path search formulation and its solution method to a finite optimal control problem on self-triggered control systems. Previous methods to the problem have high computational complexity. This paper focuses on the formulation and its data structure for reduction of calculation time and reformulates the optimal control problem to a path search problem. We also consider a data structure of a graph for the path search problem. The key point is the sharing of vertices. The sharing leads to the reduction of calculation time. We compare the calculation time of the path search algorithm and the mixed-logical dynamical method.

■ MPC-Based Optimal Control for Diesel Engine Coupled with Lean NOx Trap System

Sophia University・Fuguo XU,and Tielong SHEN

In this paper, an on-board optimal control problem for diesel engines with lean NOx trap (LNT) is investigated. First, a two-order LNT model based on mass conservation and energy conservation is constructed. Then, the optimal control problem is formulated as a continuous receding horizon problem under dynamical model constraint and discretized into a nonlinear programming problem by using the multiple shooting method. A sequential quadratic programming approach is employed to derive a numerical solution. Finally simulations are conducted under a standard driving cycle and a random driving cycle with comparison to a dynamic programming based control scheme in MATLAB/Simulink platform. Simulation results verify the effectiveness of the proposed control scheme.

■ Mistaken Pedal Pressing during Emergency Braking by Analyzing Pedal Behaviors

Doshisha University・Rahadian YUSUF,Ivan TANEV,and Katsunori SHIMOHARA

Affective computing has been used to improve computer usability and user interface by considering user's emotion.
There is still ample scope for exploration of affective computing especially for applications in a driver assistance system to recognize driver's emotions upon an emergency braking that has been extensively studied by several researches. Pressing an accelerator mistakenly during an emergency braking is a serious case that leads to accidents especially in the case of an elderly driver or a beginner. This paper reviews briefly the affective computing, emergency braking, and mistaken pedal pressing, proposing a possible approach for improving the existing driving assistance system by analyzing driver’s behavior of pedal pressing. The study is based on the idea that an emergency braking behavior should not mistakenly happen on the accelerator pedal, and we used evolutionary computing to investigate this idea. Results from our reviews and experiments showed that current affective computing technology might be insufficient to recognize a mistaken pedal pressing using facial expressions or similar emotions. However, we found that analyzing pedal pressing behaviors of driver can recognize a mistaken pedal pressing during emergency braking. This can provide further alternative options to
improve the safety of driving, especially for elderly and beginners, as we focus on the driver aspects.

■ Combination of Statistical Access Point Selection Methods Based on RSSI in Indoor Positioning System

Waseda University・Shigeyuki TATENO,Tong LI,Yu WU,and Ziyuan WANG

Recently, as wireless infrastructure has developed widely, and smartphones have become necessary in daily life, indoor positioning devices and applications have become more and more popular. Previous studies have proposed several methods based on different wireless communication technologies. Among them, methods with received signal strength indicator (RSSI) values and trilateration methods are mainly used to obtain positioning results. However, due to abnormal RSSI values caused by noise and influence from the environment, the accuracies of these methods are not satisfying. Therefore, a new method which can reduce the positioning error is necessary. In this paper, to improve the positioning accuracy above trilateration results, an access point selection method and a kernel density estimation method are combined to obtain estimated points. Experiments are designed in actual environments, and the results of which show that the proposed method is sufficient for improving the positioning accuracy.

■Detecting Position and Direction of a Device by Swept Frequency of Microwave Using Two-Dimensional Communication System

Keio University・Junya TAIRA,Suzanne LOW,Maki SUGIMOTO,and Yuta SUGIURA

We propose a system of detecting the position of a device embedded with an antenna by sensing the electrical power from a two-dimensional communication (2DC) sheet. The system obtains a characteristic power pattern at each position by sweeping the frequency of the microwave supplied to the 2DC sheet. Our system uses a machine learning technique to learn the accumulated power-pattern data to detect the position of a device. The position-detection accuracy of our system was 79.1 % when the antenna was moved in 12 mm intervals. In addition to detecting the position of a device, we also estimated the direction.

■ Acquiring Classifiers for Bipolarized Reward by XCS in a Continuous Reward Environment

The University of Electro-Communications・Takato TATSUMI,and Keiki TAKADAMA

In data mining, it is important to clarify how effective the acquired rules are and which elements are affected by rule evaluation. Extended learning classifier system (XCS) reveals factors that affect the classifier (rule) evaluation by generalizing the multiple classifiers that acquire the same reward (evaluation value) into a generalized classifier. In a real-world problem, because the reward of the classifier varies, XCS cannot acquire the generalized classifier. As useful classifiers with a narrow range of the acquired rewards are required, this paper proposes a new XCS (XCS based on Reward Bipolarization: XCS-RB) that acquires the classifiers that acquire only high rewards and classifiers that acquire only low rewards. XCS-RB was applied to the problems such as predicting the ratio of the deep sleep time of the night of the day from the care plan implemented in the care house and predicting the calculation time of a matrix-matrix product using SGEMM GPU kernel. XCS-RB acquired rules indicating care plan that leads to deep sleep and parameter settings with short calculation time. XCS-RB was able to acquire the generalized classifiers so as not to conflict with the input data; in this paper, the potential advantages of XCS-RB have been demonstrated.

Abstract 〈Vol.12 No.3(2019.5)〉

Titles [Special Issue on SICE Annual Conference 2018]

Abstract
〈Vol.12 No.3(2019.5)〉

Titles
[Special Issue on SICE Annual Conference 2018]