0% found this document useful (0 votes)
35 views

Machine Learning in 6G Wireless Communications

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Machine Learning in 6G Wireless Communications

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

IEICE TRANS. COMMUN., VOL.E106–B, NO.

2 FEBRUARY 2023
75

INVITED PAPER Special Section on Emerging Communication Technologies in Conjunction with Main Topics of ICETC 2021

Machine Learning in 6G Wireless Communications

Tomoaki OHTSUKI†a) , Fellow

SUMMARY Mobile communication systems are not only the core of the is expected to improve the performance of the uplink and
Information and Communication Technology (ICT) infrastructure but also realize communication quality assurance.
that of our social infrastructure. The 5th generation mobile communication
Mobile communication systems are evolving every 10
system (5G) has already started and is in use. 5G is expected for various use
cases in industry and society. Thus, many companies and research institutes years, and by 2030, when the next generation of mobile com-
are now trying to improve the performance of 5G, that is, 5G Enhancement munication systems (Beyond 5G (6G)) is expected to be in
and the next generation of mobile communication systems (Beyond 5G use, various social issues and use cases are expected to be
(6G)). 6G is expected to meet various highly demanding requirements even addressed. As shown above, 6G will be required to support
compared with 5G, such as extremely high data rate, extremely large cover-
age, extremely low latency, extremely low energy, extremely high reliability,
the data traffic that is expected to continuously increase as
extreme massive connectivity, and so on. Artificial intelligence (AI) and mobile communication services become more sophisticated
machine learning (ML), AI/ML, will have more important roles than ever in and diverse. Also, 6G will be required to meet the extremely
6G wireless communications with the above extreme high requirements for high-performance requirements that will support the resolu-
a diversity of applications, including new combinations of the requirements
tion of social issues and new use cases in the 2030s. The
for new use cases. We can say that AI/ML will be essential for 6G wireless
communications. This paper introduces some ML techniques and appli- Ministry of Internal Affairs and Communications (MIC) has
cations in 6G wireless communications, mainly focusing on the physical presented the following three social images for the 2030s
layer. when 6G is expected to be used in “Beyond 5G Promotion
key words: artificial intelligence (AI), machine learning (ML), deep learn- Strategy — Roadmap towards 6G —” [2]: “Inclusive soci-
ing (DL), neural network (NN), deep neural network (DNN), 6G, deep
transfer learning (DTL)
ety,” “Sustainable society,” and “Dependable society.”
The year 2030 is also the target year for achieving
1. Introduction the Sustainable Development Goals (SDGs) adopted at the
United Nations Summit in 2015. 6G is expected to support
Digital Transformation (DX), which transforms society, the realization of these goals as a social infrastructure. Ac-
economy, and industry using digital technology represented cording to the “Beyond 5G Promotion Strategy,” in addition
by rapidly developing AI (Artificial Intelligence), is attract- to further upgrading of the characteristic functions of 5G,
ing much attention. Information and Communication Tech- such as eMBB, URLLC, and mMTC, 6G must be equipped
nology (ICT) infrastructure plays an important role in DX. with four new functions: “ultra low power consumption,”
It is no exaggeration to say that mobile communication sys- “autonomy,” “scalability,” and “ultra security and resiliency.”
tems, represented by the 5th generation mobile communi- In addition to the above functions, [1] also lists lower cost
cation system (5G), are the core of the ICT infrastructure. (lower cost per bit) and sensing as requirements.
5G is expected for various use cases in industry and soci- To meet those high requirements, various techniques
ety. 5G has three functional requirements: enhanced Mobile need to be developed and used in 6G. Several companies and
Broadband (eMBB), Ultra-Reliable and Low Latency Com- research institutes have issued white papers about B5G and
munications (URLLC), and Massive Machine Type Com- 6G [1]–[8]. In those white papers, we can see many common
munications (mMTC). In 5G (New Radio (NR) Release 15), requirements as shown below such as in [1].
which is currently in service, best-effort services that empha- • Extreme high data rate/capacity
size downlink speed are mainly realized as a result of stan-
dardization in 3GPP, focusing on eMBB and some URLLC – Peak data rate > 100 Gbps exploiting new spec-
among them [1]. In the future, it is expected that services trum bands
that take advantage of large data uploads and services that – > 100× capacity
guarantee communication quality, particularly for industrial – Extreme-high uplink capacity
applications, will be required. Therefore, 5G Enhancement • Extreme low latency
Manuscript received July 8, 2022. – E2E very low latency < 1 ms
Manuscript revised July 20, 2022. – Always low latency
Manuscript publicized August 10, 2022.
† The author is with the Faculty of Science and Technology,
• Extreme coverage extension
Keio University, Yokohama-shi, 223-8522 Japan.
a) E-mail: ohtsuki@ics.keio.ac.jp – Gbps coverage everywhere
DOI: 10.1587/transcom.2022CEI0002 – New coverage areas, e.g., sky (10000 m), sea
Copyright © 2023 The Institute of Electronics, Information and Communication Engineers
IEICE TRANS. COMMUN., VOL.E106–B, NO.2 FEBRUARY 2023
76

(200 NM), space, etc. control, spectrum sensing/management/allocation, physical-


layer security, and so on. Also, in the application of ML at
• Extreme high reliability
the network layer, the following areas are shown: Caching,
– Guaranteed QoS for wide range of use cases (upto traffic classification, anomaly detection, throughput opti-
99.99999% reliability) mization, latency minimization, attack detection, intelligent
– Secure, private, safe, resilient, · · · routing, traffic prediction/control, access control, source en-
coding/decoding, and so on.
• Extreme low energy & cost As shown above, there are ongoing standardization ef-
– Affordable mmW/THz NW & devices forts to exploit AI in cellular systems. The third generation
– Devices free from battery charging partnership project (3GPP) has standardized a network data
analytics function (NWDAF) for data collection and analyt-
• Extreme massive connectivity ics in automated cellular networks [10]. In addition to 3GPP,
– Massive connected devices (10 M/km2 ) the O-RAN Alliance is targeting to realize an intelligent ra-
– Sensing capabilities & high-precision positioning dio access network (RAN). Networks are now required to
(< 1 cm) support a wide variety of applications and becoming in-
creasingly complex. In such a case, it may become difficult
To meet these high requirements, many key techniques to optimize operations and networks manually as in the past.
are mentioned in the white papers such as in [5]. It will become essential to realize more autonomous and
automated operations utilizing AI and ML. To realize such
• AI/ML-driven air interface design and optimization
a vision, the O-RAN Alliance is studying RAN configura-
• Expansion into new spectrum bands and new cognitive
tions (architectures) that can optimize network design and
spectrum sharing methods
operations while utilizing AI/ML, as well as open interfaces
• The integration of localization and sensing capabilities
are also being considered. The function called “RIC (RAN
into system definition
Intelligent Controller)” specified by the O-RAN Alliance is
• The achievement of extreme performance requirements
positioned at the center of the realization of this intelligent
on latency and reliability
RAN.
• New network architecture paradigms involving sub-
In this paper, we introduce some ML techniques and
networks and RAN-core convergence
their applications in 6G wireless communications, mainly
• New security and privacy schemes
focusing on the physical layer. We first introduce end-
As mentioned above, AI/ML will be essential for 6G to-end learning of communication systems through neural
wireless communications with the extremely high require- networked-based autoencoders [11]. We then introduce
ments for a diversity of applications, including new com- some ML techniques for massive multiple-input multiple-
binations of the requirements for new use cases. Note that output (MIMO). We introduce a neural network-based belief
AI is a simulation of human intelligence or experience by propagation (BP) algorithm for massive MIMO signal detec-
machines, while ML is an application of AI with the ability tion [12], [13]. This algorithm is based on the idea of deep
to automatically learn and improve from experience without unfolding that unfolds the iterations of an inference algorithm
being explicitly programmed. AI is a much broader concept into a layer-wise structure like a neural network [14]. We
than ML. A white paper on ML in 6G wireless communi- also introduce a signal detection based on the BP algorithm
cation networks has been issued by University Oulu [8]. In with a deep learning (DL)-based denoising technique based
the white paper, several applications of ML in each layer on the deep image prior (DIP) [15]. In massive MIMO sys-
are presented, physical layer, medium access control (MAC) tems, as the name mentions, the number of antennas is large
layer, application layer, and also for the security of wireless so the number of channels that needs to be estimated is also
networks. As the application of ML at the physical layer, the large. Due to the time-varying characteristics of the channel,
following areas are shown: channel coding, synchronization, the length of pilot signals is limited so that that of orthog-
positioning, channel estimation, beamforming, and physical onal pilot signals is finite. Thus, the same pilot signals are
layer optimization with ML. Also, in the application of ML reused in neighboring cells, which deteriorates the channel
at the MAC layer, the following use cases are shown: Fed- estimation performance. This defect is referred to as pi-
erated Learning (FL) for orientation and mobility prediction lot contamination. We introduce two neural network-based
in wireless virtual reality networks, predictive resource al- schemes to reduce the effects of pilot contamination [16].
location in machine-type communications, predictive power We also consider channel state information (CSI) feedback,
management, and asymmetric traffic accommodation. where the amount of feedback information is the issue in
In [9] more applications of AI/ML in each layer are pre- massive MIMO. We introduce the neural network-based CSI
sented, physical layer, network layer, and application layer. feedback scheme, where we explain the idea of deep transfer
As the application of ML at the physical layer, the following learning (DTL) and present the DTL-based CSI feedback
areas are shown: Channel tracking/equalization/decoding, scheme [17].
pathloss prediction/estimation, intelligent beamforming, In B5G and 6G, it is also essential to utilize new spec-
modulation mode selection, anti-jamming, channel access trum bands such as mmWave bands and tera-hertz frequency
OHTSUKI: MACHINE LEARNING IN 6G WIRELESS COMMUNICATIONS
77

bands to achieve an extremely high data rate. However, the


systems using high-frequency bands, such as the mmWave
systems, suffer from severe pathloss. Thus, it is essential to
use beamforming with large antenna array gains in mmWave
communications. Therefore, it needs to use a lot of antennas
in mmWave communications. Since the power consumption
and cost of radio-frequency (RF) chains are both high, the
current mmWave systems employ not full digital beamform-
ing, in which each antenna is attached to an RF chain, but hy-
Fig. 1 Modeling the communications system by autoencoder, where s is
brid beamforming in general. Hybrid beamforming is effec- the transmitted information, x is the transmitted signal, y is the received
tive but requires a large overhead in a beam search/selection signal, ŝ is the estimated transmitted information, f (·) is the transmitter
phase. We introduce our proposed DL-based analog beam function, g(·) is the receiver function, and p(y|x) is the conditional proba-
selection scheme with low overhead [18]. Finally, we con- bility density function of the channel.
clude this paper and discuss the future direction of ML in
6G wireless communications.
Note that some wireless scenarios mentioned in this maximum likelihood (ML) decoding and that of the trained
paper are also studied in 5G. However, as mentioned above, autoencoder (7,4). The autoencoder is shown to achieve the
there are new extremely high requirements for new use cases same BLER performance as that of the Hamming (7,4) code
in 6G. Also, there are new combinations of the requirements with ML decoding. Thus, it can be said that the autoencoder
for new use cases in 6G. Those are the differences between has learned the encoder and decoder function without any
5G and 6G even for the same wireless scenarios. prior knowledge.
One of the drawbacks of the end-to-end learning of
2. AI-Based Wireless Communications communication systems through autoencoders is that the gra-
dient of the instantaneous channel transfer function must be
2.1 End-to-End Learning of Communication Systems known, which is not practical. Moreover, the channel usually
Through Neural Networked-Based Autoencoders comprises some processes of the transmitter and the receiver,
such as quantization, which are non-differentiable, and thus
In general, a communication system consists of a transmitter, the gradient-based training through backpropagation cannot
a channel, and a receiver, where a transmitter and a receiver be used.
are split into multiple signal processing blocks. Conven- To overcome these drawbacks, [20] proposes a learn-
tional signal processing tries to optimize each block sepa- ing algorithm that enables the training of communication
rately or sometimes jointly. The block-based optimization systems with an unknown channel model or with non-
provides a good performance in general. However, block- differentiable components. The proposed algorithm iterates
based optimization does not always provide the best possi- between the training of the receiver using the true gradient
ble end-to-end performance. Joint optimization can provide of the loss, and that of the transmitter using an approxima-
better performance than block-based optimization in gen- tion of the loss function gradient. The proposed algorithm
eral. However, joint optimization of multiple signal blocks is shown to achieve the performance identical to that of the
is often computationally prohibitive. A learned end-to-end algorithm with training with a channel model using back-
optimization through deep learning can provide superior per- propagation on additive white Gaussian noise (AWGN) and
formance. Rayleigh block-fading (RBF) channels.
The general communication systems mentioned above
can be seen as a kind of autoencoder [19]. Thus, an autoen- 2.2 Deep Learning for Massive MIMO
coder has been used to model the communication system and
optimize the communication system in an end-to-end man- 2.2.1 Massive MIMO Detection
ner as shown in Fig. 1 [11]. In general, an autoencoder is
used to find a lower-dimensional representation of its input Massive MIMO that uses a massive number of antenna ele-
at an intermediate layer like compression, while it still can ments on the transmitter side is one of the key technologies
reconstruct the input signal as the output of the decoder. On of 5G and 6G. Massive MIMO can achieve high spectral
the other hand, the autoencoder modeling the communica- efficiency and accommodate a large number of users. How-
tion system tries to learn representations of the transmitted ever, in a massive MIMO system, it is difficult to detect
signals x of the messages s that are robust to the channel signals from a large number of users. Also, the complexity
impairments, such as noise, fading, distortion, and so on, so of signal detection becomes high. As simple signal de-
that the transmitted message can be recovered with a small tection techniques, linear detection methods such as zero-
probability of error. This autoencoder is sometimes referred forcing (ZF) and minimum mean squared error (MMSE)
to as the “channel autoencoder.” [11] presents a comparison are known. However, those need the inverse matrix cal-
of the block error rate (BLER) performance between Ham- culation, which results in a large computational complexity
ming (7,4) coded binary phase-shift keying (BPSK) with for massive MIMO systems. The signal detection technique
IEICE TRANS. COMMUN., VOL.E106–B, NO.2 FEBRUARY 2023
78

Table 1 Similarities between BP FG and DNN.


based on the belief propagation (BP) algorithm, referred to as
BP FG DNN
BP detection, is one of the promising techniques [21]–[25].
Nodes Neurons
The BP algorithm calculates a marginal probability of un- Transmitted signals Input data
observed variables by message passing in a factor graph. In Received signals Output data
BP detection, symbol replicas are generated from the propa- l-th iteration l-th hidden layer
gated messages at each iteration, and the log-likelihood ratio Belief messages Hidden signals
(LLR) of each symbol is updated as a message by removing Message updating rules Mapping function between layers
Correction factor Parameters
the interfering component of the received signals. BP de-
tection can achieve near-optimal detection performance with
lower complexity [21]–[25]. Moreover, BP detection does
not require the matrix-inversion calculation, which is attrac-
tive for massive MIMO systems. However, there are some
issues in BP detection. In BP detection errors occur in the
propagated messages due to residual interference and noise
in the received signals after interference removal. Also, due
to multiple loops included in the MIMO channel, a message
with error propagates throughout the factor graph, which re-
sults in the degradation of the convergence performance and
the detection performance. A damping factor is introduced
to control the message updates to improve the BP detec-
tion performance. The damping factors are used to average
two successive messages by expanding the BP iteration to
the neural network so that the detection performance and
convergence performance can be improved. In conventional Fig. 2 A structure of DNN-dBP with node selection [13].
works, the damping factors are tuned heuristically in gen-
eral. However, a heuristic-based selection often results in
suboptimal performance.
Neural networks have the ability to learn the funda-
mental information of the model. Deep unfolding is a tech-
nique to unfold the iterations of an inference algorithm into a
layer-wise structure like neural networks [14]. Deep unfold-
ing capitalizes the well-known signal processing model and
the ability of DL. It can solve problems for which precise
modeling is not available. It can also approximate com-
putationally complex operations by a deep neural network Fig. 3 A heatmap of the received signal.
(DNN). In deep unfolding, model parameters are de-coupled
across layers that can be trained easily and discriminatively
using gradient-based methods. There are similarities be- out the need for teacher data [15]. DIP learns a single input
tween the message passing factor graph and DNN as shown image and optimizes the parameters of the convolutional neu-
in Table 1 [12]. Thus, DNN is employed to improve the con- ral network (CNN) by the gradient descent method to obtain
vergence performance of BP, which is referred to as DNN- a reconstructed image in general. DIP can be said to exploit
based damped BP (DNN-dBP) [12]. DNN-dBP trains the the difference in the learning speed of neural networks for
damping factors by unfolding the BP iteration to the neural images. It has been shown in [15] that DIP learns faster
network. By using the trained damping factors, it is possible for natural images than for random images such as noise.
to improve the convergence performance of BP. In [13], we DIP can use this difference in the learning speed to output
derived the damping factors that are robust to the channel a clean image, i.e., an image with reduced noise, by stop-
mismatches between training and testing using DNN-dBP. ping learning before learning noise. In [26], a heatmap of
Fig. 2 shows the structure of the proposed DNN-dBP with the received signal is generated as shown in Fig. 3 using the
node selection. In this method, observation nodes to be “receive antenna index” and “time index” as dimensions. In
updated in one iteration are selected so that the spatial cor- each heatmap, each receive antenna receives the same trans-
relation becomes low. Thus, the channel correlation among mitted symbols after interference removal, so the correlation
the selected nodes in BP detection is lowered and the con- is high in the domain of the receive antenna. Suppose the
vergence performance of BP is improved. Therefore, the inter-frame channel is assumed to be constant. In that case,
dumping factors derived based on this method are robust to each receive antenna receives a symbol pattern of transmit-
the channel mismatches between training and testing. ted symbols at each time, resulting in high correlations in
In the context of image processing, deep image prior the time domain. Based on these correlations, DIP can re-
(DIP) has been reported as a method to remove noise with- duce residual interference and noise. In [26], weintroduced a
OHTSUKI: MACHINE LEARNING IN 6G WIRELESS COMMUNICATIONS
79

massive MIMO BP detection using DIP with a DNN-trained It can be seen that the BER performance is improved by ap-
scaling factor. In BP detection, we create the heatmap of the plying DIP. It can be also seen that the BER performance of
received signals after interference removal at each iteration the proposed method with the trained scaling factor is better
so that it correlates. By applying DIP to the heatmap of than that without the trained one.
the received signals, it is possible to reduce residual inter-
ference and noise. After applying DIP, the variance of the 2.2.2 Pilot Contamination
interference and noise components changes. To bring the
variance closer to its true value, we scale it. Because it is In massive MIMO, the number of channels that needs to
difficult to calculate the value of the variance after applying be estimated is large. Since the number of orthogonal pi-
DIP theoretically, we train the scaling factors offline using lot signals is limited when we limit the length of those, the
DNN-BP. By scaling the variance, it is possible to improve same pilot signals need to be reused in neighboring cells.
the reliability of the message. Figure 4 shows the BER per- The degradation of the channel estimation performance by
formance versus SNR in dB in the correlated channel (the reusing the same pilot signals is referred to as pilot contam-
correlation factor ρ = 0.3) where the modulation scheme is ination. In [27] a covariance-aided channel estimation is
QPSK, 16×16 MIMO, and the number of BP iterations is 7. proposed, in which the MMSE channel estimation is de-
rived. This scheme can remove the pilot contamination
completely when the covariance matrices satisfy a certain
non-overlapping condition. However, this assumption is not
so practical.
Recently, DL is expected to improve the channel estima-
tion performance in massive MIMO. In [28], DL is integrated
into direction-of-arrival (DoA) estimation and channel esti-
mation in massive MIMO systems. In [16] we propose two
methods of DL-aided channel estimation to reduce the ef-
fects of pilot contamination. One uses a neural network
consisting of fully connected layers, while the other uses
a CNN. Figure 5 shows the frameworks of the proposed
methods where the upper and lower parts show the struc-
ture in the neural network-based estimation using the fully
connected layers and the CNN-based estimation using the
convolutional layers, respectively. Neural networks, partic-
Fig. 4 BER performance vs SNR (dB) per receive antenna in the cor-
related channel (ρ = 0.3) where the modulation scheme is QPSK, 16×16
ularly CNN, can extract features of spatial information from
MIMO, and the number of BP iterations is 7. the contaminated signals. It is shown that the former method

Fig. 5 A framework for the proposed methods [16]. The upper part and the lower part show the
structure in the neural network-based estimation using the fully connected layers and the CNN-based
estimation using the convolutional layers, respectively.
IEICE TRANS. COMMUN., VOL.E106–B, NO.2 FEBRUARY 2023
80

is better in terms of the training speed, while the latter one


can estimate the channel more accurately.

2.3 Deep Transfer Learning

Transfer learning (TL) is a machine learning method where Fig. 6 The system model of the CSI feedback scheme based on DTL [17].
a model trained for a task is used as a starting point for a
model on a different related task. TL is a popular technique
in DL such as for computer vision and natural language
processing where a large amount of computation and time
resources are required to train a model from scratch. In TL,
a domain and a task are defined. A domain D is defined
as a pair D = { χ, P(X)}, which consists of a feature space
χ and a marginal distribution P(X) over the feature space,
where X = {x1, ..., xn } ∈ χ. A task is defined as a pair
T = {Y, f (·)}. Y is the label space, and given yi ∈ Y, f (·)
is a function that predicts yi corresponding to xi . Using the
definitions of a domain and a task, TL can be defined as
follows [29]:
Transfer learning : Given a source domain DS , a
source task TS , a target domain DT , and a target task TT , Fig. 7 The NMSE performance versus the number of target data samples
the aim of TL is to improve the learning of the target predic- where the target channel is CDL-A.
tion function fT (·) in DT using the knowledge in DS and TS ,
where DS , DT or TS , TT .
Deep transfer learning: DTL is a method that com- source data. We can see that there is a performance degrada-
bines deep learning with TL. Given that the TL task is de- tion, but the DTL scheme achieves good NMSE performance
fined by hDS ,TS , DT ,TT i, which is a DTL task when the with the small number of target data samples. We can also
target prediction function fT (·) for TT is a non-linear func- see that in the DTL scheme, different source models provide
tion approximated by DNN. different NMSE performances. In this environment where
DTL has been applied to wireless communications as the target channel is CDL-A (NLOS), the DTL scheme pro-
well, such as CSI feedback, beamforming, signal detection, vides the best NMSE performance when the source model is
physical layer security, and so on. In [17], DTL is used to CDL-B (NLOS) and CDL-C (NLOS). As mentioned before,
generate the CSI feedback deep learning model for each tar- the source model selection is important for the DTL scheme.
get channel model whereas the Clustered Delay Line (CDL) Some discussion about the source model selection criteria in
channel model [30] is used to simulate the wireless environ- the DTL scheme can be found in [31].
ments. Specifically, the DNN is trained as the source model
by using a large number of CDL-A samples as source data. 2.4 mmWave Communications
The source model is then fine-tuned with a small number of
CDL-B, CDL-C, CDL-D, and CDL-E samples, i.e., target In wireless communications, there have been continuous and
data, respectively. Based on this procedure, a target model tremendous efforts to increase capacity by expanding spec-
for each target channel can be obtained with a small number trum and improving spectral efficiency and spatial reuse. It
of samples and a short training time. Figure 6 shows the sys- is very important to utilize new spectrum bands such as
tem model of the CSI feedback scheme based on DTL [17]. mmWave bands and tera-hertz frequency bands. A sig-
Figure 7 shows the NMSE performance of the DTL scheme nificant amount of research has been ongoing to improve
[17] in FDD massive MIMO systems where the target chan- and realize mmWave systems. However, mmWave systems
nel is CDL-A. The frequencies of the uplink and downlink suffer from severe pathloss. Thus, it is essential to use
channels are set to 2.0 GHz and 2.1 GHz, respectively. The beamforming with large antenna array gains in mmWave
numbers of antennas of UE and BS are 2 and 32, respec- communications. In mmWave communications, the power
tively. The number of subcarriers is set to 72 with a spacing consumption and cost of RF chains are both high. Hybrid
of 15 kHz, and the number of OFDM symbols to 14. The es- beamforming is a promising technique to balance tradeoffs
timated CSI of UE and feedback CSI of BS are assumed to be between cost and performance. Since mmWave communi-
error-free. The compression ratio is set to 1/8. The number cations need to use a large number of antennas, the channel
of source data samples used to train the source model is set to estimation is also the challenging task. Against the chal-
50,000, and that of target data samples used for fine-tuning lenge, a switched beamforming scheme has been proposed
is varied as 200, 500, 1000, 2000, and 4000. The red dotted [32] in which the best beams to steer are found within the
line with the label “CDL-A (NLOS)” represents the NMSE codebook. Among beam selection schemes, an exhaustive
performance of the source model trained using CDL-A as the search scheme achieves the best performance but requires a
OHTSUKI: MACHINE LEARNING IN 6G WIRELESS COMMUNICATIONS
81

large overhead particularly when a large number of beams to the power map obtained by such as 8 × 8 DFT beams. It
are employed [33]. A hierarchical beam search proposed is shown in [18] that the proposed beam selection achieves
in [34] can reduce the beam training overhead by two-stage a performance comparable to that of the exhaustive search
beam training. In the hierarchical beam search scheme, BS scheme. Note that the number of beam measurements per
and UE, equipped with multiple-tier codebooks, sweep wider coherence time is 8 for the proposed scheme and 64 for the
beams first and iteratively thin the search space for the best exhaustive search scheme.
narrow beam. The hierarchical beam search scheme can pro-
vide a good trade-off among the performance, the time, and 3. Conclusions
the large overhead. In [35] a beam selection scheme using
DL is proposed to reduce the overhead. The DL model es- In this paper, I presented an overview of some ML techniques
timates the qualities (received power) of all the beams from and applications in 6G wireless communications, mainly
a few beam measurements. The authors introduce the DL- focusing on the physical layer. One of the challenges in
based image reconstruction approach to the beam selection applying ML to real systems is the dynamic environments.
where the received power matrix is transformed into a power The environments of wireless communications dynamically
map by assigning the received power to the corresponding change. ML makes inferences and predictions using data.
color. However, since the beams used for measurements are Therefore, if the statistical properties of the data change
selected randomly, the performance of the scheme can be over time, the performance of the system using ML may
largely affected by the beam searching area [35]. degrade. To use ML in wireless communications, DTL that
In [18], we proposed a DL-based low overhead analog I introduced its applications in wireless communications is
beam selection scheme in which two different-width beams one of the promising solutions. Another solution is the meta-
are steered, wide beams for pilot signals and narrow beams learning that learns how to learn [17]. Another challenge is
for data signals. To change the beam widths without los- that ML, particularly DL-based solutions, usually require a
ing beamforming gain, a balance beam is implemented in large amount of training data and computational resources.
our proposed scheme, which concentrates a radiation pat- To apply DL-based solutions, we need to carefully consider
tern over the target area. Based on the wide-beam measure- those requirements.
ments, the proposed super-resolution-inspired DL predicts A common problem with AI is that the parameters ob-
the beam qualities (received powers) of narrow beams where tained as a result of training are difficult to interpret. That
the spatial correlation in the beam qualities is utilized with is, it is difficult to interpret why the characteristics obtained
a CNN to improve the estimation accuracy. Moreover, the by AI are the way they are. This is called the interpretability
proposed scheme predicts beam qualities to reduce the fre- problem. However, to use AI in a real system, it is necessary
quency of beam training. The proposed scheme transmits to be able to understand and explain why such characteristics
the pilot signal only every other channel coherence time to are obtained. Explainable AI (XAI), which is an ML model
reduce the training overhead. The current received pow- whose results and processes leading to them are interpretable
ers with narrow beams are predicted based on the past pilot by humans, has been actively studied in recent years. A typ-
signals. Thus, the training time can be reduced by half. ical technique to realize XAI is LIME [36], which is a local
To capture spatiotemporal correlations, the proposed model approximation approach to represent AI’s decision logic for
is designed with a convolutional long short-term memory specific input data in an interpretable form. XAI is also an
(LSTM) network. Figure 8 shows an idea of the proposed important technology for 6G.
super-resolution-inspired DL scheme. Here, the received
power matrix obtained by 4 × 4 DFT beams is transformed References
into a power map by assigning the received power to the cor-
responding color. The low-resolution beam domain image [1] NTT DOCOMO, “White Paper: 5G Evolution and 6G (Version 4.0),”
Jan. 2022. https://www.docomo.ne.jp/english/binary/pdf/corporate/
is input to the super-resolution-inspired DL network to out-
technology/whitepaper_6g/DOCOMO_6G_White_PaperEN_v4.0.p
put the high-resolution beam domain image corresponding df
[2] The Ministry of Internal Affairs and Communications (MIC),
“Beyond 5G Promotion Strategy — Roadmap towards 6G —,”
June 2020. https://www.soumu.go.jp/main_sosiki/joho_tsusin/eng/
presentation/pdf/Beyond_5G_Promotion_Strategy-Roadmap_towar
ds_6G-.pdf
[3] National Institute of Information and Communications Tech-
nology (NICT), “Beyond 5G/6G White Paper (English Ver-
sion 2.0),” June 2022. https://beyond5g.nict.go.jp/images/download/
NICT_B5G6G_WhitePaperEN_v2_0.pdf
[4] KDDI Coorporation, KDDI Research, Inc., “Beyond 5G/6G White
Paper (Version 2.0.1),” Oct. 2021. https://www.kddi-research.jp/
sites/default/files/kddi_whitepaper_en/pdf/KDDI_B5G6G_WhitePa
perEN_2.0.1.pdf
Fig. 8 Estimation of received power of narrow beam from that of wide [5] NOKIA Bell Labs, “Communications in the 6G Era,” Sept. 2020.
beam based on super resolution [18]. https://d1p0gxnqcu0lvz.cloudfront.net/documents/Asset_20200909
IEICE TRANS. COMMUN., VOL.E106–B, NO.2 FEBRUARY 2023
82

220306.pdf to channel estimation in large-scale multiple-antenna systems,” IEEE


[6] H. Viswanathan and P.E. Mogensen, “Communications in the 6G J. Sel. Areas Commun., vol.31, no.2, pp.264–273, Feb. 2013.
Era,” IEEE Access, vol.8, pp.57063–57074, 2020. [28] H. Huang, J. Yang, H. Huang, Y. Song, and G. Gui, “Deep learning
[7] Ericcson: “6G – Connecting a cyber-physical world,” Feb. 2022. for super-resolution channel estimation and DOA estimation based
https://www.ericsson.com/4927de/assets/local/reports-papers/white- massive MIMO system,” IEEE Trans. Veh. Technol., vol.67, no.9,
papers/6g--connecting-a-cyber-physical-world.pdf pp.8549–8560, Sept. 2018.
[8] 6G Flagship, University of Oulu, “6G Visions on paper,” https:// [29] S.J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans.
www.6gflagship.com/white-papers/ Knowl. Data Eng., vol.22, no.10, pp.1345–1359, Oct. 2010.
[9] J. Du, C. Jiang, J. Wang, Y. Ren, and M. Debbah, “Machine learning [30] 3GPP TR 38.901: “Study on channel model for frequencies from 0.5
for 6G wireless networks: Carrying forward enhanced bandwidth, to 100 GHz,” 3rd Generation Partnership Project; Technical Specifi-
massive access, and ultrareliable/low-latency service,” IEEE Veh. cation Group Radio Access Network, 2017.
Technol. Mag., vol.15, no.4, pp.122–134, Dec. 2020. [31] M. Inoue and T. Ohtsuki, “Evaluation of source data selection for
[10] 3GPP TR 23.791, “Study of Enablers for Network Automation for DTL based CSI feedback method in FDD massive MIMO systems,”
5G,” June 2019. IEICE Technical Report, RCS2022-62, June 2022 (in Japanese).
[11] T. O’Shea and J. Hoydis, “An introduction to deep learning for the [32] P. Wang, Y. Li, L. Song, and B. Vucetic, “Multi-gigabit millimeter
physical layer,” IEEE Trans. Cogn. Commun. Netw., vol.3, no.4, wave wireless communications for 5G: From fixed access to cellular
pp.563–575, Dec. 2017. networks,” IEEE Commun. Mag., vol.53, no.1, pp.168–178, Jan.
[12] X. Tan, W. Xu, K. Sun, Y. Xu, Y. Be’ery, X. You, and C. Zhang, 2015.
“Improving massive MIMO message passing detectors with deep [33] L. Wei, Q. Li, and G. Wu, “Exhaustive, iterative and hybrid initial
neural network,” IEEE Trans. Veh. Technol., vol.69, no.2, pp.1267– access techniques in mmwave communications,” 2017 IEEE Wire-
1280, Feb. 2020. less Communications and Networking Conference (WCNC), pp.1–6,
[13] J. Tachibana and T. Ohtsuki, “Learning and analysis of damping 2017.
factor in massive MIMO detection using BP algorithm with node [34] M. Giordani, M. Mezzavilla, C.N. Barati, S. Rangan, and M. Zorzi,
selection,” IEEE Access, vol.8, pp.96859–96866, 2020. “Comparative analysis of initial access techniques in 5G mmWave
[14] J.R. Hershey, J.L. Roux, and F. Weninger, “Deep unfolding: Model- cellular networks,” 2016 Annual Conference on Information Science
based inspiration of novel deep architectures,” arXiv:1409.2574v4, and Systems (CISS), pp.268–273, 2016.
2014. [35] C.-H. Lin, W.-C. Kao, S.-Q. Zhan, and T.-S. Lee, “Bsnet: A
[15] V. Lempitsky, A. Vedaldi, and D. Ulyanov, “Deep image prior,” Proc. deep learning-based beam selection method for mmWave com-
IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp.9446–9454, munications,” 2019 IEEE 90th Vehicular Technology Conference
June 2018. (VTC2019-Fall), pp.1–6, 2019.
[16] H. Hirose, T. Ohtsuki, and G. Gui, “Deep learning-based channel [36] M.T. Ribeiro, S. Singh, and C. Guestrin, ““Why should I trust you?”
estimation for massive MIMO systems with pilot contamination,” Explaining the predictions of any classifier,” KDD’16: Proc. 22nd
IEEE Open J. Veh. Technol., vol.2, pp.67–77, 2021. ACM SIGKDD International Conference on Knowledge Discovery
[17] J. Zeng, J. Sun, G. Gui, B. Adebisi, T. Ohtsuki, H. Gacanin, and H. and Data Mining, pp.1135–1144, Aug. 2016.
Sari, “Downlink CSI feedback algorithm with deep transfer learning
for FDD massive MIMO system,” IEEE Trans. Cogn. Commun.
Netw, vol.7, no.4, pp.1253–1265, Dec. 2021.
[18] H. Echigo, Y. Cao, M. Bouazizi, and T. Ohtsuki, “A deep learning-
based low overhead beam selection in mmWave communications,”
IEEE Trans. Veh. Technol., vol.70, no.1, pp.682–691, Jan. 2021.
[19] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT
Press, Cambridge, MA, USA, 2016.
[20] F.A. Aoudia and J. Hoydis, “Model-free training of end-to-end com-
munication systems,” IEEE J. Sel. Areas Commun., vol.37, no.11,
pp.2503–2516, Nov. 2019.
[21] W. Fukuda, T. Abiko, T. Nishimura, T. Ohgane, Y. Ogawa, Y.
Ohwatari, and Y. Kishiyama, “Low-complexity detection based on
belief propagation in a massive MIMO system,” Proc. VTC 2013-
Spring, Dresden, Germany, pp.1–5, June 2013.
[22] J. Yang, C. Zhang, X. Liang, S. Xu, and X. You, “Improved sym-
bol based belief propagation detection for large-scale MIMO,” Proc.
IEEE Workshop Signal Process. Syst. (SiPS), pp.1–6, Oct. 2015.
[23] F. Long, T. Lv, R. Cao, and H. Gao, “Single edge based belief
propagation algorithms for MIMO detection,” Proc. IEEE Sarnoff
Symposium, Princeton, USA, pp.1–5, May 2011.
[24] Y. Gao, H. Niu, and T. Kaiser, “Massive MIMO detection based on
belief propagation in spatially correlated channels,” Proc. IEEE 11th
Int. ITG Conf. Syst. Commun. Coding, pp.1–6, Feb. 2017.
[25] T. Takahashi, S. Ibi, T. Ohgane, and S. Sampei, “On normalized
belief of Gaussian BP in correlated large MIMO channels,” 2016 In-
ternational Symposium on Information Theory and Its Applications
(ISITA), pp.458–462, Nov. 2016.
[26] J. Tachibana and T. Ohtsuki, “Massive MIMO detection with BP
algorithm using DIP for removing residual interference and noise
and DNN for learning scaling factor,” IEICE Technical Report,
RCS2020-253, March 2021 (in Japanese).
[27] H. Yin, D. Gesbert, M. Filippou, and Y. Liu, “A coordinated approach
OHTSUKI: MACHINE LEARNING IN 6G WIRELESS COMMUNICATIONS
83

Tomoaki Ohtsuki received the B.E., M.E.,


and Ph.D. degrees in Electrical Engineering
from Keio University, Yokohama, Japan in 1990,
1992, and 1994, respectively. From 1994 to 1995
he was a Post Doctoral Fellow and a Visiting Re-
searcher in Electrical Engineering at Keio Uni-
versity. From 1993 to 1995 he was a Special
Researcher of Fellowships of the Japan Society
for the Promotion of Science for Japanese Junior
Scientists. From 1995 to 2005 he was with the
Science University of Tokyo. In 2005 he joined
Keio University. He is now a Professor at Keio University. From 1998 to
1999 he was with the department of electrical engineering and computer
sciences, University of California, Berkeley. He is engaged in research
on wireless communications, optical communications, signal processing,
and information theory. Dr. Ohtsuki is a recipient of the 1997 Inoue
Research Award for Young Scientist, the 1997 Hiroshi Ando Memorial
Young Engineering Award, Ericsson Young Scientist Award 2000, 2002
Funai Information and Science Award for Young Scientist, IEEE the 1st
Asia-Pacific Young Researcher Award 2001, the 5th International Commu-
nication Foundation (ICF) Research Award, 2011 IEEE SPCE Outstand-
ing Service Award, the 27th TELECOM System Technology Award, ETRI
Journal’s 2012 Best Reviewer Award, 9th International Conference on Com-
munications and Networking in China 2014 (CHINACOM ’14) Best Paper
Award, 2020 Yagami Award, and The 26th Asia-Pacific Conference on
Communications (APCC2021) Best Paper Award. He has published more
than 240 journal papers and 470 international conference papers. He served
as a Chair of IEEE Communications Society, Signal Processing for Com-
munications and Electronics Technical Committee. He served as a technical
editor of the IEEE Wireless Communications Magazine and an editor of
Elsevier Physical Communications. He is now serving as an Area Editor
of the IEEE Transactions on Vehicular Technology and an editor of the
IEEE Communications Surveys and Tutorials. He is also serving as the
IEEE Communications Society, Asia Pacific Board Director. He has served
as general-co chair, symposium co-chair, and TPC co-chair of many con-
ferences, including IEEE GLOBECOM 2008, SPC, IEEE ICC 2011, CTS,
IEEE GLOBECOM 2012, SPC, IEEE ICC 2020, SPC, IEEE APWCS, IEEE
SPAWC, and IEEE VTC. He gave tutorials and keynote speeches at many in-
ternational conferences including IEEE VTC, IEEE PIMRC, IEEE WCNC,
and so on. He was Vice President and President of the Communications
Society of the IEICE. He is a senior member and a distinguished lecturer of
the IEEE, a fellow of the IEICE, and a member of the Engineering Academy
of Japan.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy