NEC UPF Whitepaper
NEC UPF Whitepaper
NEC UPF Whitepaper
(UPF+PGW-U/SGW-U+GGSN-U)
planes. The control plane (C-plane) nodes provide • IPv4, IPv6 and IPv4/v6 packet
traditionally been collocated, are now completely • QoS policy definition for IP traffic based on DSCP
decoupled. This is known as Control and User Plane • N+1 node redundancy
1
(PGW-U/SGW-U+GGSN-U) on top of 5G U-plane
function, enabling investment cost reduction.
This whitepaper introduces the details of NEC’s
high performance UPF and the technology behind
together with our extensive product lineup and the
use cases.
2
Performance test results
for containerized UPF software
NIC
Intel® Ethernet Network Adapter No. of QERs 2 QERs per session
E810-2CQDA2 ×4
No. of URRs 2 URRs per session
Host OS CentOS Linux 8.2
Ratio of UL/DL UL:DL = 1:3
Host OS kernel 4.18.0-193.28.1.el8_2.x86_64
User Packet Size 800 bytes
Kubernetes Ver. 1.19.3
640Gbps/server
UPF Pod UPF Pod UPF Pod UPF Pod UPF Pod UPF Pod UPF Pod UPF Pod
80Gbps 80Gbps 80Gbps 80Gbps 80Gbps 80Gbps 80Gbps 80Gbps
3
NEC’s high performance
UPF architecture
UPF software
Core
Core #N
#1
Core
GTP PDR QoS Charging
GTP #0
Decap LookUp (Policing/Marking) Encap
Maximizing CPU
utilization CPU socket #0 CPU socket #1
Main Main
memory memory
Optimizing CPU
processing through
bulk processing
Packet flow
4
processing.
Faster memory access
2 Packet forwarding in UPF essentially repeats the
with HugePages
same processing steps for each packet. NEC decided
Linux provides a virtual memory space that abstracts to aggregate multiple packets at the same time in
the physical memory space, making it easier for each processing step so that they are processed
applications to handle. Virtual and physical memory in bulk. This boosts the efficient use of instruction
spaces are divided into pages (indicating size), and caching and improves the throughput accordingly.
each page is managed by assigning an address
number to it. The default size of a page is 4 KB. Avoiding communication
5
When an application needs to look up data on the
between CPU sockets
memory, it specifies the data location by using the In a server with two Intel® Xeon® Scalable
virtual memory address. Then, the processor looks processors, the memory is split and connected under
up the conversion table on the kernel to translate the each CPU socket. So when the memory connected
virtual memory address into the physical memory under processor B is accessed from processor A,
address and accesses the data after understanding communication takes place via Intel® Ultra Path
its location on the physical memory. Interconnect (UPI), which is a communication path
When the application is handling a huge volume that connects processors. However, UPI capacity
of data, the frequency of address translation by the can be a bottleneck in processes that handle large
kernel becomes high if the page size is small, which volumes of data, slowing down the processing
can degrade the application performance. UPF is speed. To overcome this bottleneck, NEC’s UPF
one such application that consumes a huge volume software is designed to prevent memory access
of data, with gigabytes of memory to perform a large across UPI.
amount of packet forwarding.
To improve the situation, NEC has enabled Linux’s
Offloading to NIC the user
6
identification processing
HugePages function and configured the UPF
memory’s page size to 1 GB. This allows a huge To increase the efficiency of packet forwarding in
memory to be looked up efficiently, reducing the UPF, the same user’s packets must be sent to the
processing overhead. same CPU core for user-specific processing.
To achieve this, until now the user identification
Faster memory access for received packets was performed in the UPF
3
by prefetching data
software, which was then distributed to the same
Prefetching is a process in which the data that a CPU core that is assigned to handle a specific user’s
program references or uses is fetched from the traffic. But in this case, a switch would occur between
main memory (DRAM) to the CPU’s cache memory the core that performs first-level of processes for
(D-Cache) ahead of time. If the program fetches the user identification, and the core that performs
data by accessing the main memory after a process subsequent user-specific processes of the packet
starts, that process will be delayed by the time taken forwarding. The core was poorly utilized, and the
to access DRAM. Prefetching can reduce this delay. process itself was complicated.
NEC’s UPF proactively uses prefetching to reduce Now, NEC has offloaded the user-identification
processing overhead, which boosts the processing process to NIC by using the Dynamic Device
speed. Personalization (DDP) function provided by Intel NIC
(E810). The DDP function performs user identification
Optimizing CPU processing
4 for a packet on the NIC, and forwards the packet to a
through bulk processing
specific core assigned to that user.
The CPU has a mechanism to cache application The DDP function improves the utilization of CPU
instructions (I-Cache) just like data cache. If a process cores because the UPF software doesn’t need to
needs to be executed multiple times, instructions perform the user identification process anymore. In
are cached the second time onwards for faster addition, packets no longer need to be redistributed
processing. Processing instructions multiple times (switched) between cores, reducing the time lost
simultaneously with this mechanism is called bulk in the switching process. The overall software
5
processing has become simpler, with significant
improvements in processing efficiency and speed.
Traditional architecture
Rx (received) task Worker task
Core#1 thread Core#N+1 thread
Rx Rx User Packet Soft User-specific Tx Tx
NIC queue 1 process identification redistribution queue process process queue 1 NIC
Core#2 thread Core#N+2 thread
Rx Rx User Packet Soft User-specific Tx Tx
queue 2 process identification redistribution queue process process queue 2
RSS Core#N thread Core#N+M thread
Rx Rx User Packet Soft User-specific Tx Tx
queue N process identification redistribution queue process process queue M
First level of packet distribution: Distribute packets Second level of packet distribution:
to each core without identifying the user Re-distribute packets based on user identification
Optimized architecture
Rx (received) + Worker task
Core#1 thread
Rx Rx User-specific Tx Tx
NIC queue 1 process process process queue 1 NIC
Core#2 thread
RSS
Rx Rx User-specific Tx Tx
queue 2 process process process queue 2
Distribute packets to each core based on Offload to NIC the user Packets don’ t need to be re-distributed between cores,
NIC’s (DDP function) user identification identification processing improving processing efficiency and maximizing CPU utilization
6
UPF product lineup for a variety of
deployment scenarios and use cases
RAN UPF
RAN
RAN
5G core
RAN UPF (control plane)
7
used in various scenarios – for example, to perform
Realizing B2B2X business models consolidated analysis of a factory’s operation and
NEC UPF easily supports a wide range of use cases management by connecting to the company’s data
because it can be deployed on a variety of platforms. network via UPF.
In addition to the commercial networks of mobile Further, the U-plane can be flexibly deployed
network operations and local 5G of enterprises, it is wherever needed in the network, so cost-effective
also expected to make business development easier and optimal U-plane deployment is possible in
for the B2B2X model in which CSPs collaborate with usage scenarios with different requirements like high
industry partners to deliver solutions for various bandwidth and ultra-low latency.
industries.
For example, if a 5G solution for smart factories is to
be delivered through a B2X model, C-plane can be In addition to UPF, NEC’s 5G core network consists
offered as a managed service by deploying the 5G of containerized cloud-native components, and
core on a central data center, while the UPF and RAN uses 3GPP-compliant open architecture. 5G
equipment are deployed in the user environment is a technology that is likely to be deployed in
(factory). diverse businesses. In this context, NEC’s 5G
In this case, U-plane traffic stays within the local products facilitate rapid, flexible, and powerful 5G
network, so that you can securely handle data that deployments for users in all scenarios.
you don’t want to share externally. This can be