Attojoule Optoelectronics For Low-Energy Information Processing and Communications - A Tutorial Review

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

arXiv:1609.05510 [physics.

optics] v3 1 January 2017 1

Attojoule Optoelectronics for Low-Energy


Information Processing and Communications –
a Tutorial Review
David A. B. Miller, Fellow, IEEE

Abstract—Optics offers unique opportunities for reducing growth in the use of information.
energy in information processing and communications while In the early days of transistors and integrated circuits, much
simultaneously resolving the problem of interconnect bandwidth of the power was in the logic devices themselves. Over time,
density inside machines. Such energy dissipation overall is now at ever smaller transistors (“Moore’s Law” [3]) reduced that logic
environmentally significant levels; the source of that dissipation is energy per bit. That reduction is continuing, even if at a slower
progressively shifting from logic operations to interconnect
energies. Without the prospect of substantial reduction in energy pace [4], [5], [6]. But, the energy to send information inside
per bit communicated, we cannot continue the exponential growth electronic machines does not scale down in the same way,
of our use of information. The physics of optics and optoelectronics especially for longer connections. As a result, most of the
fundamentally addresses both interconnect energy and bandwidth energy dissipated inside electronic machines is used to
density, and optics may be the only scalable solution to such communicate; for example, even on silicon chips 50 – 80% of
problems. Here we summarize the corresponding background, gates are for the “repeater” amplifiers in long interconnect lines
status, opportunities, and research directions for optoelectronic
technology and novel optics, including sub-femtojoule devices in on the chip [7], and information also has to be driven off chip
waveguide and novel 2D array optical systems. We compare [5], and over data links and networks [8], [9], at much greater
different approaches to low-energy optoelectronic output devices energies per bit.
and their scaling, including lasers, modulators and LEDs, optical The remarkable and growing role of optics in the past few
confinement approaches (such as resonators) to enhance effects, decades has enabled a continuing [10], [11] exponential growth
and the benefits of different material choices, including 2D of long-distance communications; the capacity of an individual
materials and other quantum-confined structures. With such
optoelectronic energy reductions, and the elimination of line optical fiber has grown at a rate comparable to Moore’s law
charging dissipation by the use optical connections, the next major [12]. Increasingly, optics is allowing higher densities of
interconnect dissipations are in the electronic circuits for receiver communication inside large systems, as in optical data links in
amplifiers, timing recovery and multiplexing. We show we can data centers [13], [14]. But, now we are facing a need to have
address these through the integration of photodetectors to reduce optics help at shorter distances, and not just to enable higher
or eliminate receiver circuit energies, free-space optics to eliminate interconnect densities. Now a key question is whether optics
the need for timing and multiplexing circuits (while also solving
bandwidth density problems), and using optics generally to save can reduce energy in interconnects inside cabinets, racks, and
power by running large synchronous systems. One target concept circuit boards, down at least to the edges of the chips
is interconnects from ~ 1 cm to ~ 10 m that have the same energy themselves, and possibly even on the chip. This question is
(~ 10fJ/bit) and simplicity as local electrical wires on chip. critical: if we cannot solve these problems with optics, it is not
clear that we have any other way of tackling them.
Index Terms—Optical interconnections, optical
communications, space-division multiplexing, wavelength-division A. Goals for this review
multiplexing, integrated optoelectronics, quantum-confined Stark At the time of writing this review, we are approximately at
effect, optical resonators, optical arrays, optoelectronic devices,
the point where, with current and emerging technology, optics
optical computing
is poised to provide at least modest energy reductions for data
links compared to electrical approaches, even for relatively
I. INTRODUCTION
short links between cabinets and in backplanes or module

E NERGY already limits our ability to process and


communicate information. It constrains the design of
information processing machines for simple reasons of power
connections [14], [15], [16], [17], [18], [19], [20].
The main point of this tutorial review is to expose the
opportunities, requirements and challenges if we are to take
delivery, battery life, power dissipation and heat removal. The such reduction in energy for communication substantially
fraction of energy used for handling information has risen to a further, possibly by orders of magnitude. The main focus will
level that is environmentally significant [1], [2]. For these be on potential applications within racks, or possibly local
reasons, if we cannot continue reducing the energy required to groups of racks, and down to the chip, or possibly for longer
handle each bit, then we cannot continue our exponential interconnects on chip – essentially, lengths from ~ 1cm to ~

This project was supported by Multidisciplinary University Research D. A. B. Miller is with the Ginzton Laboratory, Stanford University,
Initiative grant (Air Force Office of Scientific Research, FA9550-12-1-0024). Stanford, CA 94305-4088 (e-mail: dabm@ee.stanford.edu).
arXiv:1609.05510 [physics.optics] v3 1 January 2017 2

10m. For this article, we will call all such communications links and demultiplexing), in addition to the basic line or output
“interconnects”. Such interconnects may correspond to the driver circuits1. Such circuits together currently consume
majority of power dissipation in large information processing energies of the order of picojoules per bit (see, e.g., [24]). If
and communications systems today. we do not eliminate such energies, then we will see limited
We can propose several key goals for such an approach. additional energy benefit from “attojoule” optoelectronics for
1) We should move from energies for such ~1cm to ~10m any such longer links.
interconnects that are currently in the range of picojoules Fortunately, optics has additional features that can
or larger total energy per bit, down towards ~ 10 fJ or eliminate such circuits. In addition to saving energy, such
lower total energy per bit. approach can also help simplify the interconnect so that quite
2) Such interconnects should look, both in use and in energy, long interconnects look as simple as short local wires. The
as being as simple as a short electrical wire. integration of low-capacitance photodetectors can largely
3) This interconnect approach should have sufficient density eliminate the dissipation from electronic circuits used for
largely to eliminate the bandwidth bottleneck in current receiver amplifiers, in what we will call “receiverless” or
interconnect systems. “near-receiverless” operation, and we will discuss this below.
4) Such an optical technology should be one that can be the To eliminate the line coder, CDR and SERDES circuits, we
mainstream technology for all communications at can exploit two other features of optics, which we will also
distances from ~ 1 cm up to ~ 10 m, so we get these discuss below:
benefits for wide ranges of systems. i. optics offers very large numbers and densities of physical
These goals are aggressive and even radical. Nonetheless, we channels for links of all lengths [25], including very large
argue here for how we could reach them, and we propose parallelism with free-space array optics, which means we
various promising research directions and opportunities. Such can choose to avoid SERDES and line coding while
an optical approach would transform the power dissipation of simultaneously eliminating bandwidth density problems;
modules, boards, server racks, internet routers, and ii. optics offers the possibility of very large (e.g., ~ 10 m)
supercomputers, while freeing the architectures from their synchronous zones [22] because of the timing precision
current bandwidth constraints. and stability of optical channels, which means we can
avoid CDR.
B. How optics can reduce interconnect energy
These latter two features of optics have not been part of much
There are two major ways in which optics can reduce energy recent discussion, but they represent a substantial opportunity,
for interconnects. at least as important as the reduction of energy in optoelectronic
1) Avoid charging electrical lines devices themselves2.
The charging and discharging of electrical wires is the
ultimate source of dissipation in simple electrical C. Organization of this paper
interconnects [2], [21], [22], [23]; optics can eliminate this This paper is organized as follows. In Section II we will
through “quantum impedance conversion” [21]. summarize some of the background in energy in information
Such optical interconnects become attractive energetically processing and communication systems. Section III examines
when the energy to run the optoelectronic devices – the some general aspects of energy dissipation in optoelectronic
photodetectors, modulators and/or lasers – becomes less than devices and their scaling to attojoule energy ranges. Section IV
the charging energy of the corresponding electrical line. This summarizes approaches and mechanisms for low-energy
requirement drives us to make low-energy, “attojoule” optoelectronic output devices, including modulators and light-
optoelectronic devices intimately integrated with electronics. emitters. Section V discusses photodetectors together with their
Work towards such device technologies is under way and receiver circuits. Section VI compares long, medium and short
there are several promising directions. distance optical communication systems, showing in particular
2) Eliminate electronic circuitry used to run links the different requirements for short distance interconnects. In
This second way in which optics can reduce energy for Section VII, we concentrate on the specific issues and
interconnects is less obvious, but equally important: optics opportunities in optical systems themselves in short distance
can eliminate the power dissipation of the electronic circuits interconnects. Section VIII discusses the issues and power
used to operate data links. For links much longer than simple dissipation of circuits to deal with timing problems in links, and
chip-to-chip lines, and possibly even at that level to some how to eliminate these using optics. Section IX gives a sketch
degree, both optical and electrical links currently have to add of an example optical system approach for exploiting the many
various other circuits to ensure reliable communications. benefits of optics for reducing energy in information
These circuits include receiver amplifiers, clock and data processing. Conclusions and recommendations for key research
recovery (CDR) circuits, line coders (to allow AC coupled directions are summarized in Section X. To make the article
amplifiers and also CDR), and serialization and easier to read, various detailed topics are covered in
deserialization (SERDES) circuits (i.e., for time multiplexing Appendices. Appendix A in particular is an extended discussion
1 2
Electrical links may also have to add equalization and multilevel signaling This article may represent the first substantial exposition of such ideas to
circuits to allow sufficient data rates in the presence of signal distortion and loss use optics to eliminate line coding, SERDES and CDR circuits.
on electrical lines.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 3

of physical mechanisms for optical modulators. processor chip [30] has interconnections to off-chip memory
In giving numerical examples, for the sake of definiteness, with 1.28 Tb/s bandwidth, and other input and output (I/O)
we will typically calculate for devices and systems running at ~ connections supporting more than 600 Gb/s, for a total of nearly
1.5 μm wavelength. That wavelength is certainly consistent 2 Tb/s off-chip bandwidth.
with many current technologies, such as silicon photonics and
fiber telecommunications. This choice is not meant to be TABLE I
restrictive, however; for short interconnects especially, other ENERGIES FOR COMMUNICATIONS AND COMPUTATIONS
wavelengths are possible, including near-infrared such as 850 Energy per References
nm wavelength, or even visible wavelengths, though in broad Operation
bit and notes
terms the choice of wavelength does not substantially change Wireless data 10 – 30μJ [31]
the arguments and conclusions here. Internet: access 40 – 80nJ [8]; (a),(b)
We should emphasize before going any further that this Internet: routing 20nJ [9]; (c)
article is only intended to provide the context and overall Internet: optical WDM links 3nJ [32]; (d)
background for research in this area. Because of its wide scope, Reading DRAM 5pJ [5]; (e)
it cannot review any area in great detail. As a result, the Communicating off chip 1 – 20 pJ [5], [15], [16]
references and citations here can only be representative rather Data link multiplexing and ~ 2 pJ [24]
than comprehensive. Though we attempt to cite some seminal timing circuits
work, generally we reference just recent representative Communicating across chip 600 fJ [5]; (f)
examples in many fields. This should allow readers themselves Floating point operation 100fJ [5]; (g)
to trace backwards for more depth, but this author apologizes to Energy in DRAM cell 10fJ [33]; (h)
the authors of the many worthy papers that are not credited Switching CMOS gate ~50aJ – 3fJ [4], [6], [34],
appropriately. [35]; (i)
1 electron at 1V, or 0.16aJ
II. ENERGY IN INFORMATION PROCESSING AND 1 photon @1eV (160zJ)
COMMUNICATION
WDM – wavelength division multiplexing
A. Growth in information communications DRAM – dynamic random-access memory
CMOS – complementary metal-oxide-semiconductor transistor
Since the beginning of the internet, the bandwidth of (a) Uses projections to 2016 from [8]
information communicated over it has grown remarkably. The (b) Presumes wired connections (optical or electrical) to the network
total bit rate for internet traffic surpassed that of telephone (c) Total for 20 “hops” per internet connection, and derating energies from
traffic approximately at the beginning of the 21st century [26], the 2008 numbers in [9] using a factor of 0.74 per year (from [8]) for
improved electronics energy efficiency.
at which time internet traffic was growing at a rate of (d) Total for 20 “hops” per internet connection, and using projections to 2016
approximately a factor of 100 per decade [26]. Total internet in [32]
traffic as of 2016 is estimated at ~ 280 Tb/s (280×1012 b/s) [27]. (e) Rounded sum of the DRAM access and interface energies as projected for
To get some sense of scale, we can compare to voice data rates; 2017 in [5], for off-chip DRAM
(f) Based on 2017 projects in [5] for a 10mm line on the chip
at ~ 32 kb/s for a voice channel, such a data rate corresponds (g) Double-precision fused multiply-add (floating-point) operation using the
approximately to everyone in the world talking at once. One projection in [5] of ~ 6.5 pJ in 2017 for this 64-bit operation to calculate
current estimate [27] predicts a further factor of 3 increase in energy per bit.
internet traffic over 5 years. (h) Based on the relative constancy of DRAM cell capacitance at greater than
~ 20fF, and a ~ 1V charging voltage.
Though such an internet data rate may seem large, there is (i) We might estimate a lower limit ~10aJ for switching a gate based on
much more data sent over shorter links. One estimate is that ~ projected reductions in transistor capacitance, referenced in [34], and
106 bits are communicated inside a data center for every 1 bit simulations of ~ 20 aF gate capacitance in current technologies [35], but such
that leaves it [28]. Already in 2012, a network connecting an energy is just for charging the gate itself, and further parasitic capacitance
of at least ~ 40 aF is likely [35], even if we completely neglect other load
servers inside just one data center had a capacity of >1 Pb/s capacitances and the fact that “complementary” electronic technology with
(1015 b/s) [29]; such data center network traffic largely does not two transistors per stage. On this basis, and allowing some room for
count the communication of information inside the racks of continued improvement, we quote the minimum of ~ 50 aJ. A projected
servers or within the servers themselves, which can only be overall energy per logic gate operation in an optimized processor core is ~
200 aJ [4], which includes leakage power dissipation and some local
larger. connection and other energies. Current logic gate operating energies in
To get a sense of interconnect traffic at shorter distances systems with a fan-out of 3 are ~ 3 fJ [6].
deeper inside information processing machines, we can look at
the interconnect rates associated directly with silicon chips The communications traffic inside the chip itself again can
themselves. An example graphics processor chip [5] has a peak only be larger still. That same recent processor chip [30], for
data rate on and off the chip of 1.4 Tb/s, so just 200 such chips example, has an on-chip network supporting more than 4Tb/s
are capable of generating as much information transmission as of bisection bandwidth3, and the total bandwidth in and out of
the entire global long-distance internet traffic. Another recent the “level 3” (L3) cache memory on the chip is 12.8 Tb/s. We

3
Bisection bandwidth is the amount of data traffic that we would find if we part to the other; usually, this will refer to the largest possible number we would
divided a data network into two parts, and counted the traffic passing from one find from any such division into two parts.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 4

can generally expect yet more on-chip traffic into and out of More recently, however, the reduction of transistor operating
lower level cache memory and within logic operations voltage has largely stopped. This is because low gate voltages
themselves. lead to a correspondingly smaller potential barrier between the
As we look to reduce the energy in handling information, the source and the drain of a transistor in its “off” state, which leads
energy in all such interconnects inside machines will be to leakage current; the potential barrier height becomes too
particularly critical. close to the average thermal energy of an electron at room
B. Overall energy consumption temperature T ( kBT  25meV where k B is Boltzmann’s
constant). So, to minimize the “static” power dissipation in
Information processing and computing, including data
chips, the operating voltage of logic gates is decreasing only
centers, personal computers and networks, were estimated to
slowly if at all [6], [38], [37]. Operating voltages of a substantial
consume 4.6% of world electricity production in 2012 [1]; the
fraction of a volt (e.g., 0.8V) are typical [6]. This approximate
growth rate of that consumption exceeds the growth rate in
constancy of transistor operating voltage has also meant that the
electricity generation capacity. If wireless communications and
voltages on the interconnect lines on chips have stopped
displays are included, the total rises to ~ 9 % of electricity
decreasing, which influences the energy of electrical
consumption. With such growth in the amount of information
interconnections, as we discuss below.
we are handling, information processing and communications
Present CMOS technology is based on FinFET structures
cannot continue to grow at their recent exponential rates
[19], [39], [40] or approaches like fully-depleted silicon on
without continued, major reductions in the energy per bit.
insulator (FDSOI) [19], but the scaling approach and arguments
C. Energy per bit in communications and processing here are quite different from the Dennard scaling. One main
To understand where the energy is consumed, we can look at point of such devices is to reduce drain-source leakage currents
the approximate energies per bit in various processing and and related “short-channel” effects. The minimum dimension in
communications operations in Table I. Actual numbers can vary such transistors is typically not now the gate or channel length,
considerably, of course, and they will change as technology but rather the effective thickness of the channel; an effectively
advances, but the overall orders of magnitude here give us thinner channel allows it to be more fully depleted of carriers
insight, nonetheless. We can examine these energies in a few (electrons or holes) and reduces the drain-source leakage and
categories, starting from the smaller energies at the bottom of “short-channel” effects.
the table and working up to the larger energies at the top. Nonetheless, with smaller sizes in the devices the
1) Energies for logic operations capacitances overall may still scale down [6], allowing
The energies for logic operations themselves are small, correspondingly lower logic energies per bit. The combination
ranging from possibly as low as ~ 10-100 aJ per bit inside a of logic, local interconnection and leakage energies may,
given logic gate to ~ 100fJ per bit in a complicated operation however, lead to a saturation in the total energy per bit in logic
such as floating point multiplication. Such energies have operations within a processing core [4], possibly in the range of
decreased over the decades as transistors have become smaller. ~100aJ/bit.
Note that even these small energies are much larger than the 2) Clock speeds and power dissipation in electronic chips
energy associated with one electron or one photon. Modern We might think we run electronic processor chips at clock
low-energy electronic devices work with relatively large rates of ~ 2 – 3 GHz because the transistors are slow. In fact,
numbers of electrons; even 10aJ corresponds to ~ 60 electrons the basic operation speed of an electronic gate, even when
at 1V. Changing to information processing systems that would driving a standard “fan-out” load of 3 other gates, would be ~ 3
use energies much smaller than 10 aJ would raise serious issues ps with current technology [6].
of statistical fluctuations; though we can consider reliable In modern electronic processor chips, we limit clock speeds
systems based on “unreliable” components4, such systems for two main reasons related to power dissipation:
would require a major change in the paradigm of digital (1) running transistors faster requires somewhat higher
information processing as we know it. voltages [38] which means more energy per bit;
For much of the history of Moore’s law, as the transistors (2) increasing clock speeds mean more switching transitions
became smaller, so also did the voltage to run them, following per second – so more power dissipation even for the same
a rule known as Dennard scaling [36], [37]. The “dynamic” energy per bit – but chips are already limited by the ability
energy in operating a logic gate comes largely from charging to extract heat from them [5], [38].
and discharging the capacitances of the transistor itself and of Note that, as we scale down transistors and wires, the total
the local wiring. (Logic gates can also dissipate “static” power capacitance per unit area of the chip in wires and logic gates
even when they are not operating, such as through leakage does not decrease; in fact, device [6] and wiring capacitance per
currents.) The reduction in operating voltage meant the unit chip area can even increase somewhat5. So, for a given
“dynamic” energy dissipation shrank even faster than the clock frequency, we could actually have more power
reduction in size would suggest. dissipation per unit area as we charge and discharge device and

4 5
The human brain is a good example of a system that can work well based For example, wires of smaller cross-section could lead to more total length
on a somewhat statistical operation of potentially unreliable individual parts. of wiring in a given area; since wire capacitance per unit length is largely
constant (see Section II D below), that would mean more wiring capacitance.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 5

wiring capacitance6. 6) General conclusions on energies in information


3) Energies for interconnects inside chips and off chips processing and communication
As we move up Table I, we see that the energy to We conclude, first, that the majority of energy in information
communicate bits across a chip (e.g., ~ 600 fJ/bit) can be larger processing and communications is predominantly in sending
than some quite substantial and complicated logical operation, the information from one point to another, not in the logical
like a floating point multiplication (e.g., ~ 100 fJ/bit), on those processing itself, and second, with the possible exception of
same bits. Similarly, the energy stored in a DRAM7 cell itself is wireless links, most of that energy is in local electrical
quite small, at ~ 10fJ. But, especially if the DRAM cell is on a interconnects inside information processing and switching
different chip, the energy to read that cell becomes totally systems. Hence, we should move to optics and optoelectronics
dominated by interconnection energy, and can be almost three for such local interconnects if we want to reduce energy per bit
orders of magnitude higher (e.g., 5 pJ). overall. At the present time, we appear to have no viable new
In communicating off chip, interconnecting on short lines to approach other than optics for solving interconnect energy and
adjacent chips may just involve charging line lengths of the density problems inside machines.
order of centimeters to the logic voltage, but even that simple
D. Physics of electrical interconnect energies
operation can lead to picojoule energies per bit communicated
(see Section II D below). The energy for communicating in electrical wires essentially
Longer connections off chip may use lower voltage signaling is bounded by the energy required to charge up the appropriate
or more sophisticated links, but the energy of these may not line capacitance to the driving signal voltage. For short
lower than the on-chip or local simple interconnections, leading interconnections, that capacitance will be the total capacitance
to multiple picojoules of dissipation per bit, in part because of of the line, and the drive voltage will be essentially the same as
the more sophisticated receiver and transmitter circuits required the logic voltage; that is mostly the situation for interconnect
(see Section II D below). A significant amount of energy per bit lines on chips. For longer connections off chips, only the line
can also be used in the circuits that multiplex to higher bit rates length corresponding to one bit (that is, ~ one clock cycle) needs
per line or channel for what we can call data links; as mentioned to be charged for each bit; but such lengths can be substantial
above, such circuits perform functions like line coding, CDR, (e.g., up to 30 cm at 1 GHz or 1 ns, and up to 3 cm at 10 GHz
and SERDES, in addition to receiver amplification and line or or 100 ps).
output drivers. We will return to discuss such dissipations 1) Capacitance of electrical lines
below (see Sections V and VIII). To understand the dissipation in electrical signaling, we need
4) Long-distance telecommunications to understand the capacitance of lines. One key point is that the
As we move to long distance, it might seem obvious that the capacitance of electrical lines per unit length only depends on
majority of the energy for telecommunications networks for the the relative geometry of the line, not the size scale. And, that
internet that should be in the long-distance links themselves. dependence on geometry is predominantly logarithmic for lines
Long distance optical links consume relatively low energy per where the size of the conductors is comparable to their
bit, however, primarily because of the very low loss of optical separation [2], [21], [22], [23]. For example, the capacitance per
fibers [9]. Because of switching of information, such as internet unit length of a coaxial line only depends on the logarithm of
packets, in the many routers along the way, the larger part of the ratios of the inner and outer conductor radii, not on the
that energy in the core of internet transmission is actually actual cross-sectional size or overall diameter of the line.
dissipated in the routers [9]. And, that energy is actually the
energy dissipated inside electronic machines, which, as we will
see, is predominantly interconnect energy at short distances.
5) Access networks and connections
The largest amounts of energy per bit in internet and
telecommunications networks can be at associated with the last
connections to the user (sometimes called “access”
connections). Wireless connections, as in WiFi and mobile
cellular connections, consume particularly large energies per bit
[31]. For fixed connections over fiber or cable, the access Fig. 1. A typical interconnection line will have cross-sectional dimensions that
are similar in both directions, so ~ w in the figure, and a separation h between
network and any modem connecting the customer to the the two conductors that is also similar. This balances the need to keep the
network tend to have a relatively fixed power, so the energy per overall cross-section of the line relatively small so we can have high densities
bit depends on the bandwidth to the customer [8]. As that of interconnections, while avoiding large capacitances from conductors that are
very close. A line above a ground plane is shown here, but because of the
bandwidth rises, the energy cost for access reduces, possibly
approximately logarithmic dependence of capacitance on geometry, the results
below the next largest energy cost, which is the energy are similar for all such lines. A typical value of capacitance per unit length is ~
dissipated inside the routers. 2pF/cm ≡ 200aF/μm

When we are trying to get reasonably large densities of


6 7
Hence with future electronic technologies we would even have an argument DRAM - Dynamic Random-Access Memory
to drive us, paradoxically, to lower rather than higher clock frequencies.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 6

interconnections, we do not want to waste cross-sectional area example, [42] estimates that the signal only has to go a distance
by separating the two conductors in a line by a large amount; ~ 3 times the width of a transistor for the energy to charge the
anyway, doing so would only reduce the capacitance wire to become equal to the energy to switch the transistor (see
approximately logarithmically. As a result, lines typically have Fig. 2). Because most signals go further than this and essentially
a separation between the conductors in the line that is all will go at least this far, it is simple to conclude that the
comparable to the cross-sectional dimension of the smaller of majority of dynamic energy dissipation10 in electronics is for
the conductors. See Fig. 1. communication, not logic.
Hence, the capacitances per unit length of all electrical 3) Energies for electrical off-chip communications
transmission or interconnect lines are very similar, within There is no simple answer for calculating the electrical
factors of order unity. Typical 50Ω coaxial cable with ~ 1cm energy per bit for connections off chip. Driving high-speed or
diameter has a capacitance of ~ 1.5 pF/cm. Interconnect lines high-data-rate signals on electrical lines over even 10’s of
on chip with only 80nm center-to-center spacing (so  ×105 centimeters can also be difficult because of loss and signal
smaller in linear size, and possibly  ×1010 smaller cross- distortion on electrical lines [25]. As a result, such electrical
sectional area, than the coaxial cable) also have a capacitance connections may change to links where the format of the
of ~ 2 pF/cm (≡ 200aF/μm) [6]). signaling can be quite different from simple “on-off” signals at
2) Capacitive charging energies on lines the logic voltage.
Because the voltages for logic operation on chips are not In such links, it is possible to have lower voltage signaling or
reducing substantially, as discussed above, and the capacitances to allow complex modulation formats that can increase the
per unit length of wires are relatively fixed and bounded by number of bits per symbol sent, which would tend to reduce
physical laws, the energies to communicate logic levels across energy per bit, but that decrease can be more than offset by the
the chip have not reduced significantly in recent years. necessity to run the required complex electronic circuitry to
Charging a capacitance to a voltage V leads to an energy support the signaling. Typically, such links with more complex
(1/ 2)CV 2 stored on the capacitor, with an equal energy modulation formats are designed to increase the data capacity
of lines, not to reduce energy per bit communicated.
dissipated in the series resistance through which the capacitor Additionally, such links often require clocking to establish the
is charged (see, e.g., [41] for a discussion of capacitive charging necessary timing for signals, and clock recovery circuitry can
energies). When the capacitor is discharged, this energy is consume significant power (e.g., 50% of the total receiver
dissipated into the discharging resistance, for a total dissipated power in a recent example [43]). Even on chips themselves, the
energy8 of CV 2 . On this basis, we can see that charging a line power to run the clocking inside logic blocks can also be
of ~ 1cm length to some fraction of a volt to send information comparable to other power dissipations [44].
down it leads to dissipated energies approaching or on the scale As a result of these various factors, energies per bit for off-
of a picojoule, which is the source of the 600fJ/bit energy for chip electrical interconnects can typically range from picojoules
communicating across a chip9 in Table I. per bit to significantly higher energies [5], [16]. This issue of
off-chip connection energy and the difficulty in reducing it is
well-known also in the context of supercomputers and their
future scaling [15], for example.
E. Physics of optical interconnect energies
1) Quantum impedance conversion
The key reason why optics can save energy compared to
electrical approaches in simple interconnects is that in optics we
do not have to charge the line or other electromagnetic medium
Fig. 2 The total capacitance of the transistors in a small logic gate is comparable to the signal voltage; instead, we only have to charge or
to that of a wire to another nearby gate. discharge the optoelectronic detector (or whatever is the
A key point in comparing interconnect and logic energies, equivalent load capacitance of the detector and the circuit to
however, is to note that the capacitance of the transistors in a which it is connected).
logic gate is comparable to the capacitance of a wire from one Fig. 3(a) illustrates this point. The core physics is the photo-
logic gate to another that is relatively close by [34], [42]. For electric effect. The voltage that we can generate in a photodiode
even in a simple photovoltaic mode is comparable to the photon
8
Incidentally, it is common to quote an energy of (1/4)CV2 per bit for 9
Long connections on chips are often broken up into shorter lengths of line
communications involving a capacitance C. This can be correct for the with “repeater” amplifiers between these short lengths. This is to reduce delay.
following reason. If the bit changes from one state to another, we dissipate The capacitance and the resistance of the line are both proportional to length,
(1/2)CV2, either in the charging resistance or in the discharging resistance. On so the overall charging time of the line grows as the square of the length; hence,
the average, for any two bit sequence, in an effectively random string of bits, breaking the line up into sections with intervening repeater amplifiers can
half the time the next bit has the same value as the current one, so we change reduce the overall delay. Even with repeaters, the effective signal propagation
from one state to another every 2 bits, on the average; hence we dissipate velocity on such lines can be, e.g., only ~ 1/5 or less of the velocity of light (see,
(1/2)CV2 on the average every two bits. So, we dissipate (1/4)CV2 per bit, on e.g., [23]), leading to significant “latency” or delay problems in systems.
10
the average. Dynamic energy is associated with the active processing of information,
as opposed to static, background power dissipation.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 7

energy in electron volts, and we can generate something close photodiode operated in reverse bias. In this case, the diode can
to one electron of current for each absorbed photon. The likely generate ~ 1 electron per photon, giving a current
detection of light is a quantum-mechanical process of absorbing I PC = Pe / hν (where e is the magnitude of the electronic
photons, not a classical process of measuring the voltage in the charge) for an absorbed optical power P, and an output voltage
light beam itself (see, e.g., [84]). VOUT = RLOAD Pe / hν .
In a classical electromagnetic beam of power P propagating
We can also think of this process in terms of optical energies,
in free space, the power in the beam can be written as
rather than powers; indeed, we may well be operating with a
P = VRMS
2 / Z o ; here VRMS is the root mean square ( RMS) circuit more like that of Fig. 3(c), which has no load resistor16.
voltage from one side of the (linearly polarized) beam to the Here, absorbing some amount of optical energy from a pulse
other and Z o  377Ω is the impedance of free space11. Then can lead to an output voltage because the photogenerated charge
VRMS = PZo can charge (or discharge) the capacitance CDiode of the diode and
(1)
any load capacitance CLoad, such as an input to a transistor or
For an example power of 1nW in a beam, the classical
logic gate, to change the output voltage VOUT .
voltage would therefore be VRMS  600μ V .
A given optical energy E A made up of photons of energy hν
corresponds to a number of photons N ph = E A / hν . If we
absorb all that energy, generating ~ 1 electron per photon, then
a total charge QPC = E A e / hν will flow in the circuit. This will
lead to a change in output voltage ΔVOUT = E A e / hν CTOT where
CTOT is the total capacitance17 CTOT = CLoad + CDiode . For
CTOT = 1fF , hν ≡ 1eV (so hν / e = 1V ), and an optical pulse of
energy E A = 1fJ , then ΔVOUT = 1V .
We will discuss device capacitances below in Section V; we
might hypothesize a photodetector of C Diode  100 aF ,
corresponding to a 1μm cube, connected to a transistor with
Fig. 3 (a) Illustration of quantum impedance conversion, in which a beam with
input capacitance of 100aF through some wire of capacitance
a small classical voltage can generate a much larger voltage at the output of a 100 aF, for a CTOT = 300aF . Then ~200aJ of optical energy in
photodiode. (b) A reverse-biased photodetector with a resistive load, and (c)
with a capacitive load, including the diode’s own capacitance.
an optical pulse at 1.55μm wavelength (0.8eV photon energy)
efficiently absorbed in such a detector would lead to
The photodetector, however, does not measure classical ΔVOUT  0.8V , which is more than the input voltage swing
voltage; it counts photons, and can give ~ 1 electron of current
required to switch a logic gate18.
for each absorbed photon. For photon energy12 hν ≡ 1eV
Hence with low-capacitance photodetectors connected to a
(where h is Planck’s constant and ν is the optical frequency), low-capacitance load, such as the input to a CMOS inverter
then absorbing this 1nW of power could give a current13 of ~ circuit, even optical energies less than 1fJ could give to enough
1nA. The photodiode could also give a voltage14 of up to 1V. voltage change to switch a logic gate, without any electronic
1nA in a 1GΩ load resistor15 would correspond to 1V. The key amplification (a so-called “receiverless” mode of operation
point here is that the voltage in the load resistor can be much [45], [46]). We discuss this benefit in detail in Section V.
larger than any classical voltage in the light beam. 2) Additional physical benefits of optics
A simple photodiode therefore can transform power Using optical interconnections brings many additional
propagating in a low impedance medium into power in a high benefits. See also [2], [22], [23]. The interconnect bandwidth
impedance load, and it can do so with some reasonable densities, especially for connections off chip, and the precision
efficiency. This process can be called “quantum impedance of timing possible with optics will turn out to be particularly
conversion” [21], [22]. interesting. We will come back in Section IX to discuss an
The circuit of Fig. 3(b) may be more practical, with the example system that could simultaneously take maximum
11 15
We could use other somewhat different impedances if the electromagnetic This example is somewhat simplified because we will not simultaneously
beam was propagating in a dielectric, such as glass, on in a transmission line, obtain “short circuit” current and “open-circuit” conditions, and there are some
but the essence of this argument is not changed by that. other practical limits with diodes,
16
12
Because the wavelength of light (in free space) λ = c/ν and the photon Of course, such a simple circuit with no load resistor would have difficulty
energy in electron volts, hνeV is the energy hν in joules divided by the magnitude resetting itself; once triggered with an optical pulse, the resulting voltage
of the electronic charge e, then hνeV = hc/eλ ≅1.24/λmicrons, where λmicrons is the change would remain unless some other leakage current discharged it. Later, in
wavelength in microns (micrometers). This relation, and the complementary Section VIII B, we will discuss “dual-rail” operation with stacked pairs of
one λmicrons ≅ 1.24/hνeV are very convenient. So for hνeV = 1eV, λ ≅ 1.24μm, and diodes, which avoids this difficulty for circuits with no load resistor.
17
for λ = 1.55μm, hνeV ≅ 0.8eV. The diode and load capacitances are effectively in parallel in a circuit like
13
This would be the so-called “short-circuit” current of the photodiode. this. To change the voltage VOUT we have to charge or discharge both
14
This would be an “open-circuit” voltage under “flat-band” conditions. capacitances.
18
Indeed 0.8V corresponds to a typical supply voltage for CMOS logic
circuits [6]
arXiv:1609.05510 [physics.optics] v3 1 January 2017 8

advantage of all these benefits of optics to minimize energy (see Section VIII B below for example calculations).
dissipation overall. Electrical cables, by contrast, show very large pulse
a) Density of interconnects
broadening even for much longer pulse widths [25]. Any
crosstalk or loss in optical signals is also essentially
A major benefit of optics is that it allows very high densities independent of the signal bandwidth21, so in general the optics
of information to flow, in the sense of Gb/s per square itself in optical links can be designed to support very large
millimeter of cable cross-section or Gb/s per linear millimeter signal bandwidths over the size scales of physical information
of the edge of some card or board; this is one of the major processing systems.
reasons that optical interconnects are in current use for longer Since optical signals operate by transmitting and detecting
distances inside large machines. Optical fibers can carry high photons rather than measuring classical voltages, all optical
data rates over very thin (e.g., 125 μm diameter) “wires”. connections intrinsically offer voltage isolation, just like
Smaller waveguides (e.g., ~ 0.2 – 3 μm cross-sectional inserting “optical isolators” in every link. This means ground
dimensions) are also possible on substrates, as in silicon voltage variations over systems do not matter in optically
photonics [47], [48], [49], [50], [51], [52], [53], [54], [55], [56], interconnected systems.
[57], [58], [59], [60] and integrated III-V photonics [61]. There
are the additional opportunities of much larger amounts of c) Timing precision
information transmission using wavelength division As discussed above, optics can deliver even short pulses
multiplexing (WDM) (use of multiple different wavelengths as without significant distortion over quite large distances; that
independent channels) or space-division multiplexing (SDM); could allow electrical systems to be clocked optically with very
SDM could use multiple spatial modes in a fiber (mode-division little “jitter”22, for example, into the sub-picosecond range [62].
multiplexing) or free-space, two-dimensional interconnects off So optics can be used for low-jitter clock distribution.
the surface of the chip (see Section VII below). One additional aspect of optics that has not been substantially
Electrical interconnects run into a basic limitation [2], [25] exploited is that such timing or clocking pulses can be delivered
on bit rate B that is proportional to the total cross sectional area with a very well defined absolute delay [22]. Electrical wires
A of the wiring and inversely proportional to the square l2 of the have an effective delay that depends on the variation of the
length l of the wiring, i.e., B  Bo A / l 2 where the prefactor resistance in the wire with temperature because the slope of the
Bo  1015 − 1016 b/s . This limit, which results from the rising or falling edge of electrical pulses depends on that
resistance. As a practical matter, we typically do not rely on
resistance and capacitance of electrical wires, applies to simple
long electrical wires having any particular predictable delay,
“on-off” signaling. It severely restricts the amount of
and we recover the clock phase (i.e., timing) using clock
information we can send through wiring systems19.
recovery circuitry and associated buffering.
This “aspect ratio limit” [25] is routinely encountered on
The delay on optical fibers is, however, quite precisely
chips, on boards, and transmission lines. It can be avoided to
predictable and substantially independent of temperature over
some degree by using sophisticated signaling techniques, such
the 1 – 10 m distances involved inside a system (see Section
as equalization and/or multilevel signaling and modem
VIII B); it could substantially reduce power dissipation in links
technologies, so as to approach the Shannon limit for such
because it could eliminate clock recovery circuitry entirely.
electrical connections; that, however, requires more complex
When using modulators as the output devices, we can also
transmitter and receiver circuits, which in turn lead to
automatically retime the output signals by having the optical
increasing energy per bit. Since optical connections do not have
input to the modulators be such well-timed pulses [63], [64], as
the resistance of electrical wires, they completely avoid this
we also discuss in Section VIII B below.
particular limit, and can exceed it in practice by many orders of
3) Conclusions for physical benefits of optics for
magnitude20.
interconnects
b) Signal integrity In summary, optics offers various physical benefits compared
Another key benefit of optical connections is that they can to electrical lines:
avoid some of the problems of the propagation of high- • it can reduce interconnect energy by eliminating the
frequency electrical signals. Over the scale of a machine, such charging of electrical lines;
as meters or 10’s of meters, there can be negligible distortion of • it can send information over large distances at high rates
optical signals due to dispersion, even for picosecond pulses without additional loss or distortion;

19
For example, a coaxial cable, 1cm2 in cross-sectional area and 10 m long, simple on/off keying on one frequency channel [10], and may have many Tb/s
would be able to carry ~ 1 Gb/s in simple on/off signaling (the ~1015 value of of capacity with sophisticated signaling and multiplexing [11], [12].
21
Bo is appropriate for such an “LC” transmission line) [25]. A line on a chip with For modulation bandwidths (e.g., GHz to 100’s of GHz) that are small
~ 1μm2 cross-sectional area with a simple on/off signaling at 2Gb/s, could have compared to the carrier frequencies of optics (e.g., 200 THz), that modulation
a length up to ~ 7 mm (the ~1016 value of Bo is appropriate for such an “RC” makes essentially no difference to the loss in propagating optical signals, nor to
transmission line) [25] the cross-talk between adjacent waveguides or beams. If the system is running
20
A hypothetical electrical line 125μm in diameter and 60km long would be with wavelength-division multiplexing, of course the modulation can induce
able to carry about 0.03b/s with simple signaling (the capacitance of the wire cross-talk between channels of different center wavelengths.
22
would take ~ 30 s to charge up through its own series resistance). An optical Jitter is the pulse-to-pulse variation in the timing of a pulse in a pulse train,
fiber of the same dimensions can carry bandwidths exceeding 10Gb/s with usually viewed as being from random or unpredictable causes.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 9

• it can allow very high densities of high-bandwidth typical value. Because of this large dielectric constant, to a
connections; rough approximation, we will neglect the fringing fields, and
• it can offer very precise timing and retiming of signals. treat this as a simple “plane-parallel capacitor” between
We will discuss these various points below in more depth in opposing surfaces of this cube. With capacitor plate area A and
Sections VII and VIII. separation d, the capacitance would be
C = ε rε o A / d (2)
III. SCALING OPTOELECTRONICS INTO THE ATTOJOULE RANGE where ε o  8.85 × 10 −12 F/m is the electric constant
The core energy benefit of optics in reducing the energy per (permittivity of free space). Hence for a cube of side d (and
bit for interconnects in simple connections requires that the hence of plate area A = d 2 , the capacitance is
energy to operate the optoelectronic device is itself lower than C = ε rε o d (3)
the energy required to charge an equivalent length of electrical
So for a 1 μm cube, the capacitance is ~ 100 aF. If we presume
line. Hence, the operating energy of optoelectronic devices is a
that running some device requires charging or discharge the
very important consideration. Here we look at the prospects and
device capacitance by 1V, for example, then we can see that
approaches that could allow us to scale to very low operating
with such a micron size, the resulting total energy to charge the
energies in optoelectronics, ideally even into the sub-
device would be  CV 2  100aJ . For a 100 nm cube, the
femtojoule or “attojoule” range.
energy would be ~ 10aJ. For some waveguide device that was,
The energy involved in operating optoelectronic devices
themselves can be separated approximately into two parts: say, 300 nm wide, 200 nm thick, and 3 μm long, then the
(A) the electrostatic (capacitive) energies required to swing capacitance between the top and bottom faces (i.e., over the 200
the necessary voltages across the device, as either a nm thickness) would be ~ 450 aF for 1V drive, so the associated
photodetector or an output device like a laser or modulator energy would be ~ 450aJ.
(B) the other energies involved in running some devices, such Since integrated semiconductor devices are not likely to be
as the energy to inject carriers into a light emitter or more than ~ 1 μm thick23, this simple approximation tells us
change carrier density in some modulators. that, for 1 V operation, or indeed operation at typical logic
There will be yet other energies in operating a system, swing or supply voltages (e.g., 0.8V [6]), the optoelectronic
especially from optical losses; we will return to such energies devices have to be micron or sub-micron in size if we are to run
later, however, concentrating here only on these specific at single femtojoule or sub-femtojoule operating energies24.
energies involved in running the devices themselves. Above in considering wires, we presumed ~ 200aF/μm wire
capacitance (see, e.g., [6]). So, if the capacitance of the wire
A. Electrostatic energies that connects the device to its associated driver or receiver
If our goal is make devices that operate with energies less than electronics is not to dominate the capacitance overall for such
a femtojoule, then we must make sure that the capacitive sub-femtojoule optoelectronics, then such connecting wiring
charging and discharging energy for the devices themselves is also needs to be on a scale of no more than a few microns.
less than this amount. To get a sense of scale, first we can look Connection to photodetectors should use particularly short
at the capacitance of a simple cube of semiconductor between wires; any increase in overall input capacitance can cause the
two opposite surfaces, sketched in Fig. 4. entire operating energy per bit to scale nearly in proportion, a
point we discuss in greater detail in Sections V C and IX A.
The transistors to which the photodetector devices connect
will have input capacitances in the range of ~20aF to ~ 100aF
if they are near to minimum-size transistors (see [4], [6], [34],
[35], and the discussion in footnote (i) of Table I), so their
capacitance may not dominate overall, but should be included
in the overall capacitance.
The simple overall conclusion here on electrostatic energies
is that, if we are running optoelectronics at voltage swings
comparable with the logic voltages, then the devices have to be
micron or sub-micron in size and they have to be integrated
right beside the associated electronics (e.g., within a few
microns or less); otherwise electrostatic operating energies will
Fig. 4. A cube of semiconductor material, with dielectric constant εr, and sides
of length d, area A of each cube surface, with capacitance between two opposing raise the total energy out of the sub-femtojoule range. Hence
faces of C ~ εrεoA/d. the integration technology has to be a core part of any serious
proposal for attojoule optoelectronic devices.
For the sake of definiteness, we will take the dielectric
constant of semiconductor material to be ε r  12 , which is a

23 24
Fabrication in substantially planar structures using lithography typically Note also that these capacitances and energies are slight under-estimates
uses thickness of this scale or smaller, and layered growth techniques in general since they are neglecting fringing fields; the fact that these are under-estimates
also use such thicknesses, for example. reinforces the need for small sizes.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 10

B. Operating energies structures. QCSE modulators typically use III-V quantum


To give some sense of energies for optical output devices (i.e., wells, but they can also use germanium quantum wells on
modulators and light emitters) and some of the requirements to silicon substrates [41], [67], [68], [69], [70], [71], [72], [73],
achieve them, we can perform a simple scaling of two such [74], [75], [76], [77], [78], [79], [80], [81], [82], [83], with
devices that are each based on strong microscopic mechanisms, performance comparable to or better than their III-V
namely III-V semiconductor lasers and quantum-confined Stark counterparts [82].
effect (QCSE) electro-absorption modulators [65], [66], [67], We estimate energies for different device active volumes in
[68]. Table II. For laser energies, we presume that the device volume
has to have an injected carrier (pair) density of 1018 cm-3 so that
TABLE II it has enough gain to lase and that the resulting gain is ~ 100
EXAMPLE LASER AND MODULATOR ENERGY SCALING cm-1. These are typical orders of magnitude for operating
Optical semiconductor lasers25. In calculating operating energies, we
Operating presume we require 1 eV of energy to inject or create each
Active device volume concentration
energy carrier pair. With such an assumed required carrier density to
factor
get the laser gain medium to be sufficiently above threshold,
(1 μm)3 (a)
then the energy required to operate the laser is proportional to
laser ~160 fJ ~5 (b)
the volume of the device. Of course, to make a small device
modulator ~5 fJ ~1 (c)
3 3
work, we may also need to concentrate the optical field, as in a
(300 nm) = 0.027μm resonator, and we discuss this point in Section III C below.
laser ~4300 aJ ~200 For modulator energies, we presume that the modulator
modulator ~135 aJ ~40 requires an electric field E of ~ 105 V/cm to operate; this is a
(100 nm)3 = 10-3μm3
typical value of operating field for strong QCSE absorption
laser ~160 aJ ~5×103
edge shifts in a modulator. For a given operating field, there is
modulator ~5 aJ ~103
therefore a corresponding electrostatic energy density, and this
(10 nm)3 = 10-6μm3 contribution to the operating energy is therefore proportional to
(e.g., a quantum dot) the volume. For a semiconductor relative dielectric constant ε r
laser ~160 zJ (d) ~ 5 × 10 6
~ 12, we obtain the modulator energies shown26 in Table II,
modulator ~5 zJ (e) ~ 106 presuming an energy equivalent to
(a) E.g., an active (gain) region 50nm thick, 200nm wide, and 100 μm long, (1 / 2)CV 2 ≡ volume (1 / 2)ε r ε oE 2 dv where the integral is over
as in some hypothetical quantum well edge-emitting laser, or 300 nm thick,
350 nm wide and 10 μm long as in some hypothetical short modulator.
the device volume.
(b) If the gain material entirely filled the mode cross-section, a gain ~ 100 C. Optical concentration factor
cm-1 would allow a laser of ~ 100 μm length to work with only a very weak
resonator. For the 50×200 nm hypothetical gain cross-section of note (a)
To make the optoelectronic devices work, especially for the
above, in a hypothetical mode cross-section of 300×350 nm2, so about ×10 smaller active volumes of material, we may need to increase the
larger than the gain cross-section, because of this mode overlap of only 1/10 energy density in the electromagnetic field by some optical
with the gain material, we would only obtain a gain of about 10% in one pass, “concentration factor” so that there is enough interaction with
so we would need cavity mirrors of ~ 90% (power) reflectivity R to reach
threshold, which would correspond to a concentration factor γ ~ 5 (see
the active material – e.g., for a laser operating above threshold,
below). a modulator with enough contrast ratio, a light-emitting diode
(c) No resonator is required for a 10 μm long QCSE modulator, as in [78], (LED) with strong enough spontaneous emission into a given
because the absorption in a single pass is large enough. mode, or a detector with enough absorption. That concentration
(d) This energy is equal to 1 eV and corresponds to one electron-hole pair in might involve a resonator, a sub-wavelength waveguide (e.g.,
the quantum dot. 1 zJ ≡ 10-18 J. using metals), a structure with reduced group velocity, or some
(e) This energy corresponds approximately to a charge of one electron on one other approach (see Fig. 5).
face of the dot and a charge of one hole (or one less electron) on the opposite There are many ways to define such concentration; several
face, with a corresponding voltage between the faces of ~ 100 mV.
terms like cavity quality factor Q, cavity finesse F, and Purcell
Such laser and modulator devices are in wide practical use enhancement factor FP , are well known from analysis of
today in telecommunications and other applications, and they resonators. Partly because we want to include more than just
represent realistic examples of efficient device approaches with resonator approaches, instead we use a simple and general
well-understood physics and technology. Both already exploit “optical concentration factor” γ. Appendix B gives the relation
quantum-confinement benefits through the use of quantum well between these various terms27.
25
Note that such a carrier density also corresponds to one carrier (pair) in a devices are designed to operate with low drive voltages ~ 1V or less [41], then
quantum dot of 10nm3 volume, which also makes physical sense for the this dissipation is essentially just part of the optical loss in using optical
approximate threshold for population inversion in a quantum dot. modulators, and is best counted there rather than here. High bias voltages
26
Electroabsorption modulators like QCSE devices can also have dissipation would, however, lead to magnification of that dissipation.
from the photocurrent that can be generated from absorbed photons. We have 27
Briefly, a cavity of finesse F increases the concentration factor by F/π (Eq.
not included that here, though it has been analyzed elsewhere [41]. If the (19), and in a resonator structure, FP and γ are essentially the same concept,
arXiv:1609.05510 [physics.optics] v3 1 January 2017 11

We define our optical concentration factor γ as follows. We cross-sections. Generally, reducing the group velocity will
presume we have some material of refractive index n. The reduce the operating energy as long as at least some, and ideally
wavelength inside the material is λn = λ / n where λ is the free- all, of the corresponding increase in energy density is in the
space wavelength. First, we consider a “reference structure” active medium itself.
that is a square dielectric waveguide of cross-sectional Nanometallic or plasmonic concentration in subwavelength
dimensions λ n in each direction, as sketched in Fig. 5 (a). We waveguides could reduce the operating energy for devices like
emitters or modulators, both by reducing the cross-sectional
presume we are propagating unit optical power through this
area in which the propagating light is confined (see, e.g., [85],
guide, and for simplicity we presume the power is all confined
[86]), and possibly also by leading to slower group velocity
within this square cross-section. Essentially, this is like a
(see, e.g., [87], [88]). If the use of metals leads to greater loss,
dielectric waveguide near to the minimum practical size. There
though, we may be losing overall in device performance, so
are, however, no mirrors or resonator structures in this reference
such metallic structures need a careful analysis to be sure of
structure. As a result of our unit power propagating through this
their benefits. Dielectric waveguide structures can also reduce
structure, there is some average electromagnetic energy density
group velocity in devices (see, e.g., discussion in [49]).
U 1 inside the material28.
Other structures might have some other average energy
density U S when we are propagating unit power through them.
Then, we define our optical concentration factor as
U
γ = S (4)
U1
With this definition, our reference structure has γ = 1 .
Hypothetically, our reference device of any given kind (i.e., a
detector, modulator or emitter) is one that runs using such a
reference structure29, with a length L sufficient to give enough
absorption or absorption change, refractive index change,
stimulated emission gain, or other emission for a functioning
device.
Devices such as photodetectors, lasers, LEDs, and modulators
using changes in absorption coefficient or refractive index are
all quantum-mechanically based on transition rates proportional
to the electromagnetic energy density, as in (single-photon)
emission or absorption processes (or the corresponding virtual
transition rates in the case of changes in linear refractive index
[84]). As a result, if we want to retain the same overall effect of
the material on the light, reducing the active material volume Fig. 5. Illustration of optical concentration factors γ and electromagnetic energy
densities US for various example structures using dielectric materials of
by some factor β requires we compensate by increasing the refractive index n. For free-space wavelength λ, the wavelength inside the
electromagnetic energy density with some optical concentration device is λn = λ/n. c is the velocity of light in free space. (a) Hypothetical
factor γ = β to keep the device functioning. So, if we want to “reference” device structure, with a dielectric guide of size λn in both directions
that confines the propagating light within it. By definition for this “reference”
use a smaller volume of active material for the device, we need structure, γ = 1 and the electromagnetic energy density US = U1. (We presume
to increase γ proportionately. for simplicity that the light has phase and group velocities of c/n in such a
Note any approach that increases the electromagnetic energy guide.) (b) A waveguide with the light confined in some smaller cross-section
AC, such as by metal walls. The light might also be propagating with some group
concentration while reducing the device active volume by the velocity vg slowed down by some factor 1/η compared to the phase velocity c/n,
same factor will reduce the operating energy for such devices. i.e., vg = c/ηn, giving γ = ηAC/λn2. (c) A high-finesse resonator structure with
That increased electromagnetic energy density can be from mirrors of intensity reflectivity R and a corresponding finesse F ~ π/(1-R).
resonators, from slower group velocity (which necessarily Fig. 5 illustrates the optical concentration factors
requires energy storage somewhere30), or reduced waveguide corresponding to various simple situations. Fig. 5 (a) shows the

with FP ≅ 0.477γ. Note that Q is the finesse F multiplied by the cavity length in neglecting minor possible effects on these from this waveguide structure with a
half-wavelengths. wavelength-scale cross-section.
29
28
In waveguide structures, there may be low actual energy density at the Formally, a conventional laser cannot run with such a structure because
walls and a higher density in the middle – e.g., perhaps twice as high as the there is no resonator, but we can equivalently presume a hypothetical device
average energy density – and in resonator structures there may be standing wave that is about one “gain” length long, i.e., a gain of a factor of e.
30
patterns in which the peak energy density is up to twice as high as the average. After a pulse enters a structure, its energy has to be stored inside the
Though we could incorporate such effects more precisely in our definition here, structure somewhere until it exits the structure again. Unless we have some
for our order-of-magnitude arguments, we simply ignore such effects on the “side” resonator or other energy store, the energy will be stored as
scale of factors of two, and work with the overall average energy densities. We electromagnetic energy inside the material. If the light energy is propagating at
also presume the phase velocity and group velocity in such a reference structure a group velocity vg = vp/η, where vp = c/n is the usual phase velocity, so it has
are both just c/n, where c is the velocity of light in free space, so we are been slowed down by a factor 1/η, then the energy density must have increased
by a factor η so the total power propagating remains the same.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 12

reference structure. Fig. 5 (b) shows a structure with some energy is concerned, it does not matter what specific length L
hypothetical waveguide of some much smaller cross-sectional of device design we have used to achieve this. In the device
area AC; such structures are possible with metals, for example design, we could completely fill the cross-section of the device
(though in practice such approaches can lead to substantial loss with the active material, or instead we could just fill some
if the guide is too long – see, e.g., [85], [86]). In such small central slice or layer, but keep the total volume of active
waveguide structures, or in structures with slow-light material the same by correspondingly increasing the length L.
propagation, the group velocity vg might also be reduced by Equivalently, the “fill factor” – the average fraction of the
some factor η, e.g., giving v g = c / η n , which necessarily leads cross-section of the waveguide filled by the active material –
to an increase in energy density31 of a factor η. Fig. 5 (c) shows does not matter for the energy as long as we are still using the
a resonator with cavity mirrors of some at least moderately large same total volume of active material interacting with the same
(power) reflectivity R. That resonator leads to an increase of electromagnetic energy density. If we have a smaller “fill
optical energy density by a factor γ  1 / (1 − R )  F / π (see factor”, we might, however, choose to increase the optical
concentration factor rather than increase the device length.
Eqs. (19), (20), and (21) in Appendix B.) We should note, too, that for resonators, if we make them
For example, for an absorption modulator, for whatever is the longer for the same finesse (and hence the same concentration
absorption coefficient change Δα we can make to run the factor), then the Q factor will rise in proportion, which leads to
device, in our reference device the length L needs to be such tighter requirements on resonator tuning. So, if we are using
that Δα L  1 or larger to give a strong modulation. If we make resonators, short structures in which the material fills the mode
the device shorter than this by some factor β (i.e., a total length are preferable to longer ones in which the material only fills a
L / β ) so as to reduce the active volume, then we could make small fraction of the mode. For the same reason of limiting the
some change to the optics, such as a resonator, to increase the required Q, it is preferable to have strong microscopic
effective optical energy intensity γ = β to retain optoelectronic effects that can give large absolute values of gain
approximately the same overall device performance with this or of changes in absorption or refractive index since those can
smaller active volume. Similarly, in a refractive modulator result in shorter devices and hence lower Q structures for the
using our “reference” structure, we would need to make an same optical concentration factor.
The required optical concentration factors for the smaller
optical path length change  λ / 2 in the length L to make some
volumes in Table II are based on a simple scaling from the (1
useful device. If we reduce the device length to L / β , then we
μm)3 case, in proportion as the volume of active material goes
will need increased optical energy density γ = β for the device down. The order-of-magnitude energy numbers here for the
still to work. (We will discuss use of resonators with absorptive (1μm)3 active volume are comparable to those of actual
and refractive modulation effects in greater detail in Section IV demonstrated devices. The 160 fJ for the laser with (1 μm)3
D below.) active volume should be comparable to the energy to turn on an
In the case of a laser, the gain per pass has to be sufficient to efficient conventional edge-emitting laser. 56fJ/bit has been
overcome the loss through the mirrors. If we reduce the length reported for surface emitting lasers [89] with a 3.5 μm diameter
to L / β , then we have increase the optical energy density in the aperture32. Presuming an active region thickness ~ 0.1 μm or
cavity by γ = β for the laser still to work. In some resonator less, this number is also in reasonable agreement with the
structure, this is equivalent to reducing the leakage through the estimated 160 fJ in our approximate analysis33.
mirrors by a factor γ ; for example, increasing the mirror Research demonstrations using photonic crystal resonators
reflectivity from 95% to 99% corresponds to increasing γ (and and/or quantum dot materials (e.g., [90], [91], [92]) can also be
finesse F) by a factor of 5. compared34 with the projections in Table II.
For the modulator with 1 μm3 active volume, the 5 fJ is
Note, though, in these scaling arguments, that as long as we
comparable with the operating energy (including the bias
keep the same electromagnetic energy density interacting with
field35) for a compact QCSE modulator [41], [78].
the same total volume of active material, as far as operating
One caution for using small volumes of active material is that
31
Here we presume the resulting increased energy density is all in the active crystal cavity. [92] shows a single quantum dot lasing in a cavity with a reported
material, though that may not always be the case in such guides; some energy concentration factor ~ 30,000. Such dots may be somewhat larger in volume
might be stored in the metal guiding layers, for example. than our hypothetical 10 nm cube, by a factor of, e.g., 3 or so [93], so our simple
32
Such VCSEL technology is actively researched for optical interconnect scaling would suggest at least ~ 106 required optical concentration, a factor of
applications [129], with impressive system demonstrations with total energies 30 higher than used by [92]. An important difference here, though, is that the
per bit in the range of a few picojoules [18]. experiments in [92] are conducted at a temperature of ~ 10K, not at room
33
Such surface-emitting structures may also have higher concentration temperature, and we could expect much greater gain per injected carrier pair as
factors than we have suggested in Table II as being approximately the minimum a result, so less optical concentration may be required. There may also be some
required since they operate with very high reflectivity mirrors. additional benefit from the greater quantum confinement in the quantum dot as
34
For lower energy lasers, researchers have exploited photonic crystal cavity compared to the quantum well gain media presumed in our scaling.
35
structures that allow particularly small active volumes and high Q factors, Actual energy per bit can be lower because it is not necessary to swing
allowing strong optical concentration. For example, [90] shows 13 fJ/bit over the entire bias voltage to run QCSE devices [41]. Sub-femtojoule per bit
operation in a laser with a 0.18μm3 active volume, a number in rough agreement can be deduced in that case for this modulator.
with our scaling here for such a volume. [91] has shown a low-threshold
electrically-pumped nanocavity laser using layers of quantum dots in a photonic
arXiv:1609.05510 [physics.optics] v3 1 January 2017 13

in practice we may need high Q cavities to exploit them. For broad range of available approaches to optical output devices
modulators in particular that is problematic because we need to generally. Here, we briefly summarize and compare various of
match the narrow resonance with some operating wavelength, the options and their properties and requirements. See also [94]
to a precision ~ 1 / Q . That poses fabrication and operational for another discussion of potential low-energy optoelectronics.
problems (e.g., temperature drift, feedback stabilization), Fig. 6 shows various device configurations with conventional
especially for Q values of 1000 or more. Even for lasers, if they waveguides, ring or disk resonators, and “surface-normal”
are to be matched to specific operating wavelengths in some structures in which the light comes in and/or out perpendicular
WDM system, we would have similar problems. Modulator to the surface, either with or without a resonant cavity. There
devices with Q < 100 might be usable without such tuning can be many variations in such structures, and photonic crystal
problems, however. See Appendix B for a more detailed or nanocavity structures are also possible.
discussion.
Note in these scaling arguments that the operating energies of
quantum well electroabsorption modulator devices are lower
than those of lasers; generally, lower operating energy densities
are required in these modulator cases. We could, for example,
propose a quantum-well electroabsorption modulator of total
volume ~ (300nm)3, which might correspond to some
waveguide resonator with a cross-section of 200×300 nm and a
length of 450 nm (about 1 wavelength in a typical
semiconductor in a device operating at a free-space wavelength
of 1.5 μm). Only a moderate optical concentration factor of ~
40 would be required to run such a device, which could mean a
relatively low-finesse resonator that therefore did not have to be
fabricated to extremely high precision. According to the scaling
in Table II, such a device would have an operating energy of ~
135 aJ.
D. Conclusions for scaling to attojoule optoelectronic
devices
The key conclusion of this scaling argument is fundamentally
optimistic for attojoule optoelectronics: even if we only
consider known mechanisms already widely exploited
technologically, sub-femtojoule optoelectronic output devices
are physically quite possible. Whether the extreme case of the
(10 nm)3 active volume is practical is very much a speculative
question, and that case here is included largely for comparison
purposes. However, we can be cautiously optimistic that
devices in the (300 nm)3 active volume range are quite possible,
and perhaps even the (100 nm)3 active volume range are viable Fig. 6. Various configurations for emitter and modulator devices. (a) A
waveguide containing active electroabsorptive (EA) material for a “single-
without drastic technological efforts. pass” optical modulator. (b) A resonator structure for a laser or a cavity-
The challenges are that we will have to make the devices enhanced electroabsorptive or electrorefractive (ER) device. (c) A disk (or ring)
small, into the range of 100’s of nm or smaller, and they will resonator with active gain, electroabsorption or electrorefraction material side-
have to be very well integrated with their associated electronics coupled to a passive waveguide. (d) A reflection modulator for use with
electroabsorption or electrorefraction material for a “surface-normal” device
if we are to obtain the full energy benefits. We will also have to (the top surface may also be anti-reflection coated). (e) A vertical cavity
consider seriously and critically any required approaches to structure for a surface-emitting laser or a resonant cavity modulator. The bottom
concentrating optical fields, such as the use of resonators or mirror may be designed for near 100% reflection so light only leaves from the
top. (f) A classic Mach-Zehnder waveguide interferometer structure for an
conceivably other approaches such as nanometallics (e.g., electrorefractive modulator. Two beamsplitters, nominally with split ratio
plasmonics) or slow light, with any associated loss being a 50:50, split an input beam along two arms. Changing the phase shift in one arm
major issue; furthermore, the issues of fabrication precision and compared to the other changes the division of power between the two output
ports on the right, allowing amplitude (or power) modulation from a simple
operational stabilization for resonators with Q > 30 need to be phase shift.
carefully examined for any proposed approach.
A. Qualitative comparison of light-emitters and modulators
IV. OPTOELECTRONIC OUTPUT DEVICE APPROACHES Before comparing specific device mechanisms, we can make
some general comparisons between the two approaches of light-
In our argument so far, we considered only two example
emitters and modulators, as summarized in Table III. In general,
approaches to optical output devices; as we think about
the choice between these two approaches is not simple because
potential low energy optoelectronics, we should look at the
it involves benefits and problems that emerge when we consider
arXiv:1609.05510 [physics.optics] v3 1 January 2017 14

the larger system in which we are using the devices; as a result, output power; if it is not, we have to increase its optical output
a simple comparison on one parameter or on one strength or power, and hence its overall power dissipation, so that we can
weakness is not generally sufficient to make a choice. deliver sufficient energy to the photodetector at the other end of
the interconnect [41]. As we will see later when we discuss
TABLE III receivers, simply increasing the sensitivity of the receiver to
COMPARISON OF LIGHT-EMITTERS AND MODULATORS AS make up for low emitter efficiency or high background loss or
OUTPUT DEVICES absorption in a modulator itself also leads to greater power
Light emitters - Pro dissipation. Hence, we have to try to
 No additional optics to get the light to the output device • avoid light emitters where substantial efficiency
 Only need to turn the lasers on for the channels in use compromises have to be made to allow integration
Light emitters - Con • avoid low-efficiency emitters, even if they have low
 Difficulty of monolithic integration with electronics operating energies
 Difficulty of wavelength control for individual emitters, • avoid modulators with significant background loss
limiting the use of any dispersive optics (e.g., diffractive 2) Single spatial mode operation
optics) and wavelength division multiplexing A second point about efficiency is that the emitted power must
 May require optical isolation be in such a form that it can be efficiently delivered to the
 May require polarization and mode control photodetector at the other end of the link. For reasons that will
 Relaxation oscillation limit to frequency response, become clearer once we discuss optics and receivers below, this
increasing power densities at high speeds argues strongly that all light emitters in the system emit into a
 May have timing issues from turn-on delay [95] single spatial mode, whether they are the optical output devices
 All power dissipation is on-chip themselves or are the optical power source for modulators;
 Issues with temperature variation because of indeed, on one argument, only the optical power in the most
o different shifts of bandgap and resonator wavelengths strongly coupled optical mode from the output device to the
o decrease of laser gain with increasing temperature photodetector is useful, and the rest of the power is wasted (see
Modulators - Pro Section VII A below).
o Centralized wavelength, mode, and polarization control, For modulators, since they will likely be powered by some
and optical isolation, all at the laser power source external optical power source laser, it is relatively easy to make
o Can be driven by optical pulses for precise signal timing, such a laser operate in a single spatial mode; as long as the
including whole arrays of modulators synchronously intervening optics is of reasonably high quality, then
o Only the modulator drive power is on-chip modulators will anyway be operating on single-mode beams.
o Many approaches tolerant to high-temperature operation For light-emitters as output devices, we should make sure that
o Can be compatible with wavelength division they emit with high efficiency into one spatial mode. For laser
multiplexing, even for untuned or low Q modulators output devices, we will have to take some care to make sure the
Modulators - Con spatial mode is controlled, most likely to be in the lowest spatial
 Separate light source required mode.
 Needs optics to split and deliver the power to the many If we want to use light-emitting diodes (LEDs) as the optical
modulators output devices, we need to construct them in such a way that
 All illuminated modulators consume at least the optical they emit predominantly into one spatial mode. Typical LEDs
drive power even if not driving any signals are not constructed this way, though as we consider the
possibility of making very small LEDs, it becomes more
feasible to consider such single-mode devices.
Some features that might be viewed as weaknesses can also
be strengths. For example, modulators obviously require an C. Light emitters as output devices
external light source and optics to distribute that to the devices, 1) Lasers
but that also means that we only need to perform the optical Most lasers in use today in information processing and
isolation and stabilize the polarization, mode form, and communication are semiconductor lasers. These have the
wavelength of that one source; we may also be able to exploit advantages of small size, which in turn is because of the very
the optics to distribute a synchronized set of readout optical high gain per unit length possible in semiconductors. They can
pulses, derived from pulsing the one source, to all of the operate at high speed in direct modulation, though with some
modulators [63], [64]. We will return to such points when we limits from the relaxation oscillation frequency (see, e.g., [96]
discuss systems in Sections VII, VIII and IX. for a recent discussion) that tends to require higher power
dissipation for faster modulation rates.
B. Efficiency
As we have argued above, if such lasers are to have
1) Device power efficiency sufficiently low operating energies, then we may need to
Any optical output device that is to allow low energy
change from conventional edge-emitting or surface emitting
optoelectronics must both operate at low total energy, and be
cavities towards nanoresonator structures [97], as in research
very efficient in delivering the necessary modulated optical
examples like [91] and [92], or other structures with greater
arXiv:1609.05510 [physics.optics] v3 1 January 2017 15

optical concentration. Whether lasers with metallic mode, but LEDs become a serious candidate for low-energy
confinement are viable depends on loss; though demonstrated light emitters as we move to smaller sizes and energies.
examples may be small [97], [98], [99], their efficiency or
D. Modulators
operating energy may be limited as a result (see, e.g., [85] for
an analysis of metallic loss in small semiconductor structures in Modulators come in two basic types: ones that operate by
nanometallic waveguides). Small size is of little use here if it changes in the optical absorption of a material
results in larger overall operating energy. (electroabsorption), and ones that use changes in optical path
Whether we can exploit gain media other than length or refractive index (electrorefraction).
semiconductors is an open question; if their gain is lower, then Both kinds of devices can provide amplitude modulation. In
it may be difficult to get the required performance. Higher- devices without resonators, an electroabsorption device in
dimensional quantum confinement, as in semiconductor which the increase in the absorption coefficient corresponds to
quantum wires and quantum dots, can offer somewhat better ~ 1 or more absorption lengths in the device length will give
gain because of the more concentrated densities of states and useful modulation; in the case of electrorefraction, simple two-
possibly improved electron-hole overlap, though these beam interference, as in a Mach-Zehnder interferometer [104]
advantages may be somewhat offset by the lower “filling (see Fig. 6 (f)), for example, allows amplitude modulation by
factor” in the use of such structures. Possibly an ideal material changing from constructive to destructive interference by
would be some relatively dense collection of uniform quantum inducing ~ 1 half wave of relative path length difference
dots (see, e.g., [100] for a recent example of improving between the two arms37.
fabrication approaches); size and shape uniformity is, however, Electrorefraction devices have the advantage that they can be
important if all the gain is to be concentrated at one operating used to switch a light beam from one path to another, as in
wavelength so that the device remains efficient. Mach-Zehnder or directional coupler devices, for example.
2) Light-emitting diodes Generally, electroabsorption devices cannot efficiently switch
As mentioned above, normally LEDs would be ruled out beams between different paths, for the relatively obvious reason
because of their typical optical inefficiency from emitting into that in one state they are absorbing the beam power.
large numbers of spatial modes. One of the major opportunities 1) Materials criteria for optically efficient modulators
for light emitters, however, is that LEDs intrinsically become For modulators, we obviously must care about the ability to
more interesting as we make them small. One reason is that a make some change in absorption coefficient or in refractive
small LED cannot avoid emitting into only a small number of index; but, we also care about any overall loss. For example, a
modes; indeed, an LED with a subwavelength volume can only particularly important criterion for using a modulator in a
really emit into one spatial mode (or two, including system is the absolute difference ΔT in the transmission of the
polarization), which is the mode that is essentially in all modulator in its two states [41]; indeed, for some optical input
directions at once. power P to the modulator, the useful optical signal power that
A second reason for small LEDs is that the use of strong leaves the modulator is PΔT . So, if ΔT becomes smaller, we
optical concentration as discussed above will lead to Purcell will have to increase the power P in proportion. Hence,
enhancement of the spontaneous rate emission into the modes background loss becomes very important in a modulator.
with strong optical concentration; indeed, as discussed in In Appendix C, we give an extended discussion of the
Appendix B, the Purcell enhancement factor FP is essentially36 consequences for modulator materials of this requirement of
the optical confinement factor γ we mentioned above in the high ΔT . Here we can briefly summarize the key results.
discussion of optical concentration. • For electroabsorptive materials, presume we have a
Such Purcell enhancement can also avoid speed limitations material with a background absorption coefficient (i.e., the
that otherwise apply to LEDs because it correspondingly absorption coefficient in the “transmitting” state) of α trans
reduces the spontaneous emission lifetime that governs the and a larger “absorbing” state value of α abs = ρα trans , so
dynamics of LED modulation. See, e.g., [101] for an example that the ratio of the “off” to “on” absorption coefficients is
of a nanocavity LED exploiting Purcell enhancement for single ρ. To avoid a rapidly increasing system loss penalty, in
mode operation at low energy. Another interesting recent practice we require
example [102] uses a nanoantenna to enhance spontaneous
α abs
emission, and [103] uses nanometallic guides. LEDs have the ρ= ≥2 (5)
additional advantage that, unlike lasers, they are not “threshold” α trans
devices – no particular level of drive is required to get them to • For electrorefractive materials with a background optical
work. absorption coefficient of α, so that we get enough path
Of course, any serious proposition for the use of LEDs would length change without absorbing too much power, for the
need to show substantial levels of efficiency in the generation available change Δn in refractive index, we require
of light as well as emission into predominantly a single spatial
36
Formally, FP ≅0.477γ in resonator structures, as discussed in Appendix B. making a mode unguided as a result of an index change, tend to have similar
37
Other electrorefractive approaches without resonators, such as devices that requirements on effective required path length change.
might deflect a beam out of the way by changing the optical path in one half of
the beam, or devices in which we cause a beam to “leak” out of a waveguide by
arXiv:1609.05510 [physics.optics] v3 1 January 2017 16

Δn λ appear either to be available or usable in silicon itself, however,


≥ (6) because they are only seen at or near direct band gaps39.
α 2
• These materials criteria remain essentially the same in b) Electrorefraction mechanisms
devices with resonators. Use of resonators does not help us Any change in optical absorption spectrum results in a change
avoid these materials criteria. in the refractive index spectrum through the Kramers-Kronig
The criteria (5) and (6) can be quite difficult to meet, and relations. Hence, there are relatively strong electrorefraction
various otherwise promising mechanisms cannot achieve them. mechanisms associated with the QCSE and band-filling
2) Microscopic mechanisms for optical modulation electroabsorption mechanisms, and these can make functioning
There is a broad range of mechanisms that have been devices that are competitive with other electrorefractive
proposed and investigated for modulating light in response to approaches. One difficulty with such mechanisms is that, in
electrical drive. We are not aware of a broad comparative practice, to satisfy the condition (6), the operating photon
review of these in the literature. Because of the breadth of this energy has to be moved to significantly below the band gap
topic and the level of discussion of physical mechanisms energy (i.e., to longer wavelengths) to get away from strong
required, we give this detailed treatment in Appendix A, and background absorption near the band gap energy. The usable
summarize some key conclusions here as they relate to strength of the refractive effect is therefore weaker because the
energies. refractive effects fall off as we move away from the region
a) Electroabsorption mechanisms where the absorption is being changed.
The strongest modulation mechanism overall is likely the Hence such purely electrorefractive devices using these
electroabsorption from the QCSE [65], [66], which is seen in mechanisms have to be longer (e.g., 10’s to 100’s of microns
quantum well layered semiconductor structures and other instead of a few microns) and can therefore have  ×10 − 100
quantum-confined structures; we have already given estimates higher operating energies than their purely electroabsorptive
of the required energies in Table II. It is a mechanism that counterparts. The combination of electroabsorptive and
results directly from the electric field applied to the structure. It electrorefractive effects can lead to an attractive low energy
is seen in direct gap semiconductor materials and near the direct modulation mechanism in resonant devices, however,
gap of indirect-gap materials like germanium. somewhat enhancing performance compared to purely
A related electroabsorption mechanism, commonly called the electroabsorptive devices (see, e.g., [79]). Again, this class of
Franz-Keldysh effect (FKE), is seen in bulk materials near to bandgap resonant electrorefraction mechanisms is not
their direct bandgap; it is somewhat weaker and shows less practically available in silicon.
abrupt changes in absorption38, but is still a viable strong A mechanism that does exist in silicon, and has therefore been
mechanism. widely investigated and used very successfully in devices (see,
The other main category of mechanisms for changing e.g., [105]), is the “free carrier plasma” (FCP) refractive index
absorption in semiconductor structures involve band-filling – change associated with changes in carrier (electron and/or hole)
that is, filling up the “bottom” of a band (usually the conduction densities [106]. This mechanism is not resonant with any
band) with carriers (usually electrons) so as substantially to bandgap energy40. It is, however, relatively weak, being a
eliminate the possibility of any absorption into those states, further ~ ×10 weaker than the index changes per unit carrier
thereby removing substantial absorption from some region of density in the bandgap-resonant “band filling” mechanisms.
the spectrum for photon energies near to the semiconductor Overall, this FCP electrorefractive mechanism in silicon is
bandgap energy. ~ ×1000 weaker for making a device than the best
The resulting magnitude of the changes in absorption from electroabsorption mechanism (QCSE), as is borne out in device
band-filling are similar to or, under strong excitation, larger performance; a simple Mach-Zehnder FCP modulator without
than those of the QCSE and FKE; the carrier densities required any optical concentration will require a few pJ/bit [104],
for operation are similar to those required to turn on lasers, so whereas a short QCSE electroabsorption modulator with no
the operating energies of those devices would be similar to the resonator requires a few fJ/bit or less [41], [78]. As a result, for
laser energies in Table II. This category of mechanisms has low-energy devices, the FCP requires very high optical
various other names, including Pauli blocking, Burstein-Moss concentration to operate, as in high-Q ring [107], [108] or disk
shift and phase-space filling, and there are some subtleties to [105] resonator structures, with all of the problems, such as
the physics, including the influence of excitonic effects, that are tuning, associated with that.
not conveyed by these names. The final main electrorefractive mechanism of interest is the
None of these relatively strong electroabsorption mechanisms Pockels effect – a linear change of refractive index with electric
field. This mechanism is seen in materials like lithium niobate
38
QCSE appears more as a shift of a relatively abrupt absorption edge, tuned to a given operating wavelength or to compensate for changes in bandgap
whereas the FKE appears more as a broadening of an edge. With the QCSE, it energy with temperature.
39
is possible to pre-bias the structure to just below the voltage at which the Silicon’s corresponding direct band gap is at a photon energy range (~
absorption edge reaches the operating wavelength or photon energy of interest, 4eV) in the ultraviolet, where there is also very strong background absorption
and then apply only a small additional drive voltage to shift the absorption edge from other transitions.
40
past that wavelength, thereby reducing the dynamic power dissipation for It is associated with the plasmon absorption resonance that is typically in
modulation [41]. For the same reason, the QCSE allows the device to be voltage the far infrared frequency range in typical semiconductor situations.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 17

(which is widely used in telecommunications modulators), in view of operating energy, so needs particularly long devices or
III-V semiconductors, and in electro-optic polymers, with all of high optical concentration factors. Pockels-effect devices can
these mechanisms being strong enough to demonstrate viable work well, though they do not appear in practice to offer effects
devices. It is not, however, seen in bulk silicon because of for devices that are stronger than the other electrorefractive
silicon’s crystal symmetry properties. effects considered. 2D materials may be interesting for many
The energy required for Pockels effect does not have the same reasons, but they do not currently appear to offer substantially
scaling as the other mechanisms discussed; in fact, in the lower device energies compared to quantum well structures.
absence of background losses such as waveguide propagation Overall, modulator mechanisms can offer operating energies
loss, there would be no actual minimum energy – doubling the ranging from somewhat worse than laser energies to much
modulator length would actually halve the energy required41. better, including the lowest energy microscopic mechanisms for
As a practical matter, for reasonable lengths of devices the output devices.
energies required to operate Pockels-effect devices are not
likely to be lower than hypothetical similar devices using other V. PHOTODETECTORS AND RECEIVER CIRCUITS
good electrorefractive mechanisms. With very good device If we think about qualities of a good photodetector,
engineering, however, including optical concentration from considered as a device on its own, we might look for good
nanometallic waveguides and slow group velocity, devices with efficiency, in terms of photocurrent or photovoltaic power
~ 25 fJ/bit have been demonstrated [87], [88], [109] using generation for every incident photon, and very low intrinsic
electro-optic polymers in a device  10 μm long. noise; both of these attributes would obviously contribute to the
c) Use of two-dimensional materials ultimate sensitivity possible in some optical receiver. For long
distance communications, such ultimate sensitivity is very
Two-dimensional (2D) materials like graphene or MoS2 have
important. The size and capacitance of the photodetector would
emerged in recent years as intriguing new opportunities for
be secondary attributes; a good receiver design can give very
optoelectronic devices. We compare the resulting mechanisms
good sensitivity even with large photodetector capacitance (see,
to others in Appendix A; the comparison to quantum well
e.g., [19], [110]).
structures is particularly useful because 2D materials and
As we think about short distance interconnects, however, the
quantum wells share much basic physics.
requirements change substantially. Specifically, we need to
The simplest way to state the conclusions of this comparison
minimize the total energy to communicate a bit. That energy
is to say that, broadly, the useful strengths of mechanisms like
must include the energy of all circuits, including the output
band-filling, in terms of the energies required, are essentially
driver circuit and, especially, the receiver circuit. Receiver
the same in 2D materials and quantum wells, though 2D
circuits can dissipate substantial energies, in some cases
materials may offer the possibility large total changes in
possibly even being the largest single contributor to the power
absorption in small overall volumes, which could help in
consumption overall in a link [17].
avoiding high-Q structures. But, electroabsorption mechanisms
When we optimize for minimum total energy per bit, the
like QCSE, if they exist at all in given 2D materials, are
required criteria for the photodetector change substantially. One
practically weaker there. For the QCSE, the 2D materials are
key and surprising conclusion is that we will likely not run the
actually too thin; the ~ 10 nm thickness of quantum wells is
interconnect in a noise-limited fashion [111]. This is a very
close to some kind of optimum.
different approach compared to that in long-distance or even
Therefore, 2D materials may offer many interesting
medium distance communications. We should remember,
opportunities, such as the ease of applying them to diverse
however, that short electrical wire interconnects also do not run
substrates, but they do not currently appear to offer large energy
anywhere close to a noise limit, so this is a common aspect of
advantages for optoelectronic devices, and are in practice
short distance connections.
missing a key strong mechanism (the QCSE).
Indeed, one goal in the design of short optical interconnects
d) Conclusions on energies for modulator mechanisms could be to make them appear as close to the behavior of an
With the exception of the particularly strong QCSE or the electrical short wire interconnect as possible; there is no
somewhat weaker FKE effects, other electroabsorptive overhead on such a connection for low-noise amplification, line
mechanisms will require operating energy densities and optical coding, CDR or SERDES – we simply put the signal on one end
concentration factors comparable to the those for lasers in Table of the line and it appears at the other. That simplicity is essential
II. The corresponding electrorefractive mechanisms are for minimizing energy dissipation in short connections; use of
generally effectively weaker for device operation than their low-energy optoelectronics might enable us to extend that
electroabsorptive counterparts (e.g., by  ×10 − 100 ), so would simplicity and low overall energy to much longer connections.
require either longer lengths (and larger energies) or higher A. Receiver circuit energies
optical concentration factors. The widely-used FCP effect in
The issue of increased power dissipation for high-sensitivity
silicon is about another factor of 10 weaker from the point of
41
Doubling the length and therefore halving the required refractive index required refractive index change would halve the required field. But, the
change would double the active volume. Since the change in refractive index in electrostatic energy density is proportional to E2, so it would reduce by a factor
the Pockels effect is proportional to the applied electrostatic field E, halving the of 4, hence halving the required electrostatic energy overall.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 18

receivers is well understood from classic receiver design This receiverless approach can be a good starting point for
analysis. [110] shows42 that for a field-effect transistor (FET) considering designs and energy savings from low photodetector
front-end amplifier circuit, the minimum overall noise from capacitance. The resulting electrical input circuits can be
thermal (Johnson) noise is obtained when the total of the extremely simple, being just CMOS gates, for example.
photodetector capacitance and any stray and/or wiring
C. Near-receiverless operation
capacitance at the input is equal to the physical input
capacitance of the FET. This means that such a receiver approach44
designed for optimum sensitivity with respect to thermal noise Such a “receiverless” approach may not represent the very
will have an FET size that grows with the size of that lowest possible total energy per bit for such links with low
photodetector and wiring capacitance, with a corresponding detector capacitance; it may be that we can take what we can
increase in the static current in the FET channel when it is call a “near-receiverless”. In such an approach, conceptually we
biased as an AC amplifier. start with a receiverless design, and then add some receiver
So, even if we consider a noise-limited approach, to reduce gain, but only insofar as we are reducing the total energy per bit
power dissipation overall, it can be useful to reduce the total of the system. The energy required to give the additional
input capacitance connected to the transistor, including the receiver gain must be lower than the energy saved as a result of
detector capacitance. (See also [19] for a recent analysis of needing less optical source power.
noise in optical interconnect receiver circuits43.) It might seem obvious that adding more receiver gain would
To understand the energies involved in receiver circuits, always reduce the total energy per bit because it would allow
consider, for example, a recent low-energy photodiode and lower transmitted power. But, adding gain stages does increase
receiver design [112]. The photodiode has ~ 8fF or less receiver power dissipation as well. And, if we increase receiver
capacitance and the hybrid (solder-bump) packaging technique gain so much that we start to approach a noise-limited design,
adds about another 25 fF for a total capacitance of ~ 30fF. This the receiver power dissipation can rise substantially [111]. We
example gives a receiver circuit operating at 170 fJ/bit at 25 discuss this point in more detail in Appendix D. There is
Gb/s with -14.9 dBm noise-limited sensitivity. Such a receiver therefore a balance between receiver gain and power dissipation
circuit energy per bit is impressively low; other circuits (see on the one hand and transmitter power dissipation on the other.
[112] for comparisons) can dissipate as much as several pJ/bit. For long links with high loss and/or high bit rates, then such
In the work of [112], including the input coupling loss of ~ 6 noise-limited receivers typically are required for functioning
dB, the effective responsivity of the photodetector is 0.2 A/W. links, but once we consider short links and more limited bit
-14.9 dBm is equivalent to a power of 32.3 μW, so at 25 Gb/s rates, optimizing for minimum total energy per bit can lead to
the photodetector is receiving an optical energy of ~ 1.3 fJ/bit, quite different conclusions, especially if we have low
which will be generating ~ 260 aC/bit of charge in the photodetector capacitance.
photodetector. In a capacitance of ~ 30fF that charge will give One conclusion from our analyses in Appendices D and E and
a voltage swing of ~ 8.6 mV. So the effective voltage gain of previous discussions [111] is that possibly about one gain stage
this amplifier system, including the front end amplifier and the might be advantageous in such a near-receiverless design for
sense-amplifier circuits, is ~ 50 – 100 to get a final logic level low-loss optical links with low photodetector capacitance, and
output swing that is a substantial fraction of a volt. But, the this gain stage design would not be a noise-limited one; this
energy cost of this sensitivity and noise-limited operation is ~ optimum design would still lead to voltage swings that the
170 fJ/bit when working with this ~ 30fF input capacitance. receiver input that are much larger than any effective noise
voltage. The conclusion that only about one such simple gain
B. Low-capacitance front ends and receiverless operation stage would be required is why we can call this approach “near-
Now suppose that we were able to make a small photodetector receiverless”. With the example numbers we consider here, that
(as discussed above in Section III A), integrated very close to receiver amplifier circuit could consume up to a few fJ/bit of
the input of a CMOS gate, with a total capacitance of the energy and still lead to overall energy reductions.
photodetector, the connecting wiring and the transistor input of,
D. Low-capacitance photodetectors
say, ~ 300 aF. Then that same 260 aC of optically-created
charge in the receiver of [112] would itself generate a logic- To understand the possibilities for operating with low-
level swing ~ 0.8V [6] to drive the CMOS gate. That would capacitance photodetectors, we can examine some orders of
completely eliminate the 170 fJ/bit of receiver circuit energy, magnitude for capacitance, as shown45 in Table IV.
allowing the receiving system to operate at ~ 1 fJ/bit total Historically, photodetectors in telecommunication systems
energy. Such an extreme system with no voltage amplifier, and had relatively large capacitances such as ~ 1 pF; the detector
relying on a full logic voltage swing from the photodetector and the receiver circuit might be made in different technologies
itself, can be called a “receiverless” system [45], [46]. with different materials, and a simple wire bond between the
two (with a capacitance that could easily also be ~ 1 pF) allowed
42
See Eq. 4.65 of [110] and associated text. in Eqs. (12) of [19]; low total input capacitance and high amplifier dissipation
43
See, for example, the terms proportional to the photodetector capacitance improve sensitivity in such a noise-limited receiver.
44
and inversely proportional to the square root of the transimpedance amplifier This term “near-receiverless” is one that we are introducing here.
45
power dissipation in determining the minimum possible received optical power Many of the capacitance numbers here are as discussed Section IIIA above
when we were considering electrostatic energies of output devices.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 19

simple manufacture. Receiver power dissipation was also a Avalanche gain in the detector itself is another approach to
relatively unimportant issue in such systems. reducing the required optical input energy. Such detectors have
We see, however, from Table IV that, if we could make been demonstrated in germanium structures on or with silicon
detectors with size scales ~ 1 × 1 × 1μm 3 to 100 × 100 × 100 nm 3 , [115], [116], with gains of up to 12 [115], for example. See also
the detector capacitance can be comparable to or lower than the work with III-V nanoneedle structures [117], [118], [119],
input capacitance of the small transistor to which it would be including examples on silicon [117], [119].
connected. Fortunately, if we use a direct absorption Nanometallic resonator photodetector structures have been
mechanism in a semiconductor, we can obtain strong absorption demonstrated, allowing high responsivity structures in
typically in one to a few microns of length, so even without any germanium [120]. This work showed up to ~1 A/W responsivity
optical concentration to increase the absorption per unit length, in a lateral resonant cavity structure 975nm wide and ~ 300 nm
relatively compact and effective photodetectors are possible thick. In such structures, Q  100 . Such structures can also
(e.g., [113], [114]). Such direct absorption mechanisms are exploit photoconductive gain, an alternative approach to
available at commonly used telecommunications wavelengths avalanche gain for useful current gain from the detector itself.
in III-V semiconductors (e.g., InGaAs) and across the direct gap Another approach for concentration with metals uses
of germanium. nanometallic (or plasmonic) dipole antennas to concentrate into
a ~ 100 × 100 × 100 nm 3 detector volume [121]. Nanometallics
TABLE IV can also enhance other photodetector structures [118], [119],
CAPACITANCE (C) OF SMALL STRUCTURES including those with avalanche gain. Mie and other resonances
References in dielectric structures such as nanowires [122], Fano resonance
Structure C
and notes modification of those [123], and nanocavities [124] are other
100×100μm square conventional ~1pF (a) possible approaches for moderate Q resonances for
photodetector photodetection.
5×5μm CMOS photodetector 4fF [46]; (b) Note, incidentally, that nanometallic or plasmonic
Wire capacitance, per μm ~200aF [6] concentration into such small detector volumes is one of the
FinFET input capacitance ~ 20 – [35]; (c) cases where such use of metals can make overall sense despite
200 aF the loss problems with such use of metals. Suppose metallic
1 × 1 × 1μm 3 cube of ~100aF (d) optical concentration allows a detector volume that is smaller
by a factor of 10, which might reduce the capacitance by a
semiconductor
factor of 10 as a result. Even if that metallic concentration is
100 × 100 × 100 nm 3 cube of ~10aF (d)
only 30% efficient because of metallic losses, then we may still
semiconductor be winning by a factor of 3.3 in reducing the energy of the
10 × 10 × 10 nm 3 cube of ~1aF (d) system. See, for example, [85] for an analysis of a photodetector
semiconductor in a structure with metallic concentration, including metallic
(a) Assuming a 1μm thick depletion region and a semiconductor with losses. Note also that the use of metals, with their very large
dielectric constant ~ 12. effective dielectric constants, is likely the only way to
(b) This is a lateral p-i-n silicon detector, operated at ~425nm wavelength concentrate light into deeply subwavelength volumes.
where silicon has strong optical absorption. In general, we can conclude that the concept of very low
(c) The ~20aF capacitance is simulated [35] for a single-fin FinFET, at fin capacitance photodetectors with reasonable efficiency is quite
widths of ≥8nm. The larger number of 200aF is to account for the possible viable, especially with some moderate amount of optical
use of FinFETs with more fins, as is common in circuits, and some parasitic
capacitances.
concentration from resonators or nanometallic (plasmonic)
structures. A key point, however, is that such photodetectors
(d) Assumes only the plane-parallel capacitance between two opposing faces,
neglecting any fringing capacitance, and assuming a typical semiconductor must be integrated very close to the electronics. The required
dielectric constant of ~ 12. closeness of integration here (e.g., < 1 μm given the ~ 200
aF/μm wiring capacitance) may mandate a monolithic
We would, however, have to keep the connection to the integration approach if we are to get the major benefits possible
transistor short – e.g., < 1μm – if that wiring capacitance is not here. If we take this approach of small photodetectors and tight
to dominate. That means that we need monolithic or at least integration, though, we can avoid the dissipation of noise-
very intimate integration of the photodetectors with the limited receivers (see Appendix E for a discussion of noise in
electronics to which they are connected. these cases).
A good example of a detector that could be monolithically
integrated with silicon for receiverless operation is a VI. COMPARISON OF LONG, MEDIUM AND SHORT DISTANCE
germanium waveguide detector on silicon, with a 1.3 × 4μm 2 SYSTEMS
footprint, ~ 1 μm height, and ~1.2 fF capacitance [113] Such a Table V summarizes some of the key attributes and
device has no optical concentration, so the prospects for technologies for the use of optics in sending information at
reduced capacitance in a smaller device with some different length scales. Optics has been overwhelmingly
concentration, such as a low-Q resonator, are promising. successful in long-distance telecommunications; arguably
arXiv:1609.05510 [physics.optics] v3 1 January 2017 20

nearly the all the information we send over nearly all the TABLE V
distance we send it travels over optical fiber. The modern COMPARISON OF LONG, MEDIUM AND SHORT DISTANCE
internet would be impossible without the dramatic increase in OPTICAL COMMUNICATION
information transmission optical fiber technology has enabled. Long-distance telecommunications (> 1 km)
A key requirement for long distances is that we get the  Key benefits of optics
maximum information over a given fiber over the longest o Very large data rates over very long distances
possible span.  Key requirements
Optics is increasingly used at medium distances, such as o Maximum capacity (b/s) over the longest span
those between racks inside data centers and large information o Maximum capacity per fiber
processing machines. Here one key driver for the use of optics  Key technologies
is that otherwise we run out of space for wiring the connections o Single-mode fibers for low dispersion communication
– connection and bandwidth density become important. o Optical amplifiers for maximum distance
At shorter distances, such as inside racks and down towards o WDM for maximum capacity
the edges of chips themselves, optics is not yet a dominant o Low-noise receivers for maximum distance
technology, but increasing the density of connections and o Coding and error correction for maximum distance
reducing energy per bit communicated become major system o Advanced modulation formats for maximum capacity
requirements.  Emerging possibilities
No such simple table can be comprehensive, of course, and o SDM for higher fiber capacity
there are many technologies not mentioned in Table V that Medium-distance data links (~ 10 m – ~ 1 km)
underlie the entire table, such as semiconductor electronics and  Key benefits of optics
optoelectronics. The details of such a table are also open to o High density of connections
debate. o Enable flat networks within data centers [14]
A main point, though, is that the requirements on the o Reduce overall power dissipation in data centers
technology change substantially as we move to shorter  Key requirements
distances, especially for the shortest distances. This is important o High density, low cost, connections between racks
because much of the investment and technological development  Key technologies
has obviously been for the longer distances, but we cannot o Dense integrated optics and optoelectronics
simply take the same approaches at the shortest distances. We o Array optics (e.g., linear arrays of fibers)
need to view components and systems very differently at short o Line coding to avoid AC coupling issues
distances, and there are substantially different challenges and  Emerging possibilities
opportunities. o SDM for higher density connections
We should not doubt that there are significant
Short-distance interconnects (< 10 m)
interconnection problems currently constraining systems at
 Key benefits of optics
medium and short distances. For example, the “byte per FLOP”
o Very low energy per bit communicated
problem in supercomputers is well known [17]; it is very
o Very high density of connections
desirable in computer architectures to be able to access a byte
o Signal integrity
of information from memory for each floating-point operation
• Signal timing, voltage isolation, low pulse distortion
(FLOP), but modern machines fall well below this goal. This
 Key requirements
problem has proved quite intractable so far by electrical
o Very low energy optoelectronics
approaches; such machines are unable to transfer enough
o Minimize energy per bit overall, including dissipation in
information between the memory and the processors – they
any electronic circuits
operate as if they are in a permanent and severe information
o Integration for very low energy, very high density, very
“traffic jam”. At the present time, there appears to be no
low cost per connection
physical solution other than optics for major improvements in
o Tolerant to component and operating condition (e.g.,
the information density for such relatively short interconnects.
temperature) variations
In the following Sections VII and VIII, we will look at some  Key technologies
of the key different requirements and opportunities for short o Silicon-compatible integration
interconnects. Specifically, we consider optics for dense, short o Very low capacitance photodetectors and integration
interconnects, and issues and opportunities related to clocking, o Very dense, array optics
timing, and time-multiplexing. We need to minimize the total  Emerging possibilities
energy per bit communicated while also enabling very high o Free-space and/or SDM for very high densities, allowing
densities of interconnections; these two requirements tend to moderate clock rates that minimize energy per bit
work with, not against, one another, though they lead to o Large synchronous zones to eliminate retiming power
approaches quite different from current medium and, WDM – wavelength-division multiplexing
especially, long interconnects.
SDM – space-division multiplexing (as in multiple modes or cores per fiber)
arXiv:1609.05510 [physics.optics] v3 1 January 2017 21

VII. OPTICS FOR SHORT-DISTANCE INTERCONNECT SYSTEMS tightly focused spatial mode or “spot”. A key point, though, is
A key guiding principle for short distance interconnects is that it will not then efficiently absorb much light at all from any
that we must optimize the entire interconnect for minimum total other spatial mode (in the same polarization) [125].
energy. That principle leads to some consequences and novel
opportunities for optics.
• First, the need to minimize energy overall, and hence
minimize optical loss, pushes us to use what we could call
“mode-matched” and/or diffraction-limited optics.
• Second, optics also offers the opportunity at short distances
to work with very large numbers of channels, which,
obviously, can improve interconnect density.
• Third, and less obviously, we can trade off numbers of
channels to reduce energy by eliminating electronic link
circuitry.
We will discuss the first two of these here, and we will return
to the third point in Section VIII when we consider clocking and
time-multiplexing. Fig. 7. Sketch of (a) a single beam focused with an appropriately large
convergence angle θ towards an approximately minimum sized spot, of area
A. “Mode-matched” and diffraction-limited optics Amin ~ (λ/2)2, (b) two beams focused to two spots, and requiring a total detector
area ~ 2 Amin, and (c) N (= 9 here) beams focused to N spots on a total area
Long-distance communication uses single-mode fiber in part ~ NAmin.
because it avoids the problems that arise from light in many
different spatial modes propagating at different speeds, which To absorb a second spatial mode efficiently, we would have
would lead to pulse dispersion. At medium distances, such to at least double the detector area, as sketched in Fig. 7 (b).
pulse dispersion is less important, and multimode fibers can be That doubling is relatively obvious if we think in terms of
used; multimode fibers are more tolerant of alignment “spots” that should not overlap; we might think there is some
precision, allowing lower cost systems, and they can also be other set of propagating beam shapes that could avoid this
designed to minimize dispersion. problem, but in fact that cannot be done46 [126] (see Appendix
In such multimode systems, it makes little difference in F).
which mode or modes the signal propagates; a large detector Quite generally, then, for plane absorbing surfaces on
can collect the power in all the spatial modes. With an photodetectors, their area has to grow proportionately with the
appropriate receiver amplifier, there is no sensitivity penalty for number N of modes they are to detect, which means their
using such a large detector, and we can let the light scatter into capacitance also grows by a factor N.
the many modes of a multimode fiber or waveguide. In a receiverless system dominated by photodetector
At short distances, however, to reduce or eliminate receiver capacitance, if we increase the detector area by N to collect the
energy dissipation, we want to work with the smallest possible power from N spatial modes, this corresponding growth in
photodetector to reduce capacitance. In the receiverless limit, capacitance means the voltage swing generated by the optical
the operating energy is proportional to the photodetector input energy is therefore reduced by N, so we have to increase
capacitance until that capacitance becomes comparable to the the total transmitted optical energy by N to restore the voltage
capacitance of any wiring that connects the photodetector to the swing. So, making a detector that is efficient for detecting a
transistor, and/or to the transistor input capacitance itself. Given signal in any of N modes can lead to an increase in required
our discussion of capacitances above in Table IV, the size scale system energy per bit by a factor of N in a receiverless system.
at which the photodetector capacitance will be comparable to We might imagine that we could make some piece of optics
transistor input capacitance in a well-integrated system is at a ahead of the photodiode that would somehow recombine the
wavelength scale or smaller. incoherent power from multiple different spatial modes into
Suppose we design a photodetector so that it is “minimum- one; that, however, would violate the Second Law of
sized” – that is, it has small an area as possible to collect Thermodynamics47 [125], as well as some basic optics [125].
essentially all the light in at least one form of input beam. In Hence, in a receiverless system, if we create the light in
conventional optics, it will then have some size of the order of multiple different modes or if we let it scatter into many
a square half-wavelength in area, as sketched in Fig. 7 (a), to different modes, effectively we can make little or no use of that
absorb the light as efficiently as possible from one specific power in different modes. Essentially, we cannot usefully get
back any power that we launch incoherently into other modes48.
46 47
The approximate “counting non-overlapping spots” heuristic approach is If we could do that, we could combine the power from two cool black
backed up by a much more general and rigorous theory of coupled channels bodies to heat up a warmer one, for example.
48
between surfaces and volumes [126]; that approach is based on a sum rule of For optical concentration into a minimum-sized photodetector, the best we
coupling strengths for the optimum orthogonal channels (or “communications can do in general in a multimode system where we do not know the relative
modes”), and can be regarded as a generalized theory of diffraction [126]. coherence between the power in the different modes is to concentrate the power
from the most powerful mode into the minimum-sized photodetector; all power
arXiv:1609.05510 [physics.optics] v3 1 January 2017 22

In general, we can only use the power in the most strongly multiplexing (SDM). In WDM, we exploit the very high carrier
coupled mode; power in other modes is wasted. So, we want to frequency of light (e.g., 200 THz at 1.5 μm wavelength); we
run with optics that creates and retains the power of a given can put many channels of different carrier frequencies on one
signal in one spatial mode. In free-space optics, this spatial mode, but still close enough in frequency that their
requirement equivalently means we want to run with propagation behavior is essentially the same or similar, as in the
diffraction-limited optics since otherwise we are leaking power use of ~ 50 channels on 100 GHz spacing in the
(by aberrations) into other spatial modes. telecommunications C-band. In SDM, we might try to exploit
This conclusion also has implications for the use of LEDs. If some moderate number of different orthogonal spatial modes in
we allow the LEDs to emit into multiple spatial modes, then a single-core or multiple-core fiber [12] or in a free-space link
only the power in the most powerful mode is useful to us; the between buildings, or a very large number of modes, such as
rest is wasted in a receiverless system. So, LEDs become 1000’s to 10,000’s of channels in optical imaging links between
interesting in receiverless systems if and only if they are chips [2], [136].
essentially emitting into only a single spatial mode. That is by Incidentally, the issue of the number of available spatial
no means impossible, however, and the smaller we make LEDs, channels in an optical system, either in free space or in fibers,
the easier it becomes to move towards such a situation. LEDs and the optimum choice of the optical modes for
with significant Purcell enhancement in a specific mode communications in optical systems is one over which there has
become quite attractive options (see, e.g., [96] for a recent been some confusion recently; for example, orbital angular
discussion). momentum modes are sometimes discussed as if they represent
an additional set of degrees of freedom for SDM
B. Beam couplers
communication, beyond conventional spatial or polarization
One other consequence of the need to work with such “mode- degrees of freedom, which is not the case. These points are
matched” optics is that any optics that is to couple from one discussed in Appendix F. A simple formula [126] for the
form of light beam to another, such as a grating coupler to diffraction limit to the number of separable channels between
couple from free-space to waveguide optics, has to be mode- two parallel surfaces of areas AT and AR, separated by a distance
matched; that is, if we are only coupling into a single mode, as
L and operating at a wavelength λ, is (for a given polarization)
in a single-mode waveguide or a minimum-sized detector, then
A A
we can only couple from a single mode, that is, from a specific NC  T2 2R (7)
beam shape and alignment, and the resulting coupler has to Lλ
couple exactly only these two modes to one another [125]. It is which is derived as Eq. (35) in Appendix F.
not sufficient that a coupler has no absorption losses, for 1) Wavelength-division multiplexing in short distance
example. Any coupler that is to be efficient must be matched interconnects
specifically to the modes it is coupling. The alignment tolerance There are two basic approaches to WDM for short-distance
of an efficient coupler is fixed by the sizes of the beams being interconnects: we can use passive optics to split different
coupled; that tolerance is not something we can design to be any wavelengths to photodetectors and to combine signals on
better than that [125]. different wavelengths from modulators that do not themselves
Beam couplers have received considerable attention (see, need to be tuned or resonant (see, e.g., [137], [138], [139]); or
e.g., discussion in [49]), based especially on approaches like we can use resonator modulators and/or photodetectors that
grating couplers and inverse tapers. It is an interesting question themselves extract the WDM channels by tuning to specific
whether nanophotonics could enable other approaches. Novel wavelengths (see, e.g., systems using sets of microring or
mode converters based on arbitrary and computational microdisk resonators, each tuned to a chosen different
approaches in compact nanophotonic structures [128], [129], wavelength [14], [17], [20], [105]). See also [140] for a critical
[130], [131], [132] have been designed. Extending this analysis of WDM approaches for dense interconnections.
approach could be a promising direction for improving coupler To use either approach for short distance interconnects, we
efficiency yet further. may need to use micro- and/or nano-photonic approaches; the
There are also novel possibilities for self-aligning couplers wavelength separator must be very compact if we are to achieve
that could adjust themselves after fabrication [133], [134], the large number of interconnect channels we would need off a
[135] and compensate for aberrations, imperfections, chip. For passive splitters, conventional approaches like arrayed
misalignment, and even some mixing from scattering between waveguide gratings may be too large to allow one for every
modes. spatial channel at short distances, though compact devices have
been demonstrated [141]. Echelle gratings are another
C. Large numbers of channels relatively compact passive micro-optical approach [139].
Optics has at least two ways49 in which we can substantially Solving this problem could be a promising direction for
increase the number of available channels: (i) wavelength- nanophotonics; there are several novel possibilities here,
division multiplexing (WDM); and (ii) space-division including superprism wavelength splitters [142], waveguide
49
in other modes is useless [125]. Even if all the scattering is coherent, undoing We can also use different polarizations, but that only gives a doubling of
arbitrary coherent cross-coupling into other modes, though now understood to the number of channels.
be possible in principle [127], would be hard to apply to complex scattering.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 23

nanophotonic wavelength splitters [143], [131], [144], as well then it is no longer clear that they offer advantages compared to
as conventional approaches exploiting nanophotonic the dielectric guides we could then make at similar cross-
fabrication (see, e.g., [140]). Such systems could have the sectional sizes (see, e.g., [148] for a critical discussion).
additional advantage of being able to interface directly with 3) Free-space and space-division multiplexed optics in short
medium- or long-distance WDM systems. distance interconnects
Whether we can use dense WDM techniques (e.g., with many The core idea of free-space optics, and more generally of
10’s of different wavelengths) for large-scale short distance SDM optics in which beams may overlap as they propagate, is
interconnects is an open question (see, e.g., [140]); we re- that with one optical system, we can handle multiple beams of
encounter the issue of fabricating or adjusting large numbers of light or spatial modes at once. We start out with signals in
systems with high precision that we found above when separate “spots” or single-mode guides at the transmitter end.
considering high-Q resonators. Note that 100 GHz in 200 THz In the middle of the optical system, the resulting beams may all
is 1 part in 2000, and any system that pulls out one such channel be in modes that overlap, but the optical system will separate
needs at least that precision to operate. Possibly we can adjust these out to similar spots or single-mode guides at the receiver
systems in real time to allow such tuning precision, at some cost end, giving multiple separate channels for communication, as
in complexity and power (see, e.g., analysis by [105]). in Fig. 7.
2) Dense waveguides Of course, this idea is routine in classical optics – imaging
One obvious form of SDM is to use multiple separate optics with a simple lens does exactly this function. Such
waveguides. Technologies like silicon photonics can operate imaging optics can form the basis for free-space optics for
with waveguides that might be as small as ~ 200 × 300 nm 2 ; interconnection with many 1000’s or 10,000’s of beams [136],
such an approach allows quite dense waveguide circuits. In a and we will return to this point below.
planar structure, we can therefore have dense waveguide arrays, a) Few-mode SDM systems
possibly up to many thousands per centimeter of overall width. For small numbers of modes, e.g., from a few modes up to
If we do use such small waveguides, there are some other possibly 10’s of modes, it may also be possible to run separate
considerations, such as loss and crosstalk, and the issue of just spatial channels through a single optical fiber. That possibility
what waveguide size to use in various applications is a matter is relatively straightforward if the fiber has multiple separate
of debate [145]. cores with negligible optical coupling between the cores. More
Such a waveguide technology can be used within either a set intriguing is the possibility of operating with overlapping
of waveguides on a chip, or possibly on some “interposer” modes in fibers. That possibility requires some way to
secondary waveguide structure onto which multiple chips are transform in and out of the overlapping fiber modes to connect
attached (see, e.g., [58], [146], [147]). Just what density of to separate spots or waveguide at the ends of the system, which
connections we could make to some such interposer structure is is an interesting area for novel optics [10], [11], [12], [149],
an open question; couplers between chip waveguides and [150]. Recently, it has been understood, at least in principle,
waveguides on some interposer might require sizes larger than how to solve such separation problems even in the general case
the waveguides because of diffraction. If we tried some simple of arbitrary overlapping but orthogonal beams [133], [134],
butt-coupling approach of face-to-face coupling of guides, we [135].
would require alignment tolerances between guides on different A subtler issue is that, with overlapping modes or even
chips on a scale much smaller than the guide cross-section; that loosely coupled cores in one fiber, there will in general be
could be challenging with small guides at micron or sub-micron scattering between the modes. That scattering is not in general
sizes. So, whether it is practically possible to have 1000’s of predictable; it can result from imperfections and it can change
waveguided connections off a chip to such an interposer is still in time because of, for example, temperature fluctuations or
arguably quite speculative. mechanical bending or vibrations.
Plasmonic or nanometallic guides can operate with even Scattering in and out of different modes can additionally lead
smaller cross-sections, such as ~ 80 nm (see, e.g., [86]); at such to variations in group delay, which can impact the use of SDM
sizes much smaller than dielectric guides, their losses are, in long connections [151]. In short connections, such group
however, relatively high, such as a loss-limited propagation delay variation might not be as much of a problem, but we
distance ~ 10 μm (see, e.g., [85], [86]). Such nanometallic or would still need to undo the scattering to separate the
plasmonic waveguides and related “antenna” concentrator overlapping information channels again. Use of electronic
structures (see, e.g., [121]) could be very useful at distance techniques to undo the effects of the coupling, such as MIMO50
scales of microns or shorter; they represent the only way to algorithms in digital signal processing (DSP) circuits [152], can
guide light controllably at few-micron or sub-micron scales, handle both group delay variation and separation of cross-
and the only way to concentrate light directly into sub- coupled channels. Those MIMO algorithms and processing
wavelength structures. For longer distances, however, to reduce might make sense at longer distances. The power consumption
loss, they would have to be made with larger cross-sections, and of such circuits could rule out such approaches in short
50
MIMO – multiple-input, multiple-output – approaches come originally separate out the channels from the signals from the multiple antennas, including
from wireless communications technology, where many transmitting and undoing the effects of the delay variation from signals propagating along
receiving antennas may be used at once. Signal processing techniques can different paths in a scattering environment.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 24

interconnects, however. In fact, though, approaches like Dammann grating spot array
It is possible in principle to undo such scattering [127] using generators [163] easily and efficiently generate customizable
purely optical self-configuring techniques running with low and very regular arrays, with a geometric precision guaranteed
power feedback loops [133], [134], [135], and such a scheme by lithography. In using such arrays, we only need to ensure
has recently been demonstrated based on these architectures alignment of a few parameters; once we have set those, the
and algorithms [153]. For such schemes to be practical for short entire array is aligned. As with any optical alignment even of
distances, we would, however, require optical phase shifters one beam, we need to set the overall position in three spatial
that can run at very low power; possibly micromechanical dimensions and in two angles, and we need to focus the beam;
approaches could achieve such low-power phase shifting [154], but, then to align the entire array we only need one additional
[155], though this remains speculative. Such schemes also angle (rotation about the beam array axis) and one additional
might take up significant chip area, which could limit their use factor, which is the overall physical size scale of the array of
somewhat. spots. Such spot array generators are diffractive elements, and
b) Systems with very large numbers of modes or beams
as such will have some wavelength dependence, but, as
discussed in Appendix F, these do not appear to be major
If we consider free-space optical systems, we can consider limitations.
very large numbers of modes or beams. With reasonable design
we can suppress most undesired scattering between such modes
(for example, any good imaging system, like a camera lens, will
have very little scattering between different image pixels). Such
systems can routinely support millions of modes, pixels or
resolution elements, even in quite compact, millimeter-scale
optical systems like cell-phone cameras.
Free-space optical approaches have been researched in
various functioning systems and technologies. For example, a
six-stage digital system with more than 65,000 light beams,
using imaging interconnects between stages, has been
successfully demonstrated [136], as have various other free-
space optical systems and approaches [156], [157], [158],
[159], [160], [161], [162].
Generating arrays of 1000’s of light beams with low loss
from one source is straightforward using Dammann-grating
spot array generators51 [163]. Other diffractive optics in planar
structures [161] can offer more complex interconnection
patterns, and further approaches are available for specific Fig. 8. Illustration of 32×32 arrays of 10×10μm2 areas for optical spots on a
regular interconnection networks [162], [163]. Various other chip or other substrate. (a) Directly side-by-side, taking up 320×320μm2 area.
(b) An optional array of lenslets, e.g., on 31.25μm centers, shown above the
micro-optical techniques are also possible, including lenslet array of spot areas. Such lenslets can take an array of larger side-by-side spots
arrays. (See, e.g., [164] for an extensive discussion of such free- and focus them onto the small spot areas on the chip. (c) A 32×32 array, spaced
space optics, including various micro- and nano-optical apart on 31.25μm centers, possibly using lenslets as in (b), taking up 1×1mm2
approaches and technologies.) area.
Though such approaches have been successfully researched, A simple calculation can show the orders of magnitude
they have not yet been exploited to any great degree in short possible with such “free-space” optics (See Fig. 8). Suppose,
interconnects, in part because we have not yet needed the for example, we allocate a surface area of  10 × 10μm 2 for
densities of connections they can provide. The time when we each optical “spot” on the surface of the chip. (Such an area
may need such densities may be approaching, however, and corresponds approximately to the size of the optical spot in a
there are other benefits, including reduction of energy for single mode fiber, and could correspond to the area of a grating
clocking and timing that we will discuss below. coupler or some other structure for converting between free-
We might think there would be problems with letting the space and waveguide propagation.) If we arranged such areas
light leave the waveguides and propagate through free space,
side by side, a 32 × 32 array of such spot areas, giving 1024
but, as stated, we routinely do this in imaging systems without
spots or channels, such channels would only occupy a chip area
major difficulties. Furthermore, though we use the term “free-
of ~ 320 × 320 μm 2 altogether. Even at an on-chip clock rate ~
space”, we do not necessarily mean we are propagating through
air; instead we could use bulk glass or plastic, so we can readily 2 GHz on each such channel, the bandwidth density here would
avoid problems such as dust or turbulence. be 20 Tb/mm2 (2000 Tb/cm2). Also shown in Fig. 8 is the
We might think it would be difficult to align so many beams. possible use of arrays of lenslets to concentrate from larger

51
Such an approach uses one lens to collimate a beam from a source like a from the collimated beam, and then a second lens to turn the multiple different
fiber output or a laser, a diffractive optical element, which is a lithographically angles into spots on the output plane. See [163].
fabricated plane structure, that generates beams at multiple different angles
arXiv:1609.05510 [physics.optics] v3 1 January 2017 25

spots onto spot areas space apart, e.g., on 31.25μm centers, is to give meaningful outputs, the two inputs must be
leading to an expanded total area of  1 × 1mm 2 . representing valid logic levels at the same time, and we must
Even if we expanded to 62.5 μm center-to-center spacing, the only look at the gate output at a time when both inputs are valid.
total area required would only be  2 × 2 mm 2 for these 1024 In typical logic systems, we do this by also applying a “clock”
in the system to define the valid time windows, and we may also
channels. Such a spacing would allow significant room between use additional circuitry like latches to “freeze” signals so they
the spot areas or corresponding output couplers for waveguides are valid in the desired time slots. The distribution of the
to route optical signals and/or optical power beams, a concept required clock signal itself can be regarded as in interconnect
we will discuss in Section IX below. Such a 2x2 mm2 cross- problem, and that distribution can also take up a significant
section system could carry thousands of channels over distances fraction of the chip power (see, e.g., [45], [165]).
of many centimeters with only simple lenses. See Appendix F If we think of relatively “long” interconnects, which here
for a calculation of the numbers of channels available in free could be as short as across a chip or our “short distance”
space systems. interconnects between chips, boards or cabinets, then two
These areas are much less than the overall surface area of a further issues arise:
chip, which can be up to a few square centimeters. Even running (1) the interconnects themselves can have significant delay,
at only an on-chip clock rate of ~ 2 GHz, and even allowing 2 the delay is likely not an integer number of clock cycles,
physical “beams” or channels for each data channel (as in “dual and that delay is also somewhat unpredictable in electrical
rail” operation – see below in Section VIIIB) such a system with wiring53;
1024 beams in  2 × 2 mm 2 area corresponds to 1 Tb/s of data (2) the clock frequencies in use at the two ends of the
and a bandwidth density of 25Tb/cm2. interconnect may not even be the same.
Suppose even that we wanted to couple directly into a In data links, the first of these two problems can be handled
32 × 32 array of conventional optical fibers, which would by circuitry that performs just clock phase recovery, effectively
therefore imply 125 μm center spacing; such an array butt- from the data itself; the second problem can be addressed by
coupled or imaged at unity magnification onto the surface of a recovering both the clock phase and the clock frequency from
chip would only require a 4 × 4 mm 2 chip surface area. the data. Both clock phase and clock frequency recovery can
Hence, operating with ~ 1000s of channels of free space require significant circuitry, including delay- and/or phase-
connections in an out of the surface of a chip corresponds to locked loops and data buffering for retiming; such circuitry
relatively straightforward optics that also need not use a large obviously dissipates power. Collectively, these issues of
fraction of the chip surface area. Such a number does not recovering the clock phase and/or frequency and of retiming the
approach any limits in optical design, required area, alignment data are referred to as “clock and data recovery” (CDR).
or wavelength precision. Significantly larger numbers of Typically, on links we have also wanted to get the maximum
channels, e.g., up to 10’s of thousands, might be possible if amount of data on a given physical channel; so we may time-
desired. Such optics need not occupy a large fraction of the chip multiplex the data from the lower frequency of the circuit’s
area, leaving considerable room for other functions, such as basic logic operations to some higher frequency, and similarly
heat-sinking or electrical connections. time-demultiplex it at the receiving end. Hence, we have the
Communication of 10,000’s of channels over distances of additional power consumption of the time-multiplexing and
many meters is also straightforward with a single free-space demultiplexing circuitry (otherwise known as serialization and
optical system that could be similar to two telephoto lenses deserialization or “SERDES” circuitry). That circuitry
“staring” at each other (see Appendix F). necessarily has to run at some significant multiple of the logic
circuit frequency, which typically will mean it is consuming
VIII. CLOCKING, DATA RETIMING, AND TIME-MULTIPLEXING more energy per bit operation than a logic gate itself does.
Furthermore, with such time-multiplexing, the problems of
A. Timing problems and resulting power dissipation clock recovery become worse; now we must recover a higher
There is an important aspect of dissipation in interconnect frequency clock, which will also mean we need an even better
systems that so far we have overlooked – the energy required timing precision in the recovery of the clock window.
for clocking, data retiming, and time-multiplexing in For example, [24] shows that the electronic circuit functions
interconnect links. One reason we have not considered these of line coding (for receiver AC coupling), CDR, and SERDES
aspects so far is that they are mostly not problems in can together consume ~ 20 mW for a 10 Gb/s channel, so ~2
transmitting and receiving devices themselves52; rather they pJ/bit. Of this energy, more than half (so, ~ 1 pJ/bit) is
arise as problems from electronic circuits. consumed by the SERDES circuitry. All this energy is in
In conventional digital logic systems, not only do we need addition to any energy to run the optical signal receiver and
well-defined logic levels for “1” and “0” in terms of some signal transmitter circuits and devices. 12 - 14 % of the power is in the
amplitude like voltage; we also need logic signals to fit into CDR, so > 100 fJ/bit for just that portion, in addition to the
well-defined time slots. Obviously, if some 2-input AND gate SERDES dissipation.
52
Turn-on delay in lasers can contribute to timing variability, however [95]. rise time is not reliably predictable, and hence the effective signal delay is not
53
The rise times of signals on electrical lines depend on the line resistance, predictable in practice on long electrical lines [22], at least not to within some
but the temperature coefficient of the resistance of, e.g., copper is such that the small fraction of a clock cycle.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 26

The reason for such energies in SERDES and CDR is clear we could, for example, have optical fiber connections as long
from our earlier discussion of energies to run logic gates (see as ~ 3m – 10m for only ~10 ps – 30 ps timing uncertainty from
Table I). Because running just one gate to perform just one logic thermally-induced propagation delay variation in the fiber.
operation requires several femtojoules at a minimum (and Delays of this magnitude are likely small compared to the clock
possibly considerably more), every time we “touch” a bit in period of typical logic circuits; clock frequencies of ~ 2 GHz
some operation or perform some other logical operation, we would have total clock periods of ~ 500 ps.
dissipate at least such femtojoule energies. Each bit is 2) Short pulse propagation in fibers
“touched” multiple times in a time-multiplexed link; for From the usual relation between frequency bandwidth and
example, SERDES circuits typically require clocked latching pulse time duration, as in Fourier transforms, a pulse of full
and time (de)multiplexing of each bit at transmit and receive, as width at half maximum (FWHM) Δτ has a minimum
well as other logic operations such as byte realignment. As a frequency FWHM bandwidth Δf given by an “uncertainty
result, no such time-multiplexed link can approach the energies principle” relation [84], which, for a Gaussian pulse shape as an
of a simple local interconnect in an electronic circuit. example, takes the form
So, in some hypothetical future link using low-energy Δ f Δ τ  0.44 (8)
optoelectronics, in which we may have eliminated the receiver
Mode-locked lasers can generate pulses of quality
circuit power dissipation by our receiverless or near-
comparable to such minimum uncertainty principle limits54, for
receiverless approaches, unless we somehow also reduce
example, and likely a well-designed low-chirp modulator can
SERDES and CDR powers by orders of magnitude, we cannot
also generate such high-quality pulses from a continuous-wave
take much advantage of the benefits of the new optoelectronics
beam.
approaches.
For example, a ~ 10ps pulse has a bandwidth Δ f  44 GHz .
Fortunately, however, there are ways in which optics can
eliminate both CDR and SERDES and their associated power Near 1.55μm wavelength, this is equivalent to a wavelength
dissipation. These approaches are somewhat radical from the spread  0.35 nm . Typical long-distance telecommunications
perspectives of interconnect systems as we currently know fiber is designed55 to have a dispersion ~ 10 – 20 ps/nm-km
them, but because of the growing importance of these issues, [10]. So such a 10 ps pulse would have a spread of < 7 ps in one
we need to consider these optical approaches seriously. kilometer length56. Hence over lengths even up to 100’s of
meters, such pulse dispersion may present no problems, and for
B. Optical approaches to eliminating line coding, CDR and
SERDES distances of meters or 10’s of meters, it is essentially
completely negligible. Even pulses ~ 1 ps duration would show
Optics has three major advantages that are not yet greatly
only moderate spreading over 10 m.
exploited in short interconnects:
We also know that we can use such short pulses to deliver
(1) optical delay is very predictable, allowing possibly larger
very precise clocking to electronic systems57, with sub-
synchronous systems, such as an entire rack or set of racks
picosecond precision demonstrated [62]. So optics, then, is a
[22];
very good way to deliver precise and accurate clocking to
(2) optics can support short pulses over moderate lengths,
electronic systems, even up to overall size scales ~ 10 m. Once
allowing very precise clocking from the fast rise times of
caveat is that we would not in general be able to inject the clock
the optical pulses [45], [62];
signal optically for all the points on a chip that need to be
(3) in systems with modulators, we can read out the
clocked; on a chip there is a very large number of such points,
modulators using optical pulses, automatically retiming the
and we would not have enough optical power in practice to
data as it is read out [63], [64].
clock all such points directly. We could, however, eliminate
We also need to consider two other aspects of optical
some of the upper layers in the clock distribution tree on a chip
receivers – namely,
with optics [45]. The main benefit of optical clocking for large
• AC coupling, which typically leads to the requirement of systems, though, may be in its ability to run that entire large
line coding circuitry, and system synchronously, avoiding the CDR and SERDES power
• gain control. on the longer links (e.g., off chip) as discussed above.
We would also like to use optics to avoid both of these 3) Data retiming by pulsed optical readout of modulators
additional circuit issues. One additional benefit of using short-pulse optics with a
1) Optical delay variability and precision modulator-based approach is that we can read the data out of
Optical fibers have a change of refractive index with modulators and retime the data as a result, with no additional
temperature of ~10-5 per K (or per degree Celsius). With a power dissipation required for that retiming [63], [64].
temperature range of 100 K (or Celsius degrees) for the system,
54
Such pulses are known as “time-bandwidth limited”. More correctly for Gaussian-like pulses, we might add in quadrature (the square
55
It is also possible to design fibers with lower or even near-zero dispersion root of the sum of the squares of the pulse widths or spreading).
57
[10]; finite dispersion in long-distance fibers can also be a deliberate system Note that in general, not only can optical clocking deliver very precise
choice to avoid various problems in fiber transmission. timing in terms of predictability and length of the optical pulses; effectively, the
56
This calculation is a simplistic linear addition of the calculated pulse optical pulse also leads to a much faster rising voltage edge than can be
dispersion spread to the pulse width, and may therefore be an over-estimate. generated by conventional electrical means on chip, with further improvements
in the resulting circuit performance [64], [166].
arXiv:1609.05510 [physics.optics] v3 1 January 2017 27

The idea here is shown in Fig. 9. The data from the electronic 4) Optically modulo-synchronous volumes
logic circuit drives a modulator or array of modulators. Then We could extend the use of the precision of timing available
we read out the modulator(s) with a short pulse, or an array of in optics and optical fibers to what we could call optical
optical short pulses, as might be generated using a Dammann modulo-synchronous volumes (an idea and a terminology that
grating from one short pulse source. As long as the optical pulse we are introducing here58). By modulo-synchronous we mean
readout comes at some time that the electrical data is valid, all that all propagation delays are either one clock cycle or integer
the data read out now acquires the timing of the readout pulse. numbers of clock cycles, to some accuracy of a small fraction
Hence we can remove the timing skew (different fixed delay of a clock cycle, so they have the same or similar time delay,
on different logic paths) by simple choice of optical path modulo a clock period.
lengths, and we can largely eliminate the jitter (statistically The idea of such modulo-synchronous volumes is that we
varying delay from noise or power supply fluctuations, for would completely remove the need for clock phase and
example), retiming the data precisely to the optical clock. frequency recovery on all interconnects, from a chip-sized
Effectively, the optical pulse readout is also removing the need length scale up to a scale of possibly many cabinets in size (e.g.,
for a set of data registers and their clocking. ~ 1 cm up to ~ 10 m). All signals on such interconnects would
be delivered within a known fixed part of the clock cycle
window throughout this modulo-synchronous volume, with
effectively integer numbers of clock cycles of delays. Within
what we could call a module, that delay be within one clock
cycle, or whatever substantial fraction of that we normally
consider for reliable combinational logic operations. Between
modules and racks, the additional delay would be integer
numbers of clock cycles. The overall optically modulo-
synchronous volume could be of order ~ 10 m in size,
eliminating all clock recovery within a rack or even a set of
racks.
The optical requirements for such a modulo-synchronous
system are quite modest, especially if we decide to run the
system at the moderate, few-GHz clock rates of modern
Fig. 9. If optical modulators are read out with synchronized optical pulse trains electronics (clock rates that are chosen so as to minimize and
as inputs, then we can remove (a) jitter (random variations in signal timing) and
(b) skew (different timings on different signal channels) that are present in the control power dissipation)59. The simple action of cutting
original electrical drive inputs to the optical modulators, leaving retimed and optical fiber cables to specific lengths within ~ a few mm is then
synchronized signals on the optical pulses after the modulators. (After [64]). sufficient to allow modulo-synchronous operation over ~ 10m
Note that, as long as we design the optical system so that size scales, with the additional propagation delays precise to
readout pulses arrive within the clock window, we need no timescales ~ 10 ps. Pulses from modulated conventional or
electronic circuitry at all to achieve this retiming. We also need mode-locked lasers provide suitable optical sources. We give
make no change to the optical modulators as long as they are some specific example calculations for such systems in
capable of handling the optical bandwidth of the pulses (which Appendix G.
devices like electroabsorption modulators can certainly do); 5) Avoiding AC coupling and gain control problems
specifically, we do not need to change the way we drive them When we make a receiver circuit, especially one with a
electronically or speed them up in any way. There need be no small-signal amplifier at its front end, we have to consider the
change in the modulator design or increase in its size or referencing of the voltage “midpoint” or signal “zero” of the
decrease in packing density to exploit this approach. We also amplifier input and the corresponding equilibrium voltage
do not need fast receiver amplifiers or electronic sampling output from the photodetector (i.e., the average voltage output
circuitry. Simple “receiverless” operation (see Fig. 10) or through some long string of 1’s and 0’s); in general, these
integrating receiver front-ends need no modification to work voltages will not be the same. This DC offset could cause
with such pulsed input, and indeed can then perform better than significant problems, especially if it is comparable to or larger
they otherwise do [64], [166]. Of course, large numbers of than the input sensitivity of the receiver amplifier.
spatial channels are required if we eliminate time multiplexing, A typical solution to such a problem is to AC-couple the
but as we have discussed in Section VII, free-space optics can amplifier input to make it insensitive to such static DC offsets,
offer such numbers. for example by putting a capacitor between the photodetector
output and the amplifier input. That AC coupling leads to
58
There are many existing terms describing different kinds and levels of why there may not apparently be an existing term to describe this kind of
synchronization in signaling, but not apparently one that explicitly describes synchronization may be because such a concept is not easy to achieve with
this particular concept of signals that arrive at a well-defined narrow window wires because of their effectively varying propagation delays.
59
within the clock cycle, though with delays of possibly integer numbers of Communication inside modules of the scale of 10cm within a total of one
cycles. We might just loosely call such a system “synchronous”, though that clock cycle are then straightforward if we are propagating at light velocity.
would miss the notion of delay by integer numbers of clock cycles. The reason
arXiv:1609.05510 [physics.optics] v3 1 January 2017 28

another problem, however; if the data corresponds to a very values;


long string of 1’s or a very long string of 0’s, the capacitor will • it avoids any need for gain control when used in a
essentially block such sequences60. As a result, we may add receiverless or near receiverless mode; even with
“line coding”, in which the actual data signal is “coded”61 arbitrarily large over-drive of the optical inputs, the output
before transmission into a different one that avoids such strings, voltage at the center point will either saturate at the supply
and then “decoded” at the receiver end. One example is rails (for photoconductors) or at voltages no more than the
“8b/10b” coding [24]. That line coding adds circuit complexity diode forward voltage past those supply rails (as in so-
and power dissipation. In the example we quote above [24], the called “diode-clamped” receivers [167], [168])
power dissipation associated with that line coding was ~ 15 – • it can operate as an analog data latch when using
20 % of the energy per bit, so ~ 300 – 400 fJ/bit. This is a photodiodes; if we receive optical pulses into such a
significant energy, so it is important to try to eliminate it also. detector pair driving a high-impedance input like a CMOS
FET gate, then, in the “dark” between the arrival of the
pulses, there is essentially no path for the charge to leak off
the photodetectors – at least one of the diodes is always in
reverse bias in such a scheme – so the logic state voltage is
remembered until being reset by the arrival of the next pair
of data pulses.

IX. AN EXAMPLE PHYSICAL ARCHITECTURE FOR ATTOJOULE


OPTOELECTRONICS
To illustrate how these various optical techniques could be
used in a large system to reduce power dissipation, we sketch a
physical architecture here. This example exploits the various
approaches outlined above to eliminate the energies of receiver
amplifiers, line coding, CDR, SERDES, and a significant
portion of clock distribution power generally, while allowing
large interconnect bandwidth densities. This is not meant to
represent some optimum architecture, or to exclude other
approaches; instead, it is just an example to show potential
viability and performance. If we could generate the necessary
Fig. 10. Dual-rail signaling. Equal input power beams are modulated by a pair low-energy optoelectronics integrated with their electronic
of modulators, electrically stacked and driven at their center point by a voltage circuits, this example approach is otherwise one that reasonably
Vin. The resulting pair of modulated beams are transmitted to a pair of could be engineered; the other optics required are well within
electrically stacked detectors, where they lead to an output voltage at their
center point of Vout. the capabilities of current engineering if we chose to pursue
them.
If we have large numbers of optical channels available, as in A. System interconnect energies
some free-space system, for example, there is one interesting
We presume first that we are going to run the entire system
optical option to avoid these problems – namely, “dual-rail”
at 2GHz clock rate62, consistent with power-efficient silicon
operation, in which we use a pair of beams A and B to represent
chips, and we presume the optical modulo-synchronous
one signal. Here, a logic 1 is represented by beam A being bright
approach for the system. We drive all longer interconnects
and beam B being dark (or less bright), and a logic 0
using optical pulses through modulators. Such interconnects
corresponds to the opposite. Then if we use a pair of
would predominantly be off-chip, but could include some of the
photodetectors in a “stacked” configuration at the receiver, we
longer on-chip interconnects also in the optically hybrid
can avoid AC coupling and all of the associated coding and
waveguide/free-space architecture we will discuss. Note that
decoding power (see Fig. 10). Such dual-rail optically
these interconnects could be as long as ~ 10m within the
approaches were successfully employed in large digital optical
modulo-synchronous approach.
system demonstrations [136], [156], [157].
Hypothetically, we would operate receiverless or near-
This approach has several additional benefits:
receiverless photodetector pairs integrated very close to
• it avoids any requirement of high on/off contrast in a light
minimum-sized transistors, using a dual-rail approach. We
beam, because it is a differential approach that only works
presume the total capacitance of the photodetector pair, the
with the difference between the powers, not the absolute
input transistors, and any wiring connecting them is ~ 100 aF.
60
Such sequences also cause problems with clock recovery because there limits anyway, and would certainly want to avoid the yet further complexity
are no “transitions” to use to estimate the clock cycle time. and power dissipation of error correction on every link.
61 62
This “line coding” is quite different from coding we might use for error Note that the major electrical interconnect connections to memory chips,
correction to counteract the effects of noise, and would be in addition to that. In such as the DDR4 specification, are specified to run just at such low GHz rates,
general, in short interconnects, we would try to avoid running near any noise simply running large numbers of lines to achieve large aggregate data rates.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 29

Fig. 11. Sketch of an optical platform for dense, low-energy interconnects, shown at multiple different length scales, from the transistors up to free-space
arrays off a larger chip. (The figure is not to scale, especially for the size of the transistors, which would be relatively much smaller than depicted here.) (a)
A pair of photodetectors is integrated beside the gate of the corresponding transistor input (here shown in the form or a FinFET structure). (b) A dual-rail
optical beam pair A and B are connected though “transistor-layer” couplers and short (e.g., ~ 1μm) waveguides to the photodetectors. (c) A photonics layer
(e.g., as in silicon photonics) sits on top of the electrical wiring layers of the chip. Here it contains couplers to couple the input beam pair A and B though
waveguides in the photonics layer, to a pair of “photonics layer” optical couplers that here focus the light through transparent regions in the electrical wiring
layer onto the “transistor layer” optical couplers. (d) Elsewhere on the chip, electrical “via” connections through the electrical wiring layer connect from
output transistors to modulators that are in waveguides in the photonic layer. Optically, power is fed into the modulator waveguide from a power light beam
through an input coupler, and an output coupler couples the resulting modulated power to an output beam. (e) and (f) show portions of input and output
couplers and beam arrays. (g) shows a larger picture of the photonic layer on top of the entire chip. Here we envisage various 2 dimensional coupler arrays:
input and array coupler and beam arrays; a power array coupler and beam array; and linear arrays of fiber inputs and outputs. (h) shows spot array generator
optics fed by input power from some central optical power source through a fiber, and how multiple chips might be connected latterly and vertically using
free-space connections. Such connections could include array coupler optics laterally between adjacent chips, as well as other array connections possibly
vertically in and out from other modules or boards.

This is a moderately aggressive target, but not unreasonable as energy per bit could be reduced to ~ 30 aJ with some avalanche,
a stretch goal given our arguments above. photoconductive or transistor amplifier gain, without
By use of some moderate optical confinement (e.g., 10) in substantial energy cost compared to the optical transmitter
the photodetectors and some photodetector material (e.g., energies; so, we are taking a “near-receiverless” approach at the
germanium) operated at its direct gap, we presume these input.
photodetectors have close to unit quantum efficiency (1 As in Fig. 11, we presume that we have one or more free-
electron of current for each incident photon). Hence a received space array units, each of, say, 1024 optical spatial channels,
energy of ~ 100 aJ would be sufficient to swing a logic level at coming on or off each chip. We could configure these as 512
the transistor input, even without additional gain. logical channels in a dual-rail approach. At 2 GHz clock rate,
We now further presume the minimum required optical input that would correspond to ~ 1 Tb/s data rate on or off the chip in
arXiv:1609.05510 [physics.optics] v3 1 January 2017 30

such a unit. We also presume that the total optical system loss transistors is made so as to eliminate high electrical receiver
from the power source laser to the photodetector, including loss power dissipation. The goal is to achieve total capacitance at
from finite modulator contrast, is 19dB (a factor of 80)63 (see, the input, including photodetectors, parasitics and transistor
e.g., [19]). Then the required optical energy per bit would be input capacitance ~ 100 aF per detector while achieving
30aJ×80 = 2.4 fJ. The total optical power for one such unit of reasonably efficient optical coupling into the detectors. The
512 channels would therefore be 2.4fJ x 1Tb/s=2.4mW. precise design of this integrated transistor/ photodetector/
If we assume the laser source driving this system has a “wall- waveguide/ coupler structure to achieve efficient coupling to
plug” efficiency of 30% (which is an aggressive target), then the detectors with low parasitic capacitance is an interesting and
the total power to run the laser is 8 mW (or 8 fJ per bit). If we substantial research challenge for nanotechnology and
use optical modulators that themselves operate with energy < nanophotonics.
1fJ/bit, which is already possible with quantum well modulators In this proposal, above the electrical wiring layers on the chip
[41], [78], and we assume a similar electrical circuit energy per we add a photonics layer (see Fig. 11 (c), and (d)), such as a
bit to drive the modulators, then we end up with a total system silicon photonics layer. This layer contains waveguides, optical
energy per bit < 10 fJ/bit, or < 10 mW to drive 1 Tb/s of couplers to the photodetectors, optical couplers to external
interconnect. beams (in free space or other guided wave structures like
Note that this hypothetical interconnect can drive fibers), and optical output devices (modulators or lasers).
connections over the entire modulo-synchronous volume at the Possibly this optical layer is hybrid-attached after separate
same energy per bit. Hence the potential here is to reduce fabrication on another temporary substrate.
interconnect energies by ~ 2 orders of magnitude or more This approach of putting an optical layer on top of the
compared to current approaches. The energy per bit here is so electrical wiring layers may allow some separation of the
low that it would be energetically favorable to use it for longer electrical and optical fabrications requirements. For example,
on-chip interconnects, which may otherwise take 100’s of fJ/bit optical waveguides work with lower loss with a relatively thick
(see Table I). dielectric layer (e.g., microns) underneath the waveguides, but
electronic processes typically do not use such thick layers.
B. Optical platform concept
Additionally, it may be somewhat easier to manufacture
Here, we sketch an optical platform approach that could sophisticated integrated photonic structures, such as those
provide the necessary bandwidths, connections and energies. requiring advanced materials like quantum wells for modulator
See Fig. 11. This is very much in the spirit of a “straw man” or laser structures, if we separate them from the electronic
proposal, i.e., one that is intended to generate discussion, fabrication itself.
rational criticism and comparison, and stimulate improved or Functionally, putting this optical layer on top means we do
alternative proposals. We will describe this progressively from not have to route optical waveguides in between wires inside
the smallest, “transistor” level up to the largest meter-scale the electrical wiring layers themselves; we only need to allow
level and beyond. occasional transparent regions vertically in the wiring layers to
In this example, we presume first that we have integrated pass light beams through to the detectors and/or local couplers
photodetectors right on top of the transistors, in what we call and waveguides on the “transistor” layer. The use of this
the “transistor” layer (Fig. 11 (a) and (b)). We use dual-rail separate layer on the top also ensures that the entire area is
signaling, so these photodetectors are electrically “stacked” as available for optical waveguides, couplers, and output devices.
in Fig. 10 (though they are physically side-by-side in the One disadvantage of putting the optical output devices in the
integration here), and we presume the center point of this stack photonic layer is that we will necessarily have some capacitance
directly drives the gate or gates of a CMOS stage (here depicted to connect to them. For a ~ 5 μm vertical “via” wiring through
like a FinFET with a single “fin”). the electrical wiring layers, we should expect a capacitance of
Since the detectors will require to be driven by two beams A ~ 1 fF. That may be tolerable for an output device that itself
and B that may need optical spacing of ~ a micron or more just might have 1 fJ of operating energy anyway, though it would
to separate them, the light from these beams is optically routed be undesirable for the photodetectors, which is why we have put
from couplers in the “transistor” layer through waveguides in them here on the transistor layer in this example; moving the
the “transistor” layer to the two detectors, thereby avoiding the photodetectors up to the photonics layer might make integration
capacitance of electrical wiring (at ~ 200 aF/μm) that would be easier, at some cost (~1 fF) in the input capacitance and required
required if we spaced the detectors themselves by microns. optical energies, though it would avoid the additional loss of
Possibly the waveguides here are nanometallic or plasmonic, optical coupling down to the transistor layer. For the output
or some combined metal-dielectric guide. Possibly there is devices, the energy to charge and discharge this “via”
some optical resonance in the overall detector structure (e.g., capacitance is also not “magnified” substantially by the system,
Fabry-Perot, Mie or other shape resonance). Possibly the being essentially just an additive energy. (In contrast,
detectors use germanium or III-V materials integrated with an increasing the required optical energy at the photodetector is
underlying silicon electronics platform. likely essentially to scale up the entire energy of the system
Overall, the choice of integrating the detectors right with the
63
Possibly we could presume less loss than this. This number, though, is one
estimated for real systems in current research demonstrations [19].
arXiv:1609.05510 [physics.optics] v3 1 January 2017 31

proportionately.) We are presuming here that the electrically- Actual free-space connections over meters pose no basic
driven devices in the photonic layer are otherwise attached problems for optics (see Appendix F for calculations of
without additional substantial capacitance, however. numbers of channels as limited by diffraction in such longer
One concept as shown in Fig 11 (e), (f) and (g) is that we connections). Conventional imaging optics routinely handle
would group input and output couplers in arrays. We could millions of resolution elements. We might need some autofocus
drive an entire chip optically with a single pulsed optical power and autoalignment approaches, but since those would be done
source, for example delivered through a fiber from the central on entire optical systems, the amortized cost of those per
laser, as shown in Fig. 11 (h), and distributed to 1000’s of power channel would be relatively small. Note again that consumer
input couplers using Dammann grating spot array generator cameras routinely operate with many millions of pixels, and
optics [163], here presumed miniaturized to a millimeter scale. with both autofocus and image stabilization performed
This optical power would then provide the input power to optomechanically in the optical system.
modulators through waveguides (and could also be used for Note, incidentally, that with our system here hypothetically
clocking inputs to the chip). requiring 2.4 mW of optical power for every 1 Tb/s of
The modulators would be driven electrically as in Fig. 11 (d), interconnect, it is quite conceivable to run the interconnects for
and the optical output from those modulators would be fed an entire large system from one centralized laser source. A 1 W
through waveguides to an array of free-space output signal source, such as a single semiconductor laser amplified by an
couplers. From modulator arrays on other chips we would have erbium fiber amplifier, would provide enough power for over
free-space arrays of input signal beams into the chip, which 200 chips, and support a total interconnect bandwidth of over
would eventually be coupled to the photodetectors as in Fig. 11 200 Tb/s – a bandwidth that, incidentally, is comparable to the
(a), (b) and (c). entire long-distance internet bandwidth.
Signal arrays could be fed from chip to chip using free-space
optics, such as the “free-space channel couplers” in Fig. 11 (h), X. CONCLUSIONS
which could be plastic or glass channels (or possibly even
A. Using optics to reduce the energy for handling
mostly empty space), with appropriate mirrors. Possibly the information
optics would use lenslet arrays, as in Fig. 8, directly above the
optical input and output couplers. The optics might use only the In this paper, we have argued that energy consumption and
lenslet arrays, mirrors, and an imaging lens, or could also dissipation are the dominant limit on our ability to continue to
include additional imaging optics. The lenslet array can also scale information processing and communications; if we do not
likely be aligned very precisely using some planar alignment reduce the energy per bit processed and/or communicated, we
technique to micron or even sub-micron accuracy to the will not be able to continue the exponential growth in the
couplers, reducing positional alignment tolerances in the rest of amount of information we consume.
the free-space optics. (See Appendix F for a discussion of the We have next argued that most of that energy is in the
optical design of such free-space coupler optics.) communication of information, especially over the distances
The free space connections do not need to go through solid within an information processing or switching machine. We
channels, nor do they need to be only between adjacent chips. have seen that it is difficult to reduce that energy if we stay with
(Remember, too, that light beams in free space can pass right purely electrical approaches.
through one another, so crossing arrays of light beams pose no Progressively, we have then argued that, because of the
problem.) Also shown in Fig. 11 (e) are arrays of inputs and different physics of optical communications compared to that
outputs for other free-space array connections, possibly to of electrical wires, optics can reduce that communications
adjacent boards, for example. A silicon photonics platform is of energy. This potential reduction comes in two forms.
course also capable of making fiber connections on and off the First, we can avoid the charging and discharging of lines that
edge, which could be useful for making particularly long leads to the majority of the dissipation in electrical connections
connections or connecting to external networks; we have at short distances; we propose to do this by substituting optical
sketched those also in Fig. 11 (g) and (h). interconnects, which have no such dissipation, for essentially
Note in an optical system like this that the optical loss in all off-chip interconnects (and possibly some connections on
propagating from one device to another is essentially chips). The technical challenge then becomes one of reducing
determined by coupler losses, not propagation loss. There is the energies required to run the optoelectronic devices
essentially no loss on propagating either through “free-space” themselves. That challenge leads us to the need for attojoule
or through optical fibers over the distances up to 10 m optoelectronics, both in photodetection and in optical output
considered here. devices like lasers, LEDs and modulators.
There could be many reasons why we might consider fiber If we can eliminate most of the detector capacitance, down
connections over the longer distances of meters inside system, to levels ~ 100 aF or smaller, with such attojoule optoelectronic
and indeed we have presumed some fiber-based distribution of approaches, and integrate them directly on electronics, then we
optical power here. But we should note also that free-space can largely also eliminate the substantial power dissipation of
connections of thousands of channels can work over longer receiver amplifier circuits; we would then move to operational
distances even up to 10’s of meters if we choose to engineer modalities that we call “receiverless” (no electronic receiver
them. amplifier) or “near-receiverless” (only simple low-energy
arXiv:1609.05510 [physics.optics] v3 1 January 2017 32

receiver amplifiers). (The receiver energy this eliminates is convenient in operation or integration, for example, but we do
currently of the order of 100’s of fJ/bit or higher.) not apparently fundamentally require either them or any other
Second, we can go on to propose the use of other features of more fundamental breakthrough to meet the kinds of targets
optics, especially its abilities (i) to deliver very precise and discussed here.
predictable timing in volumes up to ~ 10 m in size and (ii) to Note, incidentally, that we have focused here exclusively on
offer very numbers of channels, especially in free-space the use of optics and electronics to reduce energy by solving
connections. As a result, we can eliminate other high- problems of interconnects. We have not proposed optical or
dissipation electronic circuits normally associated with optoelectronic approaches to logic itself. We have addressed
interconnect and data links – circuits that currently can dissipate this point elsewhere [34], showing the challenges in such logic
picojoules or more per bit; specifically, we argue we can for any mainstream use64; arguably, the case for more optics in
eliminate line coding, CDR and SERDES circuits entirely. interconnects is much stronger.
The net result of these eliminations of line charging and of There are, however, various areas of technological research
most or all of the circuitry commonly associated with longer that will be very important if we are to work towards realizing
links is that we can propose that we could make essentially all the goals we set out for interconnects.
links within a system look like short on-chip interconnects, up 1) Nanoscale integration of photodetectors and electronics
to and beyond entire cabinets of electronics, both functionally Perhaps the most important direction and opportunity in nano
and in their energy use. technology required here is the intimate integration of
A stretch goal for such an approach is a total energy of ~ 10 photodetectors right beside or even on top of transistors (see,
fJ/bit communicated, and we have sketched a “straw-man” e.g., [169], [170]). Such an approach would seek to minimize
system that arguably could work towards such a goal. Note that capacitance, towards the range of 10’s of attofarads, while also
such a goal, if achieved, would correspond to 10mW of total combining good optical coupling into the detector, possibly
dissipation for each Tb/s of communication inside an entire including nanoresonator structures and/or nanometallic or
system up to ~ 10 m in size. That energy per bit is therefore 2 plasmonic elements.
to 3 orders of magnitude lower than current approaches at We note that, once we reach “receiverless” or “near-
length scales from chip-to-chip interconnections to longer receiverless” operation, the overall operating energy of the
connections. Such an energy is even less than that of current system can scale down, largely in proportion, as this input
electrical interconnects across a chip itself. capacitance is reduced and the optical coupling efficiency is
In the proposed “straw-man” approach, the optics can also increased. For maximum benefit, this photodetector integration
operate at very high interconnect bandwidth densities. should be directly within the fabrication of the logic technology
Particularly if we make the transition to free-space optics for or “transistor” layer; moving it to higher layers of the
some of the connections, we may be able to break the fabrication adds the capacitance of the resulting longer
interconnect “byte-per-flop” limits that severely constrain electrical connections between the photodetector and the
architectures today. transistor.
With this example approach, we can see that we are 2) Low-loss mode coupling
substantially addressing all four goals originally set out in The overall operating energy improves, essentially in
Section I for our attojoule optoelectronics interconnects. proportion, as we reduce optical loss in the system. Most optical
loss is in the couplers between one device or optical layer and
B. Key research directions another, not in the actual propagation of light within guides or
This proposal is certainly speculative, but it is meant to be free space. Optical coupling devices themselves generally are
one that is physically realistic and could reasonably be not lossy in the sense of having optical absorption. Rather, the
engineered. It does not require the discovery of any new losses could all be viewed as mode mismatch. Not all the
physical mechanisms beyond those we already understand and incident light in its input mode (e.g., a free-space beam) is
in materials we currently use. Indeed, part of our analysis shows coupling into the output light in its output mode (e.g., a single-
that existing known mechanisms used in current devices and mode guide); the shape of the actual output beam does not
applications offer energies at least as low or lower than more match the shape of the desired output mode. (In this sense, all
exotic recent proposals such as 2D materials (see Appendix A). uncoupled or scattered light is merely light left in some other,
That is not to say we should not continue to explore novel undesired mode). Such precise mode-conversion has been a
material approaches, especially if they are somehow more problem in optics for some time.
64
One major challenge is that nearly all such optical proposals do not meet for all aspects such a technology if we were to supplant CMOS logic; this does
even the qualitative requirements for logic devices and systems [34]. With the not therefore seem a particularly promising direction with substantial and
techniques discussed here, we could, however, make new lower-energy unequivocal benefit. We are not arguing against research here on truly novel
versions of previous functionally successful devices [64], and we could even ideas; the promise, too, of some fully quantum operations for some possible
argue that we could now make such functionally viable optoelectronic devices quantum computing systems certainly remains a worthwhile long term goal for
operate with possibly only hundreds of attojoules. But, at that point we would fundamental research. But, we have argued here that we have a clear and
merely just be competitive with the transistor for logic operations. We would convincing case now for advancing and exploiting low-energy optoelectronics
also have to create other technologies such as dense local optical wiring. Now, to solve the problems of interconnects for all longer wires. Those problems have
we could conceive of some solutions there, such as nanometallic concentration existed for some time, with no apparent path to better solutions other than a
and waveguides. But, we would need very large numbers and very high yields change to such optics.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 33

Recently, however, there have been substantial advances in on top of the chip wiring layer
techniques to allow arbitrary design of optical nanostructures • micro- and nano-mechanical technologies for tuning and
[128], [129], [130], [131], [132]; such design, together with adjustment of optical devices and circuits.
nano-scale fabrication techniques could allow a new generation 5) Low-energy output devices and their integration
of low-loss couplers, in part because such nano-fabrication Devices exploiting the relatively weak optical modulation
could allow the incorporation of the full design complexity mechanisms available in silicon have been engineered to a
needed to match precisely from one mode shape to another. remarkable degree and their feasibility and challenges for
Additionally, there are approaches to self-aligning couplers that systems have been deeply analyzed (see, e.g., [105]). Other
could adjust themselves after fabrication [133], [134], [135]. microscopic mechanisms are much stronger, as we have
Such low-loss coupling – from large beams to small beams, discussed. For example, a hypothetical QCSE electroabsorption
from free space to waveguides, from one guide to another – is modulator using germanium [41], [78] or III-V quantum wells
both a critical requirement and a major opportunity for these with a (300nm)3 active volume could be an attractive approach.
emerging design opportunities. Since there are likely many such There are also many promising directions such as nanoneedle
mode conversion interfaces in the whole optical path, the and nanocavity growth on silicon for lasers [54], [174] and
research target here for a coupler is to move from loss of a few LEDs (as well as photodetectors) [117], [119], [124], [174],
decibels to loss of a few percent. [175] that could address integration issues.
3) Free-space micro-array optics and systems The research goal here should be to exploit the stronger
Free-space array optics would allow very high densities of microscopic physics of such effects to achieve a sub-femtojoule
connections in and out of chips and modules, solving the device working over the entire C-band while eliminating the
bandwidth bottleneck, and enable us to save energy by need for any post-fabrication trimming or active temperature
eliminating much of the electronic circuitry of current links. stabilization. Any new device approach here would, however,
Compact, dense, self-aligning, free-space systems are now have to have some credible path by which an integrated system
quite feasible, and a broad range of micro-optical technology could be made, with very large numbers of devices at high
exists. Following on previous successful laboratory yields.
demonstrations of free-space digital systems, the research goal
now would be to generate technology for arrays of 1000’s to C. Final conclusions
10,000’s of beams (i) with millimeter cross-sections and We have taken a broad view here of the motivations and
centimeter lengths for on-board or on-module connections, technological opportunities, from environmental limits on
possibly in rigid and manufacturable structures and (ii) in self- information processing and computing through to fundamental
aligning free-space array optics for board-to-board or even optics and quantum mechanical mechanisms, for using optics
cabinet-to-cabinet connections. and optoelectronics to reduce energy in handling information.
4) Extending integrated optics technologies As we said earlier, this article cannot be a deep review of any
We need to be able to make large numbers of optical devices, topic; its main goal is rather to clarify research directions,
such as waveguides and beam couplers, ideally integrated with questions, and opportunities.
active optical devices, such as photodetectors, modulators, We have considered novel and even radical approaches to
lasers and LEDs. An integration platform like silicon photonics complete systems; having such a complete system proposal is
[47], [48], [49], [50], [51], [52], [53], [54], [55], [56], [57], [58], important because it enforces an intellectual honesty on our
[59], [60] gives a good basis, allowing large numbers of optical optimistic conclusions of real benefit – we cannot just push
components and complex optical circuits. show-stopping difficulties “under the rug” in the hope that
A key research direction will involve augmenting such a someone else will deal with them. Though we have proposed
platform with monolithic or heterogeneous integration of other an entire platform example here, from the transistor level up to
materials or structures so we can reach energy and performance long fiber connections, it is just that – an “existence proof”
targets especially for output devices. Such additions could example. There may be many other valid approaches.
include Though we have identified many technological challenges
• III-V materials that would need to be addressed to realize the full benefits
• quantum well or other quantum-confined structures [171], envisaged here, a solution to any one of these challenges, such
[172] in III-Vs or germanium [41], [78] as better integration, lower energy devices, or lower loss
• integration of materials other than silicon, either in coupling, will be useful on its own. Complete success in all
monolithic form, including novel nanoscale integration aspects at once is not necessary for useful progress.
approaches that can avoid problems with lattice mismatch Overall, our conclusion here is strongly optimistic: optics
[117], [119], [124], [173], [174], [175], or using offers real opportunities for substantial reduction in energy and
heterogeneous integration of III-V device structures [50], improvements in performance in systems that handle
[171], [172], [176], [177] or other materials such as information, and these opportunities should stimulate many
organics [87], [109] exciting and worthwhile research and technology directions in
• technology for electrically connecting such optoelectronics optics and optoelectronics. Indeed, without optics, we may have
onto (or into) electronics with negligible additional no other solutions to eliminating much of the energy we use to
capacitance, such as some direct-bonding technique right handle information.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 34

APPENDIX A – MICROSCOPIC MECHANISMS FOR OPTICAL


MODULATION AND THEIR ENERGY REQUIREMENTS
In this Appendix, we will give a more detailed discussion and
comparison of the energy requirements of various mechanisms
currently understood for making optical modulators that could
operate at GHz or higher rates.
Such mechanisms fall broadly into two categories: those that
work by electrically-induced changes in optical absorption (i.e.,
electroabsorption), and those that work by electrically-induced
changes in refractive index (i.e., electrorefraction).
D. Electroabsorption mechanisms and approaches
There are two main categories of electroabsorption
mechanism: (1) absorption changes as a direct result of electric
Fig. 12. Comparison of FKE (data after [184]) and QCSE (data after [82])
field in the material, and (2) absorption changes resulting from electroabsorption. The FKE data is taken in a SiGe diode structure with a ~2μm
electrical control of carrier (i.e., electron and/or hole) density in thick depletion region with ~ 0.6% fractional Si content (so technically a
the material. Si0.006Ge0.994 alloy) so as to shift the absorption edge slightly to shorter
wavelengths from that of pure Ge. The QCSE data is taken in a Ge/SiGe
1) Electric field mechanisms heterostructure diode with a ~220nm thick depletion region containing 5 Ge
A set of related mechanisms are found for electroabsorption quantum wells, each ~ 14nm thick, with 18nm Si0.19Ge0.81 barriers between
with photon energies near the direct band-gap energy of a them. The effective absorption coefficient for the QCSE is calculated using the
semiconductor, usually exploited for photon energies in the total thickness of the wells plus barriers for the effective optical thickness of
the structure, so an effective absorption coefficient of 314cm-1 in the figure is
region just below the that nominal band-gap energy (so at equivalent to ~0.1% probability of a photon being absorbed as it tries to pass
wavelengths longer than the bandgap wavelength). These are (i) through one quantum well from one side to the other. For the QCSE diode,
the Franz-Keldysh effect (FKE) [178], [179], [180], [181], fields are calculated from voltages by adding on a built-in field equivalent to
0.8V across the 220nm-thick depletion region; this built-in field, which would
[182], [183], (ii) exciton broadening (bulk excitonic correspond to that in a homostructure diode with a ~0.8 eV bandgap energy
electroabsorption) [66], [181], [182], and (iii) the quantum- (like the direct bandgap of Ge), is only an estimate because this is a
confined Stark effect (QCSE) [65], [66]. heterostructure diode that contains contact regions with direct bandgaps larger
than this, but at the same time there are lower, indirect gaps present from the
The FKE and exciton broadening are seen in bulk Ge materials and possibly in the contact regions also.
semiconductors. Exciton broadening is also seen in quantum
well layered structures for applied electric fields parallel to the What is missing from this “plane wave” approach is that the
layers [66], and the QCSE is observed for applied electric fields actual final state is that of an electron-hole pair; because of their
perpendicular to the quantum well layers [65], [66]. The QCSE Coulomb (electrostatic) attraction, they are much more likely to
is also present in quantum wires and quantum dots when the be in the same place than is estimated based on the “plane
field is applied along one of the confinement directions [183]. wave” approach. In this electron-hole pair model, the
Fig. 12 compares experimental data for germanium65 bulk FKE probability that we will absorb a photon to create a pair in a
and for germanium quantum-well QCSE. given state is proportional to the probability that the electron
Incidentally, it is not necessary that the lowest bandgap and hole will be found in the same unit cell in the resulting state
energy in a semiconductor is the direct gap in order to see such [186], [187]. There are both bound states (“excitons”) of these
electroabsorption mechanism. These electroabsorption effects electron-hole pairs that appear just below the bandgap energy
can be seen at the direct gap even in materials that are [186] as strong absorption lines, and also so-called
themselves indirect. A good example here is germanium, which “Sommerfeld” enhancement of the absorption above the
shows all these electroabsorptive effects at its direct gap energy bandgap energy (see, e.g., [188] for expressions for both aspects
in appropriate structures [41], [67], [68], [69], [70], [71], [72], for 2D and 3D cases).
[73], [74], [75], [76], [77], [78], [79], [80], [81], [82], [83], In many bulk semiconductors at room temperature, the
[185], as in Fig. 12. exciton absorption peaks associated with the bound states are
Optical absorption across the direct gap in semiconductors is already so broadened by lifetime effects (such as ionization by
described in the simplest model as being between plane-wave optical phonons [189]) that they are often not clearly resolved
“Bloch” states for electrons in the valence and the conduction at room temperature; the excitonic effects are, however, still
band; this is a “non-excitonic” model. Though it is simple, and strongly affecting the shape and strength of the optical
does describe some features, it is both qualitatively incorrect – absorption spectrum. When we quantum-confine electrons and
it does not actually predict the spectral shape of the absorption holes in semiconductors at sizes comparable to or smaller than
– and quantitatively quite inaccurate – it substantially the size of the lowest-energy (“1s”) exciton – so ~ 10 nm, for
underestimates the strength of that absorption. example – in one or more dimensions, we increase the
probability of finding the electron and hole in the same place66.
65 66
Technically, this data is for a Si0.006Ge0.994 alloy. Somewhat surprisingly, confining in that one direction also leads to the
exciton being smaller in the other two directions (see, e.g., the analysis by [188]
for the 2D case), which further enhances excitonic effects.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 35

As a result, excitonic effects are enhanced in such quantum- would lead to shifts in the “steps”. In particular, the lowest
confined structures, often allowing the associated peaks to be “step” would move to lower photon energies. Fig. 13 shows the
clearly resolved at room temperature even when they are barely quantum mechanics behind the majority of the shift in the
resolved in the equivalent bulk material [187], [189], [190]. absorption edge in an example calculation. With applied field,
Hence enhanced excitonic effects in optical absorption are a the lowest electron and highest hole confined states (which are
particularly important consequence and benefit of quantum the edges of the sub-bands) in the well move towards one
confinement in nanostructures. another in energy, thereby moving the lowest absorption step to
a) Franz-Keldysh and exciton broadening
lower photon energies. We see in this typical quantum well that
electroabsorption the separation between these states changes from 819meV
If we neglect the excitonic effects for the moment and (≅ 1514nm wavelength) at zero field to 799meV (≅ 1552nm
consider the effects of electric fields on the absorption near the wavelength) for 105V/cm (10V/μm) applied perpendicular to
direct bandgap energy in bulk semiconductors, then we this 12nm thick layer, which would shift the absorption edge to
calculate the FKE [178], [179], [180], [183], which leads to a lower energies by 20meV (≅ 38nm change in wavelength).
“tail” on the absorption that extends into the bandgap region.
The electroabsorption very near to the direct bandgap energy
in bulk semiconductor materials can be dominated by another
effect – what we are calling exciton broadening
electroabsorption; this is the lifetime broadening of the
excitonic absorption lines resulting from the field-ionization of
the bound excitonic states in the electric field [181], [182].
Nonetheless, the qualitative effect is similar once we are
significantly below the energy of the main exciton absorption
peak, with the appearance of an electrically-controllable
absorption tail that extends smoothly below the bandgap energy
to longer wavelengths.
Though the exciton broadening electroabsorption is quite
sensitive to field in the region very near to the (main) exciton
absorption peak, it is likely not usable there because it does not
have enough absorption coefficient contrast, as required in Fig. 13. Calculations of the electron (conduction band) and (heavy) hole
criterion (5) above, so this general category of electroabsorption (valence band) energies and wavefunctions for the edges of the sub-bands,
effects in bulk semiconductors near the bandgap energy is only numbered ne (nh) for the conduction (valence) sub-bands, in a 12nm-thick
In0.47Ga0.53As quantum well (the composition that lattice matches to InP). These
usable at energies moderately below the bandgap energy where calculations use the simplifying approximation of infinitely high potential
the “zero-field” background absorption is small (and where the barriers on either side, using the analytic model for “tilted” potential wells [84],
mechanism is typically described and modelled as being the at 0V/cm and at 105V/cm (10V/μm). The bandgap energy of the unconfined
FKE even if there may be some excitonic broadening effects material is taken as 750 meV, with effective masses of 0.041mo (electron) and
0.46mo (heavy hole), where mo is the free electron mass.
also present). This mechanism is exploited successfully for
optical modulators (see, e.g., [185] for a recent example).
We see also that the wavefunctions for the electron (i.e., in the
b) Quantum-confined Stark effect conduction band) and the hole (i.e., in the valence band) are
In a quantum well structure, such as a ~ 10 nm thick layer of distorted by the applied field. For these closest electron and hole
a narrower bandgap semiconductor sandwiched between layers levels, this distortion reduces the “overlap” integral between the
of wider bandgap semiconductors, the allowed states in the wavefunctions, which lead to some loss in the corresponding
quantum-confinement direction (the direction perpendicular to “height” of the absorption step with field. This kind of behavior
the layers) become quantized. If we neglect excitonic effects for is clear in the QCSE spectra of Fig. 12 for the longest
the moment, the absorption between the resulting valence and wavelength “step” in the absorption. In our discussion of the
conduction sub-bands would lead to an absorption spectrum QCSE so far, we have neglected67 excitonic effects. A key
that is a set of “steps” [187]. Excitonic effects are also strong in additional point, however, is that, unlike the behavior with bulk
such quantum wells, however, and as a result we can clearly see materials, the excitons are not rapidly field-ionized even for
strong excitonic peaks associated with each such step, even at strong fields applied perpendicular to the layers; that is because
room temperature. the walls of the quantum well hold the exciton together. Hence,
If we apply an electric field in the direction perpendicular to we see clear shifting of the absorption steps while retaining
the layers, we can shift the energies of the confined states inside strong and relatively sharp excitonic peaks, which is the
the well, leading to energy shifts of the sub-bands, which in turn mechanism known as the quantum-confined Stark effect

67
Formally, if we neglect the excitonic effects in the QCSE model, then the of the (non-excitonic) quantum well electroabsorption would tend towards the
resulting behavior is essentially the quantum-confined version of the (non- (non-excitonic) FKE spectrum as we increased the width of the layer [194].
excitonic) FKE mechanism. We can show that the electroabsorption spectrum
arXiv:1609.05510 [physics.optics] v3 1 January 2017 36

(QCSE) [65], [66]68. These excitonic peaks are visible, for depend on the abruptness of the optical absorption edge since
example, in the QCSE data of Fig. 12, where they are seen as their strongest effects correspond to “clearing out” a region of
the slight peaks in the various spectra (e.g., near 1420 nm in the absorption as the coupling is turned on and off with field; if the
QCSE spectrum at -0.1V). edge is not abrupt, then the “cleared out” region may not have
This mechanism can equivalently be regarded as a giant Stark sufficient absorption contrast.
shift of the exciton, and is formally equivalent to the electric Such electric-field electroabsorption devices can have some
field shift of the ground state of a hydrogen atom if we were temperature dependence because, like lasers, the bandgap
able to confine it between two “walls” less than ~ 1Å (0.1nm) energy does move with temperature, generally by ~ 0.3 – 0.5
and apply an electric field of ~ 10 – 100 V/Å. meV/K. In the case of QCSE modulators, this may be less of a
From a practical point of view, the QCSE offers an problem because the modulator can be voltage-tuned to
electroabsorption in which we can shift a relatively abrupt and compensate for temperature variations, and if we operate at
strong (e.g., 100’s to 1000’s of cm-1) absorption by large high field, the absorption change may be sufficiently broad in
amounts (e.g., even as much as ~ 100 meV). The required wavelength range that no temperature compensation is
electric fields are in the range of 104 – 105 V/cm (1 – 10 V/μm), necessary. For example, the QCSE electroabsorption in
which can be applied using reverse-biased diode structures, for germanium quantum wells on silicon can be voltage tuned to
example. work with good absorption coefficient contrast over ~ 125 nm
Comparing the QCSE to the FKE in similar materials, as in of wavelength range [82], which corresponds to ~ 150 K
Fig. 12, we see first that both effects are capable of producing temperature range given a measured 0.46 meV/K (0.84 nm/K
absorption coefficient changes ~ 100cm-1 to nearly ~ 1000cm-1 in wavelength shift) [68]. Even with somewhat lower applied
for photon energies in the region just below the bandgap energy fields, as in [79], allowing ~ 500cm-1 absorption change at high
(wavelengths longer than the bandgap wavelength). With the absorption contrast over ~60 nm wavelength range would be
QCSE it is easier to get large contrasts in the absorption sufficient for a 70 K operating temperature range.
coefficient between the “on” and “off” states, which is an Because such modulators can run well even when hot (e.g.,
important criterion for devices. The abruptness of the QCSE 100 °C [69]), such modulators can also be temperature tuned by
absorption edge means that, unlike the case of the FKE, the heating, which is generally easier to achieve and more energy-
device can be tuned by biasing so that the absorption edge is efficient than cooling.
shifted close to the operating wavelength, and then the device 2) Carrier density mechanisms
can be operated by applying a small additional bias to shift the a) Free-carrier plasma
absorption edge just past the operating wavelength.
For photon energies far below the bandgap in direct gap
This level of electroabsorption can be exploited in waveguide
materials or for indirect materials with populations only in the
structure that, for the case of the QCSE, can be similar to those
“indirect” valleys, there are absorptive and refractive effects
used for semiconductor lasers; indeed, QCSE modulators are
associated with “free” carrier densities N e and N h
widely used today in optical telecommunications, where they
are often integrated with semiconductor lasers. (conventionally given in units of “per cubic centimeter” – cm-3)
The QCSE electroabsorption effects are also large enough to in the conduction and valence bands respectively. For silicon,
give strong modulation of light in micron-thick structures, the absorption coefficient for such “free-carrier plasmas” at an
allowing modulators that can operate directly on light example (free-space) wavelength of 1.55 μm is given
propagating perpendicular to the surface either with (e.g., [75], approximately by [106]
[79]) or without resonators, or enabling particularly compact α fc  8.5 × 10 −18 N e + 6.0 × 10 −18 N h (9)
low-energy waveguide modulators. Indeed, a short (10 μm So, a carrier density of 1018 cm −3 leads to absorption
long) waveguide QCSE modulator, without any resonant coefficients of ~ 10 cm-1.
cavity, has already shown sub-femtojoule operation [41], [78]. Such models are approximately justifiable from a Drude free-
Such devices can be run with the ~ 1V drive swing readily carrier plasma approach, though the situation with holes is more
available from CMOS electronics [41], [78], [79], and test complicated, in part because of absorption between different
structures show the potential for total voltage ~ 1V [82]. valence bands.
The QCSE may represent the strongest and most energy- For a typical III-V material, free-carrier absorption associated
efficient high-speed optical modulation mechanism available. with holes is thought to dominate, for example in the operation
Physics experiments confirm its operation to picosecond time of lasers [195], and the hole absorption numbers are comparable
scales [69], and the speed limit is likely sub-picosecond [191]. to those in silicon (e.g., 13 cm-1 at 1018 cm-1 in InGaAsP near its
QCSE modulators can exploit other forms of quantum well bandgap wavelength).
structures, such as coupled wells [192], [193], which can offer Such absorption coefficients are essentially too small to be
some improved electroabsorption in specific cases [193]. attractive for compact electroabsorption modulators, but are
Whether such coupled wells offer substantial benefits can large enough to be a nuisance in giving background absorption
68
There is a small additional shift of the binding energy of the exciton itself,
though this is relatively small compared to the shifts of the electron and hole
“single-particle” levels [66], [68].
arXiv:1609.05510 [physics.optics] v3 1 January 2017 37

in high-Q structures. The change in absorption with carrier run such a device is comparable to that of the light emitter
density can also influence the behavior of refractive modulators device in Table I (e.g., 160 fJ/(μm)3. The absorption changes
based on high-Q structures. here are somewhat larger than in the QCSE, so the devices
b) Band-filling mechanisms could be somewhat smaller even than the QCSE devices.
Hence, devices made using this mechanism would lie
As we add electrons or holes to a semiconductor, in one
somewhere between the light-emitter and modulator numbers
simple (non-excitonic) view, we can start to fill up the bands,
in Table I.
and that band filling blocks the possibility of further absorption
Optical modulators based on band-filling in graphene have
into the states that are already occupied, as sketched in Fig. 14.
been proposed – see, e.g., [198], [199], [200], [201]. Based on
In direct bandgap materials, such as many III-V
current understanding, however, the required operating
semiconductors, the electron effective mass can be quite small,
energies for these would be considerably higher than those for
and hence the density of available electron states per unit energy
quantum wells, for example. We will discuss the comparison of
is also quite small. As a result, with moderately large densities
2D materials and quantum wells below.
of carriers, such as ~ 1018 cm −3 , a “pool” of electrons collects in
the bottom of the conduction band, effectively blocking E. Electrorefraction mechanisms and approaches
absorption over a substantial spectral range. There are several different mechanisms for changing the
refractive index of a material under some kind of electrical
control. We will consider two basic categories: (i) electric-field
mechanisms that work as a result of some microscopic
polarization of the electron wavefunctions; and (ii) band-filling
mechanisms that work as a result of the change of carrier
(electron and/or hole) density in the material. There are other
ways of changing refractive index, such as heating (which is
quite a useful mechanism in silicon photonics for tuning and
slow switching), molecular reorientation (as in liquid crystals),
and change of physical state (as in phase change materials like
Fig. 14. On the left we see a simple picture of valence and conduction bands in GST), but we will not consider these further here, mostly
a direct-gap semiconductor. On the left, the valence band is full of electrons,
and the conduction band is empty. A photon of energy just above the bandgap because they will not generally be fast enough for modulating
energy can be absorbed to take an electron from the valence band to the interconnect or communications signals.
conduction band. If, however, we add a large number of electrons to the All these refractive mechanisms can be understood through
conduction band, they will fill the lowest states in a kind of “pool” of electrons
that collects at the bottom of the conduction band. Now the absorption of the
the Kramers-Kronig relations (see, e.g., [202] for a classical
photon is blocked because the final state for the electron is already occupied. discussion and [84] for the relation to quantum mechanical
approaches) as resulting from changes in the optical absorption
The detailed physics of such mechanisms is somewhat more
spectrum. Indeed, these relations show that any change in
complicated than this non-excitonic description suggests. The
absorption at any wavelength will in general lead to changes in
presence of free carriers also effectively screens the interaction
refractive index at all other wavelengths (and vice versa).
between the electron and hole in excitons, so there is an
Classic electrorefraction mechanisms such as the Pockels
additional benefit from the disappearance of the excitonic peak
effect and the Kerr effect are not usually described in terms of
from such screening. There is also some bandgap
absorption changes, in part because these mechanisms are
renormalization – a shrinkage of the bandgap with increasing
typically employed in a spectral region far from the
carrier density – that partly counteracts the band filling. See,
wavelengths where any absorption changes are taking place
e.g., [196] for a discussion of such mechanisms. A general term
(the absorption changes may be at very short wavelengths).
to cover the resulting changes in absorption spectrum is “phase-
These mechanisms, as a result, are in practice generally not
space absorption quenching” [197], though the more informal
resonant and vary little with wavelength.
and less accurate “band-filling” is more common. Band filling
One mechanism associated with free carriers is a result of the
is also sometimes called Pauli blocking or Burstein-Moss shift.
plasmon absorption peak that results from free carrier densities;
A good example of band filling is given by a quantum well in
in semiconductors at normal carrier densities, that plasmon
a field effect transistor structure [196], [197]. Relatively
absorption is at long, far infrared wavelengths, and a direct
complete quenching of the absorption with α abs / α trans ≥ 2 is
calculation using a Drude model for the plasma behavior can be
possible over at least 40 meV in photon energy range near 0.8 useful, at least for electrons.
eV at room temperature [196] with sheet carrier concentrations Other mechanisms like refractive index changes from band
just under 1012 cm −2 in a 10 nm thick quantum well (an filling and from electroabsorption near to the bandgap energy
effective volume density therefore just under 1018 cm −3 ). Such are generally best understood and calculated working directly
quenching corresponds to ~ 1% change in the transmission of from the known changes in absorption spectrum near the band
light through a single quantum well [196], [197]. If we presume gap energy and using the Kramers-Kronig relations explicitly.
it takes ~ 1eV of energy for each added electron in the structure When we are working at photon energies or wavelengths
(consistent with bias voltages ~ 1 V), then the energy density to close to where the major absorption changes are occurring, such
arXiv:1609.05510 [physics.optics] v3 1 January 2017 38

Kramers-Kronig calculations will typically show changes in the that case, we can get the same path length change with half the
real and imaginary parts of the dielectric constant that are of electric field. But that means we only need ¼ as much
comparable magnitude. (The real part is responsible for electrostatic energy density, and hence half the energy overall.
refractive effects and the imaginary part for absorptive effects). Equivalently, we may have doubled the capacitance C by
See, for example, the dielectric constant or susceptibility near a doubling the length, but we have halved the voltage V, hence
typical atomic absorption line to understand this point [84]. But halving the resulting (1 / 2 ) CV 2 operating energy. In some
the absorption itself in such regions is usually too large to make waveguide device, there is no specific limit to how low the
much direct use of such large refractive changes because of our energy can go if we can make the waveguide arbitrarily long.
criterion (6). Hence we typically need to move to a spectral In practice, however, Pockels effects are sufficiently weak that
region where the absorption and/or induced absorption are the length of the waveguide is set by other practical
lower, which means we also get lower refractive index changes. considerations, such as waveguide loss or other practical limits
As a result, for a given such resonant mechanism, refractive on length, and devices like that of [88] may well represent the
devices tend to have to be longer, and hence have lower energy limits of low-energy operation for known materials in devices
efficiency, than the corresponding absorptive devices. without resonators.
1) Electric field mechanisms – Pockels effect It is possible to make asymmetric quantum well structures
The Pockels effect is a linear change of refractive index with
(e.g., [193]), which would technically give Pockels effects in
applied electric field, and is an example of a second-order
refractive index, though it is possibly simpler just to regard
nonlinear optical effect (sometimes described in terms of a
those as variants of the QCSE electrorefraction.
coefficient χ ( 2 ) ). Since the sign of the refractive index change 2) Electric field mechanisms – Kerr and QCSE
would obviously therefore be reversed if we reversed the The Kerr effect is a quadratic variation of refractive index
direction of the electric field, any material that shows a Pockels with electric field, and is technically a third-order nonlinear
effect must look different in two opposite directions. A classic optical effect (sometimes described in terms of a coefficient
Pockels effect material like lithium niobate, which has a strong χ ( 3 ) ). No particular material symmetry is required for the Kerr
Pockels effect, has such a property, and lithium niobate effect, and it will exist in principle in essentially any material.
modulators are extensively used in telecommunications. III-V Because it is third-order, however, at least for non-resonant
materials like GaAs have potentially usable Pockels effect for mechanisms, it is generally weak, and therefore not of great
electric fields in certain directions. Silicon, however, because it interest for low-energy modulators in conventional materials.
does not have the right symmetry properties, does not show a The electroabsorption mechanisms discussed above all have
Pockels effect. electrorefraction associated with them, and that
If we strongly strain silicon, such as by depositing layers of a electrorefraction can be calculated quite effectively based on
material like SiN on it under appropriate conditions, it then the Kramers-Kronig relations, usually from empirical
acquires the necessary asymmetry. Such strained silicon [203], absorption spectra. If the change in the absorption coefficient
[204] can show refractive index changes up to Δn ~ 3.5 ×10−5 spectrum when we apply the field is Δ α (ω ) , then in practice,
with effective applied electric fields  5 × 103 V/cm . The
we can deduce the change Δn (ω ) in refractive index at some
corresponding effective electrooptic coefficient r33  2.2 pm/V
(angular) frequency ω = 2π f (where f is the conventional
can be comparable to that of III-V materials, though it is about
an order of magnitude smaller than that in lithium niobate, frequency in cycles per second) using the integral [189]
which has r33  33 pm/V [204]. One other current approach is to ∞ Δα ( ω ′ )
Δn ( ω ) = P  dω ′ (10)
0 ω ′2 − ω 2
try to hybridize lithium niobate on silicon [205], [206] for such
electrorefractive modulators. The “P” here means to take the principal value, which means
Organic materials can have larger electro-optic coefficients of technically we have to avoid the singularity at ω = ω′ . The
r33  170 pm/V [207], and they can be successfully exploited to integrand just on the two sides of the singularity will actually
demonstrate relatively low energies in optical modulators [87], cancel out so there is no actual divergence in the resulting
[88], [109]; for these demonstrations, using a plasmonic integral69.
waveguide with a 90 nm gap (and electrode spacing) and Writing Δω = ω′ − ω , we can rewrite the resonant
exploiting additional field concentration effects from the slow denominator as ω ′2 − ω 2 = (ω ′ + ω ) Δω . Hence a change in
group velocity in the guide, this work shows a 10 μm long absorption at one frequency ω ′ gives rise to a change in
device with ±3V drive, on an estimated capacitance of 2.8 fF,
refractive index that falls off approximately as Δω as we move
for an energy of ~ 25 fJ/bit. away in frequency.
One interesting point about Pockels effect devices is that, in For the resulting refractive index changes induced by the
principle, there is no specific minimum energy required to run QCSE, see, for example, the calculations in [79], [208]). In the
them, even without resonators. To understand this, suppose we vicinity of the exciton resonance itself, the resulting index
decide to double the length of some Pockels effect device; in
69
One practical way to handle this numerically is to add a small positive decreasing the value of δ until it makes no further significant difference to the
quantity δ to the denominator when performing the integral in Eq. (10), result in some wavelength range of interest.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 39

changes are quite large, in the range of Δn  0.01 to 0.04 [79], crystal waveguides, can allow the devices to be shortened,
[208]. In that region, however, it is difficult to satisfy the reducing operating energies.
criterion (6) for refractive devices because the absorption is too The lowest energies demonstrated may be in microdisk
high. It is worth noting, however, that a “hybrid” resonator resonators [105], a version of the ring-resonator approach. With
modulator using both electroabsorptive effects combined with a Q-factor of ~ 10,000, ~ 1 fJ/bit can be obtained in a 4.8 μm
simultaneous electrorefractive shifts of the cavity resonance can diameter device. Some degree of tuning is possible without use
be quite effective in this region, with the electrorefractive of thermal mechanisms, and system-level choice of devices at
effects significantly improving the performance of the run-time can avoid other thermal tuning at some cost of
modulator [79]. electronic energy dissipation. Overall, with additional feedback
If we want to make a more purely electrorefractive modulator, and control electronics,  10 fJ/bit operating energy is
we need to move to photon energies somewhat below the projected for such a device used in conjunction with ~ 10 nm
bandgap energy where the background absorption is smaller CMOS electronics; this energy is dominated by the monitor
[79], [208]. This is quite a viable strategy for electrorefractive receiver energy.
devices based on the QCSE, which then are quite competitive
with, say, lithium niobate approaches [209], [210]. [209] shows b) Band-filling mechanisms
a switching device operating with a 675 μm long active region As we approach the bandgap energy, the changes in refractive
and 2.5 V drive swing, in a device operating with photon index from band filling start to dominate over the simple free-
energies significantly below the bandgap energy, and made to carrier refractive effects discussed above. These band-filling
satisfy the additional design constraint of polarization- effects can be much stronger.
insensitive operation. At low carrier densities in direct gap III-V materials,
The data and calculations of [79] suggest a non-resonator neglecting excitonic effects, [212] gives
electrorefractive device with germanium quantum wells might Ne  ω − EG 
Δn  −1.7 × 10−17 J  (12)
be possible at 1.55 μm wavelength, with a background ( ωeV ) T  k BT 
2

absorption of  30 cm −1 [77] and an index change


where the electron density Ne is in cm-3, k B is Boltzmann’s
Δn  2.3 ×10−3 (satisfying condition (6)) at an operating field
constant, EG is the bandgap energy, ω is the photon energy
 10 5 V/cm and a length ~ 330 μm. (Note, incidentally, that the (necessarily in eV in the denominator expression), T is the
indirect absorption tail in germanium [77] is generally not temperature in kelvin, and J ( ε ) is a resonant function that is
strong enough to preclude such refractive modulators.) In a
~ 0.5 at a photon energy an amount ΔE  k BT below the
hypothetical 200 nm × 300 nm waveguide, the operating
bandgap energy, and falling off with photon approximately
energy would be  100 fJ with a drive voltage swing of ~ 2 V.
∝ 1 / ΔE as the separation ΔE below the bandgap energy
Since there is no optical field concentration in such a increases. So for a photon energy of ~ 0.8 eV (corresponding to
hypothetical device, we can see that the basic energy ~ 1.5 μm wavelength), at room temperature this expression
requirements of such QCSE electrorefractive mechanisms are gives
comparable to the lowest-energy demonstrated Pockels-effect Δn  4 × 10 −20 N e (13)
devices [87], [88], [109], which have significant optical field at about ΔE  k B T below the bandgap energy.
concentration from nanometallic waveguides and group
[213] estimate a modal index change of -0.0005 in increasing
velocity effects.
the carrier sheet concentration from 1012 cm −2 to 2 × 1012 cm −2
3) Carrier density mechanisms
in a 10 nm thick InGaAs quantum well with a Γ factor of 0.03.
a) Free-carrier plasma
Hence, the index change is equivalent to ~ 0.0005/0.03 = 0.017.
The refractive index change in silicon from the presence of Now, 1012 cm −2 is equivalent to a volume density of 1018 cm −3
free carrier densities at an example (free-space) wavelength of
, so the index change here is
1.55 μm is given by [106]
Δ n  1.7 × 10 −20 N e (14)
Δn fc  − 8.8 × 10−22 N e + 8.5 × 10−18 ( N h ) 
0.8
(11) [187] also estimates Δn  4 × 10 −20 N e in GaAs about k B T
 
For a representative carrier concentration of 1018 cm −3 below the absorption edge at low densities in a quantum well.
Note here we are attributing this index change to the electron
electrons or holes, which corresponds to moderately strong
density; because of its low effective mass and the resulting low
doping or carrier injection, we would have changes of refractive
density of states for electrons, the filling of the conduction band
index of ~ −8.8 × 10−4 for electrons and ~ −2.1× 10 −3 for holes.
is largely responsible for the band-filling effect that is behind
Devices based on this mechanism have been extensively
this index change. Note also these estimates of the band-filling
researched (see, e.g., the reviews of silicon optical modulators
index change are  ×10 larger than the silicon free-carrier
[211] and of silicon photonics generally [49]). Simple Mach-
index change in (11).
Zehnder devices based on this approach tend to have energies
4) General conclusions on electrorefractive modulators
in the low picojoule range [104]. In addition to the use of high-
Electrorefractive effects can certainly be a viable choice for
Q resonators, such as rings, other approaches, such as photonic
low energy modulators, though the devices will generally have
arXiv:1609.05510 [physics.optics] v3 1 January 2017 40

somewhat higher energies than the best electroabsorption Our main interest here is in the possibilities of low energy
devices. For those devices based on the refractive consequences devices. To understand the energy requirements, we can
of changes in absorption spectra near the bandgap energy (so, usefully compare these 2D materials with a quantum well
band-filling and QCSE devices), for the same operating energy structure, which is itself already in many ways a 2D material.
density (e.g., 1018 cm −3 carrier density or 105 V/cm operating First, we note that an expression similar to (15) can also apply
field), the electrorefractive devices generally have to be longer to quantum wells. If, for example, we take a simplified model
(e.g., 100’s of microns long without resonators) to work than of direct gap optical absorption in semiconductors (neglecting
the corresponding electroabsorptive device (e.g., microns long). excitonic effects) [84], use a simple two-band k.p model for the
The resulting operating energies for such electrorefractive semiconductor [84], and use the 2D density of states (as
devices are therefore going to be correspondingly larger than appropriate for a quantum well), then we derive the same
the electroabsorptive devices. So we might expect QCSE expression as (15) for absorption just above the bandgap
electrorefractive devices to more comparable to those of light energy, with the only difference being that the result is divided
emitters in Table I, and band-filling electrorefractive devices to by the refractive index of the material in which the quantum
have higher energy than the light-emitter numbers in Table I. well is embedded. (That same factor would likely apply also to
Devices based on the best conventional Pockels-effect a graphene layer embedded in another material since it just
materials, such as the organic materials of [88], could have comes from the electromagnetics of such a problem.)
operating energies comparable to those of the QCSE A quantum well empty of carriers shows strong excitonic
electrorefractive devices in comparable structures, but still enhancement of absorption near the bandgap energy. Graphene
significantly larger than the best electroabsorption modulators does not show corresponding excitonic effects in the infrared or
[41], [78]. visible for two reasons: first, we are not operating near to a
For silicon devices based on free-carrier plasma effects, for bandgap energy; and, second, the high carrier densities we need
the same energy densities, devices without resonators would to use in devices when shifting the Fermi level to coincide
need to be ~ 10 times longer than the band-filling or QCSE appropriately with the operating photon energy would strongly
electrorefractive devices, with correspondingly larger energies screen any excitonic effects. In quantum wells, likely at least
of operation (hence the picojoule energies [104] of simple partially as a result of their excitonic enhancement, even with
Mach-Zehnder modulators in silicon). Hence, low energy the reduction in absorption from the refractive index,
silicon modulators have to use large amounts of optical energy experimentally, a single quantum well can show a measured >
concentration to reach low operating energies (e.g., Q-factors 2% absorption near its bandgap energy ([196] shows > 4%
of 1000’s or higher). relative change in transmission in a double pass through a single
In general, though electrorefractive devices remain an quantum well as it is filled with sufficient carriers for band
interesting option, it is harder to scale them down into deeply filling). Hence, a quantum well can have similar absorption as
sub-femtojoule operating energies without substantial optical a single graphene sheet.
field concentration, and the silicon free-carrier mechanism may 1) Band-filling modulation energies
already be operating at close to the lowest possible energies in Both the quantum well and graphene show absorption
recent impressive demonstrations [105]. modulation by band filling, but the sheet carrier density
required in graphene is much higher. To make the graphene
F. Comparison of quantum wells and 2D materials transparent at a photon energy of 0.8 eV, we would need to fill
There has been considerable recent interest in 2D materials the conduction (or valence) band up to a Fermi energy EF of
like graphene or MoS2 for their potential in optics [214]. One 0.4 eV. In graphene, using the standard expression
often-cited attribute is that graphene, a material in the form of E F = v F π ne where ne is the sheet (i.e., per unit area)
sheet that is only one atom thick, can have an absorbance (the
probability that a photon would be absorbed in passing through electron (or hole) density and the Fermi velocity vF  10 6 m/s
the sheet) [214], we would require ne ~ 1.2 × 1013 cm −2 , which is
e2 significantly higher than the < 1012 cm −2 for the quantum well
A  πα fs =  2.3% (15)
4ε o c structure, as discussed above.
which raises many interesting questions and possibilities for Possibly a fairer comparison is to ask by how much we would
optical and optoelectronic devices. have to increase the sheet carrier density in the graphene for
This absorption is particularly broad band, and we can expect modulation, compared to some starting concentration. With
many novel possibilities for integration in which such a layered sufficient carrier density to shift the transparency edge to ~ 0.8
material can be conveniently integrated with other electronic eV photon energy, the edge of the absorption spectrum of
and optical structures. Optical absorption modulators based on graphene has a width of ~0.15 eV for a factor 2 change in
band filling have been proposed and demonstrated [198], [199], absorption [214]. Moving EF from 0.3625 eV to 0.4375 eV to
[200], [201]; [201] shows ~ 1 pJ/bit operation in a 40 μm-long move the transparency edge by ~ 0.15 eV requires an additional
device integrated with silicon technology, for example, which sheet carrier density of ~ 4.4 × 1012 cm −2 , which is still ~ 5 times
is competitive with silicon Mach-Zehnder modulator [104] the required sheet density in the quantum well case.
approaches, for example. Graphene does have the significant qualitative feature that the
arXiv:1609.05510 [physics.optics] v3 1 January 2017 41

precise operating wavelength can be set as necessary over a cavity, Q can be thought of as
very wide range, just by changing the bias. Nonetheless, this  energy stored within the cavity 
mechanism in graphene does not offer lower energies than the Q = 2π ×   (16)
quantum well approach, and its operating energies would lie  energy lost during one cycle of oscillation 
somewhat above those shown for the light emitter in Table I. and cavity finesse F can loosely be considered either as
2) Electroabsorption mechanism  energy stored within the cavity 
It does not currently appear that 2D materials like graphene F = 2π ×   (17)
or MoS2 offer useful electric-field-driven electroabsorptive  energy lost during one cavity round trip 
effects for modulators. Graphene itself does not have a bandgap or equivalently as
that would allow the excitonic and band-edge electroabsorption  number of cavity round trips a photon makes 
F  2π ×   (18)
mechanisms, at least in the visible or near-infrared. Single layer  before being lost 
MoS2 does have a direct bandgap and strong excitonic effects at least for high-finesse cavities.
[215]. [216] shows QCSE in MoS2 with fields > MV/cm, For both finesse F and quality factor Q, the loss in question
corresponding to electron-hole pairs effectively confined within
each layer of MoS2. Though shifts of up to 16 meV are observed can be from absorption, scattering, escape through the mirrors,
here, even with these large fields, these shifts are not apparently or any combination of these.
large compared to the exciton linewidth [217]; hence, they may From our statement Eq. (18) above, instead of a photon just
not be particularly useful for optical modulators, because the making just one pass through the material in the cavity, it will
background absorption is quite an important parameter as in now make F / π passes (note that one round trip corresponds
criterion (5) above. to two passes through the material), so the average energy
In effect, MoS2 is arguably too thin for good QCSE. In density in the cavity is magnified by this amount. If the optical
quantum well materials there is effectively an optimum concentration factor in some propagating mode was originally
thickness for QCSE electroabsorption, which is typically ~ 10 γ o , then adding some cavity of finesse for that mode, then the
nm. If the layer is thicker, the QCSE shifts are larger, but the new optical concentration factor is
absorption strength of the shifted absorption steps falls off too F
quickly with field (because of the separation of the electron and γ = γo (19)
π
hole states to opposite sides of the well, decreasing the overlap Consider a cavity of length L in which the only loss
of their wavefunctions that is necessary for optical absorption). mechanism is the transmission of light through mirrors, with
If the layer is too thin, the quantum confinement energies (intensity) reflectivities R at each of the two ends of the cavity.
become larger and the wavefunctions are too difficult to The probability that the photon leaves the cavity on hitting one
perturb, requiring larger fields. Also, often such thin layers have of the mirrors is 1 − R , which will be a small number for high-
larger broadening of the absorption edge for any of a number of reflectivity mirrors. So the probability that the photon is lost to
different reasons, effectively eliminating the necessary the cavity in a round trip is approximately the sum of these
absorption coefficient contrast between “absorbing” and “non- small probabilities for the two mirrors, giving a probability of
absorbing” states as required by criterion (5).
loss per round trip of 2(1 − R ) , and therefore an average
The conclusion here is that, though there may be some viable
and interesting prospects for modulators using 2D materials, number of round trips before being lost of 1 / [2(1 − R )] . So, we
and these may have some qualitative advantages, they currently arrive at the expression for such a cavity
do not appear to offer any basic energy advantage over F  π / (1 − R ) (20)
structures like quantum wells, and may actually require larger and from Eqs. (19) and (20), the optical concentration factor is
operating energies. Possibly other such materials not yet 1
investigated for optoelectronic device use may offer additional γ  (21)
1− R
opportunities. We note, for example, that the related layered
material WS2 [218] does show very strong excitonic effects, The relation between Q and F for high-finesse cavities can be
with a particularly strong and clearly resolved peak. stated as
2L
Q= F (22)
APPENDIX B – OPTICAL CONCENTRATION AND USE OF λn
RESONATORS
where λn is the wavelength inside the material. For a refractive
G. Optical concentration factor index n, and a free-space wavelength λ, λn = λ / n . So Q is
Here we briefly discuss the relation between our concept of larger than F by a factor that is the length of the cavity in half-
an optical concentration factor γ and various other terms used
wavelengths in the material. We can see this relation also from
for concentrated electromagnetic fields. Formal definitions of
Eqs. (16) and (17). Light propagates one wavelength in the
finesse F, quality factor Q, and Purcell enhancement factor can material (i.e., λn ) in one cycle. It therefore requires 2 L / λn
be found in standard references, so here we will concentrate on cycles for a round trip; to get to F from Q, we need to divide by
an informal approach emphasizing the physical meanings.
For electromagnetic fields at the resonance frequency of a 2 L / λn .
arXiv:1609.05510 [physics.optics] v3 1 January 2017 42

We see from Eq. (19), incidentally, that finesse F rather than loosely to our optical concentration factor γ, however.
the quality factor Q is a more direct measure of the increase of H. Use of high-Q resonators
optical concentration factor resulting from the use of a cavity.
Though it is the finesse F, rather than the cavity Q, that
The Purcell enhancement factor FP is typically defined in
determines the concentration factor, to make small devices
terms of the ratio Q / Vλ n where V is the cavity volume
work using resonators, in practice we typically need to increase
expressed in in units of λn3 , in which case the definition is the Q factor, not just the finesse. With most microscopic
FP ≡ (3 / 4π 2 )Q / Vλ n (23) mechanisms that we use for devices, we are limited in the
Substituting from Eq. (22) absolute values we can have for processes such as absorption or
3 2L F 3 F absorption changes, gain, or refractive index change. For light
FP ≡ = (24) emitters or modulators, beyond some level of excitation or
4π λn Vλ n 2π 2 Aλ n
2
drive, we will reach some limit on these changes; either the
where Aλ n is the cross-sectional area of the cavity in square
basic properties of the material itself or our practical inability
wavelengths. A guide of cross-sectional area Aλn without a to drive it more strongly (such as practical voltage limits) may
resonator would have a field concentration factor γ o = 1 / Aλ n . prevent us from increasing the amount of emission or gain or of
So, using Eqs. (19) and (24), we have, for some resonator absorption or refractive index changes.
structure, Hence, even if we fill the active cross-section of the
3 waveguide or resonator with the active material, we will still
FP = γ (25) need some product of length and concentration factor to get the

device to work. For resonator approaches, that product is
Hence, for resonator structures, the concept of Purcell
essentially the Q of the resonator – Q is finesse F multiplied by
enhancement factor FP and our optical concentration factor γ
the length of the cavity in half-wavelengths, as stated above. So
are essentially the same, differing only by a numerical factor
Q is often the quantity quoted in devices rather than finesse F.
3 / 2π  0.477 . Equivalently, Purcell enhancement factor is
effectively defined for a somewhat smaller cross-sectional area It is still correct, however, as implied by Table II, that we need
than our reference structure, e.g., a square cross-section of area specific levels of concentration factor γ (and hence of finesse in
cavity approaches) to make devices using specific active
(3 / 2π )λn2 , or a circle of radius ( )
3 / 2π 2 λn , for example, volumes. The energy numbers in Table II presume we are
instead of the square λ reference cross-section we use for γ.
2
n
operating the microscopic mechanisms at some typical practical
One could argue that we should just use the Purcell factor level of excitation or drive.
rather than introducing our optical concentration factor; in Note, though, that it is the cavity Q that determines how
response, we would argue that our factor is more directly precisely the resonator has to be tuned. The resonant frequency
intuitive and applies to a wider range of structures, not being f is proportional to the cavity length L, and the frequency width
restricted to resonators. Δf of the resonance is Δf  f / Q . So, to hit the resonant
The term “local density of states” is sometimes used to cover frequency within a resonance width, the length of the cavity has
broader cases that do not necessarily involve resonators, but it to be correct to within a precision ΔL given by
is arguably a deeply confusing and unfortunate terminology70, ΔL Δf 1
especially for situations that do not involve resonators, so we =  (26)
L f Q
avoid it. Essentially, the ratio of the local density of states to the
so, a fractional precision of ~ 1 / Q . Hence, if we require high-
density of states (modes) in free space would correspond
Q resonators, we have to deal with this tuning precision either
70
In quantum mechanics, as in Fermi’s Golden rule (see, e.g., [84]), the obvious resonator, and no obvious way to define a true density of states that has
transition rate for a process like optical absorption or emission can be been enhanced. Any increase in optical interaction for materials near such a tip
proportional to the square, |μ|2, of a matrix element between initial and final is arguably physically from the increased optical field, not from any change in
states, and to the density ρ of available final states. One view of resonators is to the density of optical states. Nonetheless, it is common to describe such
say that they concentrate the optical density of states by some factor, and that enhanced interactions in terms of an effective “local density of states”, even
concentration therefore enhances the transition rate; and this is a common view though, in this author’s opinion, that terminology bears little or no relation to
in discussing Purcell enhancement (introduced in Purcell’s original description the actual physics. As a result, though, we will avoid using the term “local
[219]). However, if we consider a resonator in space, or inside some large box, density of states” here, using the more physical idea of optical concentration.
the resonator has almost no effect on the density of states of this larger system. Incidentally, though the various terminologies might make this seem to be a
In that view, what happens is that, for those modes of the overall system that confused topic where no clarity is possible, a direct quantum mechanical
happen to correspond to strong resonance within the resonator, the mode approach here is quite straightforward and will give unambiguous answers. For
amplitude is strongly enhanced inside the resonator, which leads to a much example, we could model the resonator system by putting it in a large box, and
larger |μ|2 for all such modes. In this case, it is the matrix element between the then evaluating all the electromagnetic modes of that large box, including the
initial and final states that is enhanced, because the optical states of interest resonator. Then we could calculate a property like absorption or spontaneous or
correspond to ones with much larger field concentration inside the resonator stimulated emission using those modes rather than plane waves, following a
where the active material is. Now, in one view, the difference between these standard quantum optical approach, e.g., as in [84]; the result of such a
two pictures does not matter, at least for resonators; both will give the same calculation is quite independent of any of the definitions of terms like finesse,
answer if we come up with some supposed factor for the enhancement of the quality factor, or local density of states.
density of states by the resonator. However, once we consider other situations,
such as the enhancement of optical field near some metallic tip, there is no
arXiv:1609.05510 [physics.optics] v3 1 January 2017 43

in the original fabrication, in some post-fabrication trimming, variation corresponds to a change in index of  10−2 and a
or in some feedback adjustment in operation. corresponding fractional change in the resonant frequency or
In fabrication, lithography might allow length precision ~ a wavelength. For example, for a Q  30 or smaller, such a
few nanometers. Suppose that our device of interest has be on fractional change would be significantly less than the fractional
the scale of only a few microns in size so that the energy can be linewidth (  1 / Q ) of the resonator. For a Q of this magnitude,
low enough and the density of devices high enough; then it
would be difficult to set the operating wavelength of the device it might also be possible to operate over most or all of the
to better than ~ 1 part in 1000 directly in fabrication. telecommunications C-band (1530 – 1565 nm wavelength)
Furthermore, device-by-device trimming to compensate for that without tuning since that wavelength range corresponds to a
lack of fabrication precision might not be feasible financially fractional range of  1/ 44 . Hence, such a Q  30 device could
for the large numbers of devices we might need. be quite a practical option.
For light emitters, we could argue that the precise wavelength
may not matter much, though that does mean that we cannot use APPENDIX C – MATERIALS CRITERIA FOR MODULATORS
other narrow band or wavelength-sensitive optics in the rest of As mentioned in the main text, an important criterion for a
the optical system; dense wavelength division multiplexed modulator is the absolute difference ΔT in the transmission of
systems might therefore also not be possible with such lasers as the modulator in its two states [41]; this gives the fraction of the
sources without some further tuning. input optical power that is usefully available to drive the
For modulators, if they need high Q’s just to function detector and receiving circuit. In general, when trying to
sufficiently well, we could propose some active tuning maximize energy efficiency overall, optimizing ΔT is more
stabilization for every device, but that raises two other issues: important than optimizing contrast ratio itself [41]. As a result,
we would need additional detection and feedback loops for a good device not only should have some significant contrast
every device (as well as some wavelength reference), and we ratio between high and low transmission, but it should also have
would need some physical resonator tuning mechanism for a high maximum transmission. Hence, background loss in
every device. There could be many different approaches to modulators is particularly important. This leads to important
resonator tuning, but current approaches such as thermal tuning consequences for the properties we require from
tend to consume significant energy; other microscopic electroabsorption and electrorefraction materials.
mechanisms for changing refractive index can lead to loss (e.g.,
as in tuning by changing carrier density) and may also not be a) Criteria for electroabsorptive materials
able to give large enough refractive index changes to tune a For the moment, presume we have a device without any
small device. One possible approach might be micromechanical resonator. Suppose the background absorption coefficient of the
tuning, which might not require any static power dissipation. material (i.e., the absorption coefficient in the “transmitting” or
Even if we can devise an approach that allows such tuning of “on” state) is α trans . For an electroabsorptive modulator,
each resonator, the additional system complexity and power suppose the absorption coefficient in the “absorbing” or “off”
dissipation associated with such tuning could be prohibitive for state is some larger amount α abs = ρα trans , so that the ratio of
any large number of modulator devices, so we should be the “off” to “on” absorption coefficients is ρ. For a length L, the
cautious in proposing Q’s beyond 1000 for any modulator “on” and “off” transmissions will be Ton = exp( −α trans L ) and
device to be used in large numbers. As noted above, however,
electroabsorptive devices can likely achieve low enough Toff = exp( −α abs L ) , respectively, with the difference being
energies without such high Q’s, so they remain an attractive Δ T = Ton − Toff .
modulator option. An electroabsorptive material at a given wavelength and
One further important issue is that resonator wavelengths will operating field will have some specific absorption coefficient
in general drift with temperature. In real systems, we should ratio ρ. A simple maximization by differentiation shows that the
expect that the entire system should be able to operate over largest ΔT is obtained for a length L such that
some significant environmental temperature range, such as at ln ρ
least the commercial range of 0° – 70°C; local temperatures on α off L = (27)
ρ −1
a silicon chip can also vary substantially from position to
position on the chip, possibly by as much as 40°C [220]. A with a resulting maximum ΔT of
typical order of magnitude for the change in refractive index Δ Tmax = ρ − 1/ ( ρ −1) − ρ − ρ / ( ρ −1) (28)
with temperature is dn / dT ~ 10 −4 K −1 in a semiconductor This value of ΔTmax rises monotonically from 0 for ρ = 1 (so
[221] and ~ 10 −5 K −1 in glass [22]. One promising approach to no contrast in absorption coefficients) through 3.5% (-14.5dB)
such an issue is to compensate the refractive index change of for a low absorption coefficient contrast of ρ = 1.1 , ~15% (-
one material with an opposite change in another [221], [222], 8.3dB) for ρ = 1.5 , 25% (-6dB) for ρ = 2 , ~50% (-3dB) for
[223].
ρ = 4.5 , and continuing to rise, but with progressively
If we consider only moderate Q resonators, however, we may
not need any tuning or compensation. For a semiconductor decreasing further benefits, for increasingly larger ρ (e.g., ~
resonator with dn / dT ~ 10 −4 K −1 , then a 100°C temperature 70% (-1.6dB) for ρ = 10 ).
A reasonable approximate conclusion from this analysis is
arXiv:1609.05510 [physics.optics] v3 1 January 2017 44

that we need an absorption contrast ratio c) Materials criteria and use of resonators
α abs Both electroabsorptive and electrorefractive modulators can
ρ= ≥2 (29) also exploit resonators. The use of resonators can allow us to
α trans
if we are to have a modulator that is reasonably (i.e., >25%) work with shorter devices. Loosely, for a cavity finesse F,
efficient in using the optical power. The penalty for lower since the photon now makes  F / π passes through the cavity
absorption coefficient contrast increases steeply as ρ reduces (see Appendix B), we only need to pick up ~ π / F as much
below about 2. path length change or absorption in each pass, so the device can
No matter how strong is the optical absorption in the material, be shorter by a factor ~ π / F .
we will have an optically very inefficient design unless we have
Our analyses above for the case without a resonator lead to a
at least about a factor of 2 or more contrast between the “off”
device no longer than ~ 1 absorption length for the background
and “on” absorption coefficients. This turns out to be quite a
or “on” state absorption. The amount of background loss we can
demanding criterion for electroabsorptive materials, and rules
tolerate per pass in the resonator case also has to go down by a
out several electroabsorptive mechanisms.
similar factor ~ π / F , however. Then, this background
Such an example design using a material with ρ = 2 would
absorptive loss per pass at most remains comparable to the loss
have a length
through the mirrors per pass; that amount of loss is at a point
ln 2 0.693 where we are beginning to substantially affect the operation of
L=  (30)
α off α off the cavity because of this background absorption loss.
so about 70% of an absorption length, and it would modulate Hence, the material requirement (6) remains the same for
from a high transmission of 50% to a low transmission of electrorefractive modulators with resonators; essentially, we are
25%. dividing both sides of the equation by F / π , which leaves the
b) Criteria for electrorefractive materials material criterion the same.
For an electrorefractive modulator, to maximize ΔT we also It is also practically the case that to make substantial
want to avoid having too much loss. With a background optical modulation in an absorptive device in a cavity, we will need at
least roughly to double the amount of absorption; that would
absorption coefficient of α, in a simple modulator without a
change the absorption per pass from being comparable to the
resonator, we would therefore want to keep the length L of the
mirror loss to being substantially greater than the mirror loss,
modulator less than about one absorption length, i.e., L ≤ 1 / α
thereby substantially changing the transmission of the
. If we have no resonator, then we need to have sufficient
resonator. So, we should expect the criterion (5) also to remain
refractive index change Δn in the device length L to give the
approximately valid for electroabsorptive modulators with
desired ~ ΔnL ≥ λ / 2 change in optical path. (Note that λ here resonators.
is the free-space wavelength, not the wavelength in the These arguments are loose, and for any specific resonator
material.) Hence a desirable criterion for an electrorefractive design we should perform the actual analysis to get detailed
material is answers for performance, but the basic conclusion is that
Δn λ changing to a resonator design does not substantially change the
≥ (31)
α 2 underlying requirements (5) and (6) on the materials. See, e.g.,
This can be a surprising difficult criterion to meet for many Refs. [75], [79] for recent example analysis and design of such
otherwise promising mechanisms for refractive index change, devices.
as we will discuss below. A key difficulty is that it can be Note, incidentally, that the use of asymmetric Fabry-Perot
difficult to find any high-speed mechanism that can in practice resonators – a useful trick to enhance the contrast ratio in
and under reasonable operating conditions give Δn much absorptive modulators (as in [75], [79], for example) – in
greater than about 10 −3 while still satisfying this criterion (6). practice makes little difference to the total ΔT for the
That has been a long-standing problem in electrorefractive modulator, so it does not change the material requirements here.
devices in general. As a result, electrorefractive devices without
APPENDIX D – EXAMPLE ANALYSIS OF “NEAR-RECEIVERLESS”
resonators tend to need to be quite long, e.g., L  750 μm for
OPERATION
Δn  10 −3 and λ = 1 μm. Organic polymer electrooptic
We can make a simple estimate of how much energy we can
materials have been projected to offer up to Δn  1% at a field
tolerate to run a receiver amplifier so that we are benefitting
of 10 6 V/cm , and in a device in a 90 nm wide plasmonic
overall in reducing the total energy to run the entire system.
waveguide with additional field concentration from group Suppose, for example, that the effective optical loss71 of the
velocity effects has been able to operate with a 10 μm length system from the optical power source to the receiving
[88], which may represent the shortest refractive modulator photodetector is some factor LSP, that the “wall-plug”
without a resonator. (This device operates at ~ 25 fJ/bit energy.)
71
Effective optical loss would include all actual loss factors together with a in optical transmission between the “low” and “high” transmission states of a
factor for the increased optical power required because of the limited difference modulator (see Appendix C and Section IV D).
arXiv:1609.05510 [physics.optics] v3 1 January 2017 45

efficiency72 of the optical power source is a factor ηS, and that, photons gives required minimum numbers of photons of 20 –
in a receiverless system, we need an optical energy ER per bit at 100 to avoid bit errors from photon statistics [224], depending
the receiver. Then the corresponding transmitter “wall-plug” on the specific statistical assumptions and the required bit error
energy per bit for the receiverless system is rate. For ~ 1 eV photons, a received optical energy of 100 aJ/bit
ET = ERη S LSP (32) corresponds to ~ 600 photons/bit, so at such a level we are likely
Adding in a receiver gain of some factor g would reduce the far from shot noise being a significant problem. We might need
required electrical “wall-plug” energy per bit by a factor g to reconsider this, however, if we were to consider operating at
because it would correspondingly reduce the required ~ 10 aJ/bit levels.
transmitter energy bit to ET / g . So the transmitter energy For thermal noise, we can estimate this by considering what
is sometimes referred to as “ kT / C ” noise. If we charge a
saved would be an amount
capacitor C through a resistor, and consider thermal noise in the
E  1
ΔET = ET − T = ET  1 −  (33) resistor as a noise source, the resulting fluctuation of the voltage
g  g on the capacitor is essentially independent of the resistor value;
Presuming we are thinking of adding a receiver gain stage this independence is because, though the thermal noise
with g significantly greater than 1 (e.g., 3 – 10)73, then the factor (voltage)2 per unit bandwidth is proportional to the resistor
1-(1/g) is not far from 1, and the energy saved at the transmitter value, the bandwidth of the RC circuit is inversely proportional
by adding the gain stage will be approximately ET, i.e., to the resistor value; so, the resistor value cancels out in the
ΔET  ET . So for any energy benefit in adding such a receiver algebra. As a result, using standard Johnson noise analysis, the
amplifier, the energy per bit to run the additional receiver standard deviation of the voltage on the capacitor is
amplifier circuit, Egain, should be at least somewhat less than the vn = k BT / C where k B is Boltzmann’s constant, regardless
energy per bit ET currently being dissipated at the transmitter or of the resistor used to charge it.
there is no point in adding the gain stage. So Such noise only appears if we have a resistor of some kind
E gain < E Rη S LSP ( = ET ) (34) connected to the capacitor, which is not necessarily the case in
Once we integrate photodetectors with very low total an optical receiver. But, even assuming we have such a
capacitance, the optical input energy required for receiverless resistance to charge or discharge the photodetector capacitance,
operation (ER) becomes small, and the energy Egain we can this noise is not likely to present much of a problem for a
afford to spend on a receiver gain stage for any net energy receiverless approach; for 1fF, vn  2 mV , and even for 10aF,
benefit also becomes small. Nonetheless, ET may still be a v n  20 mV , both of which are much less than a logic swing.
significant number, such as 10’s of fJ in such a hypothetical
If we do use some moderate signal amplification at the output
future system (see the discussion in Section IX A). So spending
of the photodetector in a “near-receiverless” approach, as long
up to a few fJ per bit on a gain stage might make sense. Any
as that is only some small factor, such as 3 – 5, such noise
such circuit would have to be quite simple, however, such as
sources are still not likely to be much of a problem, though we
one CMOS stage of gain, to hit such an energy target, and would
should likely analyze the noise in such cases with amplification.
be unlikely to be designed as a noise-limited amplifier stage
[111].
APPENDIX F – FREE SPACE OPTICAL SYSTEMS
APPENDIX E - NOISE IN LOW-CAPACITANCE AND RECEIVERLESS I. Diffraction limit to the number of free-space channels
OPERATION One important number we need to understand is the limiting
One legitimate question is whether we truly can avoid number of possible separate channels we can have for
problems of noise in receiverless or near-receiverless operation. communication between two surfaces, as determined from the
We might seriously consider two potential sources of noise – laws of diffraction. For each polarization, we will not in practice
Johnson (or thermal) noise, and shot (or Poissonian statistics) be able to exceed this number, and any optical system will have
noise. The simplest answer, which is certainly valid for the to be designed so that it does not attempt to violate this limit.
receiverless case, is that, since these noise sources do not matter Fortunately, this problem is well understood, both intuitively
for ordinary electronic logic gates operating at logic voltage and somewhat more rigorously (see, e.g., [126]).
swings, then they do not matter when those same logic voltage We can think of free-space optical system in which we are
swings are generated by photodetectors. communicating between one essentially plane “transmitting”
Note that, since a photon energy of ~0.8 eV is also equal to surface and another (parallel) “receiving” surface, as sketched
the energy of an electron at a logic voltage of 0.8 V, the in Fig. 15. The area or “aperture” of the transmitting (receiving)
numbers of photons and the numbers of electrons to drive a gate surface is AT ( AR ). The solid angle subtended by the
with an efficient detector are essentially the same, so if shot transmitting surface at the receiving surface is ΩT  AT / L2 ,
noise does not matter for the transistor, then it does not matter where L is the separation of the surfaces, and we are taking a
in the receiverless photodetector case. “paraxial” approximation, presuming L is much greater than the
In optical communications, analysis of the statistics of
72 73
By “wall-plug” efficiency we mean the ratio of useful optical power out There may not be much point in adding in gain much less than this, and
from a light source to total electrical power in to the light source. just one CMOS inverter stage is likely to add gain of at least such an amount.
arXiv:1609.05510 [physics.optics] v3 1 January 2017 46

linear dimensions of either area. Similarly, the solid angle diffraction angle, and equivalently it takes a large convergence
subtended by the receiving surface at the position of the angle to focus to a small spot74.
transmitting surface is Ω R  AR / L2 . Note this problem is symmetric – we could also consider this
in terms of a lens in the receiving aperture capturing the light
from multiple spots in the transmitting aperture, where those
spots are as small as we can allow if their resulting diffracted
beam just fits within the receiving aperture.
Of course such a counting is loose because it requires a
choice of just how far apart we think spots have to be to count
as “non-overlapping”. More rigorously, we can formally solve
such problems in a generalized fashion [126] to find the
optimum best-coupled channels – the “communications
modes”, which we can do by performing the singular value
decomposition of the coupling “diffraction” operator between
the surfaces75. If we do so, we get the same result for the number
here, so this result is quite rigorous76. So, for a given pair of
such surfaces, we can state quite definitely the maximum
number of orthogonal spatial channels we have for
communication for a given polarization.
Recently, there has been some confusion about whether the
Fig. 15. Optical apertures and solid angles for calculating the number
communication modes between surfaces.
use of different forms of beam can somehow increase the
number of channels – that is, essentially violating Eq. (35). The
For a wavelength λ in the medium between the surfaces, the fact that orbital angular momentum modes [225], [226] can be
physics of diffraction sets the practical number of orthogonal described in terms of an angular momentum “quantum number”
(i.e., spatially separable) spatial channels or “communications could lead to the mistaken impression that this angular
modes” between the surfaces as [126] momentum is somehow an addition degree of freedom of the
Ω A Ω A A A light field, and hence could increase the number of channels in
NC  R 2 T = T 2 R = T2 2R (35)
λ λ Lλ the system beyond the result Eq. (35). In fact, this is not the
We can if we want think of this as if we had some lens in the case. Such angular momentum beams are merely a different
“transmitting” aperture focusing to the smallest spots allowed choice of basis on which to represent spatial beams; they do not
by diffraction at the position of the “receiving” aperture, with increase the number of available spatial channels as given by
NC corresponding to the number of resolvable or Eq. (35). They are also not necessarily the optimum modes for
any given problem. Indeed, if we restrict ourselves to only using
approximately non-overlapping spots we could form or
angular momentum beams that have a “ring”-like form, such
approximately non-overlapping positions we could focus a light
beams use the available aperture of the optical system very
beam on the receiving surface given the cross-sectional areas of
inefficiently; instead, we would have to use all of the radial
both surfaces.
forms of beams with the same angular momentum to make good
The minimum size of spot we can form is limited by
use of the available optical aperture. Specific analysis of
diffraction; indeed, we could get intuitively to the result Eq.
information capacity of optical channels using angular
(35) by presuming that a spot of area AS = mλ 2 (for some
momentum and other approaches [227] confirms such a
number m) has a corresponding diffraction solid angle of conclusion. The true optimum choice of modes for a given
Ω S = 1 / m steradians; that is essentially equivalent to saying power coupling linear optical problem (the communications
that a spot of lateral dimension d (e.g., in, say, the “vertical” modes) can be established by performing the singular value
direction) has a corresponding diffraction angle in radians (in decomposition of the coupling operator, and that process does
that same “vertical” plane) of θ d  λ / d , which is a standard not violate Eq. (35); indeed, it actually proves Eq. (35) [126].
type of result in diffraction theory: a small spot must have large
74
We could work out an explicit example using Gaussian beam spots. As is spheroidal functions, which are not generally spot-like functions on either
conventional, we can define such a spot at its focus (e.g., on the receiving surface; all such functions on both surfaces actually essentially fill the aperture
surface) to have with electric field amplitude of the form exp(-r2/wo2) for some of both surfaces. See, e.g., [126].
76
spot radius parameter wo and with r being the distance from the center of the Technically, there is a sum rule for the sum of the squares of the “coupling
spot in the plane of the transmitting or receiving surface. As we move away strengths” between orthogonal source functions on one surface and resulting
from the focus, the beam stays Gaussian in shape, of a form exp(-r2/w2) but with orthogonal wave functions on the other [126]. For plane parallel surfaces in the
w growing with distance z from the focus approximately as w(z) ≅λz/πwo as the paraxial approximation, those couplings are strong up to a number given by the
spot expands due to diffraction. If we take the effective area of the spots to be result Eq. (35), by which point the sum rule is essentially exhausted. Any other
πwo2 on the surface where they are focused, and consider them to be focused coupled sources and waves beyond this point have very small coupling, and can
from a transmitting surface of area AT = π[w (L)]2, then we will get exactly Eq. generally be neglected. This sum rule is the rigorous generalization of
(35). diffraction.
75
The resulting optimal choice of “communications modes” functions for
the case of rectangular or circular apertures are versions of so-called prolate
arXiv:1609.05510 [physics.optics] v3 1 January 2017 47

J. Calculations of number of channels final input coupler plane what is sometimes called “telecentric”.
Suppose now we consider an optical system in which, for These field lenses allow the whole system never to exceed
simplicity, the two areas are equal, i.e., AT = AR ≡ A , as we 2 × 2 mm 2 in cross-section.
might use in connecting between chips as in Fig. 11 (h). Then There are many ways such an optical system could be
from Eq. (35), the number of orthogonal spatial channels constructed, including substantially solid elements like gradient
between the surfaces is limited by diffraction to index (GRIN) optics, and we will not go into these here; our
A2 point here is just to illustrate the magnitudes of capacities of
NC  2 2 (36) simple systems. There is also nothing special about the 4 cm

distance illustrated here for such a 2 mm cross-section. Any
For example, consider some optics for communicating
shorter distance moves the optical system further away from
between 2 × 2 mm 2 arrays on chip to adjacent chips over 4 cm
any diffraction limits for the same number of channels. It is also
distance. For λ  1.5μm , L = 2 cm (the distance to an imaging possible to build “relaying” optical systems, with lenses spaced
lens) and A ≡ 2 × 2 mm 2 , then diffraction limits us to by twice their focal length, to extend to longer paths with the
N C  17,800 channels. Hence, 1024 channels based on output same number of channels.
Suppose we consider another example, this time
couplers and lenslets [160] on 62.5 μm centers can readily be
hypothetically communicating through free space between two
coupled through a free space channel of 2 × 2 mm 2 cross-
telephoto lenses, each with aperture of A  25 cm 2 , “staring” at
section over centimeters with only a single imaging lens in the
each other over a separation distance of L = 5 m . Then we
path; even increasing the density to 4096 channels on 31.25 μm
centers should be viable optically. So, for example, a 1 cm focal would calculate the maximum number of channels as limited by
length lens 2 cm from the “transmitting” lenslet plane would diffraction as N C  110, 000 . So such a hypothetical cabinet-
image to final “receiving” plane a total of 4 cm away from the to-cabinet link could readily carry 10’s of thousands of
transmitting plane, as sketched in Fig. 16. Of course, it is channels.
straightforward to add mirror surfaces, as in Figs. D2 (b) and K. Wavelength dependence and Dammann grating spot array
XIA (h), in the regions between the lenses, to deflect the beam generators
sideways as required.
Since a spot array generator is a diffractive optical element,
that overall size of the spot array scales with the operating
wavelength, so that wavelength needs to be set to sufficient
precision. For an array size of, say, 32 × 32 spots (so 1024
spots), in which we want the positions of the spots in the
diagonal corners relative to those in the center to be correct to,
say, 1/10 of the spot size, we need a relative precision of the
wavelength of 1 part in 10 × 16 2 + 16 2  226 . At 1550 nm
wavelength, that corresponds to a wavelength precision of ~ 7
nm, or an optical frequency precision of ~ 860 GHz. This is a
relatively slack tolerance for optical wavelength, especially if
we are setting this in some single, centralized laser.
Incidentally, the fact that we have such a tolerance to the
precise laser frequency means that we could also operate with
pulsed light with pulse widths down to a few picoseconds
without causing problems for such spot array generation. The
usefulness of this will become apparent below when we discuss
clocking and timing.
Fig. 16. (a) Sketch (not to scale) of an optical system from an output coupler
plane through a lenslet array (only 4 lenslets are shown here for graphic clarity)
APPENDIX G – EXAMPLE OPTICAL REQUIREMENTS FOR
and a field lens, an imaging lens, another field lens, and another lenslet array,
onto an input coupler plane. (b) Optics shown “folded” by mirrors at the two MODULO-SYNCHRONOUS SYSTEMS
ends for coupling to chip surfaces, at close to actual size for a ~ 2×2 mm cross- Here will illustrate the requirements and capabilities of optics
section and a ~ 4 cm distance.
for modulo-synchronous systems, in which propagation delays
Here we have also included 2 cm focal length “field lenses” longer than one clock cycle are preset to match the clock cycle
above each microlens array; the one in front of the timing.
“transmitting” lenslet plane effectively captures all the For example, suppose we run the entire system at a 2 GHz
diverging light from the emitting microlenses so it passes clock rate. Such a clock rate, which is in line with current
through the imaging lens aperture, and the one at the final practice for chips, means the chips can be run efficiently at
“receiving” lenslet plane effectively “straightens out” the light relatively low power dissipations and with relatively full
so that it is focused by the lenslets onto its optic axis. This utilization of the chip’s capability for information processing
makes the system from the initial output coupler plane to the without exceeding thermal limits. That range of rates allows
arXiv:1609.05510 [physics.optics] v3 1 January 2017 48

optical interconnect path lengths of up to ~15 cm in air or ~10 large number of individuals on these topics over many years, a
cm in glass or plastic path for a full clock cycle, or ~7.5 cm in list that would be too long to include and essentially impossible
air or 5 cm in glass for a half clock cycle. This is enough to construct reliably or completely. He would, however,
distance to consider groups of chips in a module within particularly like to acknowledge stimulating and informative
distances of several centimeters, all run with communications conversations with Tony Heinz, Joseph Kahn, Ashok
on a one-clock-cycle-or-less communication pattern. Krishnamoorthy, and Jelena Vuckovic during the preparation
Driving such a system with optical pulses so that the optics of this review.
does not add substantial timing uncertainty (and could possibly
reduce that uncertainty) would suggest that the pulses are some REFERENCES
small fraction of a clock cycle, for example 10% or shorter. For [1] W. Van Heddeghem, S. Lambert, B. Lannoo, D. Colle, M. Pickavet, P.
2 GHz clocks, this would suggest 50 ps pulses, or shorter. Such Demeester, “Trends in worldwide ICT electricity consumption from 2007
to 2012,” Computer Communications 50, 64–76 (2014)
pulses can be generated by optical mode-locked sources or by http://dx.doi.org/10.1016/j.comcom.2014.02.008
direct modulation of a semiconductor power source laser. [2] D. A. B. Miller, “Device Requirements for Optical Interconnects to
Within a multiple-chip module on a scale of centimeters, Silicon Chips,” Proc. IEEE vol. 97, 1166 - 1185 (2009)
such chip-to-chip connections can be done largely or even [3] M. M. Waldrop, “More than Moore,” Nature vol. 530, pp. 144-147, Feb.
16, 2016, doi:10.1038/530144a
totally using free-space optics together with possibly some [4] D. J. Frank, W. Haensch, G. Shahidi, and O. Dokumaci, “Optimizing
secondary optical waveguide layer for forming specific and CMOS technology for maximum performance,” IBM J. Res. & Dev. 50
moderately complex interconnection patterns (as discussed in (4/5), 419-431 (July/September 2006)
[5] S. W. Keckler, W. J. Dally, B. Khailany, M. Garland, and D. Glasco,
Section IX). “GPUs and the Future of Parallel Computing,” IEEE Micro 31, No.5, 7 –
Between more distant parts within the optically modulo- 17 (Sept./Oct. 2011)
synchronous volume, we might use optical fiber connections. [6] International Technology Roadmap for Semiconductors 2.0 2015 Edition
http://www.itrs2.net/itrs-reports.html downloaded August 3, 2016
As discussed above in Section VIIIB, distances of many meters [7] I. L. Markov, “Limits on fundamental limits to computation,” Nature 512,
are possible with only a few 10’s of picoseconds variability in 147–154 (14 August 2014) doi:10.1038/nature13570
pulse arrival time from the variation of fiber refractive index [8] J. Baliga, R. Ayre, K. Hinton, and R. S. Tucker, “Energy consumption in
wired and wireless access networks,” IEEE Communications Magazine
with temperature. Since that temperature variation is not a 49, Issue 6, 7 June, 2011, pp. 70 – 77
significant problem, to get a specific delay from propagation in [9] R. S. Tucker, R. Parthiban, J. Baliga, K. Hinton, R. W. A. Ayre, and W.
a fiber, we need to cut it to the correct length. To ensure V. Sorin, "Evolution of WDM Optical IP Networks: A Cost and Energy
Perspective," J. Lightwave Technol. 27, 243-252 (2009)
propagation delay times in the fiber within, say 30 ps precision [10] E. Agrell, M. Karlsson, A. R. Chraplyvy, D. J. Richardson, P. M
within a clock cycle, fibers lengths would have to be cut to Krummrich, P. Winzer, K. Roberts, J. K. Fischer, S. J Savory, B. J
specific clock-cycle lengths to within 6 mm precision, which is Eggleton, M. Secondini, F. R Kschischang, A. Lord, J. Prat, I. Tomkos, J.
eminently feasible with simple techniques. Even 1 mm should E Bowers, S. Srinivasan, M. Brandt-Pearce and N. Gisin, “Roadmap of
optical communications,” J. Opt. 18 (2016) 063002 (40pp)
be straightforward with simple cutting jigs even allowing for http://dx.doi.org/10.1088/2040-8978/18/6/063002
end polishing length loss and variation. Hence we could [11] P. J. Winzer, “Spatial Multiplexing in Fiber Optics: The 10X Scaling of
interconnect the larger units within the optically modulo- Metro/Core Capacities,” Bell Labs Tech. J. 19 22–30 (2014) DOI:
10.15325/BLTJ.2014.2347431
synchronous volume with fibers of lengths corresponding to [12] D. J. Richardson, J. M. Fini & L. E. Nelson, “Space-division multiplexing
integer numbers of clock cycle delays. in optical fibres,” Nature Photonics 7, 354–362 (2013)
This modulo-synchronous approach would require that the doi:10.1038/nphoton.2013.94
[13] A. Vahdat, H. Liu, X. Zhao, and C. Johnson, "The Emerging Optical Data
clock frequency is specified and fixed for the entire system (and Center," in Optical Fiber Communication Conference/National Fiber
indeed for the fiber cable manufacture so they can cut to the Optic Engineers Conference 2011, OSA Technical Digest (CD) (Optical
correct length), but that in itself poses no substantial Society of America, 2011), paper OTuH2.
doi:10.1364/OFC.2011.OTuH2
engineering challenge. For a maximum modulo-synchronous [14] D. Liang, M. Fiorentino, R. G. Beausoleil, “VLSI Photonics for High-
fiber cable length of, say, 10 m, which would correspond to ~ Performance Data Centers,” Chapter 18 in Silicon Photonics III (L. Pavesi
100 clock cycles at ~ 2 GHz, and specifying that the timing and D. J. Lockwood eds.) (Springer, 2016), Volume 122 of the series
Topics in Applied Physics, pp. 489-516
precision is to be better than, say, 10 ps within a clock cycle [15] K. Bergman, J. Shalf and T. Hausken, “Optical Interconnects and Extreme
even for the longest (~10 m) cable, or 1/50 of a clock cycle, then Computing,” Optics and Photonics News, vol. 27, Issue 4, pp. 32-39,
our clock frequency precision only has to be set to a precision 2016. doi: 10.1364/OPN.27.4.000032
of 1/10,000, which should represent no substantial engineering [16] S. Rumley, R. P. Polster, K. Bergman, S. Hammond, A. F. Rodrigues,
“End-to-end Modeling and Optimization of Power Consumption in HPC
challenge at such frequencies. Interconnects,” HUCAA Workshop at ICPP (Aug 2016) (in press)
Note also that, in such a modulo-synchronous system, we http://lightwave.ee.columbia.edu/files/Rumley2016b.pdf
would deliver the clock itself optically from a centralize clock [17] A. V. Krishnamoorthy, H. Schwetman, X. Zheng, and R. Ho, "Energy-
Efficient Photonics in Future High-Connectivity Computing Systems," J.
source to boards, modules, or even to chips, with the clock Lightwave Technol. 33, 889-900 (2015) DOI:
distribution itself being modulo-synchronous, thereby 10.1109/JLT.2015.2395453
establishing a uniform, synchronous clock throughout the [18] J. Proesel, C. Schow, and A. Rylyakov, "Ultra Low Power 10- to 25-Gb/s
CMOS-Driven VCSEL Links," in Optical Fiber Communication
system. Conference, OSA Technical Digest (Optical Society of America, 2012),
paper OW4I.3. doi:10.1364/OFC.2012.OW4I.3
ACKNOWLEDGMENT [19] J. Li, X. Zheng, A. V. Krishnamoorthy, and J. F. Buckwalter, “Scaling
Trends for Picojoule-per-Bit WDM Photonic Interconnects in CMOS SOI
The author has benefitted from many conversations with a
arXiv:1609.05510 [physics.optics] v3 1 January 2017 49

and FinFET Processes,” J. Lightwave Technol. 34, 2730-2742 (2016) [42] S. Rakheja, A. Ceyhan, and A. Naeemi, “Interconnect considerations,” in
DOI: 10.1109/JLT.2016.2542065 CMOS and Beyond, ed. T.-J. K. Liu and K. Kuhn (Cambridge, 2015), pp.
[20] R. Beausoleil, M. McLaren, and N. Jouppi, “Photonic architectures for 381-412 DOI: http://dx.doi.org/10.1017/CBO9781107337886.021
high-performance data centers,” IEEE J. Sel. Top. Quantum Electron. [43] M. Raj, S. Saeedi, and A Emami, “A 4-to-11GHz Injection-Locked
19(2), 3700109 (2013) doi:10.1109/JSTQE.2012.2236080 Quarter-Rate Clocking for an Adaptive 153fJ/b Optical Receiver in 28nm
[21] D. A. B. Miller, "Optics for low-energy communication inside digital FDSOI CMOS,” 2015 IEEE Int. Solid-State Circuits Conf. Digest of
processors: quantum detectors, sources, and modulators as efficient Tech. Papers (ISSCC) [0193-6530], Vol. 58, p. 404 (2015)
impedance converters,” Optics Letters, 14, 146 148, (1989). [44] R. S. Shelar and M. Patyra, “Impact of local interconnects on timing and
[22] D. A. B. Miller, “Physical Reasons for Optical Interconnection,” Int. J. power in a high performance microprocessor,” ISPD '10 Proceedings of
Optoelectronics 11 (3), 155-168 (1997). the 19th international symposium on physical design, ACM, pp. 145-152
[23] D. A. B. Miller, “Rationale and Challenges for Optical Interconnects to (2010) doi: 10.1145/1735023.1735060
Electronic Chips,” Proc. IEEE 88, 728-749 (2000). [45] C. Debaes, A. Bhatnagar, D. Agarwal, R. Chen, G. A. Keeler, N. C.
[24] Y. Audzevich, P. M. Watts, A. West, A. Mujumdar, S. W. Moore, and A. Helman, H. Thienpont, and D. A. B. Miller, “Receiver-less Optical Clock
W. Moore, “Power Optimized Transceivers for Future Switched Injection for Clock Distribution Networks,” IEEE J. Sel. Top. Quantum
Networks,” IEEE Trans. VLSI Syst. 22, No. 10, 2081-2092 (2014) Electron. 9, 400-409 (2003)
[25] D. A. B. Miller and H. M. Ozaktas, “Limit to the Bit-Rate Capacity of [46] S. Latif, S. E. Kocabas, L. Tang, C. Debaes & D. A. B. Miller, “Low
Electrical Interconnects from the Aspect Ratio of the System capacitance CMOS silicon photodetectors for optical clock injection”,
Architecture,” J. Parallel and Distributed Computing 41, 42¬52 (1997). Appl. Phys. A – Materials Science and Processing 95, 1129-1135 (2009)
[26] M. Hilbert and P. Lopez, “The World’s Technological Capacity to Store, [47] D. Thomson, A. Zilkie, J. E. Bowers, T. Komljenovic, G. T. Reed, L.
Communicate, and Compute Information,” Science 332, 60 (2011) Vivien, D. Marris-Morini, E. Cassan, L. Virot, J.-M. Fédéli, J.-M.
[27] Cisco Systems, Inc., “The Zettabyte era: trends and analysis,” Hartmann, J. H. Schmid, D.-X. Xu, F. Boeuf, P. O’Brien, G. Z.
http://www.cisco.com/c/en/us/solutions/collateral/service- Mashanovich and M. Nedeljkovic. “Roadmap on silicon photonics,” J.
provider/visual-networking-index-vni/vni-hyperconnectivity-wp.pdf , Opt. 18 (2016) 073003 (20pp) doi:10.1088/2040-8978/18/7/073003
downloaded June 20, 2016 [48] A. L. Lentine, C. T. Derose, P. S. Davids, J. D. Nicolas, W. A. Zortman,
[28] G. Astfalk, “Why optical data communications and why now?” Appl. J. A. Cox, A. Jones, D. C. Trotter, A. T. Pomerene, A. L. Starbuck, D. J.
Phys. A. 95, 933-940 (2009) Savignon, M. Wiwi, and P. B. Chu, “Silicon Photonics Platform for
[29] A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon, S. National Security Applications,” IEEE Aerospace Conference pp. 1–9
Boving, G. Desai, B. Felderman, P. Germano, A. Kanagala, J. Provost, J. (2015) DOI: 10.1109/AERO.2015.7119249
Simmons, E. Tanda, J. Wanderer, U. Hölzle, S. Stuart, and A. Vahdat, [49] H. Subbaraman, X. Xu, A. Hosseini, X. Zhang, Y. Zhang, D. Kwong, and
“Jupiter Rising: A Decade of Clos Topologies and Centralized Control in R. T. Chen, "Recent advances in silicon-based passive and active optical
Google’s Datacenter Network,” SIGCOMM ’15 August 17-21, 2015, interconnects," Opt. Express 23, 2487-2511 (2015) doi:
London, United Kingdom, pp. 183-197 DOI: 10.1364/OE.23.002487
http://dx.doi.org/10.1145/2785956.2787508 [50] T. Komljenovic, M. Davenport, J. Hulme, A. Y. Liu, C. T. Santis, A.
[30] K. Aingaran, S. Jairath, G. Konstadinidis, S. Leung, P. Loewenstein, C. Spott, S. Srinivasan, E. J. Stanton, C. Zhang, and J. E. Bowers,
McAllister, S. Phillips, Z. Radovic, R. Sivaramakrishnan, D. Smentek, T. "Heterogeneous Silicon Photonic Integrated Circuits," J. Lightwave
Wicki, “M7: Oracle’s next-generation SPARC processor,” IEEE Micro Technol. 34, 20-35 (2016) DOI: 10.1109/JLT.2015.2465382
35, Issue 2, 36-45 (Mar.-Apr. 2015) DOI: 10.1109/MM.2015.35 [51] T. Lipka, L. Moldenhauer, J. Müller, and H. K. Trieu, "Photonic
[31] N. Balasubramanian, A. Balasubramanian, and A. Venkataramani integrated circuit components based on amorphous silicon-on-insulator
“Energy Consumption in Mobile Phones: A Measurement Study and technology," Photon. Res. 4, 126-134 (2016) doi: 10.1364/PRJ.4.000126
Implications for Network Applications,” IMC’09, November 4–6, 2009, [52] S. Zhu and G.-Q. Lo, “Vertically Stacked Multilayer Photonics on Bulk
Chicago, Illinois, USA, ACM 978-1-60558-770-7/09/11 Silicon Toward Three-Dimensional Integration,” J. Lightwave Technol.
[32] K. Hinton, J. Baliga, M. Feng, R. Ayre, and R. S. Tucker, “Power 34, 386-392 (2016) DOI: 10.1109/JLT.2015.2499761
Consumption and Energy Efficiency in the Internet,” IEEE Network 25, [53] Y. Li, Y. Zhang, L. Zhang, and A. W. Poon, "Silicon and hybrid silicon
6 - 12 (March/April 2011) photonic devices for intra-datacenter applications: state of the art and
[33] K. Kim, “From the future Si technology perspective: Challenges and perspectives [Invited]," Photon. Res. 3, B10-B27 (2015) doi:
opportunities,” 2010 IEEE International Electron Devices Meeting 10.1364/PRJ.3.000B10
(IEDM), 6-8 Dec. 2010, San Francisco, pp. 1.1.1 – 1.1.9 [54] C. Sun, M. Georgas., J. Orcutt, B. Moss, Y.-H. Chen, J. Shainline, M.
DOI:10.1109/IEDM.2010.5703274 Wade, K. Mehta, K. Nammari, E. Timurdogan, D. Miller, O. Tehar-
[34] D. A. B. Miller, “Are optical transistors the next logical step?” Nature Zahav, Z. Sternberg, J. Leu, J. Chong, R. Bafrali, G. Sandhu, M. Watts,
Photonics 4, 3 - 5 (2010) doi:10.1038/nphoton.2009.240 R. Meade, M. Popović, R. Ram and V. Stojanović, “A Monolithically-
[35] A. Pandey, S. Raycha, S. Maheshwaram, S. K. Manhas, S. Dasgupta, A. Integrated Chip-to-Chip Optical Link in Bulk CMOS,” IEEE J. Solid-
K. Saxena, and B. Anand “Effect of Load Capacitance and Input State Circuits 50, 828-844 (2015) DOI: 10.1109/JSSC.2014.2382101
Transition Time on FinFET Inverter Capacitances,” IEEE Trans. Electron [55] P. De Dobbelaere, A. Ayazi, Y. Chi, A. Dahl, S. Denton, S. Gloeckner,
Dev. 61, 30-36 (2014) DOI: 10.1109/TED.2013.2291013 K. Hon, S. Hovey, Y. Liang, M. Mack, G. Masini, A. Mekis, M. Peterson,
[36] R. H. Dennard, F. H. Gaensslen, V. L. Rideout, E. Bassous, and A. R. T. Pinguet, J. Schramm, M. Sharp, C. Sohn, K. Stechschulte, P. Sun, G.
LeBlanc, “Design of ion-implanted MOSFET's with very small physical Vastola, L. Verslegers, and R. Zhou, "Packaging of Silicon Photonics
dimensions,” IEEE J. Solid-State Circuits, 9, 256-268 (1974) Systems," in Optical Fiber Communication Conference, OSA Technical
[37] M. Bohr, “A 30 Year Retrospective on Dennard’s MOSFET Scaling Digest (online) (Optical Society of America, 2014), paper W3I.2. DOI:
Paper,” IEEE SSCS Newsletter 12 Winter 2007, pp. 11 – 13 10.1364/OFC.2014.W3I.2
[38] S. Borkar, “The Exascale challenge,” Proceedings of 2010 International [56] M. J. R. Heck, J. F. Bauters, M. L. Davenport, J. K. Doylend, S. Jain, G.
Symposium on VLSI Design, Automation and Test, pp. 2-3 (2010) Kurczveil, S. Srinivasan, Y. Tang, and J. E. Bowers, “Hybrid Silicon
[39] D. Hisamoto, W.-C. Lee, J. Kedzierski, H. Takeuchi, K. Asano, C. Kuo, Photonic Integrated Circuit Technology,” IEEE J. Sel. Topics Quantum
E. Anderson, T.-J. King, J. Bokor, and C. Hu, “FinFET—A Self-Aligned Electron. 6100117 (2013) DOI: 10.1109/JSTQE.2012.2235413
Double-Gate MOSFET Scalable to 20 nm,” IEEE Trans. Electron Dev. [57] Y. H. D. Lee and M. Lipson, “Back-End Deposited Silicon Photonics for
47, 2320-2325 (2000) Monolithic Integration on CMOS,” IEEE J. Sel. Top. Quantum Electron.
[40] P. Wambacq, B. Verbruggen, K. Scheir, J. Borremans, M. Dehan, D. 19, 8200207 (2013) DOI: 10.1109/JSTQE.2012.2209865
Linten, V. De Heyn, G. Van der Plas, A. Mercha, B. Parvais, C. Gustin, [58] Y. Arakawa, T. Nakamura, Y. Urino, and T. Fujita, “Silicon photonics for
V. Subramanian, N. Collaert, M. Jurczak, and S. Decoutere, “The next generation system integration platform,” IEEE Communications
Potential of FinFETs for Analog and RF Circuit Applications,” IEEE Magazine 51, Issue 3, 72-77 (March 2013) DOI:
Transactions on Circuits and Systems I: Regular Papers 54, 2541-2551 10.1109/MCOM.2013.6476868
(2007) DOI: 10.1109/TCSI.2007.907866 [59] J. S. Orcutt, B. Moss, C. Sun, J. Leu, M. Georgas, J. Shainline, E.
[41] D. A. B. Miller, "Energy consumption in optical modulators for Zgraggen, H. Li, J. Sun, M. Weaver, S. Urošević, M. Popović, R. J. Ram,
interconnects," Opt. Express 20, A293-A308 (2012) and V. Stojanović, "Open foundry platform for high-performance
electronic-photonic integration," Opt. Express 20, 12222-12232 (2012)
doi: 10.1364/OE.20.012222
arXiv:1609.05510 [physics.optics] v3 1 January 2017 50

[60] D. Van Thourhout, T. Spuesens, S. K. Selvaraja, L. Liu, G. Roelkens, R [78] S. Ren, Y. Rong, S. A. Claussen, R. K. Schaevitz, T. I. Kamins, J. S.
Kumar, G Morthier, P. Rojo-Romeo, F. Mandorlo, P. Regreny, O. Raz, C. Harris, and D. A. B. Miller, “Ge/SiGe Quantum Well Waveguide
Kopp, and L. Grenouillet, “Nanophotonic Devices for Optical Modulator Monolithically Integrated with SOI Waveguides,” IEEE
Interconnect,” IEEE J. Sel. Topics Quantum Electron. 16, 1363-1375 Photonics Technol. Lett. 24, 461 – 463 (2012) doi:
(2010) DOI: 10.1109/JSTQE.2010.2040711 10.1109/LPT.2011.2181496
[61] D. F. Welch, F. A. Kish, R. Nagarajan, C. H. Joyner, R. P. Schneider, Jr., [79] R. M. Audet, E. H. Edwards, K. C. Balram, S. A. Claussen, R. K.
V. G. Dominic, M. L. Mitchell, S. G. Grubb, T.-K. Chiang, D. D. Perkins, Schaevitz, E. Tasyurek, Y. Rong, E. I. Fei, T. I. Kamins, J. S. Harris, and
and A. C. Nilsson, “The Realization of Large-Scale Photonic Integrated D. A. B. Miller, “Surface-Normal Ge/SiGe Asymmetric Fabry-Perot
Circuits and the Associated Impact on Fiber-Optic Communication Optical Modulators Fabricated on Silicon Substrates,” J. Lightwave
Systems,” J. Lightwave Technol. 24, 4674-4683 (2006) DOI: Technol. 31, 3995-4003 (2013)
10.1109/JLT.2006.885769 [80] S. A. Claussen, K. C. Balram, E. T. Fei, T. I. Kamins, J. S. Harris, and D.
[62] D. A. B. Miller, A. Bhatnagar, S. Palermo, A. Emami-Neyestanak, and A. B. Miller, "Selective area growth of germanium and
M. A. Horowitz, “Opportunities for Optics in Integrated Circuits germanium/silicon-germanium quantum wells in silicon waveguides for
Applications,” International Solid State Circuits Conference, 2005, Digest on-chip optical interconnect applications," Opt. Mater. Express 2, 1336-
of Technical Papers, IEEE 2005, Paper 4.6, Pages 86-87 1342 (2012)
[63] G. A. Keeler, B. E. Nelson, D. Agarwal, and D. A. B. Miller, “Skew and [81] E. H. Edwards, R. M. Audet, E. T. Fei, S. A. Claussen, R. K. Schaevitz,
Jitter Removal Using Short Optical Pulses for Optical Interconnection,” E. Tasyurek, Y. Rong, T. I. Kamins, J. S. Harris, and D. A. B. Miller,
IEEE Photonics Technol. Lett. 12, 714 -716 (2000). "Ge/SiGe asymmetric Fabry-Perot quantum well electroabsorption
[64] G. A. Keeler, B. E. Nelson, D. Agarwal, C. Debaes, N. C. Helman, A. modulators," Opt. Express 20, 29164-29173 (2012)
Bhatnagar, and D. A. B. Miller, “The Benefits of Ultrashort Optical Pulses http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-28-29164
in Optically-Interconnected Systems,” IEEE J. Sel. Top. Quantum [82] E. H. Edwards, L. Lever, E. T. Fei, T. I. Kamins, Z. Ikonic, J. S. Harris,
Electron. 9, 477-485 (2003) R. W. Kelsall, and D. A. B. Miller, "Low-voltage broad-band
[65] D. A. B. Miller, D. S. Chemla, T. C. Damen, A. C. Gossard, W. electroabsorption from thin Ge/SiGe quantum wells epitaxially grown on
Wiegmann, T. H. Wood and C. A. Burrus, "Bandedge Electro-absorption silicon," Opt. Express 21, 867-876 (2013)
in Quantum Well Structures: The Quantum Confined Stark Effect,” Phys. [83] P. Chaisakul, D. Marris-Morini, M.-S. Rouifed, G. Isella, D. Chrastina, J.
Rev. Lett. 53, 2173 2177 (1984). Frigerio, X. Le Roux, S. Edmond, J.-R. Coudevylle, and L. Vivien, "23
[66] D. A. B. Miller, D. S. Chemla, T. C. Damen, A. C. Gossard, W. GHz Ge/SiGe multiple quantum well electro-absorption modulator," Opt.
Wiegmann, T. H. Wood and C. A. Burrus, "Electric Field Dependence of Express 20, 3219-3224 (2012)
Optical Absorption near the Bandgap of Quantum Well Structures,” Phys. [84] D. A. B. Miller, Quantum Mechanics for Scientists and Engineers
Rev. B32, 1043 1060 (1985). (Cambridge, 2008)
[67] Yu-Hsuan Kuo, Yong-Kyu Lee, Yangsi Ge, Shen Ren, Jonathan E. Roth, [85] D.-S. Ly-Gagnon, S. E. Kocabas, and D. A. B. Miller, “Characteristic
Theodore I. Kamins, David A. B. Miller & James S. Harris, “Strong Impedance Model for Plasmonic Metal Slot Waveguides,” IEEE J.
quantum-confined Stark effect in germanium quantum-well structures on Selected Topics in Quantum Electronics, 14 (6), 1473 – 1478 (2008) DOI:
silicon,” Nature 437, 1334-1336 (2005) (27 October 10.1109/JSTQE.2008.917534
2005|doi:10.1038/nature04204) [86] D.-S. Ly-Gagnon, K. C. Balram, J. S. White, P. Wahl, M. L. Brongersma,
[68] Y.-H. Kuo, Y. K. Lee, Y. Ge, S. Ren, J. E. Roth, T. I. Kamins, D. A. B. and D. A. B. Miller, “Routing and Photodetection in Subwavelength
Miller, and J. S. Harris Jr., “Quantum-Confined Stark Effect in Ge/SiGe Plasmonic Slot Waveguides,” Nanophotonics 1, 9–16, (2012) DOI:
Quantum Wells on Si for Optical Modulators,” IEEE J. Sel. Top. 10.1515/nanoph-2012-0002
Quantum Electron. 12, 1503-1513 (2006) [87] C. Haffner, W. Heni, Y. Fedoryshyn, A. Josten, B. Baeuerle, C.
[69] S. A. Claussen, E. Tasyurek, J. E. Roth, and D. A. B. Miller, Hoessbacher, Y. Salamin, U. Koch, N. Dordevic, P. Mousel, R. Bonjour,
"Measurement and modeling of ultrafast carrier dynamics and transport A. Emboras, D. Hillerkuss, P. Leuchtmann, D. L. Elder, L. Dalton, C.
in germanium/silicon-germanium quantum wells," Opt. Express 18, Hafner, and J. Leuthold, “Plasmonic Organic Hybrid Modulators—
25596-25607 (2010) Scaling Highest Speed Photonics to the Microscale,” Proc. IEEE (to be
[70] J. E. Roth, O. Fidaner, R. K. Schaevitz, Y. -H. Kuo, T. I. Kamins, J. S. published) DOI: 10.1109/JPROC.2016.2547990
Harris, and D. A. B. Miller, "Optical modulator on silicon employing [88] C. Haffner, W. Heni, Y. Fedoryshyn, J. Niegemann, A. Melikyan, D. L.
germanium quantum wells," Opt. Express 15, 5851-5859 (2007) Elder, B. Baeuerle, Y. Salamin, A. Josten, U. Koch, C. Hoessbacher, F.
http://www.opticsinfobase.org/abstract.cfm?URI=oe-15-9-5851 Ducry, L. Juchli, A. Emboras, D. Hillerkuss, M. Kohl, L. R. Dalton, C.
[71] J. E. Roth, O. Fidaner, E. H. Edwards, R. K. Schaevitz, Y.-H. Kuo, N. C. Hafner and J. Leuthold, “All-plasmonic Mach–Zehnder modulator
Helman, T. I. Kamins, J. S. Harris, and D. A. B. Miller, “C-band side- enabling optical high-speed communication at the microscale,” Nature
entry Ge quantum-well electroabsorption modulator on SOI operating at Photonics 9, 525–528 (2015) doi:10.1038/nphoton.2015.127
1 V swing,” Electronics Lett. 44, 49 – 50 (2008) [89] P. Moser, J. A. Lott, P. Wolf, G. Larisch, H. Li, N. N. Ledentsov and D.
[72] R. K. Schaevitz, J. E. Roth, S. Ren, O. Fidaner, and D. A. B. Miller, Bimberg, “56 fJ dissipated energy per bit of oxide-confined 850 nm
“Material Properties in Si-Ge/Ge Quantum Wells,” IEEE J. Sel. Top. VCSELs operating at 25 Gbit/s,” Electronics Letters 48, 1276 (2012)
Quantum Electron. 14, 1082-1089 (2008) [90] S. Matsuo, A. Shinya, T. Kakitsuka, K. Nozaki, T. Segawa, T. Sato, Y.
[73] S. Ren, Y. Rong, T. I. Kamins, J. S. Harris, and D. A.B. Miller, “Selective Kawaguchi and M. Notomi, “High-speed ultracompact buried
epitaxial growth of Ge/Si0.15Ge0.85 quantum wells on Si substrate using heterostructure photonic-crystal laser with 13 fJ of energy consumed per
reduced pressure chemical vapor deposition,” Appl. Phys. Lett. 98, bit transmitted,” Nature Photonics 4, 648 - 654 (2010)
151108 (2011) doi:10.1038/nphoton.2010.177
[74] S. Ren, T. I. Kamins, and D. A. B. Miller, “Thin Dielectric Spacer for the [91] B. Ellis, M. A. Mayer, G. Shambat, T. Sarmiento, J. Harris, E. E. Haller,
Monolithic Integration of Bulk Germanium Quantum Wells with Silicon- and J. Vuckovic, “Ultralow-threshold electrically pumped quantum-dot
on-Insulator Waveguides,” IEEE Photonics Journal 3, No. 4, 739 – 747 photonic-crystal nanocavity laser,” Nature Photonics 5, 297-300 (2011)
(August 2011) [92] M. Nomura, N. Kumagai, S. Iwamoto, Y. Ota and Y. Arakawa, “Laser
[75] R. M. Audet, E. H. Edwards, P. Wahl, and D. A. B. Miller, “Investigation oscillation in a strongly coupled single-quantum-dot–nanocavity system,”
of limits to the optical performance of asymmetric Fabry-Perot Nature Physics 6, 279 - 283 (2010) doi:10.1038/nphys1518
electroabsorption modulators,” IEEE J. Quantum Electron. 48, 198 – 209 [93] J. Wu, P. Jin, “Self-assembly of InAs quantum dots on GaAs(001) by
(2012) 10.1109/JQE.2011.2167960 molecular beam epitaxy,” Front. Phys. 10, 108101 (2015) DOI
[76] R. K. Schaevitz, E. H. Edwards, J. E. Roth, E. T. Fei, Y. Rong, P. Wahl, 10.1007/s11467-014-0422-4
T. I. Kamins, J. S. Harris, and D. A. B. Miller, “Simple Electroabsorption [94] R. G. Beausoleil, “Large Scale Integrated Photonics for Twenty-First
Calculator for Designing 1310nm and 1550nm Modulators Using Century Information Technologies - A “Moore’s Law” for Optics,”
Germanium Quantum Wells,” IEEE J. Quantum Electron. 48, 187 – 197 Found. Phys. 44, 856-872 (2014) doi:10.1007/s10701-013-9771-z
(2012) DOI: 10.1109/JQE.2011.2170961 [95] D. M. Cutrer and K. Y. Lau, “Ultralow power optical interconnect with
[77] R. K. Schaevitz, D. S. Ly-Gagnon, J. E. Roth, E. H. Edwards, and D. A. zero-biased, ultralow threshold laser-how low a threshold is low enough?”
B. Miller, “Indirect absorption in germanium quantum wells,” AIP IEEE Photonics Technology Letters 7, 4-6, (1995)
Advances 1, 032164 (2011) [96] K. L. Tsakmakidis, R. W. Boyd, E. Yablonovitch, and X. Zhang, "Large
spontaneous-emission enhancements in metallic nanostructures: towards
arXiv:1609.05510 [physics.optics] v3 1 January 2017 51

LEDs faster than lasers," Opt. Express 24, 17916-17927 (2016) doi: performance waveguide-coupled Ge-on-Si linear mode avalanche
10.1364/OE.24.017916 photodiodes," Opt. Express 24, 19072-19081 (2016) doi:
[97] M. T. Hill and M. C. Gather, “Advances in small lasers,” Nature Photonics 10.1364/OE.24.019072
8, 908–918 (2014) doi:10.1038/nphoton.2014.239 [117] L. C. Chuang, F. G. Sedgwick, R. Chen, W. S. Ko, M. Moewe, K. W. Ng,
[98] M. T. Hill, Y.-S. Oei, B. Smalbrugge, Y. Zhu, T. de Vries, P. J. van T.-T. D. Tran, and C. Chang-Hasnain, “GaAs-Based Nanoneedle Light
Veldhoven, F. W. M. van Otten, T. J. Eijkemans1, J. P. Turkiewicz, H. de Emitting Diode and Avalanche Photodiode Monolithically Integrated on
Waardt, E. J. Geluk, S.-H. Kwon, Y.-H. Lee, R. Nötzel1 and M. K. Smit, a Silicon Substrate,” Nano Lett. 11, 385-390 (2011)
“Lasing in metallic-coated nanocavities,” Nature Photonics 1, 589 - 594 [118] P. Senanayake, C.-H. Hung, A. Farrell, D. A. Ramirez, J. Shapiro, C.-K.
(2007) doi:10.1038/nphoton.2007.171 Li, Y.-R. Wu, M. M. Hayat, and D. L. Huffaker, “Thin 3D Multiplication
[99] C. Z. Ning, “Semiconductor nanolasers,” Phys. Status Solidi B 247, No. Regions in Plasmonically Enhanced Nanopillar Avalanche Detectors,”
4, 774 – 788 (2010) Nano Lett 12 (12), 6448–6452 (2012) DOI: 10.1021/nl303837y
[100] O. Chen, J. Zhao, V. P. Chauhan, J. Cui, C. Wong, D. K. Harris, H. Wei, [119] X. Dai, M. Tchernycheva, and C. Soci, “Compound Semiconductor
H.-S. Han, D. Fukumura, R. K. Jain, and M. G. Bawendi, “Compact high- Nanowire Photodetectors,” Semiconductors and Semimetals Volume 94,
quality CdSe/CdS core/shell nanocrystals with narrow emission 75–107 (2016) http://dx.doi.org/10.1016/bs.semsem.2015.08.001
linewidths and suppressed blinking,” Nat Mater. 12(5), 445–451 (2013) [120] K. C. Balram, R. M. Audet, and D. A. B. Miller, "Nanoscale resonant-
doi:10.1038/nmat3539 cavity-enhanced germanium photodetectors with lithographically defined
[101] G. Shambat, B. Ellis, A. Majumdar, J. Petykiewicz, M. A. Mayer, T. spectral response for improved performance at telecommunications
Sarmiento, J. Harris, E. E. Haller and J. Vuckovic, “Ultrafast direct wavelengths," Opt. Express 21, 10228-10233 (2013)
modulation of a single-mode photonic crystal nanocavity light-emitting http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-21-8-10228
diode,” Nature Communications 2, 539 (2011) DOI: [121] L. Tang, S. E. Kocabas, S. Latif, A. K. Okyay, D.-S. Ly-Gagnon, K. C.
10.1038/ncomms1543 Saraswat and D. A. B. Miller, “Nanometre-Scale Germanium
[102] M. S. Eggleston, K. Messer, L. Zhang, E. Yablonovitch, and M. C. Wu, Photodetector Enhanced by a Near-Infrared Dipole Antenna,” Nature
“Optical antenna enhanced spontaneous emission,” PNAS 112, no. 6, Photonics 2, 226 – 229 (2008) doi:10.1038/nphoton.2008.30
1704–1709 (2015) doi: 10.1073/pnas.1423294112 [122] Z. Wang and B. Nabet, “Nanowire Optoelectronics,” Nanophotonics 4,
[103] K. C. Y. Huang, M.-K. Seo, T. Sarmiento, Y. Huo, J. S. Harris, and M. L. 491–502 (2015) DOI: 10.1515/nanoph-2015-0025
Brongersma, “Electrically driven subwavelength optical nanocircuits,” [123] P. Fan, Z. Yu, S. Fan and M. L. Brongersma, “Optical Fano resonance
Nature Photonics 8, 244–249 (2014) doi:10.1038/nphoton.2014.2 of an individual semiconductor nanostructure,” Nature Materials 13, 471–
[104] W. M. J. Green, M. J. Rooks, L. Sekaric, and Y. A. Vlasov, "Ultra- 475 (2014) doi:10.1038/nmat3927
compact, low RF power, 10 Gb/s silicon Mach-Zehnder modulator," Opt. [124] R. Chen, K. W. Ng, W. S. Ko, D. Parekh, F. Lu, T.-T. D. Tran, K. Li and
Express 15, 17106-17113 (2007) doi: 10.1364/OE.15.017106 C. Chang-Hasnain, “Nanophotonic integrated circuits from
[105] E. Timurdogan, C. M. Sorace-Agaskar, J. Sun, E. S. Hosseini, A. nanoresonators grown on silicon,” Nature Communications 5, 4325
Biberman, and M. R. Watts, “An ultralow power athermal silicon (2014) doi:10.1038/ncomms5325
modulator,” Nature Communications 5, 4008 (2014) [125] D. A. B. Miller, "All linear optical devices are mode converters," Opt.
doi:10.1038/ncomms5008 Express 20, 23985-23993 (2012)
[106] R. A. Soref, and B. R. Bennett, “Electrooptical effects in silicon,” IEEE http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-21-23985
J. Quantum Electron, 23, No. 1, 123 – 129 (1987) [126] D. A. B. Miller, “Communicating with Waves Between Volumes –
[107] W. Bogaerts, P. De Heyn, T. Van Vaerenbergh, K. De Vos, S. K. Evaluating Orthogonal Spatial Channels and Limits on Coupling
Selvaraja, T. Claes, P. Dumon, P. Bienstman, D. Van Thourhout, and R. Strengths,” Appl. Opt. 39, 1681–1699 (2000).
Baets, “Silicon microring resonators,” Laser Photonics Rev. 6, No. 1, 47– [127] D. A. B. Miller, “Sorting out light,” Science 347, 1423-1424 (2015) DOI:
73 (2012) DOI 10.1002/lpor.201100017 10.1126/science.aaa6801
[108] G. Li, A. V. Krishnamoorthy, I. Shubin, J. Yao, Y. Luo, H. Thacker, X. [128] Y. Jiao, S. H. Fan, and D. A. B. Miller, “Demonstration of Systematic
Zheng, K. Raj, and J. E. Cunningham, “Ring Resonator Modulators in Photonic Crystal Device Design and Optimization by Low Rank
Silicon for Interchip Photonic Links,” IEEE J. Sel. Topics Quantum Adjustments: An Extremely Compact Mode Separator,” Optics Letters
Electron. 19, 3401819 (2013) DOI: 10.1109/JSTQE.2013.2278885 30, Issue 2, 141-143 (January 2005)
[109] C. Koos, J. Leuthold, W. Freude, M. Kohl, L. Dalton, W. Bogaerts, A. L. http://www.opticsinfobase.org/ol/abstract.cfm?URI=ol-30-2-141
Giesecke, M. Lauermann, A. Melikyan, S. Koeber, S. Wolf, C. Weimann, [129] V. Liu, D. A. B. Miller, and S. H. Fan, “Highly Tailored Computational
S. Muehlbrandt, K. Koehnle, J. Pfeifle, W. Hartmann, Y. Kutuvantavida, Electromagnetics Methods for Nanophotonic Design and Discovery,”
S. Ummethala, R. Palmer, D. Korn, L. Alloatti, P. C. Schindler, D. L. Proc. IEEE 101, No. 2, 484 – 493 (2013) DOI:
Elder, T. Wahlbrink, and J. Bolten, "Silicon-Organic Hybrid (SOH) and 10.1109/JPROC.2012.2207649
Plasmonic-Organic Hybrid (POH) Integration," J. Lightwave Technol. [130] J. Lu and J. Vučković, "Objective-first design of high-efficiency, small-
34, 256-268 (2016) DOI: 10.1109/JLT.2015.2499763 footprint couplers between arbitrary nanophotonic waveguide modes,"
[110] R. G. Smith, S. D. Personick, “Receiver design for optical fiber Opt. Express 20, 7221-7236 (2012) doi: 10.1364/OE.20.007221
communication systems,” in “Semiconductor devices for optical [131] J. S. Jensen and O. Sigmund, “Topology optimization for nano-
communication,” (Springer, 1982), pp. 89–160 photonics,” Laser & Photon. Rev. 5, 308-321 (2011) DOI
[111] A. V. Krishnamoorthy and D. A. B. Miller, “Scaling Optoelectronic-VLSI 10.1002/lpor.201000014
Circuits into the 21st Century: A Technology Roadmap,” IEEE J. Selected [132] C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch,
Topics in Quantum Electronics 2 (1), 55 76 (April 1996) "Adjoint shape optimization applied to electromagnetic design," Opt.
[112] S. Saeedi, S. Menezo, G. Pares, and A. Emami, “A 25 Gb/s 3D-Integrated Express 21, 21693-21701 (2013)
CMOS/Silicon-Photonic Receiver for Low-Power High-Sensitivity http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-21-18-21693
Optical Communication,” J. Lightwave Technol. 34, 2924 – 2933 (2015) [133] D. A. B. Miller, "Self-aligning universal beam coupler," Opt. Express 21,
DOI: 10.1109/JLT.2015.2494060 6360-6370 (2013)
[113] C. T. DeRose, D. C. Trotter, W. A. Zortman, A. L. Starbuck, M. Fisher, http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-21-5-6360
M. R. Watts, and P. S. Davids, "Ultra compact 45 GHz CMOS compatible [134] D. A. B. Miller, "Self-configuring universal linear optical component,"
Germanium waveguide photodiode with low dark current," Opt. Express Photon. Res. 1, 1-15 (2013).
19, 24897-24904 (2011) doi: 10.1364/OE.19.024897 http://www.opticsinfobase.org/prj/abstract.cfm?URI=prj-1-1-1
[114] K. Nozaki, S. Matsuo, T. Fujii, K. Takeda, M. Ono, A. Shakoor, E. http://dx.doi.org/10.1364/PRJ.1.000001
Kuramochi, and M. Notomi, "Photonic-crystal nano-photodetector with [135] D. A. B. Miller, "Perfect optics with imperfect components," Optica 2,
ultrasmall capacitance for on-chip light-to-voltage conversion without an 747-750 (2015). doi: 10.1364/OPTICA.2.000747
amplifier," Optica 3, 483-492 (2016) doi: 10.1364/OPTICA.3.000483 [136] F. B. McCormick, T. J. Cloonan, F. A. P. Tooley, A. L. Lentine, J. M.
[115] Z. Huang, C. Li, D. Liang, K. Yu, C. Santori, M. Fiorentino, W. Sorin, S. Sasian, J. L. Brubaker, R. L. Morrison, S. L. Walker, R. J. Crisci, R. A.
Palermo, and R. G. Beausoleil, "25 Gbps low-voltage waveguide Si–Ge Novotny, S. J. Hinterlong, H. S. Hinton, and E. Kerbis, "Six-stage digital
avalanche photodiode," Optica 3, 793-798 (2016) doi: free-space optical switching network using symmetric self-electro-optic-
10.1364/OPTICA.3.000793 effect devices," Appl. Opt. 32, 5153-5171 (1993)
[116] N. J. D. Martinez, C. T. Derose, R. W. Brock, A. L. Starbuck, A. T.
Pomerene, A. L. Lentine, D. C. Trotter, and P. S. Davids, "High
arXiv:1609.05510 [physics.optics] v3 1 January 2017 52

[137] E. A. De Souza, M. C. Nuss, W. H. Knox, and D. A. B. Miller, [157] H. S. Hinton, T. J. Cloonan, F. B. McCormick, A. L. Lentine, and F. A.
"Wavelength-division multiplexing with femtosecond pulses,” Optics P. Tooley, “Free-space digital optical systems,” Proc. IEEE 82, 1632 –
Letters, 20, 1166 1168 (1995). 1649(1994) DOI: 10.1109/5.333743
[138] B. E. Nelson, G. A. Keeler, D. Agarwal, N. C. Helman, and D. A. B. [158] D. V. Plant and A. G. Kirk, “Optical interconnects at the chip and board
Miller, “Wavelength Division Multiplexed Optical Interconnect Using level: challenges and solutions,” Proc. IEEE 88, 806 –818 (2000)
Short Pulses,” IEEE J. Sel. Top. Quantum Electron. 9, 486-491 (2003) [159] C. Debaes, M. Vervaeke, V. Baukens, H. Ottevaere, P. Vynck, P.
[139] H. D. Thacker, X. Zheng, J. Lexau, R. Shafiiha, I. Shubin, S. Lin, S. Tuteleers, B. Volckaerts, W. Meeus, M. Brunfaut, J. Van Campenhout, A.
Djordjevic, P. Amberg, E. Chang, F. Liu, J. Simons, J.-H. Lee, A. Abed, Hermanne, H. Thienpont, “Low-cost microoptical modules for mcm level
H. Liang, Y. Luo, J. Yao, D. Feng, M. Asghari, R. Ho, K. Raj, J. E. optical interconnections,” IEEE J. Sel. Top. Quantum Electron. 9, 518-
Cunningham, and A. V. Krishnamoorthy, "An all-solid-state, WDM 530 (2003)
silicon photonic digital link for chip-to-chip communications," Opt. [160] M. P. Christensen, P. Milojkovic, M. J. McFadden, and M. W. Haney,
Express 23, 12808-12822 (2015) doi: 10.1364/OE.23.012808 “Multiscale optical design for global chip-to-chip optical interconnections
[140] A. L. Lentine and C. T. DeRose, “Challenges in the implementation of and misalignment tolerant packaging,” IEEE J. Sel. Top. Quantum
dense wavelength division multiplexed (DWDM) optical interconnects Electron. 9, 548- 556 (2003)
using resonant silicon photonics”, Proc. SPIE 9772, Broadband Access [161] J. Jahns and A. Huang, "Planar integration of free—space optical
Communication Technologies X, 977207 (February 12, 2016) components," Appl. Opt. 28, 1602-1605 (1989)
doi:10.1117/12.2217429 [162] N. Streibl, K.-H. Brenner, A. Huang, J. Jahns, J. Jewell, A. W. Lohmann,
[141] K. Sasaki, F. Ohno, A. Motegi, and T. Baba, “Arrayed waveguide grating D. A. B. Miller, M. Murdocca, M. E. Prise, and T. Sizer, "Digital optics,"
of 70×60 μm2 size based on Si photonic wire waveguides,” Electronics Proc. IEEE 77, 1954-1969 (1989)
Lett. 41, 801-802 (2005) DOI: 10.1049/el:20051541 [163] R. L. Morrison, S. L. Walker, and T. J. Cloonan, "Beam array generation
[142] M. Gerken and D. A. B. Miller “Multilayer Thin-Film Structures with and holographic interconnections in a free-space optical switching
High Spatial Dispersion,” Applied Optics 42, 1330 – 1345 (2003) network," Appl. Opt. 32, 2512-2518 (1993) doi: 10.1364/AO.32.002512
[143] V. Liu, Y. Jiao, D. A. B. Miller, and S. Fan, "Design methodology for [164] J. Jahns and S. Helfert, Introduction to Micro- and Nanooptics (Wiley,
compact photonic-crystal-based wavelength division multiplexers," Opt. 2012)
Lett. 36, 591-593 (2011) [165] K.-N. Chen, M. J. Kobrinsky, B. C. Barnett, and R. Reif, “Comparisons
[144] A.Y. Piggott, J. Lu, K. G. Lagoudakis, J. Petykiewicz, T. M. Babinec, and of Conventional, 3-D, Optical, and RF Interconnects for On-Chip Clock
J. Vučković, “Inverse design and demonstration of a compact and Distribution,” IEEE Trans. Electron Devices 51, 233 – 239 (2004)
broadband on-chip wavelength demultiplexer,” Nature Photonics 9, 374– [166] L. Boivin, M. C. Nuss, J. Shah, D. A. B. Miller, and H. A. Haus, “Receiver
377 (2015) doi:10.1038/nphoton.2015.69 Sensitivity Improvement by Impulsive Coding,” IEEE Photonics
[145] G. T. Reed, G. Z. Mashanovich, W. R. Headley, S. P. Chan, B. D. Technol. Lett. 9 (5), 684 686 (May 1997).
Timotijevic and F. Y. Gardes, “Silicon Photonics: Are Smaller Devices [167] A. L. Lentine, D. A. B. Miller "Evolution of the SEED technology:
Always Better?” Japanese Journal of Applied Physics 45, No. 8B, 6609- bistable logic gates to optoelectronic smart pixels" IEEE J. of Quantum
6615 (2006) DOI: 10.1143/JJAP.45.6609 Electronics, 29, 655 669 (1993)
[146] A. V. Krishnamoorthy, R. Ho, X. Zheng, H. Schwetman, J. Lexau, P. [168] A. L. Lentine, L. M. F. Chirovsky, and T. K. Woodward, “Optical energy
Koka, G. Li, I. Shubin, and J. E. Cunningham, “Computer systems based considerations for diode-clamped smart pixel optical receivers,” IEEE J.
on silicon photonic interconnects,” Proc. IEEE 97(7), 1337–1361 (2009) Quantum Electron. 30, no. 5, 1167-1171 (1994)
DOI: 10.1109/JPROC.2009.2020712 [169] R. W. Going, J. Loo, T.-J. K. Liu, and M. C. Wu, “Germanium Gate
[147] L. Li, Y. Zou, H. Lin, J. Hu, X. Sun, N.-N. Feng, S. Danto, K. Richardson, PhotoMOSFET Integrated to Silicon Photonics,” IEEE J. Sel. Top.
T. Gu, and M. Haney, “A Fully-Integrated Flexible Photonic Platform for Quantum Electron. 20, 8201607 (2014) DOI:
Chip-to-Chip Optical Interconnects,” J. Lightwave Technol. 31, 4080- 10.1109/JSTQE.2013.2294470
4086 (2013) DOI: 10.1109/JLT.2013.2285382 [170] A. K. Okyay, D. Kuzum, S. Latif, D. A. B. Miller and K. C. Saraswat,
[148] R. S. Tucker, “Energy Consumption in Digital Optical ICs with Plasmon “Silicon germanium CMOS optoelectronic switching device: Bringing
Waveguide Interconnects,” IEEE Photonics Technol. Lett. 19, 2036 – light to latch,” IEEE Trans. Electron Devices 54, 3252-3259 (2007) DOI:
2038 (2007) 10.1109/TED.2007.908903
[149] R. G. H. van Uden, R. Amezcua Correa, E. Antonio Lopez, F. M. [171] G. Kurczveil, D. Liang, M. Fiorentino, and R. G. Beausoleil, "Robust
Huijskens, C. Xia, G. Li, A. Schülzgen, H. de Waardt, A. M. J. Koonen hybrid quantum dot laser for integrated silicon photonics," Opt. Express
and C. M. Okonkwo, “Ultra-high-density spatial division multiplexing 24, 16167-16174 (2016) http://dx.doi.org/10.1364/OE.24.016167
with a few-mode multicore fibre,” Nature Photonics 8, 865–870 (2014) [172] A. Lee, Q. Jiang, M. Tang, A. Seeds, and H. Liu, "Continuous-wave
doi:10.1038/nphoton.2014.243 InAs/GaAs quantum-dot laser diodes monolithically grown on Si
[150] P. J. Winzer, “Making spatial multiplexing a reality,” Nature Photonics 8, substrate with low threshold current densities," Opt. Express 20, 22181-
345–348 (2014) doi:10.1038/nphoton.2014.58 22187 (2012) doi: 10.1364/OE.20.022181
[151] S. O. Arık, K.-Po Ho, and J. M. Kahn, “Group Delay Management and [173] H. Sun, F. Ren, K. W. Ng, T.-T. D. Tran, K. Li, and C. J. Chang-Hasnain,
Multiinput Multioutput Signal Processing in Mode-Division Multiplexing “Nanopillar Lasers Directly Grown on Silicon with Heterostructure
Systems,” J. Lightwave Technol. 34, 2867-2880 (2016) DOI: Surface Passivation,” ACS Nano 8 (7), 6833-6839 (2014) DOI:
10.1109/JLT.2016.2530978 10.1021/nn501481u
[152] R. Ryf, S. Randel, A. H. Gnauck, C. Bolle, A. Sierra, S. Mumtaz, M. [174] R. Chen, T.-T. D. Tran, K. W. Ng, W. S. Ko, L. C. Chuang, F. G.
Esmaeelpour, E. C. Burrows, R.-J. Essiambre, P. J. Winzer, D. W. Sedgwick and C. Chang-Hasnain, “Nanolasers grown on silicon,” Nature
Peckham, A. H. McCurdy, and R. Lingle, Jr., “Mode-Division Photonics 5, 170-175 (2011)
Multiplexing Over 96 km of Few-Mode Fiber Using Coherent 6x6 MIMO [175] F. Lu, T.-T. D. Tran, W. S. Ko, K. W. Ng, R. Chen, and C. Chang-
Processing,” J. Lightwave Technol. 30, 521-531 (2012) Hasnain, "Nanolasers grown on silicon-based MOSFETs," Opt. Express
[153] F. Morichetti, A. Annoni, S. Grillanda, N. Peserico, M. Carminati, P. 20, 12171-12176 (2012) doi: 10.1364/OE.20.012171
Ciccarella, G. Ferrari, E. Guglielmi, M. Sorel, and A. Melloni, “4-Channel [176] D. Liang, G. Roelkens, R. Baets, and J. E. Bowers, “Hybrid Integrated
All-Optical MIMO Demultiplexing on a Silicon Chip,” Optical Fibers Platforms for Silicon Photonics,” Materials 3, 1782-1802 (2010)
Conference OFC’16, Anaheim California March 24, 2016, Paper ThE3.7 doi:10.3390/ma3031782
[154] B. S. Dennis, M. I. Haftel, D. A. Czaplewski, D. Lopez, G. Blumberg and [177] K. Tanabe, K. Watanabe and Y. Arakawa, “III-V/Si hybrid photonic
V. A. Aksyuk, “Compact nanomechanical plasmonic phase modulators,” devices by direct fusion bonding,” Scientific Reports 2, 349 (2012)
Nature Photonics 9, 267–273 (2015) doi:10.1038/nphoton.2015.40 doi:10.1038/srep00349
[155] F. Chollet, “Devices Based on Co-Integrated MEMS Actuators and [178] W. Franz, “Einfluß eines elektrischen Feldes auf eine optische
Optical Waveguide: A Review,” Micromachines, 7(2), 18 (2016) Absorptionskante,” Z. Naturforschung 13a 484–489 (1958)
doi:10.3390/mi7020018 [179] L. V. Keldysh, “The effect of a strong electric field on the optical
[156] F. B. McCormick, T. J. Cloonan, A. L. Lentine, J. M. Sasian, R. L. properties of insulating crystals,” J. Exptl. Theoret. Phys. (U.S.S.R.) 34,
Morrison, M. G. Beckman, S. L. Walker, M. J. Wojcik, S. J. Hinterlong, 1138-1141 (May, 1958); translation: Soviet Physics JETP 34(7), No.5,
R. J. Crisci, R. A. Novotny, and H. S. Hinton, "Five-stage free-space 788-790 (1958)
optical switching network with field-effect transistor self-electro-optic- [180] K. Tharmalingam, “Optical Absorption in the Presence of a Uniform
effect-device smart-pixel arrays," Appl. Opt. 33, 1601-1618 (1994) Field,” 130, 2204 – 2206 (1963)
arXiv:1609.05510 [physics.optics] v3 1 January 2017 53

[181] J. D. Dow and D. Redfield, “Electroabsorption in Semiconductors: The [201] M. Liu, X. Yin, and X. Zhang, “Double-Layer Graphene Optical
Excitonic Absorption Edge, Phys. Rev. B 1, 3358 – 3371 (1970) Modulator,” Nano Lett. 12 (3), 1482–1485 (2012) DOI:
[182] F. L. Lederman and J. D. Dow, “Theory of electroabsorption by 10.1021/nl204202k
anisotropic and layered semiconductors. I. Two-dimensional excitons in [202] S. L. Chuang, Physics of Photonic Devices (2nd edition) (Wiley,2009)
a uniform electric field,” Phys. Rev. B13, 1633-1642 (1976) [203] B. Chmielak, M. Waldow, C. Matheisen, C. Ripperda, J. Bolten, T.
[183] D. A. B. Miller, D. S. Chemla, and S. Schmitt-Rink, "Electroabsorption Wahlbrink, M. Nagel, F. Merget, and H. Kurz, "Pockels effect based fully
of highly confined systems: Theory of the quantum-confined Franz- integrated, strained silicon electro-optic modulator," Opt. Express 19,
Keldysh effect in semiconductor quantum wires and dots,” Appl. Phys. 17212-17219 (2011) doi: 10.1364/OE.19.017212
Lett., 52, 2154 2156, (1988). http://www.opticsinfobase.org/ome/abstract.cfm?URI=ome-2-10-1336
[184] Y. Luo, X. Zheng, G. Li, I. Shubin, H. Thacker, J. Yao, J.-H. Lee, D. Feng, [204] P. Damas, X. Le Roux, D. Le Bourdais, E. Cassan, D. Marris-Morini, N.
J. Fong, C.-C. Kung, S. Liao, R. Shafiiha, M. Asghari, K. Raj, A. V. Izard, T. Maroutian, P. Lecoeur, and L. Vivien, "Wavelength dependence
Krishnamoorthy and J. E. Cunningham, “Strong Electro-Absorption in of Pockels effect in strained silicon waveguides," Opt. Express 22, 22095-
GeSi Epitaxy on Silicon-on-Insulator (SOI),” Micromachines 3(2), 345- 22100 (2014) doi: 10.1364/OE.22.022095
363 (2012) doi:10.3390/mi3020345 [205] P. O. Weigel, M. Savanier, C. T. DeRose, A. T. Pomerene, A. L. Starbuck,
[185] N.-N. Feng, D. Feng, S. Liao, X. Wang, P. Dong, H. Liang, C.-C. Kung, A. L. Lentine, V. Stenger and S. Mookherjea, “Lightwave Circuits in
W. Qian, J. Fong, R. Shafiiha, Y. Luo, J. Cunningham, A. V. Lithium Niobate through Hybrid Waveguides with Silicon Photonics,”
Krishnamoorthy, and M. Asghari, "30GHz Ge electro-absorption Scientific Reports 6, 22301 (2016) doi:10.1038/srep22301
modulator integrated with 3μm silicon-on-insulator waveguide," Opt. [206] L. Chen, J. Nagy, and R. M. Reano, "Patterned ion-sliced lithium niobate
Express 19, 7062-7067 (2011) for hybrid photonic integration on silicon," Opt. Mater. Express 6, 2460-
[186] R. J. Elliott, “Intensity of Optical Absorption by Excitons,” Phys. Rev. 2467 (2016) doi: 10.1364/OME.6.002460
108, 1384 – 1389 (1957) [207] D. L. Elder, S. J. Benight, J. Song, B. H. Robinson, and L. R. Dalton,
[187] D. S. Chemla and D. A. B. Miller, "Room-Temperature Excitonic “Matrix-Assisted Poling of Monolithic Bridge-Disubstituted Organic
Nonlinear-Optical Effects in Semiconductor Quantum-Well Structures,” NLO Chromophores,” Chem. Mater., 26 (2), 872–874 (2014) DOI:
J. Opt. Soc. Am. B2,1155 1173 (1985). 10.1021/cm4034935
[188] M. Shinada and S. Sugano, “Interband Optical Transitions in Extremely [208] J. S. Weiner, D. A. B. Miller, and D. S. Chemla, "Quadratic Electro-Optic
Anisotropic Semiconductors. I. Bound and Unbound Exciton Effect due to the Quantum-Confined Stark Effect in Quantum Wells,”
Absorption,” J. Phys. Soc. Jpn. 21, 1936-1946 (1966) Appl. Phys. Lett. 50, 842 844, (1987)
http://dx.doi.org/10.1143/JPSJ.21.1936 [209] J. E. Zucker, K. L. Jones, T. H. Chiu, B. Tell, and K. Brown-Goebeler,
[189] D. S. Chemla, D. A. B. Miller, P. W. Smith, A. C. Gossard and W. “Strained quantum wells for polarization-independent electrooptic
Wiegmann, "Room Temperature Excitonic Nonlinear Absorption and waveguide switches,” J. Lightwave Technol. 10, 1926-1930 (1992)
Refraction in GaAs/AlGaAs Multiple Quantum Well Structures,” IEEE J. [210] P. W. Juodawlkis, F. J. O'Donnell, R. J. Bailey, J. J. Plant, K. G. Ray, D.
Quantum Electron. QE 20, 265 275 (1984) C. Oakley, A. Napoleone, M. R. Watts, and G. E. Betts, “InGaAsP/InP
[190] D. A. B. Miller, D. S. Chemla, D. J. Eilenberger, P. W. Smith, A. C. quantum-well electrorefractive modulators with sub-volt Vpi,” Proc.
Gossard, and W. T. Tsang, "Large Room-Temperature Optical SPIE 5435, Enabling Photonic Technologies for Aerospace Applications
Nonlinearity in GaAs/Ga1-xAlxAs Multiple Quantum Well Structures,” VI, (3 August 2004) doi: 10.1117/12.546786
Appl. Phys. Lett. 41, 679 681, (1982). [211] G. T. Reed, G. Mashanovich, F. Y. Gardes, and D. J. Thomson, “Silicon
[191] S. Schmitt-Rink, D. S. Chemla, W. H. Knox, D. A. B. Miller, "How fast optical modulators,” Nat. Photonics 4(8), 518–526 (2010)
is excitonic electroabsorption?" Optics Letters, 15, 60 62, (1990). doi:10.1038/nphoton.2010.179
[192] M. N. Islam, R. L. Hillman, D. A. B. Miller, D. S. Chemla, A. C. Gossard, [212] D. A. B. Miller, C. T. Seaton, M. E. Prise and S. D. Smith, "Bandgap
and J. H. English, "Electroabsorption in GaAs/AlGaAs Coupled Quantum Resonant Nonlinear Refraction in III V Semiconductors,” Phys. Rev. Lett.
Well Waveguides,” Appl. Phys. Lett. 50, 1098 1100, (1987) 47, 197 200 (1981).
[193] K. W. Goossen, J. E. Cunningham, D. A. B. Miller, W. Y. Jan, A. L. [213] D. J. Bossert and D. Gallant, “Gain, Refractive Index, and a-Parameter in
Lentine, A. M. Fox, and N. K. Ailawadi, “Low Field Electroabsorption InGaAs-GaAs SQW Broad-Area Lasers,” IEEE Photonics Technol. Lett.
and Self-Biased Self-Electrooptics Effect Device Using Slightly 8, 322-324 (1996)
Asymmetric Coupled Quantum Wells,” Paper MB3, Topical Meeting on [214] K. F. Mak, L. Ju, F. Wang, and T. F. Heinz, “Optical spectroscopy of
Quantum Optoelectronics, Salt Lake City, March 1991 (Optical Society graphene: From the far infrared to the ultraviolet,” Solid State Commun.
of America, 1991) 152, 1341-1349 (2012) http://dx.doi.org/10.1016/j.ssc.2012.04.064
[194] D. A. B. Miller, D. S. Chemla and S. Schmitt-Rink, "Relation Between [215] J. G. Kim, W. S. Yun, S. Jo, J. D. Lee and C.-H. Cho, “Effect of interlayer
Electroabsorption in Bulk Semiconductors and in Quantum Wells: The interactions on exciton luminescence in atomic-layered MoS2 crystals,”
Quantum-Confined Franz-Keldysh Effect,” Phys. Rev. B33, 6976 6982 Scientific Reports 6, 29813 (2016) doi:10.1038/srep29813
(1986). [216] J. Klein, J. Wierzbowski, A. Regler, J. Becker, F. Heimbach, K. Müller,
[195] C. H. Henry, R. A. Logan, F. R. Merritt, and J. P. Luongo, “The Effect of M. Kaniber, and J. J. Finley, “Stark Effect Spectroscopy of Mono- and
Intervalence Band Absorption on the Thermal Behavior of InGaAsP Few-Layer MoS2,” Nano Lett. 16 (3), 1554–1559 (2016) DOI:
Lasers,” IEEE J. Quantum Elecron. QE-19, 947- 952 (1983) 10.1021/acs.nanolett.5b03954
[196] D. S. Chemla, I. Bar-Joseph, J. M. Kuo, T. Y. Chang, C. Klingshirn, G. [217] B. Mukherjee, F. Tseng, D. Gunlycke, K. K. Amara, G. Eda, and E.
Livescu, and D. A. B. Miller, “Modulation of absorption in field-effect Simsek, "Complex electrical permittivity of the monolayer molybdenum
quantum well structures,” IEEE J. Quantum Electron., 24, 1664 1676, disulfide (MoS2) in near UV and visible," Opt. Mater. Express 5, 447-455
(1988). (2015) doi: 10.1364/OME.5.000447
[197] D. S. Chemla, I. Bar-Joseph, C. Klingshirn, D. A. B. Miller, J. M. Kuo, [218] Y. Li, A. Chernikov, X. Zhang, A. Rigosi, H. M. Hill, A. M. van der
and T. Y. Chang, "Optical Reading of Field-Effect Transistors by Phase- Zande, D. A. Chenet, E.-M. Shih, J. Hone, and T. F. Heinz, “Measurement
Space Absorption Quenching in a Single InGaAs Quantum Well of the optical dielectric function of monolayer transition-metal
Conducting Channel,” Appl. Phys. Lett. 50, 585 587, (1987). dichalcogenides: MoS2, MoSe2, WS2, and WSe2,” Phys. Rev. B 90,
[198] S. J. Koester and M. Li, “Waveguide-Coupled Graphene 205422 (2014) DOI: 10.1103/PhysRevB.90.205422
Optoelectronics,” IEEE J. Selected Topics Quantum Electron. 20, [219] E. M. Purcell "Spontaneous emission probabilities at radio frequencies"
6000211 (2013) DOI: 10.1109/JSTQE.2013.2272316 Phys. Rev. 69, 681 (1946)
[199] M. Kleinert, F. Herziger, P. Reinke, C. Zawadzki, D. de Felipe, W. [220] A. Shakouri, “Nano-scale thermal transport and microrefrigerators on a
Brinker, H.-G. Bach, N. Keil, J. Maultzsch, and M. Schell, "Graphene- chip,” Proc. IEEE 94, 1613–1638 (2006) DOI:
based electro-absorption modulator integrated in a passive polymer 10.1109/JPROC.2006.879787
waveguide platform," Opt. Mater. Express 6, 1800-1807 (2016) doi: [221] L. Lu, L. Zhou, X. Sun, J. Xie, Z. Zou, H. Zhu, X. Li, and J. Chen,
10.1364/OME.6.001800 "CMOS-compatible temperature-independent tunable silicon optical
[200] M. Mohsin, D. Schall, M. Otto, A. Noculak, D. Neumaier, and H. Kurz, lattice filters," Opt. Express 21, 9447-9456 (2013) doi:
"Graphene based low insertion loss electro-absorption modulator on SOI 10.1364/OE.21.009447
waveguide," Opt. Express 22, 15292-15297 (2014) doi: [222] F. Qiu, A. M. Spring, H. Miura, D. Maeda, M.-A Ozawa, K. Odoi, and S.
10.1364/OE.22.015292 Yokoyama, “Athermal Hybrid Silicon/Polymer Ring Resonator Electro-
arXiv:1609.05510 [physics.optics] v3 1 January 2017 54

Optic Modulator,” ACS Photonics 3, 780−783 (2016) DOI:


10.1021/acsphotonics.5b00695
[223] K. Shang, B. Guan, S. T. S. Cheung, L. Liao, J. Basak, H.-F. Liu, and S.
J. B. Yoo, "CMOS-compatible, athermal silicon ring modulators clad with
titanium dioxide," Opt. Express 21, 13958-13968 (2013) doi:
10.1364/OE.21.013958
[224] G. P. Agrawal, Fiber-Optic Communication Systems – Fourth Edition
(Wiley, 2010)
[225] S. Franke-Arnold, L. Allen, and M. Padgett, “Advances in optical angular
momentum,” Laser & Photonics Reviews 2, 299-313 (2008)
[226] N. Bozinovic, Y. Yue, Y. Ren, M. Tur, P. Kristensen, H. Huang, A. E.
Willner, S. Ramachandran, “Terabit-Scale Orbital Angular Momentum
Mode Division Multiplexing in Fibers,” Science 340, 1545-1548 (2013)
DOI: 10.1126/science.1237861
[227] N. Zhao, X. Li, G. Li, and J. M. Kahn, “Capacity limits of spatially
multiplexed free-space communication,” Nature Photonics 9, 822–826
(2015) doi:10.1038/nphoton.2015.214

David A. B. Miller (M’83–F’95) received the Ph.D. degree in


physics from Heriot-Watt University, Edinburgh, U.K., in
1979. He was with Bell Laboratories from 1981 to 1996, as a
Department Head from 1987. He is currently the W. M. Keck
Professor of Electrical Engineering, and a Co-Director of the
Stanford Photonics Research Center at Stanford University,
Stanford, CA. He was President of the IEEE Lasers and Electro-
Optics Society (now Photonics Society) in 1995. His research
interests include physics and devices in nanophotonics,
nanometallics, and quantum-well optoelectronics, and
fundamentals and applications of optics in information sensing,
switching, and processing. He has published more than 260
scientific papers and the text Quantum Mechanics for Scientists
and Engineers (Cambridge, U.K.: Cambridge Univ. Press,
2008), and holds 73 patents. Dr. Miller has received numerous
awards. He is a Fellow of the Optical Society of America
(OSA), the American Physical Society (APS), the Royal
Society, and the Royal Society of Edinburgh, and a Member of
the National Academy of Sciences and the National Academy
of Engineering.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy