Oriole’s fast optical reconfigurable network

- Start-up Oriole Networks has developed a photonic network to link numerous accelerator chips in an artificial intelligence (AI) data centre.
- The fast photonic network is reconfigurable every 100 nanoseconds and is designed to replace tiers of electrical switches.
- Oriole says its photonic networking saves considerable power and ensures the network is no longer a compute bottleneck.
Georgios Zervas, CTO of Oriole Networks.
In a London office bathed in spring sunlight, the team from Oriole Networks, a University College London (UCL) spin-out, detailed its vision for transforming AI and high-performance computing (HPC) data centres.
Oriole has developed a networking solution, dubbed Prism, that uses fast reconfigurable optical circuit switches to replace the tiers of electrical packet switches used to connect racks of AI processors in the data centre.
Electrical switches perform a crucial role in the data centre by enabling the scaling of AI computers comprising thousands of accelerator chips. Such chips - graphics processor units (GPUs), tensor processor units (TPUs), or more generically xPUs - are used to tackle large AI computational workloads.
The workloads include imprinting learning onto large AI models or performing inferencing once the AI model is trained, where it shares its knowledge when prompted.
Oriole’s novel network is based on optical circuit switches that can switch rapidly in response to changes in the workload, allocating the xPU resources as required. Electrical switches already do this very well.
Origins
Founded in 2023, Oriole builds on over a decade of research work by Georgios Zervas and his research team at UCL.
The start-up has raised $35 million, including a $22 million Series A led by investment firm Plural’s Ian Hogarth, a technology entrepreneur and Chair of the UK’s AI Security Institute.
The view from Oriole's London office.The company, now 50-strong, has two UK offices—one in London and a site in Paignton—and one in Palo Alto.
Oriole’s team blends photonics expertise, including Paignton’s former Lumentum coherent transceiver group and networking talent from Intel’s former Altera division west of London, with expertise in programmable logic design work addressing hyperscalers’ needs.
AI data centre metrics
Power is a key constraint limiting the productivity of an AI data centre.
“You can only get so much power to a data centre site,” says Joost Verberk, vice president, business development and marketing at Oriole. “Once that is determined, everything else follows; the systems and networking must be as power efficient as possible so that all the power can go to the GPUs.”
Joost Verberk
Oriole highlights two metrics Nvidia’s Jensen Huang used at the company’s recent GTC event to quantify AI data centre efficiency.
One is tokens per second per megawatt (tokens/s/MW). Tokens are data elements, such as a portion of a word or a strip of pixels, part of a digital image, that are fed to or produced by an AI model. The more tokens created, the more productive the data centre.
The second metric is response speed, measured in tokens per second (tokens/s), which gauges latency (speed of response).
Oriole says these two metrics are not always aligned, but the goal is to use less power while producing more tokens faster.
Discussing tokens implies that the data centre’s hardware is used for inference. However, Oriole stresses that training AI models for less power is also a goal. Oriole’s optical networking solution can be applied to both inference and training.
Going forward, only a handful of companies, such as hyperscalers, will train the largest AI models. Many smaller-sized AI clusters will be deployed and used for inference.
“By 2030, 80 per cent of AI will be inferencing,” says James Regan, CEO of Oriole.
Networking implications
Inferencing, by its nature, means that the presented AI tasks change continually. One implication is that the networking linking the AI processors must be dynamic: grabbing processors for a given task and releasing them on completion.
George Zervas, Oriole's CTO, points out that while Nvidia uses the same GPU for training and inferencing, Google’s latest TPU, Ironwood, has inferencing enhancements. Google also has AI computing clusters dedicated to inference jobs.
Amazon Web Services, meanwhile, has separate accelerator chips for inferencing and training. The two processors have different interconnect bandwidth requirements (input-output, or I/O), with the inferencing processor's requirement being lower.
For training, the data exchange between the processors/xPUs, depending on how the task is parallelised, is highly predictable. “You can create a series of optical short-lasting circuits that minimise collective communication time,” says Zervas. However, the switches must be deterministic and synchronous. “You should not have [packet] queues,” he says.
Inferencing, which may access many AI ‘mixture of experts’ models, requires a more dynamic system. “Different tokens will go to different sets of experts, spread across the xPUs”, says Zervas. “Sometimes, some xPUs batch the queries and then flush them out.”
The result is non-deterministic traffic, getting closer to the traffic patterns of traditional Cloud data centres. Here, the network must be reconfigured quickly, in hundreds of nanoseconds.
“What we say is that a nanosecond-speed optical circuit switch has a place wherever any electrical packet switch has a place,” says Zervas. It’s still a circuit switch, stresses Zervas, even at such fast switching speeds, since there is a guaranteed path between two points. This is unlike ‘best effort’ traffic in a traditional electrical switch, where packets can be dropped.
“In our case, that link can last just as short as [the duration of] a packet,” says Zervas. “Our switches can be reconfigured every 100 nanoseconds.”
Once the link is established, data is sent to the other end without encountering queuing. Or, as Zervas puts it, the switching matches the granularity of packets yet has delivery guarantees that only a circuit can deliver.
Optics’ growing role in data centre networking
Currently, protocols such as Infiniband or Ethernet are used to connect racks of xPUs, commonly referred to as the scale-out network. For xPUs to talk to each other, a traditional Clos or ‘fat tree’ architecture comprising a hierarchy of electrical switches is used.
Because of the distances spanning a data centre, pluggable transceivers connect an xPU via a networking interface card to the switching network to connect to the destination network interface card and xPU.
Broadcom and Nvidia have announced electrical switches that integrate optics with silicon switches, a newer development. Using such co-packaged optics circumvents the need for pluggable optical transceivers on the front panel of an electrical switch platform.
Google has also developed its data centre architecture to include optical circuit switches instead of the top tier of large electrical switches. In such a hybrid network, electrical switches still dominate the overall network. However, using the optical layer saves cost and power and allows Google to reconfigure the interconnect between its TPU racks as it moves workloads around.
However, Google’s optical circuit switch's configuration speed is far slower than Oriole’s, certainly not nanoseconds.
With its Prism architecture, Oriole is taking the radical step of replacing all the electrical switching, not just the top tier. The result is a flat passive optical network. (See diagram below.)
“Switching happens at the edge [of the network] and the core is fully passive; it is made just of glass,” says Verberk.
The resulting network has zero packet loss and is highly synchronous. Eliminating electrical switches reduces overall power and system complexity while delivering direct xPU to xPU high-speed connectivity.
Prism architecture
Oriole’s first announcement is the Prism architecture that hinges on three system components:
- a PCI Express (PCIe) based network interface card.
- A novel pluggable module - the XTR - includes the optical transceiver and switching.
- A photonic router that houses athermal arrayed waveguide gratings (AWGs) to route the different wavelengths of light. The router box is passive and has no electronics.
“You go optically from the GPU out to another GPU, and the only [electrical-optical] conversion that happens is at the network interface card next to each GPU,“ says Verberk.
The PCIe-based network interface card uses 800-gigabit optics and integrates with standard software ecosystems.
Built around an FPGA that includes ARM processors, the card supports protocols like Nvidia’s NCCL (Nvidia Collective Communications Library) and AMD’s RCCL Radeon Open Compute Communication Collectives Library) via plugins, ensuring compatibility with existing AI software frameworks.
The network interface card acts as a deterministic data transport, mapping collective operations used for AI computation (e.g., Message Passing Interface operations like all-reduce, scatter-gather) to optical paths with minimal latency.
The card’s scheduler maps deterministic patterns directly to wavelengths and fibres for training. At the same time, it dynamically reconfigures based on workload demands, using a standard direct memory access (DMA) engine for inference.
The XTR pluggable module is the heart of Prism’s switching capability. “Within a pluggable form factor unit, we do transmission, reception, and switching,” says Zervas.
The photonic network combines three dimensions of switching: optical wavelengths, space switching, and time slots (time-domain multiplexing).
The chosen wavelength colour is determined using a fast tunable laser.
The space switching inside the XTR pluggable refers to the selected fibre path. “You have a ribbon of fibres, and you can choose which fibres you want to go to,” says Regan.
James Regan
The time aspects refer to time slots of 100 nanoseconds, the time it takes for the tunable laser to adjust to a new wavelength. Overall, rapid colour changes can be used to route data to specific nodes.
“The modulated channel can determine which communication group or cluster you can go to, and the fibre route can determine the logical rack you’re going to, and then the colour of light you’re carrying can determine the node ID within the rack,” says Zervas.
The photonic routers, passive arrayed waveguide gratings, form Prism’s core. “They’re just glass, which means they are athermal,” says Regan, highlighting their reliability and zero power consumption. These N-by-N arrayed waveguide gratings route light based on wavelength and fibre selection, acting like prisms.
“On one port, let’s say the input port, we have a colour red; if it’s red, it comes to the first output, if it’s blue, to the second, if it’s purple, to the third, etc.,” says Zervas.
Stacked racks of multiple arrayed waveguide gratings can handle large-scale clusters, maintaining a single optical hop for consistent signal-to-noise ratio and insertion loss.
"Every node to every other node goes through this only once, ensuring uniform performance across thousands of GPUs," says Zervas.
Prism’s Power and compute efficiencies
Using an 8,000-GPU cluster example, Prism eliminates 128 leaf and 64 spine electrical switches, cutting the number of optical transceivers by 60 per cent.
For even large AI clusters of over 16,000 GPUs, a third tier of switching is typically needed. This reduces the number of transceivers by 77 per cent.
Using Prism reduces overall power, not only optical transceivers but also by removing electrical switching and the associated cooling they need.
Unlike Ethernet packet switching, Prism’s optical circuits guarantee delivery without queuing, reconfiguring every 100 nanoseconds that matches packet durations.
For training, Prism reduces communication overhead to under 1 per cent. In existing networks, it is typically tens of per cent. This means the GPUs rarely wait for data and spend their time processing.
Market and deployment strategy
Oriole targets three segments: enterprises such as financial traders, HPC users such as car makers, switch makers, and hyperscalers.
“Our potential customer base is much wider,” says Regan, contrasting with chip-level optical input-output players focusing on specific chip vendors and hyperscalers.
Prism also features an Ethernet gateway that allows integration with existing data centres, avoiding a rip-and-replace. “You could just do that in the pieces of your data centre where you need it, or where you do new builds,” says Regan.
Oriole’s roadmap includes lab demonstrations this summer, alpha hardware by early 2026, deployable products by the end of 2026, and production ramp-up in 2027. Manufacturing is outsourced to high-volume contract manufacturers.
Challenges and outlook
Convincing hyperscalers to adopt a non-standard software stack remains a hurdle. “It becomes a collaboration,” says Zervas, noting the hyperscalers’ use of proprietary protocols.
Oriole’s full-stack approach—spanning Nvidia’s CUDA libraries to photonic circuits—does set it apart.
“It’s not often you bump into a company that has deep expertise in both [photonics and computing],” says Regan, contrasting Oriole with photonics-only or computing-only competitors.
“We’re building something here,” says Regan. “We’re building a major European player for networking, for AI, and arbitrary workloads into the future.”
Reader Comments