OpenCL Software & BSPs2017-03-20T10:42:41+00:00

Nallatech FPGA Accelerators support the Altera SDK for Open Computing Language (OpenCL™)

Nallatech’s partnership with Altera now enables programmers with little or no knowledge of FPGA technologies to target the Nallatech FPGA Accelerators in an Industry Standard High Level Programming Language: OpenCL.
The Altera Software Development Kit (SDK) for OpenCL provides a programming environment for programmers to develop their own massively parallel and compute intensive applications while reducing power consumption and total cost of ownership.
Several Nallatech FPGA Accelerators are OpenCL compatible through the use of optimized Board Support Packages (BSPs). The OpenCL BSPs  abstract the hardware layer considerations from the FPGA programmer, simplifying the FPGA programming model & debug cycle.
The OpenCL SDK opens opportunities for FPGA enthusiasts & experts to create their optimized FPGA applications in a software environment reducing their solution’s time-to-market.

The OpenCL Software Language

The Open Computing Language (OpenCL) standard is the first open, royalty-free, unified programming model for accelerating algorithms on heterogeneous systems. OpenCL software allows the use of a C-based programming language for developing code across different platforms such as central processing units (CPUs), graphic processing units (GPUs), digital signal processors (DSPs), and field-programmable gate arrays (FPGAs).

The OpenCL industry standard enables engineering teams to target FPGA technology based products without getting to the level of details that hardware and firmware engineers programming in HDL had to. Existing CPU/GPU C or OpenCL code can be recompiled with the Altera OpenCL Software Development Kit and instantly make use of the FPGA hardware resources.

When porting existing code or developing new algorithms, OpenCL software is to the new standard to reduce time to market for FPGA–based accelerator products.

FPGA Programming with OpenCL

OpenCL allows the programmer to construct a dedicated FPGA Accelerator by performing hardware level optimizations automatically in the OpenCL code. The key FPGA features and benefits are abstracted in the syntax and the programmer uses the compiler to create highly parallel applications. The reconfigurable FPGA logic allows the generation of dedicated and optimized block for hardware dedicated functionalities.

Historically FPGA have been used as integer arithmetic accelerators, the Arria 10 FPGA family now also features higher FLOPS with dedicated floating-point resources (up to 1.5 TFLOPS), which OpenCL software leverages seamlessly allowing an entire new range of application to benefit from FPGAs.

Previous generations of FPGA accelerators have been limited by their IO throughput or memory bandwidth, OpenCL Software Development Kit helps balancing the high computing power capabilities of the FPGA logic with the speed of IOs, enabling high speed kernel-to-kernel and kernel-to-IOs data transfers through the OpenCL channel extension.

The channel feature combined with the highly flexible memory configuration, where internal and on-board memory can be customized to fit the application’s need in a way different from GPUs, provide the platform to enable Nallatech FPGA accelerator as optimized stream computing nodes in customers’ infrastructures.

OpenCL Software Development Kit enables:

  • Thousands of parallel kernels executions
  • Configurable FPGA logic optimized for integer arithmetic
  • New dedicated floating-point FPGA resources (up to 1.5 TFLOPS)
  • Configurable local and global memory
  • Kernel-to-kernel / kernel-to-IO high bandwidth channels
  • Low Power

Altera Tool Flow

The Altera OpenCL SDK is a development environment for the Software Programmer; FPGA design considerations are abstracted away and automatically handled by the compiler. The flow is based on a debug and optimization cycle in software where the FPGA compilation is to be performed only a limited number of times when most of the application has been designed and optimized.

  1. Emulator to verify functionalities
  2. Optimize OpenCL for FPGA architecture – over 300 optimizations
    1. Increase parallelism
    2. Ensure pipeline
    3. Use FPGA hardware resources
  3. Profile kernel performance
  4. Compile to FPGA hardware target

The Altera SDK for OpenCL is in full production release enabling programmers to get to gate-level performant OpenCL code by following simple design guidelines and port kernel code platform to platform with minimum effort. OpenCL SDK is the most efficient path to production and deployment for FPGA Accelerator solutions.

What is a BSP?

Nallatech’s expertise in FPGA-based hardware and algorithm acceleration is concentrated in the OpenCL Board Support Packages. The on-board resources and the FPGA low-level resources are automatically leveraged by the BSPs allowing the programmer to focus on the algorithm rather than its physical implementation in the FPGA.

Nallatech BSP offerings are tailored to specific needs. For COMPUTE intensive applications, HPC BSPs maximize the FPGA’s resource utilization. For data streaming acceleration, the NETWORK Streams enabled MAC BSPs provide a data flow straight to the FPGA fabric for in-stream bit operations.

Altera’s OpenCL SDK combined with Nallatech’s BSP enables the use of the newly available OpenCL channel feature. Channels are an OpenCL construct that allows kernel-to-kernel or IO-to-kernel high bandwidth data transfers. The high bandwidth FPGA fabric local memory bandwidth can be leveraged by these OpenCL channels.

Fully Integrated Solutions

Nallatech OpenCL capable FPGA Accelerators are available as a fully integrated & production-ready solutions. The BSP can be installed and deployed from a single installer on the development and runtime systems. Nallatech also offers the BSP Debug Kits, which include the Altera Quartus-II / OpenCL SDK licenses for customers who require them.

Nallatech OpenCL BSPs also include several features to facilitate in production system deployment:

  • Board health status (power consumption & temperature)
  • Altera PCIe Hard IP cores (tested across industry standard systems)
  • Flash recovery mechanisms

We also provide pre-installed, ready to use, Integrated Servers with all the software & hardware pieces included.

Accelerator Metrics

The Nallatech FPGA Accelerators compatible with OpenCL are based on two FPGA families: Arria 10 and Stratix V. When choosing an FPGA Accelerator that will fit their system’s requirements, customers must first look into the FPGA resources requirements of their algorithm at the top level & the FPGA Accelerator capabilities.

Nallatech offers multiple FPGA Accelerators and Board Support Packages to target these needs.The following sections describe Nallatech BSP IP offering.

Host to Global Memory Bandwidth8-lane PCIe 2.08-lane PCIe 2.08-lane PCIe 3.016-lane PCIe 3.0 (2 x 8-lane)
Global Memory Depth8GB (up to 16GB)32 GB8 GB (up to 32 GB)32 GB
IO to Kernel BandwidthUp to 2  x 10 GbE (MAC BSP)Up to 4  x 10 GbE (MAC BSP)Up to 2 x 40G Serial Links (not GbE)

or up to 2  x 10 GbE (MAC BSP)



The High Performance Computing BSP, or HPC BSP, provides the larger amount of FPGA resources to the user algorithm.

OpenCL HPC BSP - High Performance Computing - Board Support Package
Use the OpenCL SDK features to maximize the FPGA fabric utilization by replicating multiple parallel instances of your optimized OpenCL kernel code.

High Bandwidth Kernel-to-Kernel Channel Support

Typical Applications:

  • Encryption
  • Compression
  • Etc.

Resource Usage:

  • PCIe Host Interface
  • On-board Global Memory Buffers


The MAC BSP connects the FPGA fabric to the network; the FPGA fabric is connected to a high bandwidth data stream via 10 GbE MAC cores.
Stream AcceleratorThe network packets are directly passed to the OpenCL kernel and the UDP packet fields can be access as standard OpenCL constructs. The kernel code can filters packet from analyzing the headers or modify the packets’ payloads.

High Bandwidth Kernel-to-Kernel & IO-to-Kernel Channel Support

Typical Applications:

  • Network Packet Filtering
  • In Stream Processing
  • Etc.

Resource Usage:

  • PCIe Host Interface
  • On-board Global Memory Buffers
  • 10 GbE MAC Streaming Interfaces

Customized BSP

Nallatech can also develop Customized Board Support Packages for your specific needs. Multiple IO protocols are supported by Nallatech FPGA Accelerators. Our team of FPGA Acceleration Experts can work with your organization to develop a customized Board Support Package.

Get Started!

Contact Nallatech

Let's discuss how your program needs can be met using our custom solutions.
Get Started!

20 years of Experience & Innovation

Nallatech’s partnership with Altera on supporting OpenCL SDK is the logical continuation of 20 years of experience promoting high level language programming of FPGAs. Understanding customer’s system challenges and identifying the best approach to accelerate and optimize a customer’s application is in Nallatech’s DNA.

Nallatech Focus

Nallatech believes FPGA-based products should be:

  • Intuitive & Easy to Use
  • Flexible
  • Production Ready
  • Easy to deploy
  • Integrated in Customer’s System

We believe OpenCL serves these goals and we are excited to see you succeed!

A Team of System Acceleration Experts

At Nallatech we have assembled a top-notch design and engineering team that can engage with you to ensure your program’s success. We work best when we are deeply engaged with customers at the early development stage, leveraging our multiple disciplines to deliver a solution on-time, on-budget and on-spec with minimum risk.

Nallatech Design Services Key Value:

  • Reduced Risk
  • Lower Costs
  • Faster time to Market

Nallatech’s R&D Department is constantly providing new solutions to the industry’s challenges. Find out more on our White Papers & Publications sections.

Get Started!

Contact Nallatech

Let's discuss how your program needs can be met using our custom solutions.
Get Started!

OPRA Fast Decoder

The OPRA FAST parser example parses incoming compressed OPRA Fast data and returns a subset of fields over Ethernet. This example is based on the Nallatech MAC BSP and shows how to implement a subset of UDP operations with the OpenCL kernel code.

Read More Altera OPRA FAST Parser Design

Pre-Compiled AOCX

Nallatech provides pre-compiled packages for all the example designs available on the Altera Examples Pages.

Active Community

Altera OpenCL SDK has been the talk of the town since its release. A community of users has been created and is now very active on the Altera OpenCL Forum.

Research Papers, White Papers & Articles can be found on the Altera OpenCL Publication Page.

Password Reset
Please enter your e-mail address. You will receive a new password via e-mail.