Roofline ニュース

RooflineがエッジAIの容易な導入を可能にする方法に関する最新情報と洞察
お問い合わせ
カテゴリー
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
December 26, 2025

Dropping our biggest feature to close the year: Unlock your full SoC with heterogeneous execution.

Today, we are showing a capability the edge AI ecosystem has been missing for years. Modern edge SoCs are hardware powerhouses. CPU, GPU, NPU on a single chip, each with complementary strengths. But until now, software could only use them one at a time and forced them into slow lockstep. The result: idle hardware and wasted performance.

That gap ends today. roofline now enables true asynchronous heterogeneous execution, across devices, across vendors, fully end-to-end.

In this demo, we showcase Qwen3-0.6B on NXP Semiconductorsi.MX 95. The model is executed heterogeneously on CPU and GPU in parallel, with parts of the prefill and decode workloads offloaded to the GPU.

Get all the details in our full case study, where we outline our heterogeneous execution infrastructure, show how it extends to NPUs, and apply it across NXP Semiconductors, Qualcomm, and Apple SoCs: case-study: asynchronous heterogeneous execution

#EdgeAI Hashtag#AIDeployment Hashtag#AICompiler Hashtag#MLIR Hashtag#Roofline          

                           

October 31, 2025

Thank you HiPEAC for featuring roofline in the latest HiPEAC magazine!  Check out the article to learn more about our retargetable edge AI compiler and recent progress.

https://www.hipeac.net/news/magazine/7172/hipeacinfo-76/#/

October 27, 2025

We are thrilled to be part of the next edition of the LLVM Foundation Developers’ Meeting this week.

We will be presenting some of roofline's latest compiler work: We are especially proud to show our joint work by Ege Beysel and Andrzej Warzyński from Arm, focusing on MLIR and data tiling for Arm’s Scalable Vector Extensions. Also, Maximilian Bartel will deep dive into critical data layout transformations for tensor computations in MLIR.

Our team will be in Santa Clara the whole week. So please reach out to Maximilian Bartel, Christopher M., or Ege Beysel, if you’re around!

October 13, 2025

Enabling specific devices for AI deployment is one thing, making it developer-friendly is another. Most edge AI toolchains are complex and hard to use.

roofline changes that. In this short demo, our CTO Maximilian Bartel shows how simple deployment can be with our SDK: By changing just one line of code, he compiles and runs Qwen 2.5 0.5B on CPU (Arm Cortex-A) and GPU (Arm Immortalis). No manual configuration, no hassle.

This is what retargetable AI deployment truly means. We unlock full SoCs for the edge AI products you dream of.

July 15, 2025

Edge SoCs are becoming increasingly complex and heterogeneous. Unlocking the entire chip for AI workloads (CPU, GPU, NPU) is key to building performant edge AI products. For example, this paves the way for on-device agentic AI, where multiple models can run in parallel across different compute cores.

roofline's SDK provides one integrated toolchain to easily switch execution targets. Enabled by our MLIR-based compiler, retargeting takes just one line of code.

In this demo, we showcase migration of Qwen3 from Qualcomm's Oryon™ CPU to Adreno™ GPU. Realizing a speed-up of ~2x. Without any switching of tools or manual rewrites.

July 4, 2025

We are honored to announce that roofline has been selected by the European Innovation Council (EIC) Accelerator, receiving €2.5M in grant funding and a pre-committed equity investment for our next funding round.

The EIC Accelerator is Europe’s most competitive deep tech funding program. In this round, only 40 startups were selected from nearly 1,000 applicants: https://lnkd.in/e4aeD_sc

The recognition of our project "Retargetable AI Compiler Technology for Scalable Edge Deployment of Next-Generation AI Models" not only validates the potential of our technology, it fuels our mission to enable the edge AI products you dream of. Thank you to European Innovation Council and SMEs Executive Agency (EISMEA) and everyone who supported us on this journey.

June 19, 2025

On-device image-to-text with multimodal LLMs. LLMs are getting smaller and now fit on edge systems. But bringing them into products and unlocking disruptive use cases remains a challenge. Common edge AI deployment tools cannot keep up with the pace of AI innovation, especially with cutting-edge models like multimodal LLMs.

Here is a look at what roofline's MLIR-based compiler can do. We run an image-to-text task using Google DeepMind's Gemma-3-4B, fully compiled, on real Qualcomm edge hardware:

🖼️ Input: Camera view of a mobile robot in an aisle.

💬 Output: Natural language reasoning. The mobile robot decides to slow down and adjust its path.

⚡ Performance: ~9x faster than TorchInductor.

Curious? Let's talk

Sorry, we couldn’t find anything matching that. How about browsing our latest posts?