In recent months, Majestic Labs has been considered one of the mysterious and intriguing companies operating in Israel – it was founded by a team of former executives in the engineering departments of Google andMeta and raised 100 million dollars while promising to develop an artificial intelligence server that would bypass the Nvidia Effectively per AI processing operation (price per token, or “token”).
● The first laboratory outside of Kneam: Envidia rents half a dunam in Rishon Lezion
● Massive, cold and cruel: the wave of layoffs in 2026 breaks the rules of the game in high-tech
CEO Ofer Shaham, who founded Meta’s chip lab at the invitation of Mark Zuckerberg, has not yet revealed the way in which the company will compete with Nvidia until today (Tuesday). In his first exposure to the Israeli media, Shaham explains what is behind the server that wants to solve one of the major bottlenecks in the artificial intelligence servers of companies such as Nvidia and AMD – a severe lack of memory.
Ofer Shaham, and his co-founders of the company – the Americans Shah Rabi and Massumi Reindes – were not content with developing a new graphics processor, but with engineering a new server from the ground up under the name “Prometheus” with the assumption that the release of the bottleneck in artificial intelligence processing starts from there.
Each server of the Israeli company is equipped with memory chips that provide an amount of memory 100 times higher than a standard Nvidia server in which Blackwell (B200) processors are embedded. Thanks to a structure that allows extended access to memory for each processor, the amount of memory available to Majestic Labs’ processors is 128 terabytes, but the server’s architecture – the way in which the central memory components are wired to the various chips – actually provides a thousand times the amount of memory available to an average Nvidia Blackwell processor, which is estimated at only 192 gigabytes.
Shachem does not reveal what type of memory it is – but admits that it is not the type of memory common in Nvidia processors – Broadband Memory (HBM). According to the assessment, the company purchases the memory components from the three major memory chip companies: Micron, Samsung and SK Hynix. Instead of Nvidia’s GPUs, Majestic Labs offers its own processors called Ignite, or AIU for short. These were developed based on ARM’s intellectual property, Majestic’s original design, and the RISC-5 (RISC-V) open source platform that allows to perform and plan the calculation operations of the artificial intelligence in a way that is adapted to the requirements of different companies.
Since Nvidia’s control is reflected not only in the supply of graphics processors, but also in the control of the operating system for artificial intelligence “Cuda” (CUDA), the AIU processors were developed to allow AI programmers to develop applications in languages such as PyTorch, the common development language for AI experts in the Nvidia environment, and on OpenAI’s Triton operating system, which has become Coda’s major competitor, even that Nvidia is one of the biggest funders of OpenAI recently.
The architectural problem and the difference from Invidia
The new architecture of Majestic Labs eliminates, according to the CEO, the need for a significant supply of communication processors – a role played by former Mellanox processors in Nvidia servers.
“We don’t need communication between the processors because the communication is done through the memory, just as it happens in standard computing – like in the communication between cores in a standard core processor,” says Shaham. “The need for so many communication chips in the server farms was born due to the small amount of memory given to each of the Nvidia processors. It forces the processors to communicate with each other.
“This is why, in order to provide one stable server, Nvidia had to build an expensive hybrid creature – a server cabinet with 72 processors with broadband communication between each other (Nvidia’s premium server known as NVLink – AG) – but due to the inefficiency of their activity, they very quickly reach the limit of the maximum memory they can process and wait for the data to reach them, and in the meantime they stand idle. And with the increasing amount of parameters of the edge models released by companies such as Anthropic, OpenAI and Gemini – companies are forced to purchase more and more servers to enable this enormous processing power.
“The end models, such as those of Gemini or GPT have difficulty operating in the memory of even 10 graphics processors, so they require entire server cabinets of 72 Nvidia processors that are seen as those that can handle the largest models.
“Jensen Huang Abel recently presented a slide that even in such a closet the models are already having difficulty moving through and there is already a decline in 400,000 tokens (tokens – the basic processing unit of AI – AG), not to mention models of 5 trillion parameters that will be launched towards the end of the year or the beginning of next year.
“The need to insert so many graphics processors to deal with these models dictates energy consumption that increases exponentially and is unsustainable. Then you reach the memory ceiling, what is known in the industry as the “memory wall”, and the processors sit idle half the time, burning energy and waiting for data to flow to them. The result is a small return and energy consumption increases for each added graphics chip – it’s not just a lack of memory – it’s simply a problem Architectural in the model in which artificial intelligence works today.”
This problem dictates huge capital expenditures among the five cloud giants alone: Amazon google, Microsoft meta andTesla stand at 443 billion dollars last year and are expected to increase to 602 billion dollars this year.
Does this mean that you will market servers that will be cheaper than those of Nvidia?
“We do not compete on the price per processing unit, but on the price per result – the cost per token, or token. We offer a ‘machine’ that is capable of producing between 10 times 50 times more ‘tokens’ per megawatt for every dollar invested in building a server farm.
“I have a client who is currently building a server farm with an electrical activity of 500 megawatts – what he is asking me is not necessarily how much the server will cost him, but how many tokens he will sell per megawatt, and I know how to give him up to 50 times what is accepted in the market. Offering a cheaper product in this case is not necessarily a sustainable model for us, so you get a ‘race to the bottom’. I do not have a cost advantage over Nvidia because there is also a game of quantities here and they are can provide large volumes at a low cost and provide discounts. We want to sell our product at a good profit.”
The finished product will reach customers next year
According to Shaham, Majestic’s servers and chips are primarily built for the needs of inference and running agents (inference) rather than training models, although they could be adapted to them as well. The company focuses on language models and graph or table-based neural networks – and less on image and video models.
When will you start selling the servers and processors?
“We are already working with several customers in the prototype phase, but the finished product will be shipped to our first customers next year. We are already accepting orders and working with several customers to better adapt our product to their needs.”
A company that has proven to customers, most likely cloud giants, that it can increase the processing efficiency of AI models by 50 times – this sounds like a very attractive asset to companies like Nvidia or one of the cloud giants. Have you received purchase offers?
“I was recently asked: are we building a product or is the company the product. The answer is that we are building a product – I built such products for Google, for Meta, the first processors for DARPA (American Defense Agency – AG). We are here to build a product that can solve problems for entire industries.
“It’s clear that along the way things may happen – but our goal is to build a product that our customers – companies that build server farms for AI processing – enjoy, that improves their energy consumption, and as a result save many energy expenses that are still planned in the long term. We had to leave our previous jobs at the technology giants to understand what the bottlenecks were and they came much faster than we thought. When we founded the company two and a half years ago, we said that the memory problem would be a headache The largest of the industry, today it is ten times more severe than we thought it would be at this stage.”
code of ethics
appearing
in the trust report
according to which we act. Expressions of violence, racism, incitement or any other inappropriate discourse are filtered out automatically and will not be published on the site.













