d-Matrix CEO Showcases Digital In-Memory Compute Solutions at IESA Vision Summit Amid Accelerating AI Innovation in India
Bangalore, January 30, 2024 –– Sid Sheth, CEO of d-Matrix, delivered the keynote presentation for the 2024 IESA Vision Summit, sharing an update on the state of generative AI and the immense opportunity it presents for India’s economy. Sheth detailed d-Matrix’s first-of-its-kind digital in-memory compute (DIMC) solutions, which address current barriers to the commercial viability of generative AI, and highlighted the company’s fast growing operations in India’s tech corridor and around the world.
Key Points
- India’s Vision for the Future of AI – India is uniquely poised for economic transformation: research projects generative AI to add $1.2 trillion to India’s GDP by FY30, and the startup landscape continues to boom with 102 unicorns and $65 billion in foreign direct investment. The Ministry of Electronics and Information Technology (MeitY) has outlined a clear vision for the future of AI, prioritizing collaboration across sectors and initiatives such as project Bhashini which will help democratize access to digital advancements.
- Current Barriers to Generative AI Deployment – Today, applications such as ChatGPT are run on GPUs which are not optimized for inference, resulting in a costly upfront purchase, prohibitively high electricity usage and a large carbon footprint for companies running AI applications. In order to make generative AI widely accessible, it needs to be delivered sustainably at a more affordable cost. This requires a purpose-built solution that is performance and power optimized for AI inference.
- d-Matrix’s Digital In-Memory Solution – The d-Matrix DIMC solution tackles challenges related to memory bandwidth, memory capacity and computational needs during AI inference. The d-Matrix Jayhawk 2 solution delivers up to 150 TB/s throughput in memory bandwidth, a significant jump from the 3 to 4 TB/s that High Bandwidth Memory (HBM) is able to stream on modern GPUs. d-Matrix has created an architecture that significantly enhances key metrics for large transformer-based inference, improving performance by orders of magnitude:
- Total cost of ownership (TCO) improvement of 13-27x compared to GPUs when running LLaMA2-13B models with 4K context
- 20x better power efficiency, 20x lower latency and 40x higher memory bandwidth
- Immense Opportunity with the d-Matrix Team – Headquartered in Silicon Valley, d-Matrix has raised $160 million and expanded to establish design centers in Bengaluru, Sydney and Toronto with 100+ team members (30% PhDs). The company is moving to an expanded facility and is actively hiring across roles at d-matrix.ai/careers/.