Spiking Brain-inspired Large Models
Автор: LuxaK
Загружено: 2025-09-09
Просмотров: 654
This document introduces SpikingBrain, a family of brain-inspired large language models (LLMs) designed to address the efficiency bottlenecks of Transformer-based LLMs. The models focus on efficient long-context training and inference, leveraging the MetaX1GPU cluster. SpikingBrain utilizes linear and hybrid-linear attention architectures with adaptive spiking neurons, along with algorithmic optimizations such as conversion-based training and a dedicated spike coding framework. System engineering includes customized training frameworks, operator libraries, and parallelism strategies tailored to the MetaX hardware. The paper presents SpikingBrain-7B and SpikingBrain-76B, demonstrating the feasibility of large-scale LLM development on non-NVIDIA platforms. These models achieve comparable performance to Transformer baselines with significantly reduced data resources and improved long-sequence training efficiency. The research explores the potential of brain-inspired mechanisms to drive the next generation of efficient and scalable large model design.
#LargeLanguageModels #BrainInspired #SpikingNeuralNetworks #Efficiency #MetaX
paper - http://arxiv.org/pdf/2509.05276v1
subscribe - https://t.me/arxivpaper
donations:
USDT: 0xAA7B976c6A9A7ccC97A3B55B7fb353b6Cc8D1ef7
BTC: bc1q8972egrt38f5ye5klv3yye0996k2jjsz2zthpr
ETH: 0xAA7B976c6A9A7ccC97A3B55B7fb353b6Cc8D1ef7
SOL: DXnz1nd6oVm7evDJk25Z2wFSstEH8mcA1dzWDCVjUj9e
created with NotebookLM
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: