Artificial intelligence has been a widely discussed topic in 2023, with Google, Meta and Microsoft showing off their impressive product line-up and sharing their ambitious vision to harness the power of AI.
Amid all the chaos surrounding AI, Apple has chosen to remain silent or take its time to demonstrate its AI capabilities. Many people are curious about what steps Apple is taking to remain competitive in the AI arms race. It’s pretty clear that Apple has been actively involved in various AI initiatives for several years. Users have encountered difficulties integrating ChatGPT on their iPhones.
But be prepared for a turn. Apple recently published a research paper showing a remarkable technique that enables AI to run on iPhones. This technique involves optimizing flash storage to speed up bulky LLMs. When Apple incorporates advanced AI into the iPhone, it will mark another important milestone. Apple recently shared two research papers that highlight major advances in artificial intelligence and demonstrate its commitment to innovation. The article discusses innovative methods to create 3D avatars and improve the efficiency of language model inference.
This recent research, titled “LLM in a Snap: Efficient Inference of Large Language Models with Limited Memory,” was published on December 12. It has the potential to greatly improve the iPhone experience by providing a more engaging visual experience. In addition, users will have the ability to access advanced artificial intelligence systems on their iPhones and iPads. The research paper mainly addresses the efficient utilization of large language models on devices with limited DRAM capacity. DRAM is a type of memory commonly used in PCs. It is well regarded for its fast speed, high density, affordability and lower power consumption.
These are some of the key research findings that will give Apple a competitive advantage over its competitors.
The paper discusses the issue of running LLMs that go beyond the available DRAM capacity. It proposes a solution to store model parameters in flash memory and transfer them to DRAM as needed. The Inference Cost Model has been developed to optimize data transfers from flash memories, taking into account the characteristics of flash and DRAM.
The article discusses two techniques: windowing and row and column grouping. Windowing helps reduce data transfer by reusing previously activated neurons, while row and column binning increases the size of data chunks for more efficient flash reads.
The paper also discusses the concept of hashing exploitation, which involves using hashing in the FeedForward Network (FFN) layers to selectively load parameters and improve efficiency. Memory management is an important aspect that focuses on optimizing data handling in DRAM to reduce unnecessary overhead.
The researchers have used models such as OPT 6.7B and Falcon 7B to show their methodology. According to the article, the results demonstrated a significant improvement in the speed of both the CPU and GPU compared to traditional methods. The models achieved a 4-5x increase in CPU and a 20-25x increase in GPU.
In terms of applying the research in real-world situations, both models showed notable progress in resource-limited settings.
Apple recently conducted research that demonstrates an innovative method for effectively operating LLM in environments with limited hardware resources. It sets the stage for future research on on-device and next-generation user experience.
What does it mean for iPhone users?
From a user perspective, the findings on efficient LLM inference with limited memory could be extremely advantageous for both Apple and iPhone users. Thanks to the efficient performance of LLMs, users can now enjoy enhanced AI capabilities on their iPhones and iPads, even with limited DRAM. These features include improved language processing, advanced voice assistants, increased privacy, potential reduction in Internet bandwidth usage, and most importantly, the ability to make advanced AI accessible and responsive to all users of iPhone.
Despite the promising strides Apple’s efforts in AI research and applications show, experts express a sense of caution. Some experts have suggested that the tech giant should act cautiously and responsibly when applying research results to real-world use cases. Others have also emphasized the importance of considering privacy protections, finding ways to prevent potential misuse, and assessing the overall impact.
Categories: Technology
Source: vtt.edu.vn