PaperPulse - AI/ML Summarization Platform

One-line Summary

Splitwise, a Lyapunov-assisted DRL framework, optimizes LLM deployment by adaptively partitioning models across edge and cloud, reducing latency and energy consumption significantly compared to existing methods.

Plain-language Overview

Deploying large language models (LLMs) on devices like smartphones or small computers is tough because they often don't have enough power or memory. Using the cloud can help, but it can be slow and expensive. The new system, called Splitwise, uses a smart method to split the work between the device and the cloud, making it faster and using less energy. This system is especially good at handling changes in internet speed and can recover from connection problems. Tests show that it works much better than older methods, making things faster and saving energy.

Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL

One-line Summary

Plain-language Overview

Technical Details

Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL

One-line Summary

Plain-language Overview

Technical Details

Methodology

Data

Results