Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/179743
Title: Efficient inference offloading for mixture-of-experts large language models in internet of medical things
Authors: Yuan, Xiaoming
Kong, Weixuan
Luo, Zhenyu
Xu, Minrui
Keywords: Computer and Information Science
Issue Date: 2024
Source: Yuan, X., Kong, W., Luo, Z. & Xu, M. (2024). Efficient inference offloading for mixture-of-experts large language models in internet of medical things. Electronics, 13(11), 2077-. https://dx.doi.org/10.3390/electronics13112077
Journal: Electronics 
Abstract: Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy protection. Existing LLMs face difficulties in providing accurate medical questions and answers (Q&As) and meeting the deployment resource demands in the IoMT. To address these challenges, we propose MedMixtral 8x7B, a new medical LLM based on the mixture-of-experts (MoE) architecture with an offloading strategy, enabling deployment on the IoMT, improving the privacy protection for users. Additionally, we find that the significant factors affecting latency include the method of device interconnection, the location of offloading servers, and the speed of the disk.
URI: https://hdl.handle.net/10356/179743
ISSN: 2079-9292
DOI: 10.3390/electronics13112077
Schools: School of Computer Science and Engineering 
Rights: © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
electronics-13-02077.pdf1.09 MBAdobe PDFThumbnail
View/Open

Page view(s)

17
Updated on Sep 7, 2024

Download(s)

4
Updated on Sep 7, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.