Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/179743
Title: | Efficient inference offloading for mixture-of-experts large language models in internet of medical things | Authors: | Yuan, Xiaoming Kong, Weixuan Luo, Zhenyu Xu, Minrui |
Keywords: | Computer and Information Science | Issue Date: | 2024 | Source: | Yuan, X., Kong, W., Luo, Z. & Xu, M. (2024). Efficient inference offloading for mixture-of-experts large language models in internet of medical things. Electronics, 13(11), 2077-. https://dx.doi.org/10.3390/electronics13112077 | Journal: | Electronics | Abstract: | Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy protection. Existing LLMs face difficulties in providing accurate medical questions and answers (Q&As) and meeting the deployment resource demands in the IoMT. To address these challenges, we propose MedMixtral 8x7B, a new medical LLM based on the mixture-of-experts (MoE) architecture with an offloading strategy, enabling deployment on the IoMT, improving the privacy protection for users. Additionally, we find that the significant factors affecting latency include the method of device interconnection, the location of offloading servers, and the speed of the disk. | URI: | https://hdl.handle.net/10356/179743 | ISSN: | 2079-9292 | DOI: | 10.3390/electronics13112077 | Schools: | School of Computer Science and Engineering | Rights: | © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Journal Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
electronics-13-02077.pdf | 1.09 MB | Adobe PDF | View/Open |
Page view(s)
17
Updated on Sep 7, 2024
Download(s)
4
Updated on Sep 7, 2024
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.