TrustFUL

Project Highlights

Project Title: Low-Parameter Transformer with Temporal Dependency Hierarchical Propagation for Health Informatics

Keywords: Foundation Models, Healthcare
Introduction:
Transformers based on Self-Attention (SA) mechanism have demonstrated unrivaled superiority in numerous areas. Compared to RNN-based networks, Transformers can learn the temporal dependency representation of an entire sequence in parallel, while efficiently dealing with long-range dependencies. However, the O(L²) (L denotes the length of the sequence) computational complexity of the SA mechanism and the high memory usage make the construction cost of the Transformer-based model prohibitively expensive. To address these challenges, we propose a Transformer-like model, HPformer: Low-Parameter Transformer with Temporal Dependency Hierarchical Propagation. HPformer first chunks the sequence into K sequence segments, then leverages the hierarchical propagation mechanism with O(L) computational complexity to learn the temporal dependencies between the segments and within the segments, and ultimately generates K vectors as Key matrices. This reduces the complexity of the SA mechanism from O(L²) to O(LlogL). In addition, we employ a strategy of sharing Key and Value matrices between layers to build the HPformer, thus reducing memory usage.
Publication:

Wu Lee, Yuliang Shi, Han Yu, Lin Cheng, Xinjun Wang, Zhongmin Yan & Fanyu Kong. HPformer: Low-parameter transformer with temporal dependency hierarchical propagation for health informatics. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE (2025).

Project Title: Multi-Session Budget Optimization for Forward Auction-based Federated Learning

Keywords: Auction-based Federated Learning, Temporal Budget Optimization
Introduction:
Auction-based Federated Learning (AFL) has emerged as an important research field in recent years. The prevailing strategies for FL data consumers (DCs) assume that the entire team of the required data owners (DOs) for an FL task must be assembled before training can commence. In practice, a DC can trigger the FL training process multiple times. DOs can thus be gradually recruited over multiple FL model training sessions. Existing bidding strategies for AFL DCs are not designed to handle such scenarios. Therefore, the problem of multi-session AFL remains open. To address this problem, we propose the Multi-session Budget Optimization Strategy for forward Auction-based Federated Learning (MBOS-AFL). Based on hierarchical reinforcement learning, MBOS-AFL jointly optimizes intersession budget pacing and intra-session bidding for AFL DCs, with the objective of maximizing the total utility. Extensive experiments on six benchmark datasets show that it significantly outperforms seven state-of-the-art approaches. On average, MBOS-AFL achieves 12.28% higher utility, 14.52% more data acquired through auctions for a given budget, and 1.23% higher test accuracy achieved by the resulting FL model compared to the best baseline. To the best of our knowledge, it is the first budget optimization decision support method with budget pacing capability designed for DCs in multi-session forward AFL.
Publication:

Xiaoli Tang, Han Yu, Zengxiang Li & Xiaoxiao Li, "Multi-Session Budget Optimization for Forward Auction-based Federated Learning," in Proceedings of the 42nd International Conference on Machine Learning (ICML'25), 2025.

Project Title: Efficient Heterogeneity-Aware Federated Active Data Selection

Keywords: Federated Learning, Active Learning, Data Selection
Introduction:
Training effective AI models usually requires large amounts of labeled data, which is costly and challenging to gather, especially when data privacy regulations restrict direct data sharing among institutions. Existing techniques often fail to efficiently select the most useful data points when data is distributed unevenly across multiple clients. To address this, we developed Federated Active data selection by Leverage score sampling (FALE), a novel method combining federated Learning with active Learning. FALE utilizes a privacy-preserving federated singular value decomposition to understand data distribution securely across different clients. Based on this analysis, FALE employs a leverage-score sampling strategy to select globally informative data points efficiently. It further securely trains a robust global AI model using these selected data points, without compromising client privacy. Our FALE significantly reduces redundant labeling efforts and enhances the accuracy of AI models in decentralized environments. Experiments on multiple benchmark datasets demonstrate that FALE consistently outperforms existing methods, achieving better performance with fewer labeled data points. Thus, FALE makes decentralized AI training more practical, efficient, and privacy-aware, with broad implications for secure and collaborative machine learning applications.
Publication:

Ying-Peng Tang, Chao Ren, Xiaoli Tang, Sheng-Jun Huang, Lizhen Cui & Han Yu, "Efficient Heterogeneity-Aware Federated Active Data Selection," in Proceedings of the 42nd International Conference on Machine Learning (ICML'25), 2025.

Project Title: Personalized Federated Class-Incremental Learning with Mixture of Frequency Aggregation

Keywords: Foundation Models, Federated Learning, Frequency Domain Aggregation, LoRA
Introduction:
Federated learning (FL) has emerged as a promising paradigm for privacy-preserving collaborative machine learning. However, extending FL to class incremental learning settings introduces three key challenges: 1) spatial heterogeneity due to non-IID data distributions across clients, 2) temporal heterogeneity due to sequential arrival of tasks, and 3) resource heterogeneity due to diverse client capabilities. Existing approaches generally address these challenges in isolation, potentially leading to interference between updates, catastrophic forgetting, or excessive communication overhead. In this project, we built personalized Federated class-incremental parameter efficient fine-tuning with Mixture of Frequency aggregation (pFedMixF), a novel framework that simultaneously addresses all three heterogeneity challenges through frequency domain decomposition. Our key insight is that assigning orthogonal frequency components to different clients and tasks enables interference-free learning to be achieved with minimal communication costs. We further design an Auto-Task Agnostic Classifier that automatically routes samples to task-specific classifiers while adapting to heterogeneous class distributions. To the best of our knowledge, it is the first federated LoRA tuning approach based on frequency domain aggregation.
Publication:

Yifei Zhang, Hao Zhu, Alysa Ziying Tan, Dianzhi Yu, Longtao Huang & Han Yu, "pFedMixF: Personalized Federated Class-Incremental Learning with Mixture of Frequency Aggregation," in Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'25), pp. 30640-30650, 2025.

Project Title: Federated Textual Gradient (FedTextGrad)

Keywords: Foundation Models, Closed-Source, Textual Gradient, Federated Learning, Prompt Optimization
Introduction:
Recent studies highlight the promise of LLM-based prompt optimization, especially with TextGrad, which automates "differentiation" via texts and backpropagates textual feedback provided by LLMs. This approach facilitates training in various real-world applications that do not support numerical gradient propagation or loss calculation. It opens new avenues for optimization in decentralized, resource-constrained environments, suggesting that users of black-box LLMs (e.g., ChatGPT) could enhance components of LLM agentic systems (such as prompt optimization) through collaborative paradigms like federated learning (FL). In this project, we systematically explore the potential and challenges of incorporating textual gradient into FL. Our contributions are fourfold. Firstly, we introduce a novel FL paradigm, Federated Textual Gradient (FedTextGrad), that allows FL clients to upload their locally optimized prompts derived from textual gradients, while the FL server aggregates the received prompts through text summarization. Unlike traditional FL frameworks, which are designed for numerical aggregation, FedTextGrad is specifically tailored for handling textual data, expanding the applicability of FL to a broader range of problems that lack well-defined numerical loss functions. Secondly, building on this design, we conduct extensive experiments to explore the feasibility of federated textual gradients. Our findings highlight the importance of properly tuning key factors (e.g., local steps) in FL training to effectively integrate textual gradients. Thirdly, We highlight a major challenge in federated textual gradient aggregation: retaining essential information from distributed prompt updates. Concatenation often produces prompts that exceed the LLM API's context window, while summarization can degrade performance by generating overly condensed or complex text that lacks key context. Last but not least, in response to this issue, we improve the vanilla variant of FedTextGrad by providing actionable guidance to the LLM when summarizing client prompts by leveraging the Uniform Information Density principle. Such a design reduces the complexity of the aggregated global prompt, thereby better incentivizing the LLM's reasoning ability. Through this principled study, we enable the adoption of textual gradients in FL for optimizing LLMs, identify important issues, and pinpoint future directions, thereby opening up a new research area that warrants further investigation. Our code is available here.
Publication:

Minghui Chen, Ruinan Jin, Wenlong Deng, Yuanyuan Chen, Zhi Huang, Han Yu & Xiaoxiao Li, "Can Textual Gradient Work in Federated Learning?," in Proceedings of the 13th International Conference on Learning Representations (ICLR'25), 2025.

Project Title: LLM for Medical Quality Control Governance

Keywords: Foundation Models, Healthcare, Quality Control, Auditing
Introduction:
Medical quality control (MQC) indicators are essential for evaluating the performance of healthcare institutions to ensure high-quality patient care. In this project, we design, implement and deploy the Intelligent EMR-LLM platform for Medical Quality Control (IMQC), a large language model (LLM)-empowered system for automatically computing MQC indicators for enhancing the quality of medical services in Shanghai. It consists of an LLM (i.e., EMR-LLM) for processing electronic medical records (EMRs). With EMR-LLM, IMQC translates existing MQC indicators into a standardized representation language and automatically computes them based on EMRs. Since its deployment in February 2024, IMQC has been adopted by the Shanghai Medical Quality Management Center and associated hospitals. So far, it has processed 1,245 medical quality indicators for secondary- and tertiary-level hospitals, achieving an MQC evaluation accuracy of 93.31%, which is comparable to human experts. It has significantly improved efficiency, increasing from 10 EMRs per hour per human expert to over 1,000 EMRs per hour on average using one single H800 GPU. Over the first round of deployment in Shanghai, it is estimated that IMQC saves around 3.42 million RMB per month in manpower costs compared to traditional reporting methods. The successful deployment of IMQC sets a precedence for other regions to adopt similar AI-driven solutions to enhance medical quality control.
Publication:

Qi Ye, Guangya Yu, Jingping Liu, Erzhen Chen, Chenjie Dong, Xiaosheng Lin, Zelei Liu, Han Yu & Tong Ruan, "IMQC: A Large Language Model Platform for Medical Quality Control," in Proceedings of the 37th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-25), 2025. (Innovative Application of AI Award)

Project Title: Federated Multi-Foundation Agent System for Smart Healthcare

Keywords: Multi-Agent Decision Support, Foundation Models, Personalization, Privacy-Preservation
Introduction:
Performing diagnosis and helping patients manage treatment require complex reasoning and decision-making by doctors based often on uncertain information. In recent years, artificial intelligence (AI), especially large language models (LLMs), has achieved significant advancement, prompting the development of decision support systems for doctors to help improve clinical care. This has resulted in the emergence of a plethora of multi-agent systems (MAS) with foundation model-based agents providing interactive decision support based on queries about patient conditions. Existing approaches focus on building MAS with a mixture of medical expert agents attempting to answer queries based on their respective specialization. Nevertheless, such a complex, sequential and interactive decision-making process requires coordination among agents with diverse roles. Such capabilities are currently missing. In this project, we develop the Multi-Agent Collaborative Decision Support System for Healthcare (MAC-Health) to bridge this gap. It consists of two main types of agents: 1) medical domain expert agents, which are medical Q&A agents possibly finetuned on hospital data through federated learning (FL), and 2) ecosystem management agents, which play supplementary roles such as coordination, prompt generation, dynamic context-aware workflow triggering, uncertainty-based follow up questioning, optimization of MAS organizational structure, and uncovering potential issues among diagnostic responses from medical domain expert agents, etc. In this way, the system is designed to provide transparent decision support for medical professional, potentially serving as an AI copilot for doctors in action, or a learning companion for medical school students. The project is currently ongoing. Any interested collaborators are welcome!
Publication:

Tao Fan, Hanlin Gu, Xuemei Cao, Chee Seng Chan, Qian Chen, Yiqiang Chen, Yihui Feng, Yang Gu, Jiaxiang Geng, Bing Luo, Shuoling Liu, Win Kent Ong, Chao Ren, Jiaqi Shao, Chuan Sun, Xiaoli Tang, Hong Xi Tae, Yongxin Tong, Shuyue Wei, Fan Wu, Wei Xi, Mingcong Xu, He Yang, Xin Yang, Jiangpeng Yan, Hao Yu, Han Yu, Teng Zhang, Yifei Zhang, Xiaojin Zhang, Zhenzhe Zheng, Lixin Fan & Qiang Yang. Ten challenging problems in federated foundation models. IEEE Transactions on Knowledge and Data Engineering, IEEE (2025).

Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Bo Zhao, Liping Yi, Alysa Ziying Tan, Yulan Gao, Anran Li, Xiaoxiao Li, Zengxiang Li & Qiang Yang. Advances and open challenges in federated foundation models. IEEE Communications Surveys and Tutorials, IEEE (2025).

Project Title: Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler

Keywords: Model Calibration, Personalization, Privacy-Preservation
Introduction:
Local calibration in federated learning (FL) is easy, but how about global calibration without global validation datasets? FedCal aggregates local scalers into a global scaler. The scaler possesses robust generalization capabilities to handle potential discrepancies between local and global data distributions. Scaling and calibration maintain model accuracy. The scaler is aggregatable, and the scaler aggregation strategy does not require direct access to local data distributions.
Publication:

Hongyi Peng, Han Yu, Xiaoli Tang & Xiaoxiao Li, "FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler," in Proceedings of the 41st International Conference on Machine Learning (ICML'24), pp. 40331-40346, 2024.

Project Title: Federated Neuro-Symbolic Learning

Keywords: Explainability, Personalization, Privacy-Preservation
Introduction:
Neuro-symbolic learning (NSL) models complex symbolic rule patterns into latent variable distributions by neural networks, which reduces rule search space and generates unseen rules to improve downstream task performance. Centralized NSL learning involves directly acquiring data from downstream tasks, which is not feasible for federated learning (FL). To address this limitation, we shift the focus from such a one-to-one interactive neuro-symbolic paradigm to one-to-many Federated Neuro-Symbolic Learning framework (FedNSL) with latent variables as the FL communication medium. Built on the basis of our novel reformulation of the NSL theory, FedNSL is capable of identifying and addressing rule distribution heterogeneity through a simple and effective Kullback-Leibler (KL) divergence constraint on rule distribution applicable under the FL setting. It further theoretically adjust variational expectation maximization (V-EM) to reduce the rule search space across domains. This is the first incorporation of distribution-coupled bilevel optimization into FL.
Publication:

Pengwei Xing, Songtao Lu & Han Yu, "Federated Neuro-Symbolic Learning," in Proceedings of the 41st International Conference on Machine Learning (ICML'24), pp. 54635-54655, 2024.

Project Title: Federated Neuro-Symbolic Learning

Keywords: Explainability, Personalization, Privacy-Preservation
Introduction:
Federated learning (FL) is an emerging approach for training machine learning models collaboratively while preserving data privacy. The need for privacy protection makes it difficult for FL models to achieve global transparency and explainability. To address this limitation, we incorporate logic-based explanations into FL by proposing the Logical Reasoning-based eXplainable Federated Learning (LR-XFL) approach. Under LR-XFL, FL clients create local logic rules based on their local data and send them, along with model updates, to the FL server. The FL server connects the local logic rules through a proper logical connector that is derived based on properties of client data, without requiring access to the raw data. In addition, the server also aggregates the local model updates with weight values determined by the quality of the clients' local data as reflected by their uploaded logic rules. The explicit rule evaluation and expression under LR-XFL enable human experts to validate and correct the rules on the server side, hence improving the global FL model robustness to errors. It has the potential to enhance the transparency of FL models for areas like healthcare and finance where both data privacy and explainability are important.
Publication:

Yanci Zhang & Han Yu, "LR-XFL: Logical Reasoning-based Explainable Federated Learning," in Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI-24), pp. 21788-21796, 2024.

Project Title: HiFi-Gas: Hierarchical Federated Learning Incentive Mechanism Enhanced Gas Usage Estimation

Keywords: Fairness, Privacy-Preservation, Incentive Mechanism Design, Hierarchical FL
Introduction:
Gas usage estimation plays a critical role in various aspects of the power generation and delivery business, including budgeting, resource planning, and environmental preservation. Federated Learning (FL) has demonstrated its potential in enhancing the accuracy and reliability of gas usage estimation by enabling distributedly owned data to be leveraged, while ensuring privacy and confidentiality. However, to effectively motivate stakeholders to contribute their high-quality local data and computational resources for this purpose, incentive mechanism design is key. In this project, we design the Hierarchical FL Incentive mechanism for the Gas usage estimation (HiFi-Gas) system. It is designed to cater to the unique structure of gas companies and their affiliated heating stations. HiFi-Gas provides effective incentivization in a hierarchical federated learning framework that consists of a horizontal federated learning (HFL) component for effective collaboration among gas companies and multiple vertical federated learning (VFL) components for the gas company and its affiliated heating stations. To motivate active participation and ensure fairness among gas companies and heating stations, we incorporate a multi-dimensional contribution-aware reward distribution function that considers both data quality and model contributions. Since its deployment in the ENN Group in December 2022, HiFi-Gas has successfully provided incentives for gas companies and heating stations to actively participate in FL training, resulting in more than 12% higher average gas usage estimation accuracy and substantial gas procurement cost savings. This implementation marks the first successful deployment of a hierarchical FL incentive approach in the energy industry.
Publication:

Hao Sun, Xiaoli Tang, Chengyi Yang, Zhenpeng Yu, Xiuli Wang, Qijie Ding, Zengxiang Li & Han Yu, "HiFi-Gas: Hierarchical Federated Learning Incentive Mechanism Enhanced Gas Usage Estimation," in Proceedings of the 36th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24), pp. 22824-22832, 2024. (Innovative Application of AI Award)

Project Title: Transformer-empowered Multi-modal Item Embedding for Enhanced Image Search in E-Commerce

Keywords: Foundation Models
Introduction:
Over the past decade, significant advances have been made in the field of image search for e-commerce applications. Traditional image-to-image retrieval models, which focus solely on image details such as texture, tend to overlook useful semantic information contained within the images. As a result, the retrieved products might possess similar image details, but fail to fulfil the user's search goals. Moreover, the use of image-to-image retrieval models for products containing multiple images results in significant online product feature storage overhead and complex mapping implementations. In this project, we design the Multi-modal Item EmbeddingModel (MIEM) to address these limitations. It is capable of utilizing both textual information and multiple images about a product to construct meaningful product features. By leveraging semantic information from images, MIEM effectively supplements the image search process, improving the overall accuracy of retrieval results. MIEM has become an integral part of the Shopee image search platform, with its features covering over 400 million products. Since its deployment in March 2023, it has achieved a remarkable 9.90% increase in terms of clicks per user and a 4.23% boost in terms of orders per user for the image search feature on the Shopee e-commerce platform.
Publication:

Chang Liu, Peng Hou, Anxiang Zeng & Han Yu, "Transformer-empowered Multi-modal Item Embedding for Enhanced Image Search in E-Commerce," in Proceedings of the 36th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24), pp. 22770-22778, 2024. (Innovative Application of AI Award)

Project Title: IBCA: An Intelligent Platform for Social Insurance Benefit Qualification Status Assessment

Keywords: Fairness, Explainability, Privacy-Preservation
Introduction:
Social insurance benefits qualification assessment is an important task to ensure that retirees enjoy their benefits according to the regulations. It also plays a key role in curbing social security frauds. In this project, we develop the Intelligent Benefit Certification and Analysis (IBCA) platform, an AI-empowered platform for verifying the status of retirees to ensure proper dispursement of funds in Shandong province, China. Based on an improved Gated Recurrent Unit (GRU) neural network, IBCA aggregates missing value interpolation, temporal information, and global and local feature extraction to perform accurate retiree survival rate prediction. Based on the predicted results, a reliability assessment mechanism based on Variational Auto-Encoder (VAE) and Monte-Carlo Dropout (MC Dropout) is executed to perform reliability assessment. Deployed since November 2019, the IBCA platform has been adopted by 12 cities across the Shandong province, handling over 50 terabytes of data. It has empowered human resources and social services, civil affairs, and health care institutions to collaboratively provide highquality public services. Under the IBCA platform, the efficiency of resources utilization as well as the accuracy of benefit qualification assessment have been significantly improved.
Publication:

Yuliang Shi, Lin Cheng, Cheng Jiang, Hui Zhang, Guifeng Li, Xiaoli Tang, Han Yu, Zhiqi Shen & Cyril Leung, "IBCA: An Intelligent Platform for Social Insurance Benefit Qualification Status Assessment," in Proceedings of the 36th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24), pp. 22815-22823, 2024. (Innovative Application of AI Award)

Project Title: FedOBD: Efficient Training of Large-Scale Industrial Fault Diagnostic Models through Federated Opportunistic Block Dropout

Keywords: Robustness, Privacy-Preservation, Large-Scale Models
Introduction:
Artificial intelligence (AI)-empowered industrial fault diagnostics is important in ensuring the safe operation of industrial applications. Since complex industrial systems often involve multiple industrial plants (possibly belonging to different companies or subsidiaries) with sensitive data collected and stored in a distributed manner, collaborative fault diagnostic model training often needs to leverage federated learning (FL). As the scale of the industrial fault diagnostic models are often large and communication channels in such systems are often not exclusively used for FL model training, existing deployed FL model training frameworks cannot train such models efficiently across multiple institutions. In this project, we develop and deploy the Federated Opportunistic Block Dropout (FedOBD) approach for industrial fault diagnostic model training. By decomposing large-scale models into semantic blocks and enabling FL participants to opportunistically upload selected important blocks in a quantized manner, it significantly reduces the communication overhead while maintaining model performance. Since its deployment in ENN Group in February 2022, FedOBD has served two coal chemical plants across two cities in China to build industrial fault prediction models. It helped the company reduce the training communication overhead by over 70% compared to its previous AI Engine, while maintaining model performance at over 85% test F1 score. To our knowledge, it is the first successfully deployed dropout-based FL approach.
Publications:

Yuanyuan Chen, Zichen Chen, Pengcheng Wu & Han Yu, "FedOBD: Opportunistic Block Dropout for Efficiently Training Large-scale Neural Networks through Federated Learning," in Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI'23), pp. 3541-3549, 2023.

Yuanyuan Chen, Zichen Chen, Sheng Guo, Yansong Zhao, Zelei Liu, Pengcheng Wu, Chengyi Yang, Zengxiang Li & Han Yu, "Efficient Training of Large-Scale Industrial Fault Diagnostic Models through Federated Opportunistic Block Dropout," in Proceedings of the 35th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-23), pp. 15485-15493, 2023. (Innovative Application of AI Award)

Project Title: HACFL: Hierarchical Auctioning in Crowd-based Federated Learning

Keywords: Fairness, Privacy-Preservation, Data Valuation, Trading
Introduction:
In open collaborative federated learning (FL) taking place within a network of participants, anyone can initiate an FL model training task. Participants can either bid to join an FL task, or help refer others in their own networks. Currently, there is a lack of simulation and benchmarking tools to support research in this domain. In this research, we built Hierarchical Auctioning in Crowd-based Federated Learning (HACFL), a benchmark platform which enables simulations of FL networks with any given topology and reputation-aware hierarchical auction-based FL team formation to support research in this domain. It consists of a configurable back-end simulation system and a web-based interactive user interface, allowing end users and researchers to visualize trust-based open collaborative FL training processes. Results show that leveraging such an ecosystem of FL participants not only improves model performance, but also improves social welfare.
Publication:

Yulan Gao, Yansong Zhao & Han Yu, "Multi-Tier Client Selection for Mobile Federated Learning Networks," in Proceedings of the 2023 IEEE International Conference on Multimedia and Expo (ICME'23), pp. 666-671, 2023.

Project Title: Contribution-Aware Federated Learning (CAreFL)

Keywords: Explainability, Fairness, Privacy-Preservation
Introduction:
Artificial intelligence (AI) is a promising technology to transform the healthcare industry. Due to the highly sensitive nature of patient data, federated learning (FL) is often leveraged to build models for smart healthcare applications. Existing deployed FL frameworks cannot address the key issues of varying data quality and heterogeneous data distributions across multiple institutions in this sector. In this project, we design, develop and deploy the Contribution-Aware Federated Learning (CAreFL) framework for smart healthcare. It provides fair and explainable FL participant contribution evaluation in an efficient and privacy-preserving manner, and optimizes the FL model aggregation approach based on the evaluation results. Since its deployment in Yidu Cloud Technology Inc. in 2021, CAreFL has served 8 well-established medical institutions in China to build healthcare decision support models. It can perform contribution evaluations 2.84 times faster than the best existing approach, and has improved the average accuracy of the resulting models by 2.62% compared to the previous system (which is significant in industrial settings). To our knowledge, it is the first contribution-aware federated learning successfully deployed in the healthcare industry.
Publications:

Zelei Liu, Yuanyuan Chen, Yansong Zhao, Han Yu, Yang Liu, Renyi Bao, Jinpeng Jiang, Zaiqing Nie, Qian Xu & Qiang Yang, "Contribution-Aware Federated Learning for Smart Healthcare," in Proceedings of the 34th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-22), pp. 12396-12404, 2022. (Innovative Application of AI Award)

Zelei Liu, Yuanyuan Chen, Han Yu, Yang Liu & Lizhen Cui. GTG-Shapley: Efficient and accurate participant contribution evaluation in federated learning. ACM Transactions on Intelligent Systems and Technology, ACM (2022).

Project Title: CrowdFL: A Marketplace for Crowdsourced Federated Learning

Keywords: Privacy-Preservation, Data Valuation, Trading, Crowdsourcing
Introduction:
Amid data privacy concerns, Federated Learning (FL) has emerged as a promising machine learning paradigm that enables privacy-preserving collaborative model training. However, there exists a need for a platform that matches data owners (supply) with model requesters (demand). In this project, we built CrowdFL, a platform to facilitate the crowdsourcing of FL model training. It coordinates client selection, model training, and reputation management, which are essential steps for the FL crowdsourcing operations. By implementing model training on actual mobile devices, we demonstrate that the platform improves model performance and training efficiency. To the best of our knowledge, it is the first platform to support crowdsourcing-based FL on edge devices.
Publication:

Daifei Feng, Cicilia Helena, Wei Yang Bryan Lim, Jer Shyuan Ng, Hongchao Jiang, Zehui Xiong, Jiawen Kang, Han Yu, Dusit Niyato & Chunyan Miao, "CrowdFL: A Marketplace for Crowdsourced Federated Learning," in Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI-22), 2022.

Project Title: FedVision: An Online Visual Object Detection Platform Powered by Federated Learning

Keywords: Privacy-Preservation, Visualization, Process Management
Introduction:
Visual object detection is a computer vision-based artificial intelligence (AI) technique which has many practical applications (e.g., fire hazard monitoring). However, due to privacy concerns and the high cost of transmitting video data, it is highly challenging to build object detection models on centrally stored large training datasets following the current approach. Federated learning (FL) is a promising approach to resolve this challenge. Nevertheless, there currently lacks an easy to use tool to enable computer vision application developers who are not experts in federated learning to conveniently leverage this technology and apply it in their systems. In this project, we built FedVision - a machine learning engineering platform to support the development of federated learning powered computer vision applications. The platform has been deployed through a collaboration between WeBank and Extreme Vision to help customers develop computer vision-based safety monitoring solutions in smart city applications. Over four months of usage, it has achieved significant efficiency improvement and cost reduction while removing the need to transmit sensitive data for three major corporate customers. To the best of our knowledge, this is the first real application of FL in computer vision-based tasks.
Publications:

Yang Liu, Anbu Huang, Yun Luo, He Huang, Youzhi Liu, Yuanyuan Chen, Lican Feng, Tianjian Chen, Han Yu & Qiang Yang, Federated learning-powered visual object detection for safety monitoring. AI Magazine, vol. 42, no. 2, AAAI Press (2021).

Yang Liu, Anbu Huang, Yun Luo, He Huang, Youzhi Liu, Yuanyuan Chen, Lican Feng, Tianjian Chen, Han Yu & Qiang Yang, "FedVision: An Online Visual Object Detection Platform Powered by Federated Learning," in Proceedings of the 32nd Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-20), pp. 13172-13179, 2020. (Innovative Application of AI Award)