-------- Forwarded Message --------
Dear Colleagues,
We are leading a special issue regarding "Information and Data
Quality for Intelligent Systems" on Information Discovery and
Delivery. Please see the information below:
Special issue: Information and Data Quality for Intelligent
Systems
Journal: Information Discovery and Delivery
Link to the journal website:
https://www.emeraldgrouppublishing.com/calls-for-papers/information-and-data-quality-intelligent-systems<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.emeraldgrouppublishing.com%2Fcalls-for-papers%2Finformation-and-data-quality-intelligent-systems&data=05%7C01%7CWHe%40odu.edu%7C09368d6dc30443d6901c08da3551d68c%7C48bf86e811a24b8a8cb368d8be2227f3%7C0%7C0%7C637880923021849457%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VddGWSPVrHFJaVQ3Huept5uolvr2Y9gu%2Fh1MAaahtqA%3D&reserved=0>
Introduction: Data quality (DQ) is critical to successfully
implementing artificial intelligence (AI) systems. However, both
AI researchers and practitioners overwhelmingly concentrate on
models/algorithms while undervaluing the impact of DQ (Sambasivan
et al., 2020). The trend of AI research is moving from
model-centric AI toward data-centric AI (Ng, 2021). In the year
2021 alone, 108,000 journal and conference articles have been
published on DQ in AI and closely-related areas in total. The
demand for DQ assurance in high-stakes domains such as medical,
legal, and cyber-security is more intensive and urgent (Sambasivan
et al., 2020). Recently, techniques such as semi-supervised
learning (SSL), transfer learning (TL), few-shot learning (FSL),
active learning (AL), and generative adversarial learning (GAN)
have been proposed by natural language processing (NLP) and
machine learning (ML) researchers to enhance the model performance
when the quality of training data is not high enough or the amount
of the data is not sufficient (Lourentzou, 2019). However, the
data quality assessment, assurance, and improvement covering the
whole life-cycle of building an intelligent system have not yet
been well-investigated (Chen et al., 2021).
This special issue aims to bring AI and information science
researchers together to understand the challenges, investigate the
problems, propose solutions, exchange ideas, share resources, and
look for new research directions in the field of DQIS. This
special issue focuses on how to use state-of-the-art (SOTA)
technologies in the assessment, assurance, and improvement of big
data for building high-quality intelligent systems. Big data are
harvested for building intelligent systems for supporting a broad
array of applications from biomedicine, healthcare, education, and
legal intelligence to smart city and autopilot (Roh et al, 2021).
DQ could significantly impact the quality of the intelligent
system that is built on it. The special issue endeavors to publish
research and practice on the evaluation and improvement of DQ
quantitatively and systematically in specific applications and
domains and the roles and best practices of different ML and deep
learning (DL) techniques for DQ improvement.
Indicative list of anticipated article topics:
* Data quality assessment for machine learning and deep learning
(including defining of dimensions, measurement, and evaluation
techniques)
* Data quality management in high-stake domains (e.g., legal,
medical, cyber security)
* Quality evaluation of knowledge graph and ontology system
* Experimental study regarding the impact of data quality on the
performance of machine learning and deep learning
* Techniques for data quality issue detection and data quality
improvement
* Data augmentation using current transfer learning, fine-tuning,
semi-supervised learning, GAN, and any current techniques
* Exploratory data analysis
* Data security and privacy
* Fairness in machine learning (e.g., how to handle missing data)
* Ethics in machine learning (e.g., biased data leads to biased
results)
* The role of human factors in data quality assurance
* Other related topics
Guest Editors:
* Dr. Junhua Ding, Department of Information Science, University
of North Texas, Denton, Texas, USA, Email:
junhua.ding@unt.edu<mailto:junhua.ding@unt.edu>
* Dr. Haihua Chen, Department of Information Science, University
of North Texas, Denton, Texas, USA, Email:
Haihua.chen@unt.edu<mailto:Haihua.chen@unt.edu>
* Dr. Lei Li, Department of Information Management, Beijing Normal
University, Beijing, China, Email:
leili@bnu.edu.cn<mailto:leili@bnu.edu.cn>
* Dr. Ismini Lourentzou, Department of Computer Science, Virginia
Tech, Blacksburg, Virginia, USA, Email:
ilourentzou@vt.edu<mailto:ilourentzou@vt.edu>
Important Dates:
* First announcement/CfP: December 15, 2021
* Second CfP: April 15, 2022
* Final Reminder: May 23, 2022
* Submissions due: May 30, 2022
* Papers sent to reviewers: June 7, 2022
* Reviews due: July 31, 2022
* Author notification: August 31, 2022
* Final papers: September 21, 2022
References:
[1] Chen, Haihua, Jiangping Chen, and Junhua Ding, "Data
Evaluation and Enhancement for Quality Improvement of Machine
Learning," in IEEE Transactions on Reliability, vol. 70, no. 2,
pp. 831-847, June 2021. Doi: 10.1109/TR.2021.3070863.
[2] Lourentzou, Ismini, “Data quality in the deep learning era:
Active semi-supervised learning and text normalization for natural
language understanding.” Diss. University of Illinois at
Urbana-Champaign, 2019.
[3] Ng, Andrew, “A chat with andrew on mlops: From model-centric
to data-centric ai”, 2021, [Online; accessed 12-01-2021].
[4] Roh, Yuji, Geon Heo, and Steven Euijong Whang, "A Survey on
Data Collection for Machine Learning: A Big Data - AI Integration
Perspective," in IEEE Transactions on Knowledge and Data
Engineering, vol. 33, no. 4, pp. 1328-1347, 1 April 2021. Doi:
10.1109/TKDE.2019.2946162.
[5] Sambasivan, Nithya, Shivani Kapania, Hannah Highfill, Diana
Akrong, Praveen Paritosh, and Lora M. Aroyo. "“Everyone wants to
do the model work, not the data work”: Data Cascades in
High-Stakes AI." In proceedings of the 2021 CHI Conference on
Human Factors in Computing Systems, pp. 1-15. 2021. Doi:
ttps://doi.org/10.1145/3411764.3445518.
We welcome any further discussions and questions. Thanks a lot.
Best,
Haihua Chen, PhD
Clinical Assistant Professor in Data Science
Department of Information Science
University of North Texas
1155 Union Circle #311068
Denton, Texas 76203-5017
Email:
Haihua.chen@unt.edu<mailto:Haihua.chen@unt.edu>
Phone: 940-220-0057
Homepage:
https://iia.ci.unt.edu/haihua-chen/<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiia.ci.unt.edu%2Fhaihua-chen%2F&data=05%7C01%7CWHe%40odu.edu%7C09368d6dc30443d6901c08da3551d68c%7C48bf86e811a24b8a8cb368d8be2227f3%7C0%7C0%7C637880923021849457%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9PsghO0NZFjjZId3LaZsPhHXpMy0a4nvUjch4PwtSac%3D&reserved=0>