-------- Forwarded Message --------
___________________________________________________________
CALL FOR PAPERS
ACM Journal of Data and Information Quality
Special Issue on Deep Learning for Data Quality
___________________________________________________________
* Guest Editors:
-Paolo Papotti, EURECOM (France)
-Donatello Santoro, Università degli Studi della Basilicata
(Italy)
-Saravanan Thirumuruganathan, QCRI (Qatar)
* Context:
Deep learning (DL) has been recently used successfully for
monitoring
and improving data quality (DQ). Examples include data integration
tasks such as entity resolution and schema matching, data cleaning
tasks such as error detection and repair, and data curation in
general.
The data curation community has successfully leveraged deep
learning
techniques spanning from word embeddings to transformers to
achieve
state-of-the-art performance on well established data quality
benchmarks.
Nevertheless, there is still an open debate on which technical
solution
performs best for relational data and under which setting.
Despite a promising start, deep learning for data quality has a
long
way to go in achieving the human level performance that it has
achieved
in domains such as computer vision, natural language processing,
and
speech recognition. While there have been some substantial
improvements
in specific tasks such as entity resolution and data
repair/imputation,
many of the other data quality tasks (such as data discovery, data
profiling, data integration, record fusion) are yet to fully
benefit from
the DL revolution. Also, it is not clear how to push DL techniques
to
get the same level of adaptation achieved by more traditional
logic-based
methods. For example, interpretability of the models is a key
stumbling
block.
How can one develop DQ explanations that are consumed by
non-experts?
Should the explanation be generated individually for each error?
Or can it be summarized so that the user gets a high level
overview?
Finally, DL data quality tools need novel explanation algorithms
which
are not a priority for DL researchers as the architecture is quite
specific.
This special issue focuses on deep learning used for assessing and
improving
the quality of data. Thus, the issue is addressed to those members
from the
data science community proposing novel methods, architectures and
algorithms
capable of integrating, cleaning and profiling relational data
sources
with supervised and unsupervised approaches.
* Topics:
The goal of this special issue is to collect recent advances,
innovations,
and practices in ML, data and software engineering for building
techniques,
solutions, and systems that support users in assessing and
improving
relational data quality. The topics of interest are inspired from
the
themes above and include, but are not limited to:
- Deep learning methods for data integration and data cleaning
- Deep learning methods for metadata discovering/profiling,
including
constraint discovery
- Making deep learning methods for data quality interpretable
- Experimental studies of deep learning methods for data quality
- Deep learning methods for curating data in domain specific
applications
- Scalability of deep learning methods for data quality (speeding
up DL
for DQ using GPU)
- Characterization of data quality tasks that are more amenable to
deep
learning
- Reducing the need of large amount of training data in supervised
approaches
(weak- and self-supervision for data quality)
- Combination of logic based and DL based methods for data quality
* Expected contributions:
We welcome three types of research contributions:
- Full research papers describing a novel contribution to the
field (up to
25 pages)
- Experience papers discussing important lessons learned (up to 20
pages)
- Vision and Challenge papers (up to 7 pages)
- Survey papers (up to 30 pages)
* Submission Format:
JDIQ welcomes manuscripts that extend prior published work,
provided they
contain at least 30% new material, and that the significant new
contributions
are clearly identified in the introduction.
Submission guidelines with Latex (preferred) or Word templates are
available at:
http://jdiq.acm.org/authors.cfm#subm
To submit, select the paper type
"SI: Deep Learning for Data Quality"
* Important Dates:
- Submission deadline: March 1, 2021
- First notification: May 15, 2021
- Revised manuscripts deadline: July 15, 2021
- Final notification: September 15, 2021
- Camera-ready manuscripts: October 15, 2021
- Estimated publication date: January 2022
_______________________________________________
AISWorld mailing list
AISWorld@lists.aisnet.org