-------- Forwarded Message --------
Greetings,
We would like to invite you to submit to FinLayout, a shared task
on
document layout analysis in the financial domain, held in
conjunction with
IJCAI-ECAI-2022, Messe Wien, Vienna, Austria 23th -25th July, 2022
as part
of the FinNLP-2022 workshop.
Shared Task URL:
https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp-2022/shared-task-finlayout
Workshop URL:
https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp-2022/home
Registration Form:
https://docs.google.com/forms/d/1EZTTheA4rLomtLOHKU0xQnraCRc1obuk1IGR1TCqwBc/edit?usp=sharing
=====Introduction=====
The 1st edition of FinLayout introduces a shared task on document
layout
analysis in the financial domain. Visual features allow to give
indications
about different aspects of the structure of documents. Therefore,
most
approaches in document layout analysis rely on image-based methods
to
extract the structure of each page of the document. Deep neural
networks
that are developed for computer vision have been proven to be an
effective
method to analyze layout of document images.
=====Task Description=====
In this shared task, we propose to recognize the layout of
financial
documents represented through 4 labels: Text, Title, Figure and
List.
As training set, we propose to use PubLayNet (), a large dataset
(~100GB)
of document images, of which the layout is annotated with bounding
boxes.
The dataset contains overs 1 million PDF articles that are
publicly
available on PubMed Central, a free full-text archive of
biomedical and
life sciences journal literature at the U.S. National Institutes
of
Health's National Library of Medicine.
The idea will be to demonstrate that models trained on PubLayNet
containing
scientific articles can accurately recognize the layout on
different type
of documents and typically in the financial domain, which proves
the
effectiveness of transfer learning.
=====Reference=====
Zhong, Xu and Tang, Jianbin and Yepes, Antonio Jimeno (2019),
"PubLayNet:
largest dataset ever for document layout analysis". International
Conference on Document Analysis and Recognition (ICDAR)
G. Paaß and I. Konya, Machine learning for document structure
recognition,
in Modeling, Learning, and Processing of Text Technological Data
Structures. Springer, 2011, pp. 221–247.
A. M. Namboodiri and A. K. Jain, Document structure and layout
analysis, in
Digital Document Processing. Springer, 2007, pp. 29–48.
M. El Haj, P. Rayson, S. Young, and M. Walker, Detecting document
structure
in a very large corpus of UK financial reports. In Proceedings of
the Ninth
International Conference on Language Resources and Evaluation
(LREC-2014).
2014. pp. 1335-1338.
C. Ramakrishnan, A. Patnia, E. Hovy, and G. A. Burns, Layout-aware
text
extraction from full-text pdf of scientific articles, Source Code
for
Biology and Medicine, vol. 7, no. 1, p. 7, May 2012. [Online].
Available:
https://doi.org/10.1186/1751-0473-7-7
S. Tuarob, P. Mitra, and C. L. Giles, “A hybrid approach to
discover
semantic hierarchical sections in scholarly documents,” in
Proceedings of
13th International Conference on Document Analysis and Recognition
(ICDAR),
Aug 2015, pp. 1081–1085.
S. Budhiraja and V. Mago, “A supervised learning approach for
heading
detection,” CoRR, vol. abs/1809.01477, 2018. [Online]. Available:
http://arxiv.org/abs/1809.01477
K. Anoop R and R. Christian and G. Cordula and B. Sebastian and B.
Steffen
and H. Johannes and F. Jean Baptiste, Chargrid: Towards
Understanding 2D
Documents, In Proceedings of the 2018 Conference on Empirical
Methods in
Natural Language Processing, 2018. pp.4459--4469.
Xu, Yiheng and Li, Minghao and Cui, Lei and Huang, Shaohan and
Wei, Furu
and Zhou, Ming, LayoutLM: Pre-training of Text and Layout for
Document
Image Understanding, In Proceedings of the 26th ACM SIGKDD
International
Conference on Knowledge Discovery & Data Mining, 2020.
Aggarwal, Milan and Sarkar, Mausoom and Gupta, Hiresh and
Krishnamurthy,
Balaji, Multi-Modal Association based Grouping for Form Structure
Extraction, In Proceedings of 2020 IEEE Winter Conference on
Applications
of Computer Vision (WACV), 2020, 2064-2073.
Lin W. et al. ViBERTgrid: A Jointly Trained Multi-modal 2D
Document
Representation for Key Information Extraction from Documents. In:
Lladós
J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition
– ICDAR
2021. Lecture Notes in Computer Science, vol 12821. Springer.
=====Registration=====
To register your interest in participating in FinLayout shared
task, please
use the following google form:
https://docs.google.com/forms/d/1EZTTheA4rLomtLOHKU0xQnraCRc1obuk1IGR1TCqwBc/edit?usp=sharing
=====Prize=====
A USD$1000 prize will be rewarded to the best-performing teams.
=====Important Dates=====
April 12, 2022: First announcement of the shared task and
beginning of
registration
April 20, 2022 : Release of training set & scoring scripts.
May 20, 2022: Release of test set.
May 26, 2022: System's outputs submission deadline.
May 30, 2022: Release of results.
May 30, 2022: Shared task title and abstract due
June 06, 2022: Shared task paper submissions due
June 17, 2022: Registration deadline.
June 17, 2022: Camera-ready version of shared task paper due
July 23-25, 2022: FinNLP-2022 workshop @IJCAI-ECAI-2022
=====Contact=====
For any questions on the shared task, please contact us on
fin.layout.task@gmail.com.
=====Shared Task Co-organizers - Fortia Financial Solutions=====
Willy au
Abderrahim AIT AZZI
Sandra Bellato
Mei Gan
Juyeon KANG
_______________________________________________
AISWorld mailing list
AISWorld@lists.aisnet.org