Artificial Neural Networks-Driven High Precision Tabular Information Extraction from Datasheets

Fernandes, Johan

Artificial Neural Networks-Driven High Precision Tabular Information Extraction from Datasheets

dc.contributor.author	Fernandes, Johan
dc.contributor.supervisor	Kantarci, Burak
dc.date.accessioned	2022-03-11T14:27:33Z
dc.date.available	2022-09-11T09:00:07Z
dc.date.issued	2022-03-11	en_US
dc.description.abstract	Global organizations have adopted Industry 4.0 practices to stay viable through the information shared through billions of digital documents. The information in such documents is vital to the daily functioning of such organizations. Most critical information is laid out in tabular format in order to provide the information in a concise manner. Extracting this critical data and providing access to the latest information can help institutions to make evidence based and data driven decisions. Assembling such data for analysis can further enable organizations to automate certain processes such as manufacturing. A generalized solution for table text extraction would have to handle the variations in the page content and table layouts in order to accurately extract the text. We hypothesize that a table text extraction pipeline can extract this data in three stages. The first stage would involve identifying the images that contain tables and detecting the table region. The second stage would consider the detected table region and detect the rows and columns of the table. The last stage would involve extracting the text from the cell locations generated by the intersecting lines of the detected rows and columns. For first stage of the pipeline, we propose TableDet: a deep learning (artificial neural network) based methodology to solve table detection and table image classification in datasheet (document) images in a single inference. TableDet utilizes a Cascade R-CNN architecture with Complete IOU (CIOU) loss at each box head and a deformable convolution backbone to capture the variations of tables that appear at multiple scales and orientations. It also detects text and figures to enhance its table detection performance. We demonstrate the effectiveness of training TableDet with a dual-step transfer learning process and fine-tuning it with Table Aware Cutout (TAC) augmented images. TableDet achieves the highest F1 score for table detection against state-of-the-art solutions on ICDAR 2013 (complete set), ICDAR 2017 (test set) and ICDAR 2019 (test set) with 100%, 99.3% and 95.1% respectively. We show that the enhanced table detection performance can be utilized to address the table image classification task with the addition of a classification head which comprises of 3 conditions. For the table image classification task TableDet achieves 100% recall and above 92% precision on three test sets. These classification results indicate that all images with tables along with a significantly reduced number of images without tables would be promoted to the next stage of the table text extraction pipeline. For the second stage we propose TableStrDet, a deep learning (artificial neural network) based approach to recognize the structure of the detected tables regions from stage 1 by detecting and classifying rows and columns. TableStrDet comprises of two Cascade R-CNN architectures each with a deformable backbone and Complete IOU loss to improve their detection performance. One architecture detects and classifies columns as regular columns (column without a merged cell) and irregular columns (group of regular columns that share a merged cell). The second architecture detects and classifies rows as regular rows (row without a merged cell) and irregular rows (group of regular rows that share a merged cell). Both architectures work in parallel to provide the results in a single inference. We show that utilizing TableStrDet to detect four classes of objects enhances the quality of table structure detection by capturing table contents that may or may not have hierarchical layouts on two public test sets. Under the TabStructDB test set we achieve 72.7% and 78.5% weighted average F1 score for rows and columns respectively. On the ICDAR 2013 test set we achieve 90.5% and 89.6% weighted average F1 score for rows and columns respectively. Furthermore, we show that TableStrDet has a higher generalization potential on the available datasets.	en_US
dc.embargo.terms	2022-09-11
dc.identifier.uri	http://hdl.handle.net/10393/43374
dc.identifier.uri	http://dx.doi.org/10.20381/ruor-27591
dc.language.iso	en	en_US
dc.publisher	Université d'Ottawa / University of Ottawa	en_US
dc.subject	Deep Learning	en_US
dc.subject	Information Extraction	en_US
dc.subject	Table Detection	en_US
dc.subject	Document Processing	en_US
dc.subject	Supply Chain Optimization	en_US
dc.subject	Image Processing	en_US
dc.title	Artificial Neural Networks-Driven High Precision Tabular Information Extraction from Datasheets	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Génie / Engineering	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	MCS	en_US
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science	en_US

Fichiers

Trousse originale

Voici les éléments 1 - 1 sur 1

Nom:: Fernandes_Johan_2022_thesis.pdf
Taille:: 9.4 MB
Format:: Adobe Portable Document Format
Description:

Télécharger

Trousse de licence

Voici les éléments 1 - 1 sur 1

Nom:: license.txt
Taille:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Télécharger

Collections

- Thèses, 2011 - // Theses, 2011 -