Repository logo

Artificial Neural Networks-Driven High Precision Tabular Information Extraction from Datasheets

dc.contributor.authorFernandes, Johan
dc.contributor.supervisorKantarci, Burak
dc.date.accessioned2022-03-11T14:27:33Z
dc.date.available2022-09-11T09:00:07Z
dc.date.issued2022-03-11en_US
dc.description.abstractGlobal organizations have adopted Industry 4.0 practices to stay viable through the information shared through billions of digital documents. The information in such documents is vital to the daily functioning of such organizations. Most critical information is laid out in tabular format in order to provide the information in a concise manner. Extracting this critical data and providing access to the latest information can help institutions to make evidence based and data driven decisions. Assembling such data for analysis can further enable organizations to automate certain processes such as manufacturing. A generalized solution for table text extraction would have to handle the variations in the page content and table layouts in order to accurately extract the text. We hypothesize that a table text extraction pipeline can extract this data in three stages. The first stage would involve identifying the images that contain tables and detecting the table region. The second stage would consider the detected table region and detect the rows and columns of the table. The last stage would involve extracting the text from the cell locations generated by the intersecting lines of the detected rows and columns. For first stage of the pipeline, we propose TableDet: a deep learning (artificial neural network) based methodology to solve table detection and table image classification in datasheet (document) images in a single inference. TableDet utilizes a Cascade R-CNN architecture with Complete IOU (CIOU) loss at each box head and a deformable convolution backbone to capture the variations of tables that appear at multiple scales and orientations. It also detects text and figures to enhance its table detection performance. We demonstrate the effectiveness of training TableDet with a dual-step transfer learning process and fine-tuning it with Table Aware Cutout (TAC) augmented images. TableDet achieves the highest F1 score for table detection against state-of-the-art solutions on ICDAR 2013 (complete set), ICDAR 2017 (test set) and ICDAR 2019 (test set) with 100%, 99.3% and 95.1% respectively. We show that the enhanced table detection performance can be utilized to address the table image classification task with the addition of a classification head which comprises of 3 conditions. For the table image classification task TableDet achieves 100% recall and above 92% precision on three test sets. These classification results indicate that all images with tables along with a significantly reduced number of images without tables would be promoted to the next stage of the table text extraction pipeline. For the second stage we propose TableStrDet, a deep learning (artificial neural network) based approach to recognize the structure of the detected tables regions from stage 1 by detecting and classifying rows and columns. TableStrDet comprises of two Cascade R-CNN architectures each with a deformable backbone and Complete IOU loss to improve their detection performance. One architecture detects and classifies columns as regular columns (column without a merged cell) and irregular columns (group of regular columns that share a merged cell). The second architecture detects and classifies rows as regular rows (row without a merged cell) and irregular rows (group of regular rows that share a merged cell). Both architectures work in parallel to provide the results in a single inference. We show that utilizing TableStrDet to detect four classes of objects enhances the quality of table structure detection by capturing table contents that may or may not have hierarchical layouts on two public test sets. Under the TabStructDB test set we achieve 72.7% and 78.5% weighted average F1 score for rows and columns respectively. On the ICDAR 2013 test set we achieve 90.5% and 89.6% weighted average F1 score for rows and columns respectively. Furthermore, we show that TableStrDet has a higher generalization potential on the available datasets.en_US
dc.embargo.terms2022-09-11
dc.identifier.urihttp://hdl.handle.net/10393/43374
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-27591
dc.language.isoenen_US
dc.publisherUniversité d'Ottawa / University of Ottawaen_US
dc.subjectDeep Learningen_US
dc.subjectInformation Extractionen_US
dc.subjectTable Detectionen_US
dc.subjectDocument Processingen_US
dc.subjectSupply Chain Optimizationen_US
dc.subjectImage Processingen_US
dc.titleArtificial Neural Networks-Driven High Precision Tabular Information Extraction from Datasheetsen_US
dc.typeThesisen_US
thesis.degree.disciplineGénie / Engineeringen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMCSen_US
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Scienceen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Fernandes_Johan_2022_thesis.pdf
Size:
9.4 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: