Document Layout Analysis: A Comprehensive Survey

Authors : Galal M. Binmakhashen , Sabri A. Mahmoud Authors Info & Claims

Article No.: 109, Pages 1 - 36 Published : 16 October 2019 Publication History 108 citation 3,313 Downloads Total Citations 108 Total Downloads 3,313 Last 12 Months 524 Last 6 weeks 58 Get Citation Alerts

New Citation Alert added!

This alert has been successfully added and will be sent to: You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below. Manage my Alerts

New Citation Alert!

Abstract

Document layout analysis (DLA) is a preprocessing step of document understanding systems. It is responsible for detecting and annotating the physical structure of documents. DLA has several important applications such as document retrieval, content categorization, text recognition, and the like. The objective of DLA is to ease the subsequent analysis/recognition phases by identifying the document-homogeneous blocks and by determining their relationships. The DLA pipeline consists of several phases that could vary among DLA methods, depending on the documents’ layouts and final analysis objectives. In this regard, a universal DLA algorithm that fits all types of document-layouts or that satisfies all analysis objectives has not been developed, yet. In this survey paper, we present a critical study of different document layout analysis techniques. The study highlights the motivational reasons for pursuing DLA and discusses comprehensively the different phases of the DLA algorithms based on a general framework that is formed as an outcome of reviewing the research in the field. The DLA framework consists of preprocessing, layout analysis strategies, post-processing, and performance evaluation phases. Overall, the article delivers an essential baseline for pursuing further research in document layout analysis.

References

Mudit Agrawal and David Doermann. 2009. Voronoi++: A dynamic page segmentation approach based on Voronoi and Docstrum features. In The International Conference on Document Analysis and Recognition. IEEE, 1011--1015.

Mudit Agrawal and David Doermann. 2010. Context-aware and content-based dynamic Voronoi page segmentation. In The 8th IAPR International Workshop on Document Analysis Systems. ACM Press, New York, 73--80.

Prakash K. Aithal, G. Rajesh, Dinesh U. Acharya, and P. C. Siddalingaswamy. 2013. A fast and novel skew estimation approach using radon transform. International Journal of Computer Information Systems and Industrial Management Applications 5 (2013), 337--344.

Alireza Alaei, Umapada Pal, and P. Nagabhushan. 2011. A new scheme for unconstrained handwritten text-line segmentation. Pattern Recognition 44, 4 (2011), 917--928.

Michele Alberti, Mathias Seuret, Vinaychandran Pondenkandath, Rolf Ingold, and Marcus Liwicki. 2017. Historical document image segmentation with LDA-initialized deep neural networks. In The 4th International Workshop on Historical Document Imaging and Processing. ACM, 95--100.

Adnan Amin and Sue Wu. 2005. A robust system for thresholding and skew detection in mixed text/graphics documents. International Journal of Image and Graphics 5, 2 (Apr. 2005), 247--265.

Khalid M. Amin, Mohamed Abd Elfattah, Aboul Ella Hassanien, and Gerald Schaefer. 2014. A binarization algorithm for historical Arabic manuscript images using a neutrosophic approach. In The 9th International Conference on Computer Engineering 8 Systems. IEEE, 266--270.

A. Antonacopoulos, C. Clausner, C. Papadopoulos, and S. Pletschacher. 2011. Historical document layout analysis competition. In International Conference on Document Analysis and Recognition. IEEE, 1516--1520.

A. Antonacopoulos, B. Gatos, and D. Karatzas. 2003. ICDAR 2003 page segmentation competition. In The 7th International Conference on Document Analysis and Recognition. 688--692.

A. Antonacopoulos and R. T. Ritchings. 1995. Representation and classification of complex-shaped printed regions using white tiles. In The 3rd International Conference on Document Analysis and Recognition, Vol. 2. IEEE Comput. Soc. Press, 1132--1135.

A. Antonacopoulos, S. Pletschacher, D. Bridson, and C. Papadopoulos. 2009. ICDAR2009 page segmentation competition. In The 10th International Conference on Document Analysis and Recognition. 1370--1374.

Apostolos Antonacopoulos and David Bridson. 2007. Performance analysis framework for layout analysis methods. In The 9th International Conference on Document Analysis and Recognition (ICDAR), Vol. 2. IEEE, 1258--1262.

Apostolos Antonacopoulos, David Bridson, Christos Papadopoulos, and Stefan Pletschacher. 2009. A realistic dataset for performance evaluation of document layout analysis. In The 10th International Conference on Document Analysis and Recognition. IEEE, 296--300.

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Stefan Pletschacher. 2013. ICDAR2013 competition on historical newspaper layout analysis (HNLA’13). In The 12th International Conference on Document Analysis and Recognition. IEEE, 1454--1458.

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Stefan Pletschacher. 2015. ICDAR2015 competition on recognition of documents with complex layouts. In The 13th International Conference on Document Analysis and Recognition. IEEE, 1151--1155.

Manivannan Arivazhagan, Harish Srinivasan, and Sargur Srihari. 2007. A statistical approach to line segmentation in handwritten documents. In Document Recognition and Retrieval XIV, Xiaofan Lin and Berrin A. Yanikoglu (Eds.). International Society for Optics and Photonics, 65000T.

Nikolaos Arvanitopoulos and Sabine Susstrunk. 2014. Seam carving for text line extraction on color and grayscale historical manuscripts. In The 14th International Conference on Frontiers in Handwriting Recognition. IEEE, 726--731.

Abedelkadir Asi, Rafi Cohen, Klara Kedem, and Jihad El-Sana. 2015. Simplifying the reading of historical manuscripts. In The 13th International Conference on Document Analysis and Recognition. IEEE, 826--830.

Abedelkadir Asi, Rafi Cohen, Klara Kedem, Jihad El-Sana, and Itshak Dinstein. 2014. A coarse-to-fine approach for layout analysis of ancient manuscripts. In The 14th International Conference on Frontiers in Handwriting Recognition. 140--145.

Abedelkadir Asi, Raid Saabni, and Jihad El-Sana. 2011. Text line segmentation for gray scale historical document images. In The Workshop on Historical Document Imaging and Processing. ACM Press, New York, 120.

Bruno Tenório Ávila and Rafael Dueire Lins. 2005. A fast orientation and skew detection algorithm for monochromatic document images. In The ACM Symposium on Document Engineering. ACM Press, New York, 118.

Micheal Baechler, Marcus Liwicki, and Rolf Ingold. 2013. Text line extraction using DMLP classifiers for historical manuscripts. In The 12th International Conference on Document Analysis and Recognition. IEEE, 1029--1033.

A. Bagdanov and J. Kanai. 1997. Projection profile based skew estimation algorithm for JBIG compressed images. In The 4th International Conference on Document Analysis and Recognition, Vol. 1. IEEE Comput. Soc., 401--405.

Itay Bar-Yosef, Nate Hagbi, Klara Kedem, and Itshak Dinstein. 2009. Line segmentation for degraded handwritten historical documents. In The 10th International Conference on Document Analysis and Recognition. IEEE, 1161--1165. http://ieeexplore.ieee.org/document/5277595/.

P. Barlas, S. Adam, C. Chatelain, and T. Paquet. 2014. A typed and handwritten text block segmentation system for heterogeneous and complex documents. In The 11th IAPR International Workshop on Document Analysis Systems. IEEE, 46--50.

J. Bernsen. 1986. Dynamic thresholding of gray level images. In The International Conference on Pattern Recognition. 1251--1255.

Fadi Biadsy, Jihad El-Sana, and Nizar Habash. 2006. Online Arabic handwriting recognition using hidden Markov models. In The 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.

Thomas M. Breuel. 2003. High performance document layout analysis. In Symposium on Document Image Understanding Technology 3 (2003), 209--218.

D. Bridson and A. Antonacopoulos. 2008. A geometric approach for accurate and efficient performance evaluation of layout analysis methods. In The 19th International Conference on Pattern Recognition. IEEE, 1--4.

Syed Saqib Bukhari, Mayce Ibrahim Ali Al Azawi, Faisal Shafait, and Thomas M. Breuel. 2010. Document image segmentation using discriminative learning over connected components. In The 8th IAPR International Workshop on Document Analysis Systems. ACM Press, New York, 183--190.

Syed Saqib Bukhari, T. M. Breuel, Abedelkadir Asi, and Jihad El-Sana. 2012. Layout analysis for arabic historical document images using machine learning. In The International Conference on Frontiers in Handwriting Recognition. IEEE, 639--644.

Syed Saqib Bukhari, Faisal Shafait, and Thomas M. Breuel. 2009. Script-independent handwritten textlines segmentation using active contours. In The 10th International Conference on Document Analysis and Recognition. IEEE, 446--450. http://ieeexplore.ieee.org/document/5277636/.

Syed Saqib Bukhari, Faisal Shafait, and Thomas M. Breuel. 2011. Improved document image segmentation algorithm using multiresolution morphology. In International Society for Optics and Photonics, Gady Agam and Christian Viard-Gaudin (Eds.). International Society for Optics and Photonics, 78740D.

Marius Bulacu, Rutger Van Koert, Lambert Schomaker, and Tijn van der Zant. 2007. Layout analysis of handwritten historical documents for searching the archive of the cabinet of the Dutch Queen. In The 9th International Conference on Document Analysis and Recognition. IEEE, 351--361.

Mark J. Burge and Gladys Monagan. 1995. Using the Voronoi tessellation for grouping words and multipart symbols in documents. In The SPIE International Symposium on Optics, Imaging and Instrumentation, Robert A. Melter, Angela Y. Wu, Fred L. Bookstein, and William D. K. Green (Eds.). International Society for Optics and Photonics, 116--124.

C. Clausner A. Antonacopoulos C. Papadopoulos, S. Pletschacher. 2013. The IMPACT dataset of historical document images. In The 2nd International Workshop on Historical Document Imaging and Processing. 123--130.

Yang Cao, Shuhua Wang, and Heng Li. 2003. Skew detection and correction in document images based on straight-line fitting. Pattern Recognition Letters 24, 12 (2003), 1871--1879.

Samuele Capobianco, Leonardo Scommegna, and Simone Marinai. 2018. Historical handwritten document segmentation by using a weighted loss. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition. Springer, 395--406.

R. Cattoni, T. Coianiz, S. Messelodi, and Cm Modena. 1998. Geometric layout analysis techniques for document image understanding: A review. ITC-First Technical Report (1998), 1--68.

F. Cesarini, M. Gori, S. Marinai, and G. Soda. 1999. Structured document segmentation and representation by the modified X-Y tree. In The 5th International Conference on Document Analysis and Recognition. IEEE, 563--566.

Nabendu Chaki, Soharab Hossain Shaikh, and Khalid Saeed. 2014. A comprehensive survey on image binarization techniques. In Exploring Image Binarization Techniques. Springer India, 5--15.

Kai Chen, Cheng-Lin Liu, Mathias Seuret, Marcus Liwicki, Jean Hennebert, and Rolf Ingold. 2016. Page segmentation for historical document images based on superpixel classification with unsupervised feature learning. In The 12th IAPR Workshop on Document Analysis Systems (DAS). IEEE, 299--304.

Kai Chen, Mathias Seuret, Jean Hennebert, and Rolf Ingold. 2017. Convolutional neural networks for page segmentation of historical document images. In The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 965--970.

Kai Chen, Mathias Seuret, Marcus Liwicki, Jean Hennebert, and Rolf Ingold. 2015. Page segmentation of historical document images with convolutional autoencoders. In The 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1011--1015.

Kai Chen, Hao Wei, Jean Hennebert, Rolf Ingold, and Marcus Liwicki. 2014. Page segmentation for historical handwritten document images using color and texture features. In The 14th International Conference on Frontiers in Handwriting Recognition. 488--493.

Yiping Chen and Liansheng Wang. 2017. Broken and degraded document images binarization. Neurocomputing 237 (2017), 272--280.

Atul K. Chhabra and Ihsin T. Phillips. 1997. The second international graphics recognition contest-raster to vector conversion: A report. In International Workshop on Graphics Recognition. Springer, 390--410.

Rafi Cohen, Abedelkadir Asi, Klara Kedem, Jihad El-Sana, and Itshak Dinstein. 2013. Robust text and drawing segmentation algorithm for historical documents. In The 2nd International Workshop on Historical Document Imaging and Processing. 110--117.

Rafi Cohen, Itshak Dinstein, Jihad El-Sana, and Klara Kedem. 2014. Using scale-space anisotropic smoothing for text line extraction in historical documents. In International Conference Image Analysis and Recognition. Springer International Publishing, 349--358.

Laboratoire National de metrologie et d’Essais (LNE). 2013. MAURDOR campaign. http://www.maurdor-campaign.org/index.php?id=83&L==1.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition. 248--255.

Markus Diem, Florian Kleber, and Robert Sablatnig. 2011. Text classification and document layout analysis of paper fragments. In The International Conference on Document Analysis and Recognition. IEEE, 854--858.

Markus Diem, Florian Kleber, and Robert Sablatnig. 2012. Skew estimation of sparsely inscribed document fragments. In The 10th IAPR International Workshop on Document Analysis Systems. IEEE, 292--296.

David Doermann Elena Zotkina, Himanshu Suri. 2013. GEDI: Groundtruthing Environment for Document Images. https://lampsrv02.umiacs.umd.edu/projdb/project.php?id=53.

Boris Epshtein. 2011. Determining document skew using inter-line spaces. In The International Conference on Document Analysis and Recognition. IEEE, 27--31.

Sébastien Eskenazi, Petra Gomez-Krämer, and Jean-Marc Ogier. 2015. The Delaunay document layout descriptor. In ACM Symposium on Document Engineering. ACM Press, New York, 167--175.

Sébastien Eskenazi, Petra Gomez-Krämer, and Jean-Marc Ogier. 2017. A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognition 64 (2017), 1--14.

Jonathan Fabrizio. 2014. A precise skew estimation algorithm for document images using KNN clustering and Fourier transform. In The International Conference on Image Processing. IEEE, 2585--2588.

Andreas Fischer, Micheal Baechler, Angelika Garz, Marcus Liwicki, and Rolf Ingold. 2014. A combined system for text line extraction and handwriting recognition in historical documents. In The 11th IAPR International Workshop on Document Analysis Systems. 71--75.

Andreas Fischer, Volkmar Frinken, Alicia Fornés, and Horst Bunke. 2011. Transcription alignment of latin manuscripts using hidden Markov models. In The Workshop on Historical Document Imaging and Processing. ACM, 29--36.

Gaofeng Meng, Chunhong Pan, Nanning Zheng, and Chen Sun. 2010. Skew estimation of document images using bagging. IEEE Transactions on Image Processing 19, 7 (jul 2010), 1837--1846.

Angelika Garz, Markus Diem, and Robert Sablatnig. 2010. Detecting text areas and decorative elements in ancient manuscripts. In The 12th International Conference on Frontiers in Handwriting Recognition. IEEE, 176--181.

Angelika Garz, Andreas Fischer, Robert Sablatnig, and Horst Bunke. 2012. Binarization-free text line segmentation for historical documents based on interest point clustering. In The 10th IAPR International Workshop on Document Analysis Systems. IEEE, 95--99.

Angelika Garz and Robert Sablatnig. 2010. Multi-scale texture-based text recognition in ancient manuscripts. In The 16th International Conference on Virtual Systems and Multimedia. IEEE, 336--339.

Angelika Garz, Robert Sablatnig, and Markus Diem. 2011. Layout analysis for historical manuscripts using SIFT features. In The International Conference on Document Analysis and Recognition. 508--512.

B. Gatos, N. Papamarkos, and C. Chamzas. 1997. Skew detection and text line position determination in digitized documents. Pattern Recognition 30, 9 (1997), 1505--1519.

B. Gatos, N. Stamatopoulos, and G. Louloudis. 2011. ICDAR2009 handwriting segmentation contest. International Journal on Document Analysis and Recognition (IJDAR) 14, 1 (2011), 25--33.

Basilios Gatos, Pratikakis Ioannis, and Stavros J. Perantonis. 2004. An adaptive binarization technique for low quality historical documents. In Document Analysis Systems VI. Springer, Springer Berlin, 102--113.

Basilis Gatos, Nikolaos Stamatopoulos, and Georgios Louloudis. 2010. ICFHR2010 handwriting segmentation contest. In The 12th International Conference on Frontiers in Handwriting Recognition. IEEE, 737--742.

Tobias Grüning, Gundram Leifert, Tobias Strauß, and Roger Labahn. 2018. A two-stage method for text line detection in historical documents. arXiv preprint arXiv:1802.03345 (2018).

Karim Hadjar and Rolf Ingold. 2004. Physical layout analysis of complex structured arabic documents using artificial neural nets. In Lecture Notes in Computer Science. Springer Berlin, 170--178.

Sheng He and Lambert Schomaker. 2019. DeepOtsu: Document enhancement and binarization using iterative deep learning. Pattern Recognition 91 (2019), 379--390.

S. C. Hinds, J. L. Fisher, and D. P. D’Amato. 1990. A document skew detection method using run-length encoding and the hough transform. In The 10th International Conference on Pattern Recognition, Vol. I. IEEE Comput. Soc. Press, 464--468.

Jaekyu Ha, R. M. Haralick, and I. T. Phillips. 1995. Document page decomposition by the bounding-box project. In The 3rd International Conference on Document Analysis and Recognition. IEEE Comput. Soc. Press, 1119--1122.

Anil K. Jain and Yu Zhong. 1996. Page segmentation using texture analysis. Pattern Recognition 29, 5 (May 1996), 743--770.

N. Journet, V. Eglin, J. Y. Ramel, and R. Mullot. 2005. Text/graphic labelling of ancient printed documents. In The 8th International Conference on Document Analysis and Recognition. IEEE, 1010--1014 Vol. 2.

Nicholas Journet, Jean-Yves Ramel, Rémy Mullot, and Véronique Eglin. 2008. Document image characterization using a multiresolution analysis of the texture: Application to old documents. International Journal of Document Analysis and Recognition (IJDAR) 11, 1 (Jun 2008), 9--18.

Hao Wei Marcus Liwicki Rolf Ingold Kai Chen, Mathias Seuret. 2015. Document, image, and video analysis DLA tool. http://diuf.unifr.ch/main/hisdoc/divadia.

Rangachar Kasturi, Lawrence O’Gorman, and Venu Govindaraju. 2002. Document image analysis: A primer. Sadhana 27, 1 (2002), 3--22.

N. Khorissi, A. Namane, A. Mellit, F. Abdati, Z. A. Bensalama, and A. Guessoum. 2007. Application of the wavelet and the Hough transform for detecting the skew angle in arabic printed documents. In The 9th International Symposium on Signal Processing and Its Applications. IEEE, 1--4. http://ieeexplore.ieee.org/document/4555586/.

Koichi Kise, Akinori Sato, and Motoi Iwata. 1998. Segmentation of page images using the area Voronoi diagram. Computer Vision and Image Understanding 70, 3 (1998), 370--382.

Koichi Kise. 2014. Page segmentation techniques in document analysis. In Handbook of Document Image Processing and Recognition. Springer London, London, 135--175.

K. Kise, A. Sato, and K. Matsumoto. 1997. Document image segmentation as selection of Voronoi edges. In The Workshop on Document Image Analysis. IEEE Comput. Soc, 32--39.

Florian Kleber, Robert Sablatnig, Melanie Gau, and Heinz Miklas. 2008. Ancient document analysis based on text line extraction. In The 19th International Conference on Pattern Recognition. IEEE, 1--4.

M. Krishnamoorthy, G. Nagy, S. Seth, and M. Viswanathan. 1993. Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 7 (1993), 737--747.

Victor Lavrenko, Toni M. Rath, and Raghavan Manmatha. 2004. Holistic word recognition for handwritten historical documents. In The 1st International Workshop on Document Image Analysis for Libraries. IEEE, 278--287.

Daniel S. Le, George R. Thoma, and Harry Wechsler. 1994. Automated page orientation and skew angle detection for binary document images. Pattern Recognition 27, 10 (1994), 1325--1344.

Shutao Li, Qinghua Shen, and Jun Sun. 2007. Skew detection using wavelet decomposition and projection profile analysis. Pattern Recognition Letters 28, 5 (2007), 555--562.

L. Likforman-Sulem, A. Hanimyan, and C. Faure. 1995. A Hough based algorithm for extracting text lines in handwritten documents. In The 3rd International Conference on Document Analysis and Recognition. IEEE Comput. Soc. Press, 774--777.

Laurence Likforman-Sulem, Abderrazak Zahour, and Bruno Taconet. 2007. Text line segmentation of historical documents: A survey. International Journal of Document Analysis and Recognition (IJDAR) 9, 2--4 (Sept. 2007), 123--138. http://link.springer.com/10.1007/s10032-006-0023-z

N. Liolios, N. Fakotakis, and G. Kokkinakis. 2001. Improved document skew detection based on text line connected-component clustering. In The International Conference on Image Processing (Cat. No.01CH37205), Vol. 1. IEEE, 1098--1101.

G. Louloudis, B. Gatos, and C. Halatsis. 2007. Text line detection in unconstrained handwritten documents using a block-based Hough transform approach. In The 9th International Conference on Document Analysis and Recognition. IEEE, 599--603.

G. Louloudis, B. Gatos, I. Pratikakis, and C. Halatsis. 2009. Text line and word segmentation of handwritten documents. Pattern Recognition 42, 12 (2009), 3169--3183.

Scott Lowther, Vinod Chandran, and Subramanian Sridharan. 2002. An accurate method for skew determination in document images. In Digital Image Computing Techniques and Applications, Vol. 1. 25--29.

Yue Lu and Chew Lim Tan. 2003. A nearest-neighbor chain based approach to skew estimation in document images. Pattern Recognition Letters 24, 14 (2003), 2315--2323.

Yue Lu, Zhe Wang, and Chew Lim Tan. 2004. Word grouping in document images based on Voronoi tessellation. In International Workshop on Document Analysis Systems. Springer Berlin, 147--157.

Simon M. Lucas. 2005. ICDAR 2005 text locating competition results. In The 8th International Conference on Document Analysis and Recognition. IEEE, 80--84.

Song Mao, Azriel Rosenfeld, and Tapas Kanungo. 2003. Document structure analysis algorithms: A literature survey. SPIE 5010, Document Recognition and Retrieval X 5010, 1 (2003), 197.

Simone Marinai, Marco Gori, and Giovanni Soda. 2005. Artificial neural networks for document analysis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1 (2005), 23--35.

Gale L. Martin. 1993. Centered-object integrated segmentation and recognition of overlapping handprinted characters. Neural Computation 5, 3 (1993), 419--429.

Maroua Mehri, Petra Gomez-Krämer, Pierre Héroux, Alain Boucher, and Rémy Mullot. 2013. Texture feature evaluation for segmentation of historical document images. In The 2nd International Workshop on Historical Document Imaging and Processing. ACM Press, New York, 102.

Maroua Mehri, Pierre Héroux, Petra Gomez-Krämer, and Rémy Mullot. 2017. Texture feature benchmarking and evaluation for historical document image analysis. International Journal on Document Analysis and Recognition (IJDAR) 20, 1 (2017), 1--35.

Maroua Mehri, Nibal Nayef, Pierre Héroux, Petra Gomez-Krämer, and Rémy Mullot. 2015. Learning texture features for enhancement and segmentation of historical document images. In The 3rd International Workshop on Historical Document Imaging and Processing. ACM Press, New York, 47--54.

G. Nagy. 2000. Twenty years of document image analysis in PAMI. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1 (2000), 38--62.

George Nagy and Sharad Seth. 1984. Hierarchical representation of optically scanned documents. In The International Conference on Pattern Recognition. IEEE, 347--349.

Y. Nakano, Y. Shima, H. Fujisawa, J. Higashino, and M. Fujinawa. 1990. An algorithm for the skew normalization of document image. In The 10th International Conference on Pattern Recognition, Vol. 2. IEEE Comput. Soc. Press, 8--13.

N. Nandini, K. Srikanta Murthy, and G. Hemantha Kumar. 2008. Estimation of skew angle in binary document images using hough transform. World Academy of Science, Engineering and Technology 18 (2008), 44--49.

Wayne; Niblack. 1986. An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs NJ. 115--116 pages.

Nikos Nikolaou, Michael Makridis, Basilis Gatos, Nikolaos Stamatopoulos, and Nikos Papamarkos. 2010. Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image and Vision Computing 28, 4 (Apr. 2010), 590--604.

Konstantinos Ntirogiannis, Basilis Gatos, and Ioannis Pratikakis. 2014. ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). In The 14th International Conference on Frontiers in Handwriting Recognition. IEEE, 809--813.

L. O’Gorman. 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 11 (1993), 1162--1173.

Oleg Okun, Matti Pietikäinen, O. Okun, and M. Pietikäinen. 1999. A survey of texture-based methods for document layout analysis. In Workshop on Texture Analysis in Machine Vision. 137--148.

Sofia Ares Oliveira, Benoit Seguin, and Frederic Kaplan. 2018. dhSegment: A generic deep-learning approach for document segmentation. In The 16th International Conference on Frontiers in Handwriting Recognition. IEEE, 7--12.

Nobuyuki Otsu. 1979. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics 9, 1 (1979), 62--66.

U. Pal and B. B. Chaudhuri. 1996. An improved document skew angle estimation technique. Pattern Recognition Letters 17, 8 (1996), 899--904.

G. S. Peake and T. N. Tan. 1997. A general algorithm for document skew angle estimation. In The International Conference on Image Processing. IEEE Comput. Soc., 230--233.

Ihsin T. Phillips and Atul K. Chhabra. 1999. Empirical performance evaluation of graphics recognition systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 9 (1999), 849--870.

Ihsin T. Phillips, Jisheng Liang, Atul K. Chhabra, and Robert Haralick. 1997. A performance evaluation protocol for graphics recognition systems. In International Workshop on Graphics Recognition. Springer, 372--389.

Stefan Pletschacher and Apostolos Antonacopoulos. 2010. The PAGE (page analysis and ground-truth elements) format framework. In The 20th International Conference on Pattern Recognition. IEEE, 257--260.

Wolfgang Postl. 1986. Detection of linear oblique structures and skew scan in digitized documents. In The 8th International Conference on Pattern Recognition. 687--689.

Ioannis Pratikakis, Konstantinos Zagoris, George Barlas, and Basilis Gatos. 2016. ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016). In The 15th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, 619--623.

Ioannis Pratikakis, Konstantinos Zagoris, George Barlas, and Basilis Gatos. 2017. ICDAR2017 competition on document image binarization (DIBCO 2017). In The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 1. IEEE, 1395--1403.

Lorenzo Quirós. 2018. Multi-task handwritten document layout analysis. arXiv preprint arXiv:1806.08852 (2018).

Lorenzo Quirós, Llu´s Serrano, Vicente Bosch, Alejandro H. Toselli, Rosa Congost, Enric Saguer, and Enrique Vidal. 2018. HTR Dataset ICFHR 2018. https://zenodo.org/record/1322666#.XHOanOgzaUk.

Irina Rabaev, Ofer Biller, Jihad El-Sana, Klara Kedem, and Itshak Dinstein. 2013. Text line detection in corrupted and damaged historical manuscripts. In The 12th International Conference on Document Analysis and Recognition. IEEE, 812--816.

J. Y. Ramel, S. Leriche, M. L. Demonet, and S. Busson. 2007. User-driven page layout analysis of historical printed books. International Journal of Document Analysis and Recognition (IJDAR) 9, 2--4 (Apr. 2007), 243--261.

Marte A. Ramírez-Ortegón, Lilia L. Ramírez-Ramírez, Ines Ben Messaoud, Volker Märgner, Erik Cuevas, and Raúl Rojas. 2014. A model for the gray-intensity distribution of historical handwritten documents and its application for binarization. International Journal on Document Analysis and Recognition 17, 2 (2014), 139--160.

Tony M. Rath and Rudrapatna Manmatha. 2007. Word spotting for historical documents. International Journal on Document Analysis and Recognition 9, 2 (2007), 139--152.

Ahsen Raza, Imran Siddiqi, Ali Abidi, and Fahim Arif. 2012. An unconstrained benchmark urdu handwritten sentence database with automatic line segmentation. In International Conference on Frontiers in Handwriting Recognition. IEEE, 491--496.

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 234--241.

Raid Saabni, Abedelkadir Asi, and Jihad El-Sana. 2014. Text line extraction for historical document images. Pattern Recognition Letters 35, 1 (2014), 23--33.

Raid Saabni and Jihad El-Sana. 2011. Language-independent text lines extraction using seam carving. In The International Conference on Document Analysis and Recognition. IEEE, 563--568.

Rana S. M. Saad, Randa I. Elanwar, N. S. Abdel Kader, Samia Mashali, and Margrit Betke. 2016. BCE-Arabic-v1 dataset: Towards interpreting arabic document images for people with visual impairments. In The 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments - PETRA. ACM Press, New York, New York, USA, 1--8.

T. Saitoh, M. Tachikawa, and T. Yamaai. 1993. Document image segmentation and text area ordering. In The 2nd International Conference on Document Analysis and Recognition. IEEE Comput. Soc. Press, 323--329.

P. Saragiotis and N. Papamarkos. 2008. Local skew correction in documents. International Journal of Pattern Recognition and Artificial Intelligence 22, 4 (2008), 691--710.

M. Sarfraz, S. A. Mahmoud, and Z. Rasheed. 2007. On skew estimation and correction of text. In Computer Graphics, Imaging and Visualisation. IEEE, 308--313.

Eric Saund, Jing Lin, and Prateek Sarkar. 2009. PixLabeler: User interface for pixel-level labeling of elements in document images. In The 10th International Conference on Document Analysis and Recognition. IEEE, 646--650.

J. Sauvola and M. Pietikainen. 1995. Skew angle detection using texture direction analysis. In The 9th Scandinvian Conference on Image Analysis. 1099--1106.

J. Sauvola and M. Pietikäinen. 2000. Adaptive document image binarization. Pattern Recognition 33, 2 (2000), 225--236.

Seong-Whan Seong-Whan Lee and Dae-Seok Dae-Seok Ryu. 2001. Parameter-free geometric document layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 11 (2001), 1240--1256.

Mathias Seuret, Michele Alberti, Marcus Liwicki, and Rolf Ingold. 2017. PCA-initialized deep neural networks applied to document image analysis. In The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 877--882.

F. Shafait and T. M. Breuel. 2011. The effect of border noise on the performance of projection-based page segmentation methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 4 (2011), 846--851.

F. Shafait, D. Keysers, and T. M. Breuel. 2006. Pixel-accurate representation and evaluation of page segmentation in document images. In The 18th International Conference on Pattern Recognition. IEEE, 872--875.

Faisal Shafait, Joost van Beusekom, Daniel Keysers, and Thomas M. Breuel. 2008. Background variability modeling for statistical layout analysis. In The 19th International Conference on Pattern Recognition. IEEE, 1--4.

Mahnaz Shafii and Maher Sid-Ahmed. 2015. Skew detection and correction based on an axes-parallel bounding box. International Journal on Document Analysis and Recognition (IJDAR) 18, 1 (2015), 59--71.

Asif Shahab. 2013. UW3 and UNLV Datasets. Http://www.iapr-tc11.org/mediawiki/index.php/Table_Ground_Truth_for_the_UW3_and_UNLV_datasets.

Zhixin Shi and Venu Govindaraju. 2004. Line separation for complex document images using fuzzy runlength. In The 1st International Workshop on Document Image Analysis for Libraries. 306--312.

Zhixin Shi, Srirangaraj Setlur, and Venu Govindaraju. 2009. A steerable directional local profile technique for extraction of handwritten Arabic text lines. In The 10th International Conference on Document Analysis and Recognition. IEEE, 176--180.

Frank Y. Shih and Shy-Shyan Chen. 1996. Adaptive document block segmentation and classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 26, 5 (1996), 797--802.

P. Shivakumara, G. Hemantha Kumar, D. S. Guru, and P. Nagabhushan. 2005. A novel technique for estimation of skew in binary text document images based on linear regression analysis. Sadhana 30, 1 (2005), 69--85.

Fotini Simistira, Manuel Bouillon, Mathias Seuret, Marcel Wursch, Michele Alberti, Rolf Ingold, and Marcus Liwicki. 2017. ICDAR2017 competition on layout analysis for challenging medieval manuscripts. In The 14th IAPR International Conference on Document Analysis and Recognition. IEEE, 1361--1370.

A. Simon, J.-C. Pret, and A. P. Johnson. 1997. A fast algorithm for bottom-up document layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 3 (1997), 273--277.

Brij Mohan Singh, Rahul Sharma, Debashis Ghosh, and Ankush Mittal. 2014. Adaptive binarization of severely degraded and non-uniformly illuminated documents. International Journal on Document Analysis and Recognition (IJDAR) 17, 4 (2014), 393--412.

Chandan Singh, Nitin Bhatia, and Amandeep Kaur. 2008. Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognition 41, 12 (2008), 3528--3546.

Bolan Su, Shijian Lu, and Chew Lim Tan. 2010. Binarization of historical document images using the local maximum and minimum. In The 8th IAPR International Workshop on Document Analysis Systems. ACM Press, New York, 159--166.

Wassim Swaileh, Kamel Ait Mohand, and Thierry Paquet. 2015. Multi-script iterative steerable directional filtering for handwritten text line extraction. In The 13th International Conference on Document Analysis and Recognition. IEEE, 1241--1245.

D. Sylwester and S. Seth. 1995. A trainable, single-pass algorithm for column segmentation. In The 3rd International Conference on Document Analysis and Recognition, Vol. 2. IEEE Comput. Soc. Press, 615--618.

Breuel Thomas and Faisal Shafait. 2010. AutoMLP: Simple, effective, fully automated learning rate and size adjustment. In The Learning Workshop, Utah.

Tuan Anh Tran, In-Seop Na, and Soo-Hyung Kim. 2015. Hybrid page segmentation using multilevel homogeneity structure. In The 9th International Conference on Ubiquitous Information Management and Communication. ACM Press, New York, 1--6.

Tuan Anh Tran, In Seop Na, and Soo Hyung Kim. 2016. Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology. International Journal on Document Analysis and Recognition (IJDAR) 19, 3 (Sep. 2016), 191--209.

Nikos Vasilopoulos and Ergina Kavallieratou. 2017. Complex layout analysis based on contour classification and morphological operations. Engineering Applications of Artificial Intelligence 65 (2017), 220--229.

Friedrich M. Wahl, Kwan Y. Wong, and Richard G. Casey. 1982. Block segmentation and text extraction in mixed text/image documents. Computer Graphics and Image Processing 20, 4 (Dec. 1982), 375--390.

Hao Wei, Micheal Baechler, Fouad Slimane, and Rolf Ingold. 2013. Evaluation of SVM, MLP and GMM classifiers for layout analysis of historical documents. In The 12th International Conference on Document Analysis and Recognition. IEEE, 1220--1224.

Hao Wei, Kai Chen, Rolf Ingold, and Marcus Liwicki. 2014. Hybrid feature selection for historical document layout analysis. In The 14th International Conference on Frontiers in Handwriting Recognition. 87--92.

Hao Wei, Kai Chen, Anguelos Nicolaou, Marcus Liwicki, and Rolf Ingold. 2014. Investigation of feature selection for historical document layout analysis. In The 4th International Conference on Image Processing Theory, Tools and Applications. 1--6.

Florian Westphal, Niklas Lavesson, and Håkan Grahn. 2018. Document image binarization using recurrent neural networks. In The 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 263--268.

Christoph Wick and Frank Puppe. 2018. Fully convolutional neural networks for page segmentation of historical document images. In The 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 287--292.

Chung-Chih Wu, Chien-Hsing Chou, and Fu Chang. 2008. A machine-learning approach for analyzing document layout structures with two reading orders. Pattern Recognition 41, 10 (2008), 3200--3213.

Yi Xiao and Hong Yan. 2003. Text region extraction in a document image based on the Delaunay tessellation. Pattern Recognition 36, 3 (2003), 799--809.

Yi Xiao and Hong Yan. 2004. Location of title and author regions in document images based on the Delaunay triangulation. Image and Vision Computing 22, 4 (2004), 319--329.

H. Yan. 1993. Skew correction of document images using interline cross-correlation. Graphical Models and Image Processing 55, 6 (1993), 538--543.

Younki Min, Sung-Bae Cho, and Yillbyung Lee. 1996. A data reduction method for efficient document skew estimation based on Hough transformation. In The 13th International Conference on Pattern Recognition, Vol. 3. IEEE, 732--736.

Bin Yu and Anil K. Jain. 1996. A robust and fast skew detection algorithm for generic documents. Pattern Recognition 29, 10 (Oct. 1996), 1599--1629.

Yue Lu and C. L. Tan. 2005. Constructing area Voronoi diagram in document images. In The 8th International Conference on Document Analysis and Recognition, Vol. 1. IEEE, 342--346.

A. Zahour, B. Taconet, P. Mercy, and S. Ramdane. 2001. Arabic hand-written text-line extraction. In The 6th International Conference on Document Analysis and Recognition. IEEE Comput. Soc., 281--285.