Quantitative database extracted from scientific publications is developed within the framework of the development of the Information Infrastructure in order to support scientific geological research of the Russian Far East. Based on the analysis of the existing methods and technologies used for processing of scientific publications, new technological approaches to automatic extraction and processing of quantitative data in scientific publications are applied.
Processing of scientific publications, automatic data extraction from scientific publications, geology of the Russian Far East
1. Bart, E. Parsing tables by probabilistic modeling of perceptual cues, Document Analysis Systems (DAS) // 10th IAPR International Workshop - Queensland, Australia: Gold Coast., 2012. - p. 409. DOI: https://doi.org/10.1109/DAS.2012.67; EDN: https://elibrary.ru/XYYNNN
2. Day, R. A. How to Write and Publish a Scientific Paper - Philadelphia: ISI Press., 1979. - 160 pp.
3. Downs, R. T., Hall-Wallace, M. The American mineralogist crystal structure database, // American Mineralogist, 2003. - no. 88 - p. 247. EDN: https://elibrary.ru/GKIDRF
4. Hassan, T., Baumgartner, R. Table recognition and understanding from PDF files // Proc. 9th International Conference on Document Analysis and Recognition (ICDAR 2007), IEEE Computer Society - Washington: IEEE Computer Society., 2007. - p. 1143.
5. Naumova, V. V., Belousov, A. V. Digital repository ``Geology of the Russian Far East'' - an open access to the spatially distributed online scientific publications, // Russ. J. Earth. Sci., 2014. - v. 14 - p. 1143.
6. Naumova, V. V., Goryachev, I. N., Dyakov, S., Belousov, A., Platonov, K. A. Modern technologies of development of the Information infrastructure to support the research on geology of the Russian Far East, // Information Technology, 2015. - v. 21 - no. 7 - p. 551. EDN: https://elibrary.ru/UDNZFD
7. Sarbas, B. The GEOROC Database as part of a Growing Geoinformatics Network // Geoinformatics 2008 - Data to Knowledge, Potsdam - Reston: USGS., 2008. - p. 42.
8. Seo, W., Koo, H., Cho, N. Junction-based table detection in camera-captured document images, // International Journal on Document Analysis and Recognition (IJDAR), 2015. - v. 18 - p. 47. DOI: https://doi.org/10.1007/s10032-014-0226-7; EDN: https://elibrary.ru/TKNCZD
9. Shigarov, A. O. Technology for Table Data Extraction From Digital Documents of Different Formats - Irkutsk: ICT SB RAS., 2009.
10. Tkaczyk, D., Szostek, P. , Fedoryszak, M. , Dendek, P. J. , Bolikowski, L. ``CERMINE'': Automatic extraction of structured metadata from scientific literature, // Int. J. on Document Analysis and Recognition, 2015. - v. 18 - p. 1. DOI: https://doi.org/10.1007/s10032-014-0229-4; EDN: https://elibrary.ru/AKYHMU