The 22th ACM Symposium on Document Engineering

September 20th, 2022 to September 23rd, 2022
Virtual Event (Hosted from San Jose, CA, USA)


DocEng will be a virtual event. See for information on how to access the virtual events.

Papers will be presented as pre-recorded videos followed by a live Q&A.

All times shown are in Pacific Daylight Time (PDT).

Tuesday, September 20th

Time (PDT) Session
6:00am ‑ 6:10am Welcome
6:10am ‑ 6:45am Competitions Report
6:45am ‑ 7:00am Break and Networking
7:00am ‑ 8:00am Birds of a Feather
8:00am ‑ 8:30am Break and Networking

Wednesday, September 21th

Time (PDT) Session Authors
6:00am ‑ 6:10am Welcome
6:10am ‑ 7:00am Keynote : Documents Past David Brailsford
7:00am ‑ 7:30am Graphical Document Representation for French Newsletters Analysis Alexis Blandin, Farida Said, Jeanne Villaneau and Pierre-François Marteau
7:30am ‑ 7:45am - break -
7:45am ‑ 8:00am A cascaded approach for page-object detection in scientific papers Erika Spiteri Bailey, Alexandra Bonnici and Stefania Cristina
8:00am ‑ 8:15am From Print to Online Newspapers on Small Displays: A Relayouting Approach Aimed at Preserving Entry Points Sebastian Gallardo, Dorian Mazauric and Pierre Kornprobst
8:15am ‑ 8:45am Long-Term Lifecycle-Related Management of Digital Building Documents: Towards a Holistic and Standard-based Concept for a Technical and Organizational Solution in Building Authorities Uwe M. Borghoff, Eberhard Pfeiffer, Peter Rödig
8:45am ‑ 9:00am Theory Entity Extraction for Social and Behavioral Sciences Papers using Distant Supervision Xin Wei, Lamia Salsabil and Jian Wu
9:00am ‑ 9:30am Birds of a Feather Report

Thursday, September 22nd

Time (PDT) Session Authors
6:00am ‑ 6:10am Welcome
6:10am ‑ 7:00am Keynote : Documents Present Laurie Byrum
7:00am ‑ 7:30am Tab this Folder of Documents: Page Stream Segmentation of Business Documents Thisanaporn Mungmeeprued, Yuxin Ma, Nisarg Mehta, and Aldo Lipani
7:30am ‑ 7:45am - break -
7:45am ‑ 8:00am Modifying PDF Sewing Patterns for Use With Projectors Charlotte Curtis
8:00am ‑ 8:15am SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection Maksim Eren, Nick Solovyev, Manish Bhattarai, Kim Rasmussen, Charles Nicholas and Boian Alexandrov
8:15am ‑ 8:45am Downstream Transformer Generation of Question-Answer Pairs with Preprocessing and Postprocessing Pipelines Cheng Zhang, Hao Zhang, Yicheng Sun and Jie Wang
8:45am ‑ 9:00am Academic writing and publishing beyond documents Cerstin Mahlow and Michael Piotrowski
9:00am ‑ 9:15am OCR with Transformers and CTC Israel Campiotti and Roberto Lotufo
9:00am ‑ 9:30am Optical Character Recognition Guided Image Super Resolution Maximilian Schulze, Philipp Hildebrandt, Sarel Cohen, Vanja Doskoč, Raid Saabni and Tobias Friedrich

Friday, September 23rd

Time (PDT) Session Authors
6:00am ‑ 6:10am Welcome
6:10am ‑ 7:00am Keynote : Documents Future Tong Sun
7:00am ‑ 7:30am Panel Discussion
7:30am ‑ 7:45am Best Paper Award
7:45am ‑ 8:00am Anonymizing and Obfuscating PDF Content while Preserving Document Structure Charlotte Curtis
8:00am ‑ 8:15am Scholarly Big Data Quality Assessment: A Case Study of Document Linking and Conflation with S2ORC Jian Wu, Ryan Hiltabrand, Dominik Soos, and Lee Giles
8:15am ‑ 8:30am Detecting malware using text documents extracted from spam email through machine learning Luis Ángel Redondo-Gutierrez, Francisco Jáñez-Martino, Eduardo Fidalgo, Enrique Alegre, Víctor González-Castro, and Rocío Alaiz-Rodríguez
8:30am ‑ 8:45am Triplet Transformer Network for Multi-Label Document Classification Johannes Melsbach, Sven Stahlmann, Stefan Hirschmeier and Detlef Schoder
8:45am ‑ 9:00am Data harvesting for Chinese procurement documents Danrun Cao, Oussama Ahmia, Nicolas Béchet and Pierre-François Marteau
9:00am ‑ 9:30am Concluding DocEng 2022 and look ahead to DocEng 2023