Removing Unwanted Text from Architectural Images with Multi-Scale Deformable Attention-Based Machine Learning

Michael Lystbæk*, Archontis Giannakidis, Michail Beliatis, Martin Olsen

*Corresponding author af dette arbejde

Publikation: Bidrag til bog/antologi/rapport/proceedingKonferencebidrag i proceedingsForskningpeer review

3 Citationer (Scopus)

Abstract

Dataset development for machine learning (ML) is considered a challenging and time-consuming process because of the significant resources needed for preprocessing. Automated pipelines for retrieving and preprocessing large amounts of data are not always so readily available. This article examines the benefits of using self-attention ML object detection approaches in the image preprocessing stage in an autonomous manner. We focus on the case where the required preprocessing is text (noise) removal from architectural images. The model we develop is based on the state-of-the-art Text Spotting Transformer (TESTR) framework.
By using our TESTR model, we demonstrate that identification and removal of unwanted text annotations on architectural floor plan image datasets are feasible. The impact of inference threshold and image scale on the removal performance is investigated. Optimal thresholds are derived between leaving text and inadvertently removing (building) content. The lower the image scale, the worse the object detection performance. Our pipeline could be used for the preprocessing of huge image datasets for removing obsolete/unwanted annotations and features to improve performance during generative adversarial network (GAN) model training.
This could boost efforts to make artificial intelligence systems automatically offer suggestions and refine the building design. The application of TESTR to the architectural image data preprocessing stage as a tool for text and numerical content removal has shown promise. The recognizer decoder of TESTR provides the ability to retain the removed content information for further labeling.
OriginalsprogEngelsk
Titel2023 IEEE International Conference on Imaging Systems and Techniques (IST)
ForlagIEEE
Publikationsdatookt. 2023
ISBN (Elektronisk)979-8-3503-3083-0, 979-8-3503-3084-7
DOI
StatusUdgivet - okt. 2023
NavnIEEE International Conference on Imaging Systems and Techniques Proceedings
ISSN2832-4234

Fingeraftryk

Dyk ned i forskningsemnerne om 'Removing Unwanted Text from Architectural Images with Multi-Scale Deformable Attention-Based Machine Learning'. Sammen danner de et unikt fingeraftryk.

Citationsformater