Bartzi.de Research

Research

Sat, 14 Feb 2026 15:23:08 +0100

If anything looks wrong, read on the site!

My research focus is in the field of computer vision, especially in the subdomain of unconstrained scene text recognition. This page includes a list of all publications that I authored and co-authored. It also contains links to further material for each publication.

Publications

C Bartz, H Rätz, H Yang, J Bethge, C Meinel, “Synthesis in Style: Semantic Segmentation of Historical Documents using Synthetic Data” [arXiv preprint][code]
C Bartz, H Rätz, C Meinel, “Handwriting Classification for the Analysis of Art-Historical Documents” [FAPER 2020][pdf][code][models]
N Jain, C Bartz, T Bredow, E Metzenthin, J Otholt, R Krestel, “Semantic Analysis of Cultural Heritage Data: Aligning Paintings and Descriptions in Art-Historic Collections” [FAPER 2020][Code]
C Bartz, J Bethge, H Yang, C Meinel, “One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN” [arXiv preprint][code][models]
C Bartz, L Seidel, DH Nguyen, J Bethge, H Yang, C Meinel, “Synthetic Data for the Analysis of Archival Documents: Handwriting Determination” [DICTA 2020][code][models]
C Bartz, N Jain, R Krestel, “Automatic Matching of Paintings and Descriptions in Art-Historic Archives using Multimodal Analysis”, [AI4HI]
J Bethge, C Bartz, H Yang, C Meinel, “BMXNet 2: An Open Source Framework for Low-bit Networks-Reproducing, Understanding, Designing and Showcasing” [ACM MM 2020][code]
J Bethge, C Bartz, H Yang, Y Chen, C Meinel, “MeliusNet: An Improved Network Architecture for Binary Neural Networks” [WACV 2021][arXiv preprint][code]
C Bartz, J Bethge, H Yang, C Meinel, “KISS: Keeping it Simple for Scene Text Recognition” [pdf][code][models]
C Bartz, H Yang, J Bethge, C Meinel, “LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks” [AMV-18][pdf][code][models]
J Bethge, H Yang, C Bartz, C Meinel, “Learning to train a binary neural network”, [arXiv preprint][code]
C Bartz, H Yang, C Meinel, “SEE: Towards Semi-Supervised End-to-End Scene Text Recognition” [AAAI-18][pdf][code][models]
C Bartz, H Yang, C Meinel, “STN-OCR: A single Neural Network for Text Detection and Text Recognition” [arXiv preprint][code][models]
C Bartz, T Herold, H Yang, C Meinel, “Language Identification Using Deep Convolutional Recurrent Neural Networks” [ICONIP 2017] [pdf][code]
H Yang, M Fritzsche, C Bartz, C Meinel, “BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet” [ACM MM 2017][pdf][code]
H Yang, C Wang, C Bartz, C Meinel, “SceneTextReg: A Real-Time Video OCR System” [ACM MM 2016][pdf]
C Wang, H Yang, C Bartz, C Meinel, “Image captioning with deep bidirectional LSTMs” [ACM MM 2016][pdf][code]

Handwriting Classification for the Analysis of Art-Historical Documents

Thu, 05 Nov 2020 15:10:00 +0100

If anything looks wrong, read on the site!

This page contains models and train/test data for our approach described in the paper Handwriting Classification for the Analysis of Art-Historical Documents.

Train/Test Data

You can download train and test data for the training of a classifier based on the GANWriting dataset, by downloading ganwriting_train_test.tar.bz2. Train and test data for creating a model on the 5CHPT dataset can be found in the file 5CHPT_train_test.tar.bz2.

Models

We also provide pretrained models. ganwriting_models.tar.bz2 contains pretrained models trained on the GANWriting dataset, wheras 5CHPT_models.tar.bz2 contains models trained on the 5CHPT dataset.

Synthetic Data for the Analysis of Archival Documents: Handwriting Determination

Mon, 26 Oct 2020 15:19:00 +0100

If anything looks wrong, read on the site!

Synthetic Data for the Analysis of Archival Documents: Handwriting Determination

This page contains all data necessary to generate training data, train a model, or use a trained model for our paper.

Trained Model

model.zip contains the pre-trained model that we used for our experiments.

Training Data

training_data.zip contains all training data we used to create our model in model.zip. Be aware, the file is quite large (> 3GB).

Generating Your Own Data

generation_data.zip contains the directory structure necessary to work with our data generator. Because of Copyright issues, we are not able to provide you with all data we used. However, we supply information on how to get the data in each sub-directory in a README.

One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN

Thu, 22 Oct 2020 11:32:00 +0200

If anything looks wrong, read on the site!

This page contains several models for our paper “One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN” (Preprint Here).

We provide models for a range of reconstruction experiments, denoising experiments, and experiments with our different training strategies.

Reconstruction Experiments

Here, we provide models for our reconstruction experiments shown in Table 1. We provide Models for our experiments on the FFHQ dataset and also the LSUN Church Dataset. Besides the models you will also find a link to the page where we logged the train run, giving you access to all log information and used hyperparameters. You can download the model by clicking the respective link in the attachment section.

FFHQ Experiments

Stylegan 1, W Only (Z), ffhq_stylegan_1_w_only.zip, WandB
Stylegan 1, W Plus, ffhq_stylegan_1_w_plus.zip, WandB
Stylegan 2, W Only (Z), ffhq_stylegan_2_w_only.zip, WandB
Stylegan 2, W Plus, ffhq_stylegan_2_w_plus.zip, WandB

LSUN Church Experiments

Stylegan 1, W Only (Z), lsun_church_stylegan_1_w_only.zip, WandB
Stylegan 1, W Plus, lsun_church_stylegan_1_w_plus.zip, WandB
Stylegan 2, W Only (Z), lsun_church_stylegan_2_w_only.zip, WandB
Stylegan 2, W Plus, lsun_church_stylegan_2_w_plus.zip, WandB

Denoising Experiments

Here, we provide only our best models trained for color and black and white denoising.

Stylegan 2, W Plus, Denoise, stylegan2_wplus_denoising.zip, WandB
Stylegan 2, W Plus, Denoise, Black and White, stylegan2_wplus_denoising_black_and_white.zip, WandB

Different Training Strategies

We provide the models, we used to create the interpolation results, shown in Figure 13 of our paper. On the one hand a model trained using the two-network strategy (denoted as two-stem in our code) and a model trained using the learning rate strategy.

Stylegan 1, W Plus, Two Network, LSUN Church, lsun_church_stylegan_1_w_plus_two_networks.zip, WandB
Stylegan 1, W Plus, Learning Rate, LSUN Church, lsun_church_stylegan_1_w_plus_learning_rate.zip, WandB

If you are interested in any other models, feel free to open an issue on Github and ask us!

KISS: Keeping It Simple for Scene Text Recognition

Tue, 19 Nov 2019 16:31:00 +0100

If anything looks wrong, read on the site!

This page contains the trained model and also annotation files for training data and evaluation data for our paper “KISS: Keeping It Simple for Scene Text Recognition”. You can get the paper from here.

Training Annotations

In order to get the data we used for training, please follow the instructions in our Github repository. You can find the train annotation files for the MJSynth and the SynthAdd dataset in train_annotations.zip.

Evaluation Annotations

Follow the instructions in our Github repository to get the evaluation data, prepare the directories, as indicated in the column notes and download the annotation files from here. You can find the annotations for evaluation in the following files:

ICDAR2013: icdar2013.zip
ICDAR2015: icdar2015.zip
CUTE80: cute80.zip
IIIT5K: iiit5k.zip
SVT: svt.zip
SVTP: svtp.zip

Best Model

You can find our best model in model.zip.

LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks

Tue, 13 Nov 2018 15:40:00 +0100

If anything looks wrong, read on the site!

This page provides you with access to data necessary to reproduce the results we reported in our paper “LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks”.

We provide our datasets and also access to auxilliary data necessary to genreate your own datasets. Furthermore, we provide access to some models that we trained using the data provided here.

Sheep Dataset

The first dataset we provide, is the sheep dataset that features a lawn mower with an orange sheep on top. We provide the whole dataset inluding annotations, and all files necessary to recreate the dataset.

Dataset

The file sheep_dataset.zip consists the complete train dataset for training localizer and assessor in the respective folders. The file backgrounds_templates_bboxes.zip consists all auxiliary file necessary to create the dataset.

Trained Models

The file sheep_models.zip contains two of our trained models on the sheep dataset. The first model is a Resnet 18 based localizer that was trained on an image size of 224x224 pixels. The second model is a Resnet 50 based localizer trained on an image size of 224x224 pixels.

Figure Skating Dataset

We also provide everything you to redo our figure skating experiments. This includes all images and also some trained models

Dataset

In this section we describe the files that we uploaded for working with the figure skating dataset.

Assessor

If you want to have the background images, we used for training the assessor, please download the file figure_skating_backgrounds.zip. You can find the template images for training the assessor in the file figure_skating_templates.zip. If you want to have the already prepared assessor train dataset, you can download the file figure_skating_assessor_datasets.zip This includes two datasets. One dataset used for training the Resnet 18 based localizer (reference_iou_flipped) and one used for training the Resnet 50 based localizer (reference_iou_75_100).

Localizer

If you are interested in the train datasets for the localizer, you can download the datasets with and without noise if you choose the file figure_skate_localizer_datasets.zip (file size 19GB). Further, you can download the evaluation dataset figure_skating_evaluation_dataset.zip.

Models

The file figure_skating_models.zip contains the best two models that we trained on the figure skating datset. The first model is a Resnet 18 based localizer that has been trained with the noisy dataset and an output size of 50x100 pixels. The second model is a Resnet 50 based localizer that has been trained on the dataset without noise and an output size of 75x100 pixels.

Slides and Poster

We also provide the slides of our talk and the poster we presented at the “1st International Workshop on Advanced Machine Vision for Real-life and Industrially Relevant Applications”. If you want to download the slides or the poster, please download the file talk.pdf or poster.pdf, respectively.

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

Thu, 09 Nov 2017 18:14:00 +0100

If anything looks wrong, read on the site!

This page grants access to our generated SVHN datasets and trained models, that are mentioned in our AAAI-18 paper “SEE: Towards Semi-Supervised End-to-End Scene Text Recognition” (preprint available here). The code for this publication is available here.

SVHN Experiments

The page about our previous Arxiv publication “STN-OCR: A single Neural Network for Text Detection and Text Recognition” contains all data necessary for redoing our experiments on the SVHN datasets. You can find the page here. Please note: the models on this page won’t work with the code for this paper, but with the code for the other paper. The other paper is the pre-version of this paper.

We’ve also prepared a video that shows the train progress of our model on the SVHN dataset with randomly placed numbers. You can find the video here.

FSNS

For training and evaluating on the FSNS dataset we prepared the dataset as described here.

The file fsns_model.zip contains our best performing model on the FSNS dataset. You can use this data with the evaluation.py script in our Github repository to evaluate the model.

For this dataset, we also prepared a video showing the train progress on this dataset. You can find the video here.

Text Recognition

Although not mentioned in the paper, we also performed pure text recognition experiments on already extracted text lines (analog to the experiments described in our STN-OCR paper).

The file text_recognition_model.zip contains a text recognition model and also a small example dataset including all necessary files. If you want to use this dataset, you will need to adapt some filepaths!

Supplementary Material

The file supplementary_material.pdf contains supplementary material for our paper.

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Mon, 24 Jul 2017 16:42:00 +0200

If anything looks wrong, read on the site!

This Page grants access to our purpose build SVHN datasets and trained models, that we created for our experiments described in STN-OCR (the code is available here).

SVHN Experiments

In the attachment svhn_dataset_and_models.zip you can find two datasets we used for our training on SVHN data. We created these datasets using the original SVHN dataset images and our scripts that can be found here. You can also find prepared evaluation data in the evaluation folder.

Besides these datasets you can also find several models

a model trained on original SVHN data
a model trained on SVHN data evenly distributed on a grid
a model trained on SVHN house number crops randomly placed in an image

Text Recognition

We can, unfortunately, not provide the dataset we used for these experiments, as it is too large and we have not been able to find a suitable place to host it. If you know a good place, please let us know, by opening an issue in our Github repository.

But the file text_recognition_model.zip contains a model trained for performing text recognition on already cropped scene text images. This model can be used with eval_text_recognition.py script from our repository on Github.

FSNS

For our training we used the standard FSNS dataset. Please this file, for downloading and preparing the FSNS dataset for usage with our system.

The file fsns_model.zip contains our best performing model trained on the FSNS dataset. This model can be used with the eval_fsns_model.py script from our repository.

Supplementary Material

In the paper we mentioned some videos that show how the model learns to find the regions of text and the pretraining steps that we performed. You can find the videos in videos.zip

Bartzi.de Research

Research

#Publications

Handwriting Classification for the Analysis of Art-Historical Documents

#Train/Test Data

#Models

Synthetic Data for the Analysis of Archival Documents: Handwriting Determination

#Synthetic Data for the Analysis of Archival Documents: Handwriting Determination

#Trained Model

#Training Data

#Generating Your Own Data

One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN

#Reconstruction Experiments

#FFHQ Experiments

#LSUN Church Experiments

#Denoising Experiments

#Different Training Strategies

KISS: Keeping It Simple for Scene Text Recognition

#Training Annotations

#Evaluation Annotations

#Best Model

LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks

#Sheep Dataset

#Dataset

#Trained Models

#Figure Skating Dataset

#Dataset

#Assessor

#Localizer

#Models

#Slides and Poster

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

#SVHN Experiments

#FSNS

#Text Recognition

#Supplementary Material

STN-OCR: A single Neural Network for Text Detection and Text Recognition

#SVHN Experiments

#Text Recognition

#FSNS

#Supplementary Material

Publications

Train/Test Data

Models

Synthetic Data for the Analysis of Archival Documents: Handwriting Determination

Trained Model

Training Data

Generating Your Own Data

Reconstruction Experiments

FFHQ Experiments

LSUN Church Experiments

Denoising Experiments

Different Training Strategies

Training Annotations

Evaluation Annotations

Best Model

Sheep Dataset

Dataset

Trained Models

Figure Skating Dataset

Dataset

Assessor

Localizer

Models

Slides and Poster

SVHN Experiments

FSNS

Text Recognition

Supplementary Material

SVHN Experiments

Text Recognition

FSNS

Supplementary Material