
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
>
  <channel>
    <atom:link href="https://bartzi.de//rss.xml" rel="self" type="application/rss+xml" />
    <title>Bartzi.de Research</title>
    <link>https://bartzi.de/</link>
    <description>Additional content for my research articles.</description>
    <image>
      <url>https://bartzi.de//favicons/favicon-32x32.png</url>
      <title>Bartzi.de Research</title>
      <link>https://bartzi.de/</link>
      <width>32</width>
      <height>32</height>
    </image>
    
        <item>
          <guid>https://bartzi.de//research</guid>
          <title>Research</title>
          <description>How to manage existing blog posts and create new ones</description>
          <link>https://bartzi.de//research</link>
          <pubDate>Sat, 14 Feb 2026 15:23:08 +0100</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>My research focus is in the field of computer vision, especially in the subdomain of unconstrained scene text recognition. This page includes a list of all publications that I authored and co-authored. It also contains links to further material for each publication.</p>
<h1 id="publications"><a class="heading-link" title="Permalink" aria-hidden="true" href="#publications"><span>#</span></a>Publications</h1>
<ul><li>C Bartz, H Rätz, H Yang, J Bethge, C Meinel, <strong>“Synthesis in Style: Semantic Segmentation of Historical Documents using Synthetic Data”</strong> [<a href="https://arxiv.org/abs/2107.06777" rel="nofollow">arXiv preprint</a>][<a href="https://github.com/Bartzi/synthesis-in-style" rel="nofollow">code</a>]</li>
<li>C Bartz, H Rätz, C Meinel, <strong>“Handwriting Classification for the Analysis of Art-Historical Documents”</strong> [<a href="https://link.springer.com/chapter/10.1007/978-3-030-68796-0_40" rel="nofollow">FAPER 2020</a>][<a href="https://arxiv.org/abs/2011.02264" rel="nofollow">pdf</a>][<a href="https://github.com/hendraet/handwriting-classification" rel="nofollow">code</a>][<a href="/research/handwriting_classification">models</a>]</li>
<li>N Jain, C Bartz, T Bredow, E Metzenthin, J Otholt, R Krestel, <strong>“Semantic Analysis of Cultural Heritage Data: Aligning Paintings and Descriptions in Art-Historic Collections”</strong> [<a href="https://link.springer.com/chapter/10.1007/978-3-030-68796-0_37" rel="nofollow">FAPER 2020</a>][<a href="https://github.com/HPI-DeepLearning/semantic_analysis_of_cultural_heritage_data" rel="nofollow">Code</a>]</li>
<li>C Bartz, J Bethge, H Yang, C Meinel, <strong>“One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN”</strong> [<a href="https://arxiv.org/abs/2010.11113" rel="nofollow">arXiv preprint</a>][<a href="https://github.com/Bartzi/one-model-to-reconstruct-them-all" rel="nofollow">code</a>][<a href="/research/one_model_to_reconstruct_them_all">models</a>]</li>
<li>C Bartz, L Seidel, DH Nguyen, J Bethge, H Yang, C Meinel, <strong>“Synthetic Data for the Analysis of Archival Documents: Handwriting Determination”</strong> [<a href="http://www.dicta2020.org/wp-content/uploads/2020/09/9_CameraReady.pdf" rel="nofollow">DICTA 2020</a>][<a href="https://github.com/Bartzi/handwriting-determination" rel="nofollow">code</a>][<a href="/research/handwriting_determination">models</a>]</li>
<li>C Bartz, N Jain, R Krestel, <strong>“Automatic Matching of Paintings and Descriptions in Art-Historic Archives using Multimodal Analysis”</strong>, [<a href="https://www.aclweb.org/anthology/2020.ai4hi-1.4.pdf" rel="nofollow">AI4HI</a>]</li>
<li>J Bethge, C Bartz, H Yang, C Meinel, <strong>“BMXNet 2: An Open Source Framework for Low-bit Networks-Reproducing, Understanding, Designing and Showcasing”</strong> [<a href="https://dl.acm.org/doi/abs/10.1145/3394171.3414539" rel="nofollow">ACM MM 2020</a>][<a href="https://github.com/hpi-xnor/BMXNet-v2" rel="nofollow">code</a>]</li>
<li>J Bethge, C Bartz, H Yang, Y Chen, C Meinel, <strong>“MeliusNet: An Improved Network Architecture for Binary Neural Networks”</strong> [<a href="https://openaccess.thecvf.com/content/WACV2021/html/Bethge_MeliusNet_An_Improved_Network_Architecture_for_Binary_Neural_Networks_WACV_2021_paper.html" rel="nofollow">WACV 2021</a>][<a href="https://arxiv.org/abs/2001.05936" rel="nofollow">arXiv preprint</a>][<a href="https://github.com/hpi-xnor/BMXNet-v2" rel="nofollow">code</a>]</li>
<li>C Bartz, J Bethge, H Yang, C Meinel, <strong>“KISS: Keeping it Simple for Scene Text Recognition”</strong> [<a href="https://arxiv.org/abs/1911.08400" rel="nofollow">pdf</a>][<a href="https://github.com/Bartzi/kiss" rel="nofollow">code</a>][<a href="/research/kiss">models</a>]</li>
<li>C Bartz, H Yang, J Bethge, C Meinel, <strong>“LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks”</strong> [<a href="https://link.springer.com/chapter/10.1007/978-3-030-21074-8_29" rel="nofollow">AMV-18</a>][<a href="https://arxiv.org/abs/1811.05773" rel="nofollow">pdf</a>][<a href="https://github.com/Bartzi/loans" rel="nofollow">code</a>][<a href="/research/loans">models</a>]</li>
<li>J Bethge, H Yang, C Bartz, C Meinel, <strong>“Learning to train a binary neural network”</strong>, [<a href="https://arxiv.org/abs/1809.10463" rel="nofollow">arXiv preprint</a>][<a href="https://github.com/Jopyth/BMXNet" rel="nofollow">code</a>]</li>
<li>C Bartz, H Yang, C Meinel, <strong>“SEE: Towards Semi-Supervised End-to-End Scene Text Recognition”</strong> [<a href="https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16270" rel="nofollow">AAAI-18</a>][<a href="http://arxiv.org/abs/1712.05404" rel="nofollow">pdf</a>][<a href="https://github.com/Bartzi/see" rel="nofollow">code</a>][<a href="/research/see">models</a>]</li>
<li>C Bartz, H Yang, C Meinel, <strong>“STN-OCR: A single Neural Network for Text Detection and Text Recognition”</strong> [<a href="https://arxiv.org/abs/1707.08831" rel="nofollow">arXiv preprint</a>][<a href="https://github.com/Bartzi/stn-ocr" rel="nofollow">code</a>][<a href="/research/stn-ocr">models</a>]</li>
<li>C Bartz, T Herold, H Yang, C Meinel, <strong>“Language Identification Using Deep Convolutional Recurrent Neural Networks”</strong> [<a href="https://link.springer.com/chapter/10.1007/978-3-319-70136-3_93" rel="nofollow">ICONIP 2017</a>] [<a href="https://arxiv.org/abs/1708.04811" rel="nofollow">pdf</a>][<a href="https://github.com/HPI-DeepLearning/crnn-lid" rel="nofollow">code</a>]</li>
<li>H Yang, M Fritzsche, C Bartz, C Meinel, <strong>“BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet”</strong> [<a href="https://dl.acm.org/citation.cfm?id=3129393&CFID=780103033&CFTOKEN=26246456" rel="nofollow">ACM MM 2017</a>][<a href="https://arxiv.org/abs/1705.09864" rel="nofollow">pdf</a>][<a href="https://github.com/hpi-xnor" rel="nofollow">code</a>]</li>
<li>H Yang, C Wang, C Bartz, C Meinel, <strong>“SceneTextReg: A Real-Time Video OCR System”</strong> [<a href="https://dl.acm.org/citation.cfm?id=2973811" rel="nofollow">ACM MM 2016</a>][<a href="https://hpi.de/fileadmin/user_upload/fachgebiete/meinel/tele-task/papers/ACMMM16-yang.pdf" rel="nofollow">pdf</a>]</li>
<li>C Wang, H Yang, C Bartz, C Meinel, <strong>“Image captioning with deep bidirectional LSTMs”</strong> [<a href="https://dl.acm.org/citation.cfm?id=2964299" rel="nofollow">ACM MM 2016</a>][<a href="https://arxiv.org/abs/1604.00790" rel="nofollow">pdf</a>][<a href="https://github.com/deepsemantic/image_captioning" rel="nofollow">code</a>]</li></ul>
          ]]></content:encoded>
          <media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://bartzi.de///images/posts/blog-posts.jpg"/>
          <media:content xmlns:media="http://search.yahoo.com/mrss/" medium="image" url="https://bartzi.de///images/posts/blog-posts.jpg"/>          
        </item>
      
        <item>
          <guid>https://bartzi.de//research/handwriting_classification</guid>
          <title>Handwriting Classification for the Analysis of Art-Historical Documents</title>
          <description>This page contains models and train/test data for our approach described in the paper "Handwriting Classification for the Analysis of Art-Historical Document"</description>
          <link>https://bartzi.de//research/handwriting_classification</link>
          <pubDate>Thu, 05 Nov 2020 15:10:00 +0100</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/handwriting_classification">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>This page contains models and train/test data for our approach described in the paper <a href="https://arxiv.org/abs/2011.02264" rel="nofollow">Handwriting Classification for the Analysis of Art-Historical Documents</a>.</p>
<h2 id="traintest-data"><a class="heading-link" title="Permalink" aria-hidden="true" href="#traintest-data"><span>#</span></a>Train/Test Data</h2>
<p>You can download train and test data for the training of a classifier based on the <code>GANWriting</code> dataset, by downloading <a href="/media/research/handwriting_classification/ganwriting_train_test.tar.bz2"><code>ganwriting_train_test.tar.bz2</code></a>.
Train and test data for creating a model on the 5CHPT dataset can be found in the file <a href="/media/research/handwriting_classification/5CHPT_train_test.tar.bz2"><code>5CHPT_train_test.tar.bz2</code></a>.</p>
<h2 id="models"><a class="heading-link" title="Permalink" aria-hidden="true" href="#models"><span>#</span></a>Models</h2>
<p>We also provide pretrained models.
<a href="/media/research/handwriting_classification/ganwriting_models.tar.bz2"><code>ganwriting_models.tar.bz2</code></a> contains pretrained models trained on the  <code>GANWriting</code> dataset, wheras <a href="/media/research/handwriting_classification/5CHPT_models.tar.bz2"><code>5CHPT_models.tar.bz2</code></a> contains models trained on the 5CHPT dataset.</p>
          ]]></content:encoded>
          
                    
        </item>
      
        <item>
          <guid>https://bartzi.de//research/handwriting_determination</guid>
          <title>Synthetic Data for the Analysis of Archival Documents: Handwriting Determination</title>
          <description>This page contains all data necessary to generate training data, train a model, or use a trained model for our paper.</description>
          <link>https://bartzi.de//research/handwriting_determination</link>
          <pubDate>Mon, 26 Oct 2020 15:19:00 +0100</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/handwriting_determination">
                  read on the site!
                </a>
              </strong>
            </div>

            <h1 id="synthetic-data-for-the-analysis-of-archival-documents-handwriting-determination"><a class="heading-link" title="Permalink" aria-hidden="true" href="#synthetic-data-for-the-analysis-of-archival-documents-handwriting-determination"><span>#</span></a>Synthetic Data for the Analysis of Archival Documents: Handwriting Determination</h1>
<p>This page contains all data necessary to generate training data, train a model, or use a trained model for our paper.</p>
<h2 id="trained-model"><a class="heading-link" title="Permalink" aria-hidden="true" href="#trained-model"><span>#</span></a>Trained Model</h2>
<p><a href="/media/research/handwriting_determination/model.zip"><code>model.zip</code></a> contains the pre-trained model that we used for our experiments.</p>
<h2 id="training-data"><a class="heading-link" title="Permalink" aria-hidden="true" href="#training-data"><span>#</span></a>Training Data</h2>
<p><a href="/media/research/handwriting_determination/training_data.zip"><code>training_data.zip</code></a> contains all training data we used to create our model in <code>model.zip</code>. Be aware, the file is quite large (&gt; 3GB).</p>
<h2 id="generating-your-own-data"><a class="heading-link" title="Permalink" aria-hidden="true" href="#generating-your-own-data"><span>#</span></a>Generating Your Own Data</h2>
<p><a href="/media/research/handwriting_determination/generation_data.zip"><code>generation_data.zip</code></a> contains the directory structure necessary to work with our data generator. Because of Copyright issues, we are not able to provide you with all data we used. However, we supply information on how to get the data in each sub-directory in a <code>README</code>.</p>
          ]]></content:encoded>
          
                    
        </item>
      
        <item>
          <guid>https://bartzi.de//research/one_model_to_reconstruct_them_all</guid>
          <title>One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN</title>
          <description>This page contains the trained model and also annotation files for training data and evaluation data for our paper "KISS: Keeping It Simple for Scene Text Recognition"</description>
          <link>https://bartzi.de//research/one_model_to_reconstruct_them_all</link>
          <pubDate>Thu, 22 Oct 2020 11:32:00 +0200</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/one_model_to_reconstruct_them_all">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>This page contains several models for our paper “One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN” (<a href="https://arxiv.org/abs/2010.11113" rel="nofollow">Preprint Here</a>).</p>
<p>We provide models for a range of reconstruction experiments, denoising experiments, and experiments with our different training strategies.</p>
<h2 id="reconstruction-experiments"><a class="heading-link" title="Permalink" aria-hidden="true" href="#reconstruction-experiments"><span>#</span></a>Reconstruction Experiments</h2>
<p>Here, we provide models for our reconstruction experiments shown in Table 1.
We provide Models for our experiments on the FFHQ dataset and also the LSUN Church Dataset. Besides the models you will also find a link to the page where we logged the train run, giving you access to all log information and used hyperparameters.
You can download the model by clicking the respective link in the attachment section.</p>
<h3 id="ffhq-experiments"><a class="heading-link" title="Permalink" aria-hidden="true" href="#ffhq-experiments"><span>#</span></a>FFHQ Experiments</h3>
<ul><li>Stylegan 1, W Only (Z), <a href="/media/research/one_model_to_reconstruct_them_all/ffhq_stylegan_1_w_only.zip"><code>ffhq_stylegan_1_w_only.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/20znopao" rel="nofollow">WandB</a></li>
<li>Stylegan 1, W Plus, <a href="/media/research/one_model_to_reconstruct_them_all/ffhq_stylegan_1_w_plus.zip"><code>ffhq_stylegan_1_w_plus.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/b7xems29" rel="nofollow">WandB</a></li>
<li>Stylegan 2, W Only (Z), <a href="/media/research/one_model_to_reconstruct_them_all/ffhq_stylegan_2_w_only.zip"><code>ffhq_stylegan_2_w_only.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/2xtyfi5v" rel="nofollow">WandB</a></li>
<li>Stylegan 2, W Plus, <a href="/media/research/one_model_to_reconstruct_them_all/ffhq_stylegan_2_w_plus.zip"><code>ffhq_stylegan_2_w_plus.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/3vx67aji" rel="nofollow">WandB</a></li></ul>
<h2 id="lsun-church-experiments"><a class="heading-link" title="Permalink" aria-hidden="true" href="#lsun-church-experiments"><span>#</span></a>LSUN Church Experiments</h2>
<ul><li>Stylegan 1, W Only (Z), <a href="/media/research/one_model_to_reconstruct_them_all/lsun_church_stylegan_1_w_only.zip"><code>lsun_church_stylegan_1_w_only.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/2ch4qvel" rel="nofollow">WandB</a></li>
<li>Stylegan 1, W Plus, <a href="/media/research/one_model_to_reconstruct_them_all/lsun_church_stylegan_1_w_plus.zip"><code>lsun_church_stylegan_1_w_plus.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/3o0p71t7" rel="nofollow">WandB</a></li>
<li>Stylegan 2, W Only (Z), <a href="/media/research/one_model_to_reconstruct_them_all/lsun_church_stylegan_2_w_only.zip"><code>lsun_church_stylegan_2_w_only.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/126ncuqu" rel="nofollow">WandB</a></li>
<li>Stylegan 2, W Plus, <a href="/media/research/one_model_to_reconstruct_them_all/lsun_church_stylegan_2_w_plus.zip"><code>lsun_church_stylegan_2_w_plus.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/15jejim5" rel="nofollow">WandB</a></li></ul>
<h2 id="denoising-experiments"><a class="heading-link" title="Permalink" aria-hidden="true" href="#denoising-experiments"><span>#</span></a>Denoising Experiments</h2>
<p>Here, we provide only our best models trained for color and black and white denoising.</p>
<ul><li>Stylegan 2, W Plus, Denoise, <a href="/media/research/one_model_to_reconstruct_them_all/stylegan2_wplus_denoising.zip"><code>stylegan2_wplus_denoising.zip</code></a>, <a href="https://app.wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/5vneqdzv" rel="nofollow">WandB</a></li>
<li>Stylegan 2, W Plus, Denoise, Black and White, <a href="/media/research/one_model_to_reconstruct_them_all/stylegan2_wplus_denoising_black_and_white.zip"><code>stylegan2_wplus_denoising_black_and_white.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/3lmtt63t" rel="nofollow">WandB</a></li></ul>
<h2 id="different-training-strategies"><a class="heading-link" title="Permalink" aria-hidden="true" href="#different-training-strategies"><span>#</span></a>Different Training Strategies</h2>
<p>We provide the models, we used to create the interpolation results, shown in Figure 13 of our paper. On the one hand a model trained using the two-network strategy (denoted as two-stem in our code) and a model trained using the learning rate strategy.</p>
<ul><li>Stylegan 1, W Plus, Two Network, LSUN Church, <a href="/media/research/one_model_to_reconstruct_them_all/lsun_church_stylegan_1_w_plus_two_networks.zip"><code>lsun_church_stylegan_1_w_plus_two_networks.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/176x4g52" rel="nofollow">WandB</a></li>
<li>Stylegan 1, W Plus, Learning Rate, LSUN Church, <a href="/media/research/one_model_to_reconstruct_them_all/lsun_church_stylegan_1_w_plus_learning_rate.zip"><code>lsun_church_stylegan_1_w_plus_learning_rate.zip</code></a>, <a href="https://wandb.ai/hpi/One%20Model%20to%20Generate%20them%20All/runs/x5u0oowj" rel="nofollow">WandB</a></li></ul>
<p>If you are interested in any other models, feel free to open an issue on Github and ask us!</p>
          ]]></content:encoded>
          
                    
        </item>
      
        <item>
          <guid>https://bartzi.de//research/kiss</guid>
          <title>KISS: Keeping It Simple for Scene Text Recognition</title>
          <description>This page contains the trained model and also annotation files for training data and evaluation data for our paper "KISS: Keeping It Simple for Scene Text Recognition"</description>
          <link>https://bartzi.de//research/kiss</link>
          <pubDate>Tue, 19 Nov 2019 16:31:00 +0100</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/kiss">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>This page contains the trained model and also annotation files for training data and evaluation data for our paper “KISS: Keeping It Simple for Scene Text Recognition”. You can get the paper from <a href="https://arxiv.org/abs/1911.08400" rel="nofollow">here</a>.</p>
<h1 id="training-annotations"><a class="heading-link" title="Permalink" aria-hidden="true" href="#training-annotations"><span>#</span></a>Training Annotations</h1>
<p>In order to get the data we used for training, please follow the instructions in our <a href="https://github.com/Bartzi/kiss#image-data" rel="nofollow">Github repository</a>.
You can find the train annotation files for the MJSynth and the SynthAdd dataset in <a href="/media/research/kiss/train_annotations.zip"><code>train_annotations.zip</code></a>.</p>
<h1 id="evaluation-annotations"><a class="heading-link" title="Permalink" aria-hidden="true" href="#evaluation-annotations"><span>#</span></a>Evaluation Annotations</h1>
<p>Follow the instructions in our <a href="https://github.com/Bartzi/kiss#evaluation-data" rel="nofollow">Github repository</a> to get the evaluation data, prepare the directories, as indicated in the column <code>notes</code> and download the annotation files from here. You can find the annotations for evaluation in the following files:</p>
<ul><li>ICDAR2013: <a href="/media/research/kiss/icdar2013.zip"><code>icdar2013.zip</code></a></li>
<li>ICDAR2015: <a href="/media/research/kiss/icdar2015.zip"><code>icdar2015.zip</code></a></li>
<li>CUTE80: <a href="/media/research/kiss/cute80.zip"><code>cute80.zip</code></a></li>
<li>IIIT5K: <a href="/media/research/kiss/iiit5k.zip"><code>iiit5k.zip</code></a></li>
<li>SVT: <a href="/media/research/kiss/svt.zip"><code>svt.zip</code></a></li>
<li>SVTP: <a href="/media/research/kiss/svtp.zip"><code>svtp.zip</code></a></li></ul>
<h1 id="best-model"><a class="heading-link" title="Permalink" aria-hidden="true" href="#best-model"><span>#</span></a>Best Model</h1>
<p>You can find our best model in <a href="/media/research/kiss/model.zip"><code>model.zip</code></a>.</p>
          ]]></content:encoded>
          
                    
        </item>
      
        <item>
          <guid>https://bartzi.de//research/loans</guid>
          <title>LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks</title>
          <description>This page provides you with access to data necessary to reproduce the results we reported in our paper "LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks"</description>
          <link>https://bartzi.de//research/loans</link>
          <pubDate>Tue, 13 Nov 2018 15:40:00 +0100</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/loans">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>This page provides you with access to data necessary to reproduce the results we reported in our paper “LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks”.</p>
<p>We provide our datasets and also access to auxilliary data necessary to genreate your own datasets. Furthermore, we provide access to some models that we trained using the data provided here.</p>
<h1 id="sheep-dataset"><a class="heading-link" title="Permalink" aria-hidden="true" href="#sheep-dataset"><span>#</span></a>Sheep Dataset</h1>
<p>The first dataset we provide, is the sheep dataset that features a lawn mower with an orange sheep on top. We provide the whole dataset inluding annotations, and all files necessary to recreate the dataset.</p>
<h2 id="dataset"><a class="heading-link" title="Permalink" aria-hidden="true" href="#dataset"><span>#</span></a>Dataset</h2>
<p>The file <a href="/media/research/loans/sheep_dataset.zip"><code>sheep_dataset.zip</code></a> consists the complete train dataset for training localizer and assessor in the respective folders.
The file <a href="/media/research/loans/backgrounds_templates_bboxes.zip"><code>backgrounds_templates_bboxes.zip</code></a> consists all auxiliary file necessary to create the dataset. </p>
<h2 id="trained-models"><a class="heading-link" title="Permalink" aria-hidden="true" href="#trained-models"><span>#</span></a>Trained Models</h2>
<p>The file <a href="/media/research/loans/sheep_models.zip"><code>sheep_models.zip</code></a> contains two of our trained models on the sheep dataset. The first model is a <code>Resnet 18</code> based localizer that was trained on an image size of <code>224x224</code> pixels. The second model is a <code>Resnet 50</code> based localizer trained on an image size of <code>224x224</code> pixels.</p>
<h1 id="figure-skating-dataset"><a class="heading-link" title="Permalink" aria-hidden="true" href="#figure-skating-dataset"><span>#</span></a>Figure Skating Dataset</h1>
<p>We also provide everything you to redo our figure skating experiments. This includes all images and also some trained models</p>
<h2 id="dataset-1"><a class="heading-link" title="Permalink" aria-hidden="true" href="#dataset-1"><span>#</span></a>Dataset</h2>
<p>In this section we describe the files that we uploaded for working with the figure skating dataset.</p>
<h3 id="assessor"><a class="heading-link" title="Permalink" aria-hidden="true" href="#assessor"><span>#</span></a>Assessor</h3>
<p>If you want to have the background images, we used for training the assessor, please download the file <a href="/media/research/loans/figure_skating_backgrounds.zip"><code>figure_skating_backgrounds.zip</code></a>. You can find the template images for training the assessor in the file <a href="/media/research/loans/figure_skating_templates.zip"><code>figure_skating_templates.zip</code></a>.
If you want to have the already prepared assessor train dataset, you can download the file <a href="/media/research/loans/figure_skating_assessor_datasets.zip"><code>figure_skating_assessor_datasets.zip</code></a> This includes two datasets. One dataset used for training the <code>Resnet 18</code> based localizer (<code>reference_iou_flipped</code>) and one used for training the <code>Resnet 50</code> based localizer (<code>reference_iou_75_100</code>).</p>
<h3 id="localizer"><a class="heading-link" title="Permalink" aria-hidden="true" href="#localizer"><span>#</span></a>Localizer</h3>
<p>If you are interested in the train datasets for the localizer, you can download the datasets with and without noise if you choose the file <a href="/media/research/loans/figure_skate_localizer_datasets.zip"><code>figure_skate_localizer_datasets.zip</code></a> (<strong>file size 19GB</strong>).
Further, you can download the evaluation dataset <a href="/media/research/loans/figure_skating_evaluation_dataset.zip"><code>figure_skating_evaluation_dataset.zip</code></a>. </p>
<h2 id="models"><a class="heading-link" title="Permalink" aria-hidden="true" href="#models"><span>#</span></a>Models</h2>
<p>The file <a href="/media/research/loans/figure_skating_models.zip"><code>figure_skating_models.zip</code></a> contains the best two models that we trained on the figure skating datset. The first model is a <code>Resnet 18</code> based localizer that has been trained with the noisy dataset and an output size of <code>50x100</code> pixels. The second model is a <code>Resnet 50</code> based localizer that has been trained on the dataset without noise and an output size of <code>75x100</code> pixels.</p>
<h1 id="slides-and-poster"><a class="heading-link" title="Permalink" aria-hidden="true" href="#slides-and-poster"><span>#</span></a>Slides and Poster</h1>
<p>We also provide the slides of our talk and the poster we presented at the “1st International Workshop on Advanced Machine Vision for Real-life and Industrially Relevant Applications”. If you want to download the slides or the poster, please download the file <a href="/media/research/loans/talk.pdf"><code>talk.pdf</code></a> or <a href="/media/research/loans/poster.pdf"><code>poster.pdf</code></a>, respectively.</p>
          ]]></content:encoded>
          <media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://bartzi.de///images/posts/blog-posts.jpg"/>
          <media:content xmlns:media="http://search.yahoo.com/mrss/" medium="image" url="https://bartzi.de///images/posts/blog-posts.jpg"/>          
        </item>
      
        <item>
          <guid>https://bartzi.de//research/see</guid>
          <title>SEE: Towards Semi-Supervised End-to-End Scene Text Recognition</title>
          <description>This page grants access to our generated SVHN datasets and trained models, that are mentioned in our AAAI-18 paper "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"</description>
          <link>https://bartzi.de//research/see</link>
          <pubDate>Thu, 09 Nov 2017 18:14:00 +0100</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/see">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>This page grants access to our generated SVHN datasets and trained models, that are mentioned in our AAAI-18 paper <strong>“SEE: Towards Semi-Supervised End-to-End Scene Text Recognition”</strong> (preprint available <a href="http://arxiv.org/abs/1712.05404" rel="nofollow">here</a>). The code for this publication is available <a href="https://github.com/Bartzi/see" rel="nofollow">here</a>.</p>
<h1 id="svhn-experiments"><a class="heading-link" title="Permalink" aria-hidden="true" href="#svhn-experiments"><span>#</span></a>SVHN Experiments</h1>
<p>The page about our previous Arxiv publication “STN-OCR: A single Neural Network for Text Detection and Text Recognition” contains all data necessary for redoing our experiments on the SVHN datasets. You can find the page <a href="/research/stn-ocr">here</a>. Please note: the models on this page won’t work with the code for this paper, but with the code for the other paper. The other paper is the pre-version of this paper.</p>
<p>We’ve also prepared a video that shows the train progress of our model on the SVHN dataset with randomly placed numbers. You can find the video <a href="https://youtu.be/GSq3_GeDZKk" rel="nofollow">here</a>.</p>
<h1 id="fsns"><a class="heading-link" title="Permalink" aria-hidden="true" href="#fsns"><span>#</span></a>FSNS</h1>
<p>For training and evaluating on the FSNS dataset we prepared the dataset as described <a href="https://github.com/Bartzi/see" rel="nofollow">here</a>.</p>
<p>The file <a href="/media/research/see/fsns_model.zip"><code>fsns_model.zip</code></a> contains our best performing model on the FSNS dataset. You can use this data with the <code>evaluation.py</code> script in our Github repository to evaluate the model.</p>
<p>For this dataset, we also prepared a video showing the train progress on this dataset. You can find the video <a href="https://youtu.be/5lt6dAbbsu4" rel="nofollow">here</a>.</p>
<h1 id="text-recognition"><a class="heading-link" title="Permalink" aria-hidden="true" href="#text-recognition"><span>#</span></a>Text Recognition</h1>
<p>Although not mentioned in the paper, we also performed pure text recognition experiments on already extracted text lines (analog to the experiments described in our <code>STN-OCR</code> paper).</p>
<p>The file <a href="/media/research/see/test_recognition_model.zip"><code>text_recognition_model.zip</code></a> contains a text recognition model and also a small example dataset including all necessary files. If you want to use this dataset, you will need to adapt some filepaths!</p>
<h1 id="supplementary-material"><a class="heading-link" title="Permalink" aria-hidden="true" href="#supplementary-material"><span>#</span></a>Supplementary Material</h1>
<p>The file <a href="/media/research/see/supplementary_material.pdf"><code>supplementary_material.pdf</code></a> contains supplementary material for our paper.</p>
          ]]></content:encoded>
          <media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://bartzi.de///images/posts/blog-posts.jpg"/>
          <media:content xmlns:media="http://search.yahoo.com/mrss/" medium="image" url="https://bartzi.de///images/posts/blog-posts.jpg"/>          
        </item>
      
        <item>
          <guid>https://bartzi.de//research/stn-ocr</guid>
          <title>STN-OCR: A single Neural Network for Text Detection and Text Recognition</title>
          <description>This Page grants access to our purpose build SVHN datasets and trained models, that we created for our experiments described in " STN-OCR: A single Neural Network for Text Detection and Text Recognition"</description>
          <link>https://bartzi.de//research/stn-ocr</link>
          <pubDate>Mon, 24 Jul 2017 16:42:00 +0200</pubDate>
          <category>research</category>
          <content:encoded><![CDATA[
            <div style="margin: 50px 0; font-style: italic;">
              If anything looks wrong, 
              <strong>
                <a href="https://bartzi.de//research/stn-ocr">
                  read on the site!
                </a>
              </strong>
            </div>

            <p>This Page grants access to our purpose build SVHN datasets and trained models, that we created for our experiments described in <a href="https://arxiv.org/abs/1707.08831" rel="nofollow">STN-OCR</a> (the code is available <a href="https://github.com/Bartzi/stn-ocr" rel="nofollow">here</a>).</p>
<h1 id="svhn-experiments"><a class="heading-link" title="Permalink" aria-hidden="true" href="#svhn-experiments"><span>#</span></a>SVHN Experiments</h1>
<p>In the attachment <code>svhn_dataset_and_models.zip</code> you can find two datasets we used for our training on SVHN data. We created these datasets using the original SVHN dataset images and our scripts that can be found <a href="https://github.com/Bartzi/stn-ocr/tree/master/datasets/svhn" rel="nofollow">here</a>.
You can also find prepared evaluation data in the <code>evaluation</code> folder.</p>
<p>Besides these datasets you can also find several models</p>
<ol><li>a model trained on original SVHN data</li>
<li>a model trained on SVHN data evenly distributed on a grid</li>
<li>a model trained on SVHN house number crops randomly placed in an image</li></ol>
<h1 id="text-recognition"><a class="heading-link" title="Permalink" aria-hidden="true" href="#text-recognition"><span>#</span></a>Text Recognition</h1>
<p>We can, unfortunately, not provide the dataset we used for these experiments, as it is too large and we have not been able to find a suitable place to host it. If you know a good place, please let us know, by opening an issue in our Github repository.</p>
<p>But the file <a href="/media/research/stn-ocr/text_recognition_model.zip"><code>text_recognition_model.zip</code></a> contains a model trained for performing text recognition on already cropped scene text images. This model can be used with <code>eval_text_recognition.py</code> script from our repository on Github.</p>
<h1 id="fsns"><a class="heading-link" title="Permalink" aria-hidden="true" href="#fsns"><span>#</span></a>FSNS</h1>
<p>For our training we used the standard FSNS dataset. Please <a href="https://github.com/Bartzi/stn-ocr/blob/master/README.md#fsns" rel="nofollow">this</a> file, for downloading and preparing the FSNS dataset for usage with our system.</p>
<p>The file <a href="/media/research/stn-ocr/fsns_model.zip"><code>fsns_model.zip</code></a> contains our best performing model trained on the FSNS dataset. This model can be used with the <code>eval_fsns_model.py</code> script from our repository.</p>
<h1 id="supplementary-material"><a class="heading-link" title="Permalink" aria-hidden="true" href="#supplementary-material"><span>#</span></a>Supplementary Material</h1>
<p>In the paper we mentioned some videos that show how the model learns to find the regions of text and the pretraining steps that we performed.
You can find the videos in <a href="/media/research/stn-ocr/videos.zip"><code>videos.zip</code></a></p>
          ]]></content:encoded>
          <media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://bartzi.de///images/posts/blog-posts.jpg"/>
          <media:content xmlns:media="http://search.yahoo.com/mrss/" medium="image" url="https://bartzi.de///images/posts/blog-posts.jpg"/>          
        </item>
      
  </channel>
</rss>