
The Street View Text (SVT) dataset contains 647 words and 3796 letters in 249 images harvested from Google Street View. The dataset is more challenging because text is present in different orientations, the variety of fonts is bigger and the images are noisy. The format of the ground truth is different from the previous experiment as only some words are annotated (see Figure 11). Of the annotated words the proposed method achieved a recall of 32.9% using the same evaluation protocol as in the previous section (see Figure 12 for output examples) K. Wang, B. Babenko, and S. Belongie. End-to-end scene text recognition. In ICCV 2011 Kai Wang and Serge Belongie. Word Spotting in the Wild. ECCV 2010

