||In the last several years, deep learning has become the method of choice for many applications within computer vision. The subfield of word image retrieval, typically called word spotting, is no different. In this talk, I will present our most recent work on word spotting that has achieved state-of-the-art results on multiple benchmarks. Our model, entitled Ctrl-F-Net, is based on a convolutional neural network it allows us to search within word images using text strings, much like any search engine. Furthermore, the model is segmentation-free, meaning it simultaneously segments and retrieves word images on a manuscript page. As a real word use case, we apply our model to a collection of historical Swedish manuscripts received in collaboration with the Department of History to aid them in their research. Finally, I will provide some recent model developments guided by a series of ablation studies.