From pixel to visual resonances: Images with voices

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published 2016-07-12
Pilar Rosado Rodrigo Eva Figueras Ferrer Ferran Reverter Comes

Abstract

The objective of our research is to develop a series of computer vision programs to search for analogies in large datasets—in this case, collections of images of abstract paintings—based solely on their visual content without textual annotation. We have programmed an algorithm based on a specific model of image description used in computer vision. This approach involves placing a regular grid over the image and selecting a pixel region around each node. Dense features computed over this regular grid with overlapping patches are used to represent the images. Analysing the distances between the whole set of image descriptors we are able to group them according to their similarity and each resulting group will determines what we call 'visual words'. This model is called Bag-of-Words representation Given the frequency with which each visual word occurs in each image, we apply the method pLSA (Probabilistic Latent Semantic Analysis), a statistical model that classifies fully automatically, without any textual annotation, images according to their formal patterns. In this way, the researchers hope to develop a tool both for producing and analysing works of art.

How to Cite

Rosado Rodrigo, Pilar, Eva Figueras Ferrer, and Ferran Reverter Comes. 2016. “From Pixel to Visual Resonances: Images With Voices”. AusArt 4 (1). https://doi.org/10.1387/ausart.16670.
Abstract 569 | PDF (Español) Downloads 463

##plugins.themes.bootstrap3.article.details##

Keywords

ARTIFICIAL VISIÓN, BAG-OF-WORDS MODEL, CBIR (CONTENT-BASED IMAGE RETRIEVAL), PLSA (PROBABILISTIC LATENT SEMANTIC ANALYSIS), VISUAL WORD

References
Flusser, Vilém. 2009. Una filosofía de la fotografía. Traducción, Thomas Schilling. El Espíritu y la Letra 5. Madrid: Síntesis

Hofmann, Thomas. 2001. "Unsupervised learning by probabilistic latent semantic analysis". Machine Learning 42

Kandinsky, Vasili Vasilievich (1912) 1987. La gramática de la creación. El futuro de la pintura. Ed. y notas de Philippe Sers. Barcelona: Paidós

Kandinsky, Vasili Vasilievich (1926) 1996. Punto y línea sobre el plano: Contribución al análisis de los elementos pictóricos. Traducción Roberto Echavarren. Barcelona: Paidós

Koffka, Kurt. (1935). 2014. Principles of Gestalt Psychology. Milano: Mimesis International

Köhler, Wolfgang. (1947) 1992. Gestalt psychology: An introduction to new concepts in modern psychology. New York: Liveright

Lazebnik, Svetlana, Cordelia Schmid & Jean Ponce. 2006. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories". IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2: 2169-78. Doi: doi.ieeecomputersociety.org/10.1109/CVPR.2006.68

Lowe, David G. 2000. "Towards a computational model for object recognition in IT cortex". En Biologically motivated computer vision: First IEEE International Workshop, BMCV 2000 Seoul, Korea, May 15-17: Proceedings, Seong-Whan Lee Heinrich H. Bülthoff & Tomaso Poggio, eds., 20-31. Berlin: Springer

Lowe, David G. 2004. "Distinctive image features from scale invariant keypoints". International Journal of Computer Vision 60(2): 91-110

Reverter Comes, Ferrán, Eva Figueras Ferré, Miquel Planas Rosselló & Pilar Rosado Rodrigo. 2013. Ideación y catalogación artística basada en métodos de visión artificial. Barcelona: Raima

Rosado Rodrigo, Pilar, Ferrán Reverter Comes, Eva Figueras Ferré & Miquel Planas Rosselló. 2014. "Semantic-based image analysis with the goal of assisting artistic creation". Lecture Notes in Computer Science 8671: 526-33. Doi: 10.1007/978-3-319-11331-9

Rosado Rodrigo, Pilar, Eva Figueras Ferré & Ferrán Reverter Comes. 2014. "Intersecciones entre visión artificial y mirada artística". Brac 2(1): 1-54. Doi:10.4471/brac.2014.01
Section
Articles