The diversity in forms of documents (multimedia, multilingual, with or without a structure) and in their uses encourages different communities to mingle more and more.
Information Retrieval, Document and Semantic Web is a meeting point for these scientific or industrial communities who are interested in information research, the semantic web, the analysis of documents (texts, images, sounds, videos, etc.) or in the collection of documents.
La multiplicité des formes de documents (multimédia, multilingue, structuré ou non) et des usages favorise de plus en plus un brassage entre différentes communautés.
Recherche d’information, document et web sémantique est un point de rencontre pour ces communautés scientifiques ou industrielles qui s’intéressent à la recherche d’information, au web sémantique, à l’analyse de documents (textes, images, sons, vidéos...) ou à la collection de documents.
Crowdsourcing have been widely deployed to cover some challenges in digital humanities, like in the transcription of old handwritten documents. Such approach is especially useful to tackle existing limits in automatic (...)
The automatic detection of changes in forests (deforestation, reforestation) relies on various data sets. This article reviews data sets both global and local that can be used to evaluate tasks of land cover classification, (...)
In the last decade, political injunctions to curate and share research data have increased significantly. A survey conducted in 2017 in Rennes 2, a french Humanities and Social Sciences university, enabled us to question (...)
In this article we try to tackle some problems arising with noisy and heterogeneous data in the domain of digital humanities. We investigate a corpus known as the mazarinades corpus which gathers around 5,500 documents in (...)
Over the last decades, there has been an increasing use of information systems, resulting in an exponential increase in textual data. Although the volumetric dimension of these textual data has been resolved, its (...)
With the extremely rapid growth of the amount of digital documents in our societies, automatic keyword indexing has become a central research issue in information retrieval and document management. Several scientific (...)
This article presents the eXenSa contribution to the 2016 DEFT shared task. The proposed task consists in indexing bibliographic records with keywords chosen by professional indexers. We propose a statistical approach (...)
This paper presents the 2016 edition of the DEFT text mining challenge. This edition adresses the keyword-based indexing of scientific papers with the aim of simulating a professional indexer. The corpus is composed of (...)
This article presents the participation of the TALN group at LINA to the défi fouille de textes (DEFT) 2016. Developed specifically for automatic keyphrase annotation, we propose a new method, TopicCoRank, extracting the (...)
This short paper gives an overview of the presentations and discussions held during the "Computational Journalism" workshop. This workshop was proposed by Laurent Amsaleg (CNRS, IRISA), Vincent Claveau (CNRS, IRISA) and (...)
Editor in Chief
Université Nice Sophia Antipolis
IRIT – Université Paul Sabatier
LIPN – Université Paris 13