Ce numéro spécial de la revue Open ISI regroupe des extensions d’articles des deux dernières éditions d’INFORSID. Il n’est pas seulement consacré au “meilleur” d’INFORSID 2019 et 2020, mais représente aussi la diversité de notre domaine. Il contient en effet une sélection d’articles que nous avons choisis pour leur qualité mais aussi pour le recul qu’ils offrent sur la recherche en SI.
The paper presents the ongoing work of a PhD in history at CY Cergy Paris University, in partnership with the Institut National du Patrimoine. The current and increasingly rapid deployment of digital devices in archaeology remains poorly questioned, especially for research activities in this discipline. While the new digital tools and practices attempt to make themselves accessible to the greatest number of people, they also leave out some archaeologists and reveal risks of fracture in work collectives. From an observant participation, the author proposes a reflexive research work in epistemological, history of science and technology and sociology of professional archaeological organisations. The effects studied concern archaeology as a discipline, and archaeologists as a set of individual skills, collective practices and professional identities. The study material is made up of numerous observations and feedback from more than ten years of experience in the field of archaeological field data acquisition at the Institut national de recherches archéologiques préventives (Inrap) and in several collective multidisciplinary research projects.
From a Web of documents of uncertain commercial interest, driven by pioneers believing in knowledge sharing, the Web later evolved into a collaborative, real-time form that was made profitable by advertising. The latter has evolved towards targeted advertising, including behavioral advertising based on the massive collection of usage traces. These traces come from various tracking devices including IP addresses (IP tracking), the now known cookies or fingerprints (e.g. browser fingerprinting and canvas fingerprinting). While the collection was initially limited to the workstation (mainly through the browser), it was later extended to smartphones and connected objects. This led to the trace marketing and attention economy that digital natives were confronted with at an early stage. Various countermeasures were gradually deployed by users (parameterization, extensions, e.g. ad blockers), by anonymization services (e.g. VPN and proxy), by the publishers themselves or by the regulator (e.g. RGPD). This paper proposes, on the one hand, a presentation of the structuring of the online advertising sector followed by a state of the art on the tracking tools deployed there, on the other hand, an inventory and analysis of the countermeasures deployed as well as their effectiveness. We show in particular the rapid evolution of the techniques used and the heterogeneity of the coverage offered by a priori equivalent protective devices.
Companies trying to build new solutions using blockchain are confronted with a plethora of available concurrent technologies that have many control knobs which require fine-tuning by experts. Exiting studies that build decision models for blockchain adoption or selection lack an automated way to use non-functional requirements to provide recommendations. This article, extended from previous works, introduces BLADE (BLockchain Automated DEcision Engine), a decision support tool for blockchain to better take into account high level requirements and preferences for recommendations. From documentations, white papers and academic papers, a knowledge base of blockchain solutions is constructed. This allows BLADE to execute an automated multi-criteria decision process giving the most relevant solution based on requirements and preferences, extracted from the ISO 25010 software quality standard. An implementation of this tool is performed within a web platform allowing the easy capture of user requirements and preferences for recommendation. Finally, the proposed approach is validated on a supply chain management case study. This study is a first step in order to design a solution allowing the design and implementation of end-to-end blockchain applications. While still limited in scope, BLADE will include more blockchain alternatives and more flexible requirements inputs in future work.
With the large expansion of available textual data, text mining has become of special interest. Due to their unstructured nature, such data require important preprocessing steps. Among them, stemming algorithms conflate the variants of words into their stems. However, the most popular algorithms are rule-based, and therefore highly languagedependent. In contrast, corpus-based stemmers often exhibit significant algorithmic complexity, making them inefficient. They do not necessarily provide the extracted stems either, which are required for certain text mining tasks. We propose a new approach, RFreeStem, that is corpus-based and can therefore be applied on many languages. The implementation of our method is flexible and efficient, since it relies on a single running through the words’ n-grams. We also detail a method to extract the stems. Our experiments show that RFreeStem improves the results of text mining tasks, even more than the Porter reference, while providing a stemming solution on poorly endowed languages, which do not benefit from a version of Porter.
Extracting value from social network data is a task whose complexity is driven by speed, volume and variability of data. Users develop multiple uses of these systems, that enhance the semantic variability. Analytics results must be produce as soon as possible (optimally in real-time) to be more relevant. Thus, business knowledge is essential and can usually be acquired by doing exploratory analysis. Accordingly, systems that harvest, store and analyze data from social networks have to support important streams of data, real-time analysis and exploratory analysis. Architecture styles and pattern allow to take these specificities into consideration, by proposing techniques to handle those data, and thus to facilitate their processing. These architectures have to be formalized, to study if essential properties are fulfilled, to know their behaviour, and to anticipate the effects that components can have on other components when they are gathered into a same architecture, and this even before developing and putting in production the architecture. In this article, we propose an architecture pattern, the Lambda+ Architecture, inspired from the Lambda Architecture and adapted to the processing of Big Data. We propose a formalization for architectures based on category theory, and an implementation of our pattern to analyze Twitter data.