Saneifar Hassan, Bonniol Stéphane, Poncelet Pascal, Roche Mathieu. 2015. Recognition of logical units in log files. Intelligent Data Analysis, 19 (2) : 431-448.
Version publiée
- Anglais
Accès réservé aux personnels Cirad Utilisation soumise à autorisation de l'auteur ou du Cirad. document_575757.pdf Télécharger (578kB) |
Quartile : Q4, Sujet : COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Liste HCERES des revues (en SHS) : oui
Thème(s) HCERES des revues (en SHS) : Psychologie-éthologie-ergonomie
Résumé : With the development of new technologies more and more information is stored in log files. Analyzing such logs can be very useful for the decision maker. One of the probably best known example is the Web log file analysis where lots of efficient tools have been proposed to extract the top-k accessed pages, the best users or even the patterns describing the behaviors of users on a Web site. These tools take advantages of the well-formed structures of the data. Unfortunately, logs files from the industrial world have very heterogeneous complex structures (e.g., tables, lists, data blocks). For experts, analyzing logs to find messages helping to better understand causes of a failure, if a problem have already occurred in the past or even knowing the main consequences of a failure is a hard, tedious, time-consuming and error-prone task. There is thus a need for new tools helping the experts to easily recognize the appropriate part in logs. Passage retrieval methods have proved to be very useful for extracting relevant parts in documents. In this paper we propose a new approach for automatically split logs files into relevant segments based on their logical units. We characterize the complex logical units found in logs according to their syntactic characteristics. We also introduce the notion of generalized vs-grams which is used to automatically extract the syntactic characteristics of special structures found in log files. Conducted experiments are performed on real datasets from the industrial world to demonstrate the efficiency of our proposal on the recognition of complex logical units.
Mots-clés Agrovoc : analyse de données, traitement des données, traitement de l'information, technologie, industrie, application des ordinateurs, logiciel, aide à la décision, soutien de la recherche
Classification Agris : U10 - Informatique, mathématiques et statistiques
U30 - Méthodes de recherche
Champ stratégique Cirad : Hors axes (2014-2018)
Auteurs et affiliations
- Saneifar Hassan, LIRMM (FRA)
- Bonniol Stéphane, Satin Technologies (FRA)
- Poncelet Pascal, LIRMM (FRA)
- Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
Source : Cirad - Agritrop (https://agritrop.cirad.fr/575757/)
[ Page générée et mise en cache le 2024-12-25 ]