Data mining

The IT dedicated to the intensive exploitation of large volumes of data is mainly related to the research activities of the Ondes team, activities related to methods based on ambient noise. This expertise is at the interface between geophysicists, HPC specialists and databases in the context of national and European projects.

The codes and associated documentation developed as part of the activity "Data Mining" are accessible in the Forge GitLab GRICAD (or the Forge OSUG for projects not yet migrated)). It should be noted that the software forges and/or documentation of some of these projects are restricted access.

WIN->MSEED format conversion tools / pre-processing / correlations / doublets & inversion (Whisper and F-Image projects) :
- code outils de conversion WIN vers MSEED
- wiki et journal projet Whisper, documentation données Japonaises, documentation et code prétraitements des données
- documentation et code outils corrélation/doublets/inversion

Beamforming tools (Imag’In, RESOLVE projects) :
- wiki RESOLVE
- documentation and code
- documentation MFP code
- documentation and code (old version, no longer maintained)

Visualization tools for beamforming outputs (collaboration with R. Blanch and M. Ortega from LIG)
- documentation and code

Template Matching Tools (EventDetection project)
 documentation and code

NoiseCorr_DBF : Correlation tools and double beamforming (sanjacinto project) :
 documentation and code
 documentation (old wiki)

Time error detection tools on dense networks (IWORMS project) :
 online journal
 documentation and code
 wiki

Tools for the manipulation/reorganization of datasets of valued data in HDF5 format (Utils project) :
 documentation and code

Tools for performing flow velocity and particle concentration measurements based on Acoustic Particle Image Velocimetry (projet ImVort = Imagerie-Vorticité) :
documentation et code

Prototyping tools to link RESIF data center data and CIMENT-GRICAD HPC infrastructures (projet Resif-Summer-Ciment et code)

Autres :

Lien vers les supports de la formation interne HDF5 pour les personnels RESIF, SIG, IPGP

Lien vers la Formation CiGri, et Support de la présentation

Lien vers l’offre de formation du site : outils pour le traitement de données, le développement logiciel et le calcul (mise à jour au fur et à mesure du déroulé des séances, version complète sur demande)

The business expertise of the technical staff involved are :

 optimization of sequential codes (numerical methods, choice of languages,...)
 application parallelization (MPI, OpenMP)
 application deployment
 grid calculation (CiGri v3)
 iRODS : transfer techniques, metadata management,...
 IO parallel
 HDF5 data format
 signal processing : hole management, data decimation,...
 Fortran / C / Python3 / Shell Bash

Contacts for the activity ’Data mining’ :

 Michel Campillo, Philippe Roux, Florent Brenguier : researchers, F-Image project manager, RESOLVE, Pacific
 Albanne Lecointre, IR CNRS BapE, iWORMS project manager, team Waves, service GeoData (Deputy Head of GeoData Service)

Specific skills

 Fortran, C, Python3
 MPI, OpenMP, Grid calculation....
 Scientific computation libraries BLAS Lapack, Scipy, IntelMKL
 h5py, opspy, numPy, scipy, matplotlib, ...
 File format HDF5, SEED, (NetCDF3)
 data scan : msi
 format conversion : sac2mseed, win2sac_32
 metadata mining : rdseed

 Data type : seismological

Hardware and software resources backed by the activity’Data mining’

 Link to the CIMENT/GriCAD computing center
 Link to the ISTerre calculation means

Links with other IT activities and resources in ISterre and OSUG

==> ISterre Data Centre
==> Laboratory IT resources
==> OSUG Storage Center

Links with other ISTerre technical platforms

Last updated on 07/04/2022