Traits database

Quick update and way forward

June 2025

Romain Frelat
 FRB-CESAB

Karim Barkaoui
 CIRAD

Elena Kazakou
 CNRS

Species list

Files in fellow-traits/data/raw-data/species-list



  • Merge files (in analyses/01_get_specieslist.R)
  • Harmonize the spellings with function clean_species_list():
    • remove punctuations: ? . , ()
    • remove sp, spp, cf, complex, …
    • remove author name and years when included
    • make sure only first genus letter is uppercase
    • remove “Arbre”, “Espece”, “Inconue”, “Repousse”, …
    • remove taxa that are coarser than family: Dicotyledonae, Bryophyta

Number of unique taxa: 2286

Species list

Taxonomic backbone

Taxa with no exact match in Taxref: 112
Taxa not found in Taxref with fuzzy matching: 51
Taxa not found in GBIF: 4 (Anisantha bromus, Festuca schedonorus, Ornithogalum muscari, Picris helminthotheca)
Number of accepted taxa: 2027


    FAMILY      GENUS    SPECIES SUBSPECIES    VARIETY 
        15        241       1640        120         11 

Synonyms

46k known synonyms were retrieve from
original species list, TaxRef and GBIF.

specialist: a taxa listed in only one database
generalist: a taxa listed in 50% of the databases

sp_class
specialist      other generalist 
       854       1018        155 

Trait databases

Trait coverage per database

Trait coverage per database

Trait coverage per database

Trait coverage per taxonomic rank

Taxa with no trait information.

Abies
Acacia
Agrimonia agrimonoides
Agropyron
Amaranthaceae
Apiaceae
Asparagaceae
Aster
Boraginaceae
Brassicaceae
Bryum dichotomum
Caryophyllaceae
Chaenomeles x superba
Chrysanthemum
Cochlearia
Cosmos
Crambe abyssinica
Dysphania aristata
Geraniaceae
Glyceria
Imbribryum subapiculatum
Lamiaceae
Lavandula
Leontodon autumnale
Liliaceae
Lunaria
Moehringia
Orchis
Paronychia
Piptatherum
Poaceae
Primulaceae
Pulmonaria
Rhizogemma staphylina
Riccia sorocarpa
Riccia warnstorfii
Roemeria hispida
Rosaceae
Rubiaceae

Trait coverage per species frequency

Trait completness

Comparison - Plant height (m)

   spvignes FlorealData    Lososova        BIEN        GIFT    Ecoflora 
       1888        1578         168         732         593        1000 
     filled 
        132 

Comparison - SLA

Hodgson    GIFT    BIEN  filled 
   1090     875     866     589 

Comparison - Seed mass (mg)

Lososova     BIEN     GIFT Ecoflora Biolflor   filled 
     305      728      365     1271     1534      228 

Comparison - Life form

Way forward - open questions


Missing trait information?

  • which traits are missing? where to find them?
  • relevant trait database missing?

How to get clean and complete trait database?

  • do we need information for all species? can we discard the rare taxa / non-weed taxa?
  • how to combine trait information from multiple sources? biases?
  • can we measure missing trait? can we guesstimate some of them?
  • should we do trait imputation based on other traits or taxonomy?
  • most categorical traits should be cleaned (misspellings, irrelevant categories, …)

What’s next?

  • a first stepping stone for multiple sub-projects
  • in itself, is it a publishable output?
  • how to continue? who takes the lead from now?