LivingNER terminology: NCBI Taxonomy translated to Spanish
- 1. Barcelona Supercomputing Center
Description
Official NCBI Taxonomy FTP dump (https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/) with the terms translated to Spanish by a Neural Machine Translator fine-tuned for the biomedical domain.
We have added the following terms:
2560602 Mumps orthorubulavirus Mumps orthorubulavirus scientific name Paperas ortorubulavirus
2560526 Human orthorubulavirus 4 Human orthorubulavirus 4 scientific name orthorubulavirus humano 4
2847144 hepatitis C virus genotype 1a hepatitis C virus genotype 1a scientific name virus de la hepatitis C genotipo 1a
2560525 Human orthorubulavirus 2 Human orthorubulavirus 2 scientific name orthorubulavirus humano 2
_NOCODE_ out of NCBI Taxonomy scope out of NCBI Taxonomy scope NA mención no codificable a NCBI Taxonomy
The first 4 were added because they appear in the LivingNER corpus, and are present in the browser version of NCBI Taxonomy.
The last one (_NOCODE_) is added to identify terms in LivingNER corpus not present in the NCBI Taxonomy.
Format:
Tab-separated file with the following columns:
- tax_id: the id of node associated with this name
- name_txt: name itself
- unique name: the unique variant of this name if name not unique
- name class: (synonym, common name, scientific name, ...)
- Spanish name: name in Spanish
Please cite if you use this dataset:
A. Miranda-Escalada, E. Farré-Maduell, S. Lima-López, D. Estrada, L. Gascó, M. Krallinger, Mention detection, normalization & classification of species, pathogens, humans and food in clinical documents: Overview of livingner shared task and resources, Procesamiento del Lenguaje Natural (2022)
@article{amiranda2022nlp,
title={Mention detection, normalization \& classification of species, pathogens, humans and food in clinical documents: Overview of LivingNER shared task and resources},
author={Miranda-Escalada, Antonio and Farr{\'e}-Maduell, Eul{`a}lia and Lima-L{\'o}pez, Salvador and Estrada, Darryl and Gasc{\'o}, Luis and Krallinger, Martin},
journal = {Procesamiento del Lenguaje Natural},
year={2022}
}
Resources
For more information visit https://temu.bsc.es/livingner/ or email us at encargo-pln-life@bsc.es
Check out the translator demo: https://textmining.bsc.es/translator
Notes
Files
Files
(267.4 MB)
Name | Size | Download all |
---|---|---|
md5:8fb3d39c479de41060f461423e21cef1
|
267.4 MB | Download |