Annotation tagset mapping and transliteration

Description of a procedure for harmonisation of two annotation tag sets:

User uploads a set of source annotated documents with any tagset with one or more document with gold (target) annotation tagset
System reads both tagsets and offer mappings (for each tag, some option should be selected)
System generates source documents in a new annotation scheme
User downloads the results

Upload a .zip file with following structure:

f.zip
├── gold/
└──── rnd1.ann
└──── rndm2.ann
├── to_map/
└──── x1.ann
└──── x2.xml
└──── x3.txt

Download Sample Archive

BRAT → CoNLL02 (.py source)

Upload a .zip file with following structure:

f.zip
├── f1.ann
├── f1.txt
├── f2.ann
└── f2.txt

Download Sample Archive

CONLL02 → BRAT (.py source)

Upload a .zip file with following structure:

── f.zip
├── f1.conll
└── f2.conll

Download Sample Archive

BRAT → XML

Upload a .zip file with following structure: Please, remove attributes from ann file, if any

files.zip
├── f1.ann
└── f1.txt
└── f2.ann
└── f2.txt

Download Sample Archive

XML → BRAT

Upload a .zip file with following structure:

files.zip
├── f1.xml
└── f2.xml

Download Sample Archive

XML → CONLL02

Upload a .zip file with following structure:

files.zip
├── f1.xml
└── f2.xml

Download Sample Archive

Perfom NER with spaCy (output in BRAT and XML formats)

Upload a .zip file with following structure:

── f.zip
├── f1.txt
└── f2.xml

Download Sample Archive: English, Serbian

Information about models: English, Serbian

Visualisation & Automatic Annotation

Perfom NER with StanfordNER (output in XML format)

Upload a .zip file with following structure:

── f.zip
├── f1.txt
└── f2.xml

Download Sample Archive: English, Serbian

Information about models: Serbian, English, German.

NER stats on .ann files

Upload a .zip file with following structure:

f.zip
├── eng/
└──── f1.ann
└──── f2.ann
├── slv/
└──── f1.ann
└──── f2.ann

Download Sample Archive

Gemini tool for NER evaluation, GitHub

Upload a .zip file with the following structure:

f.zip
├── gold/
└──── f1.ann
└──── f2.ann
└──── f3.ann
├── to_eval/
└──── f1.xml
└──── f2.xml
└──── f3.xml

gold/ directory can contain both .xml and .ann files
to_eval/ directory can contain .xml and .ann files (also .txt but well-formed)
it is important that the files to be evaluted and their corresponding gold standards have the same name
if at least one file is XML, than the evaluation will produce visualisation (generated .html files)
Download Sample Archive