🎯elinor 101

What is elinor about and how does it work?

elinor is an annotation and ontology management tool. Whether you’re just interested in elinor and our approach to NLP or are getting started with elinor yourself - this document is for you. Here we cover the design philosophy of elinor and a high-level overview of the annotation process.

Philosophy

We at &effect developed elinor after we couldn't find any tool that met our needs. There are excellent tools to annotate when you just have a few labels (e.g. Prodigy). However, we often had large corpora to annotate and, at the same time, large codebooks or taxonomies (sometimes with thousands of labels). This is a form of natural language processing that stays as close as possible to the domain and incorporates experts' domain knowledge.

With elinor, we want to return the domain to the center of natural language processing.

Understanding your domain: No matter if you want to classify documents, find mentions of entities in texts or disambiguate entity candidates. Explicitly stating hierarchies, concept relations, and synonyms help you understand what you want to model. Encoding your domain knowledge in an ontology creates transparency and clarity.

Making predictions interoperable: Using explicit data schemas, you can use predictions from your trained model in other contexts and combine them with other datasets. Ontologies are nothing new (not by a long shot), but linked data is having a revival - for a good reason.

Interpretable and smaller models: By infusing your domain knowledge into text transformer models (e.g., with siamese neural networks), your models can be considerably smaller than full-blown transfer-learned foundational models. Your results will be as accurate or even better - while being easier to interpret.

The Project Workflow

Here is a prototypical workflow for developing an NLP-product with elinor:

  1. Develop an ontology: What do you want to model? Which categories are needed, and what concepts do you expect to find in the documents? We believe ontology development is an iterative process, so start with some first ideas. If you want to use an existing ontology or taxonomy, just upload it using .rdf or .json files (coming soo). Read more Ontologies

  2. Upload documents: Upload documents that you are interested in. So far, you can only upload .csv files, but more formats are coming soon. You can structure documents in folders and reorganize them in the document hub. Read more Documents

  3. Create a project: Create a project that is linked to the ontology. Here you can also start annotation guidelines explaining how annotators structure the documents. These guidelines will probably change as you go as you have to add or change rules to make annotations consistent. Read more Projects

  4. Create experiments and assign documents to annotators: First, you might want to test your ontology and annotation guidelines. For this purpose, you can give several documents to multiple annotators to check their annotation agreement. Then you can assign larger chunks of documents to different annotators to process them faster. Read more Experiments

  5. Annotate documents: After documents are assigned, the annotators can start processing the documents. You can assign word sequence labels as well as document labels. The project manager can conveniently monitor the progress in the experiment view. Read more Annotation

  6. Export your results: You can export annotated documents conveniently in .json format. You can then feed it as training data into your machine-learning model. You can also directly analyze the results to understand how topics are distributed in your data. Read more Annotation

Glossary

TermDescription

Natural Language Processing (NLP)

Using computational techniques to analyze, structure, and process natural language and speech.

Annotation

Annotation is the process of attaching labels to words, phrases, or documents.

Gold standard

A gold standard is a data set of documents for which human annotators have added labels. This data set is usually used for training or fine-tuning machine learning models.

Ontology

An ontology is a collection of concepts that represent important ideas or topics of your domain. In other tools these might be called codebooks, or taxonomies. Our ontologies are developed as SKOS-compliant and can therefore be used for many use cases other than annotating text.

SKOS

Simple Knowledge Organization System is a W3C recommendation designed to represent thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. Wikipedia

Project

Projects structure your annotation workflow. In projects, you can start experiments to achieve specific goals. You can also develop annotation guidelines that help annotators label documents consistently.

Experiment

An experiment is a collection of task assignments. It also serves as a milestone in the annotation process.

Task (Assignment)

A task is assigning one document to one annotator within one experiment. This is important as documents might be annotated by different people in different experiments or projects.

Last updated