Frequently asked Questions

General Questions

What can I do with Tesla?
Tesla is a highly flexible framework

  • for developing your own linguistic components to tackle a specific linguistic task
  • for ensuring interaction between linguistic components
  • for setting up and persisting empirical linguistic experiments
  • for producing large scale linguistic results

You can use Tesla to test hypotheses on existing or custom corpora, to develop new algorithms for corpus linguistics, to evaluate different approaches on the same task, or to publish results of an analysis in a way which allows other scientists to reproduce your experiment.

Which Operating Systems are supported by Tesla?
Tesla was developed and tested on Windows (XP, 7), Linux (Kubuntu) and MacOS X (10.5 and 10.6). It requires Java 6 and uses Eclipse 3.6, therefore any platform that supports this software should also support Tesla.
Binary executables are available for the operating systems mentioned above, in both 32-bit and 64-bit versions - have a look at the Download section.
Do I need additional software, such as a database?
Short answer: No, except Java 6, as mentioned above.
Long answer: Not necessarily... Tesla uses a relational database to store information about experiments, and in an server environment you might want to replace the internal database with a more flexible one. Also, some components use a relational database to store annotations - such components will probably not perform well without an advanced database. We recommend to install PostGreSQL in such a case, because this is the database we're also using (and testing).
Is Tesla a desktop application? Is it a server application?
It's both. Tesla's desktop application is based on Eclipse and can be used to create and analyze experiments, or to develop new components. These components can be installed on the Tesla Server, which will execute the experiments.
So I have to download, install and configure two applications?
No. The Tesla Client contains a fully-functional Server, which you can start without the need to use a command line. Download the client, unzip it and launch the application - that's it.
How can I run Tesla on a dedicated server?
Download the standalone server from the Download section and unzip it. Modify the file configuration/core/tesla.server.properties to accept remote connections, than run java -jar tesla.server.jar in Tesla's base directory. The tutorial How to set up a standalone server provides detailed instructions.
How can I access a remote installation of Tesla?
After installing Tesla on a remote server, start the Client, go to Tesla's Server preferences and enter the server URL. You can then connect to Tesla similarly as you would do via the local version - however, you will not be able to debug components or to upload new components.
I can't connect to the remote server! Oh, and the server runs behind a firewall...
Modify the server's firewall settings, such that it allows connections on ports 1199, 4561, 4562, 4563 and 4564. These are Tesla's default ports - you can modify the settings by modifying configuration/core/tesla.server.properties on server side and Tesla's Connection Preferences on client side.

Conceptual Questions

What is the difference between Tesla and other frameworks, like UIMA?
UIMA was designed as an framework which allows the integration of existing tools and applications, by defining an XML-based exchange format called common analysis structure (CAS).
However, XML cannot reflect all opportunities an object-oriented programming language offers, which can lead to framework-related restrictions when developing new components: For instance, you cannot define custom functions or methods in XML, and you better don't try to encode large (binary) data structures, such as vectors, as XML is too verbose for such a case.
Thus, we developed an alternative approach, where components communicate with the help of (Java-) interfaces, resolving the disadvantages of a mainly data-driven framework.
So Tesla will solve all problems? Should I switch from XYZ to Tesla?
Not in general - it depends on your needs. The benefits of Tesla's object-oriented approach are getting more important, the more complex the algorithms of shared objects are. In general, you might want to use Tesla if your approach is experimental and code-driven, or if you'd like to use flexible workflows with fine-grained components which can exchange temporary, active data structures, not only target structures. Tesla is mostly a laboratory for computational linguistics, not a framework for repetitive workflow execution.
What is a Role?
Think of a role as a generic task description, such as POS-tagging, named entity recognition or categorization. These tasks can be realized in different ways: In case of clustering, for instance, several algorithms with different approaches exist, like k-means clustering, EM-based clustering, or hierarchical clustering - each approach requires different data structures and has different configuration options. However, there are also things in common: A clustering algorithm groups data and assigns elements to one or more clusters.
The Tesla Role System provides a way to organize such approaches in an abstract way, independent of algorithms, data structures, or else: developers can define role hierarchies for a task, and create a base role with functions all implementations will have in common, and which is further specified in its sub roles. To get an idea, take a look at the Tesla Role System section for an overview of the currently implemented roles, or look at the tutorial How to create custom roles to learn how to create custom roles.
What are the benefits of the Tesla Role System?
For developers, the Role System can be seen as a kind of API, which makes it easy to use third party components - a role's functionality is well-documented and self-explaining. It is easy to extend roles and implement custom methods, as the Role System guarantees compatibility to super roles. There is no need to convert your custom data structures into some framework-given representation, and there is also no need to analyze third party annotations to query for the data you're interested in.
For users, the Role System improves compatibility and exchangeability of components, and simplifies the creation of new experiment workflows.
What is a Component?
Tesla components are encapsulated units which process raw texts or annotations on texts to generate new annotations. Tesla components interact with each other via their results (annotations), on which they provide specific access (adapters).
What is an Annotation?
An Annotation is the fundamental act of associating some content to an experiment, a single text or a region in a text. This definition is based loosely on the same definition in the Atlas framework with the significant difference that Annotations in Tesla can have a range spanning not only over parts of a text but also over one or more entire texts. This actuality makes it possible not only to annotate tokens but also to build stereotypes and represent both in the same mechanism.
What is an Adapter?
Adapters are interfaces to existing or newly produced annotations of components. Each component owns one or more InputAdapters which enable access the DataObjects consumed by the component. At the same time a component provides an OutputAdapter which manages the persistence of the DataObjects it produces.
What is an Experiment?
An Experiment is a combination of various components and Texts, with the aim of analysing a linguistic issue and testing a hypothesis by computational means. An experiment can also involve testing a new component. Tesla maintains a seamless audit trail of the complete experiment: Components involved in the experiment including their configuration and the processed documents, are stored in Tesla's database. This puts researchers in a position to fully interpret and understand an experiment and also complies with fundamental requirements in scientific research. It is not necessary to modify any source code of the respective components in order to perform an experiment. This opens new vistas to linguists with unique expertise in their respective field - they do not necessarily require any programming skills in order to use the system.
What is a DataObject? What kind of data structures are supported by Tesla?
A DataObject is the content of an Annotation. One of the goals of Tesla is to support every data structure that can be implemented in Java, from simple "structs" of primitive data types to most complex data types such as graphs of several thousands of Java objects that do not only contain the usual get/set-methods, but also more complex methods. Every existing Java class can be converted into a data structure Tesla can use, although there still are limitations. A Tesla DataObject has to fulfil three conditions:

  • It has to implement java.io.Serializable, just as any class that is referenced as a field within the DataObject must do
  • It has to implement Tesla's DataObject-Interface, what means that a field named "id" of type "long" has be present combined with the two methods "getId()" and "setId()"
  • It must be persistable in the database of your choice.

User Questions

Which corpora and document formats are supported?
Answering this question depends on the definition of 'support': In general, you can process any text documents in any encoding. However, Tesla contains Reader Components for the TIGER and the BNC corpus which convert the annotations of these corpora into the Tesla Role System. Additionally, a general-purpose reader (based on Apache Tika) can be used to extract textual content of various document formats, such as RTF, PDF, MS Office, ODF and HTML.
An overview of all Readers is given in the Tesla Reader section.
Which components are available?
Currently there are components for Sentence Detection, Tokenization, POS-Tagging, Named Entity Recognition, Word Vector Generation, Clustering and Syntax Parsing, but some of them are still work in progress. Visit the Tesla Components section for a complete overview.

Developer Questions

How can I implement a Tesla component? What is the Tesla IDE?
Tesla is not only a linguistic component framework, but also (thanks to Eclipse as underlying Framework) a linguistic IDE. If you ever used Eclipse, you'll easily understand the IDE-Concepts that are added by Tesla. As an example: To create a new component you use the "Tesla Component Wizard", which looks much like the default "Java Project Wizard" of Eclipse, but has some additional fields and functions that are needed for Tesla. Tesla's IDE features also Role Wizards, Quick Fixes and more - you can even debug a Component running on a server! Nevertheless there's still much to do, but we're working hard on it to make the development of Tesla components as simple as " Hello World ". If you're interested in developing new components, you might want to read the How to create a new Tesla Component section.
What kind of Adapter should I use?
That choice depends on the kind of data your Component will produce. We're currently using three different adapter families, to use relational databases, object databases and a custom database for token-based annotations, and you can use one of the related classes as a superclass of your adapter.
Which programming languages are supported?
Tesla was written in Java, thus Java is the recommended programming language. However, we're currently adding support for other languages, such as Python (through Jython) or Scala.
How can I get the sourcecode?
We're currently planning to migrate from our SVN repository to GitHub - Please check this page again in a few weeks.

Even more Questions

How do I configure...?
You're right, we need to add more documentation to our website. Most server configuration is done by modifying the files found in the configuration directory of the Tesla server - these files will be documented soon.
My question is not listed here - what should I do?
Please become a member of the Tesla Mailing List spinfo-tesla and ask your question there.