Projects

Ongoing Projects

Artificial Intelligence for Defence

The AI4DEF project is set to pave the way for accelerated development and application of AI in defence to maintain European sovereignty and excellence in this area. The project aims to demonstrate the ability/benefits of AI systems to strengthen situational awareness in various situations where decisions must be made. With its four use cases, AI4DEF will cover a functional approach and a technology assessment to match these functional capabilities. They derivate from military needs on the basis of already existing solutions and technologies.

Earth Observation Multi-mission federation layer

European space industry is moving towards an architecture where satellites and ground sensors are integrated into a federation system to make Earth observation assets available to business and public services.The DOMINO-E project aims at solving the key challenge of availability and reactivity of Earth observation from space. The technology will enable multi-mission accessibility in a scalable and automated way that allows the end-user to address a variety of acquisition assets by implementing a multi-mission and multi-sensor federation layer and using scheduling and optimization algorithms.

HumanE-AI-Net

The HumanE AI Net brings together top European research centers, universities and key industrial champions into a network of centers of excellence that goes beyond a narrow definition of AI and combines world leading AI competence with key players in related areas such as HCI, cognitive science, social sciences and complexity science. The project aims to develop the scientific foundations and technological breakthroughs needed to shape the AI revolution in a direction that is beneficial to humans both individually and societally, and adheres to European ethical values and social, cultural, legal, and political norms. The goal is to facilitate AI systems that enhance human capabilities and empower individuals and society as a whole while respecting human autonomy and self-determination.

A Competitive Intelligence Cloud/High Performance Computing Platform for Artificial Intelligence-based Science, Technology and Innovation Policy Making

The objective of IntelComp is to deliver a platform that provides tools for assisting the whole spectrum of Science, Technology and Innovation (STI) policy, i.e., agenda setting, modeling design, implementation, monitoring and evaluation. It will do so by involving multi-disciplinary teams to co-develop innovative analytics services, Natural Language Processing pipelines and Artificial Intelligence workflows and by exploiting open data, services and computational resources from the EOSC, HPC environments and federated distributed operations at the European Union, national and regional level. It will ensure a cooperative environment where different actors can visualize, interact and analyze information.

NExt ApplicationS of Quantum Computing

The NEASQC project brings together academic experts and industrial end-users to investigate and develop a new breed of Quantum-enabled applications that can take advantage of NISQ (Noise Intermediate-Scale Quantum) systems in the near future. An important objective of the project is to build an active European community of applied QC. A ready-to-install quantum programming environment (QPE) will be built and made available for free to the community.

EUropean Cyber and INFormation warfare toolbox

EUCINF will address the development of a coherent European library of configurable software components easy to integrate into Cyber and Information Warfare systems, with capabilities in detection, analysis, fusion, and threat targeting to support activities of Cyber and Operational Centres. The project will study, design, prototype, test and demonstrate cutting-edge capabilities in the domain of Cyber and Information Warfare through a toolbox, i.e. a holistic system which embeds a coherent set of components, an interoperability Framework and its associated Testbed, and a Store able to host components and their associated metadata.

STARLIGHT

Law enforcement agencies' (LEAs) data-rich environments provide the opportunity to adopt artificial intelligence tools and capabilities that improve investigatory practices and limit the criminal misuse of AI. Through STARLIGHT, LEAs will collaboratively develop their autonomy and resilience in the use of AI for tackling major criminal threats.

STARLIGHT aims to create a community that brings together LEAs, researchers, industry and practitioners in the security ecosystem under a coordinated and strategic effort to bring AI into operational practices.

Ease the Engagement of Low-Tech users to the AI-on-Demand platform through AI

The StairwAI project targets low-tech users with the goal of facilitating their engagement on the AI on-demand Platform. This will be achieved through a new service layer enriching the functionalities of the on-demand platform and containing: (1) a multi-lingual interaction layer enabling conversations with the Platform in the user’s own language, (2) a horizontal matchmaking service for the automatic discovery of AI assets (tools, data sets, AI experts, consultants, papers, courses etc.) meeting the user business needs and, (3) a vertical matchmaking service that will dimension and provision hardware resources through a proper hardware provider (HPC, Cloud and Edge infrastructures).

NLTP: The National Language Technology Platform

NLTP brings together the latest language technologies to give European public entities easy-to-use tools to simplify communication in many languages. The project aims to unite the most advanced LT tools and solutions developed in the CEF AT and other European and national programmes in a novel state-of-the-art, Artificial Intelligence (AI) driven software solution – NLTP. NLTP will provide national public administrations, SMEs and general public with mature, tightly integrated Machine Translation (MT) and other LT services (e.g. terminology management, translation memories, speech tools for selected languages, etc.) that will serve as an efficient way to enable multilingual access to information and public online services.

European Language Data Space

Through the Language Data Space (LDS) relevant stakeholders, e.g., from the publishing, language technology or press industry, will be able to share and also monetise their language data and other language resources (e.g., language models) through a single platform, taking EU values and compliance with EU rules fully into account. With the creation of the LDS platform, we aim at marking a turning point in the approach to the collection of language resources: the LDS will help European industry to compete globally with the language technology services provided by US or Chinese companies, and to build trust throughout the language data sharing process.

Completed Projects

Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of machine Translation

The aim of the ACCURAT project was to research methods and techniques to overcome one of the central problems of machine translation (MT) – the lack of linguistic resources for under-resourced areas of machine translation. The main goal was to find, analyze and evaluate novel methods that exploit comparable corpora on order to compensate for the shortage of linguistic resources, and ultimately to significantly improve MT quality for under-resourced languages and narrow domains.

Project website

A European AI On Demand Platform and Ecosystem

AI4EU aimed to build a comprehensive European AI-on-demand platform to lower barriers to innovation, to boost technology transfer and catalyse the growth of start-ups and SMEs in all sectors through Open calls and other actions. The platform acts as a broker, developer and one-stop shop providing and showcasing services, expertise, algorithms, software frameworks, development tools, components, modules, data, computing resources, prototyping functions and access to funding.

Big Data Value ecosystem

The mission of BDVe is to support the Big Data Value PPP in realizing a vibrant data-driven EU economy or said in other words, BDVe supports the implementation of the PPP to be a success.
Behind that mission, there are multiple goals to achieve, a more competitive landscape of European Big Data providers being one of them.

Clara project logo

CLARA: Common Language Resources and their Applications

CLARA aimed to provide researcher training in crucial areas related to language resources and infrastructure. The scope of the project was broad and included infrastructure design, lexical semantic modelling, domain modelling, multimedia and multimodal communication, applications, such as machine translation, and parsing technologies and grammar models. The project has resulted in new theoretical insights, new resources and tools and a new generation of researchers who can perform advanced research and development in language resources and technologies.

Project website

Cross-language Information Retrieval and Organisation of Text and Audio Documents

The aim of the CLARITY project was to develop cross-lingual information retrieval (CLIR) techniques for English -> Finnish, Swedish, Latvian & Lithuanian i.e low density languages with minimal translation resources and to investigate techniques of document organisation and presentation in concept hierarchies and by document genres and filters. Clarity was a fully-fledged retrieval system that supported the user during the whole process of query formulation, text retrieval and document browsing.

CLEOPATRA: Cross-lingual Event-centric Open Analytics Research Academy

The CLEOPATRA Marie Skłodowska-Curie Innovative Training Network, aims to make sense of the massive digital coverage generated by the intense disruption in Europe over the past decade – including appalling terrorist incidents and the dramatic movement of refugees and economic migrants. The CLEOPATRA offered a unique interdisciplinary and cross-sectoral research and training programme, which explored how we can begin to analyse and understand the major events that influence and shape our lives and our societies.

META-FORUM 2020: Project Expo

Cost-effective, Multilingual, Privacy-driven voice-enabled Services

COMPRISE defines a fully private-by-design methodology and tools that reduce the cost and increase the inclusiveness of voice interaction technology through research advances on privacy-driven data transformations, personalised learning, automatic labelling, and integrated translation. This leads to a holistic easy-to-use software development kit interoperating with a cloud-based resource platform. The sustainability of this new ecosystem is being demonstrated for three sectors with high commercial impact: smart consumer apps, e-commerce, and e-health.

enetCollect - European Network for Combining Language Learning with Crowdsourcing Techniques

The enetCollect aimed at unlocking a crowdsourcing potential available for all languages and at triggering an innovation breakthrough for the production of language learning material, such as lesson or exercise content, and language-related datasets such as, among others, NLP language resources.

Eastin cl project logo

EASTIN CL (ICT PSP programme) – Crosslingual and Multimodal Search in a Portal for Support of Assisted Living

The project supports the e-inclusion of disabled and elderly people, by providing crosslingual and multimodal support for accessing information bases on assistive tools and technology. Recent efforts have linked national assistive technology information bases into a European portal called EASTIN . The objective of EASTIN-CL was to enhance this portal by creating a front-end to make it more accessible, using language technology: Multilingual technology allowing users to search the data in their native language; Multimodal technology allowing them to access the portal not just in written but also in spoken communication.

ELRC

European Language Resource Coordination

The objective of the project to identify and gather language and translation data relevant to public administration across all 30 European countries. ELRC action manages, maintains and coordinates the relevant language resources in all official languages of the EU and CEF associated countries. These activities will help to improve the quality, coverage and performance of automated translation solutions in the context of current and future CEF digital services.

European Language Grid

The European Language Grid (ELG) project addresses fragmentation of European Language Technology (LT) landscape by establishing the ELG as the primary platform for LT in Europe. The ELG aims to be a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial Language Technologies for all European languages, including running tools and services as well as data sets and resources.

European Language Resource Equality

Twenty-four official languages and more than 60 regional and minority languages constitute the fabric of the EU’s linguistic landscape. However, language barriers still hamper communication and the free flow of information across the EU. The primary goal of ELE is to prepare the European Language Equality Programme, in the form of a strategic research, innovation and implementation agenda and a roadmap for achieving full digital language equality in Europe by 2030.

EuroTermBank (eContent project) – Collection of Pan-European Terminology Resources through Cooperation of Terminology Institutions

The EuroTermBank project focused on harmonisation and consolidation of terminology work in new EU member states, transferring experience from other European Union terminology networks and accumulating competencies and efforts of the accessed countries. The EuroTermBank project result in a centralized online terminology bank for languages of new EU member countries interlinked to other terminology banks and resources.

Project website

eRMIONE (eTEN project) – E-Learning Resource Management Service for Interoperability Networks in the European Cultural Heritage Domain

eRMIONE project aimed at making available a range of services supporting e-learning and improving knowledge acquisition, targeted to actors operating in the cultural heritage domain all over Europe.The final output of eRMIONE project was an e-learning resource management service that delivers European cultural heritage material online to courses to bring enriched cultural exchanges to students at Higher Education Institutions from different countries.

FREME project logo

FREME - Open Framework of E-Services for Multilingual and Semantic Enrichment of Digital Content

FREME addressed the general systemic and technological challenges to validate that the multilingual and semantic technologies are ready for their integration in real life business cases in innovative way. These technologies are capable to process (harvest and analyse) content, capture datasets, and add value throughout content and data value chains across sectors, countries, and languages.

Project website

Knowledge Complexity

The project is undertaking a 15-month investigation of the ways in which a focus on 'big data' in ICT research elides important issues about the information environment we live in. For its part in the project, Tilde will examine one of the greatest challenges for Big Data: the analysis and processing of multilingual content in unstructured texts.

LetsMT! - Platform for Online Sharing of Training Data and Building User Tailored Machine Translation

To fully exploit the huge potential of existing open SMT technologies the project proposed to build an innovative online collaborative platform for data sharing and MT building. This platform supports upload of public as well as proprietary MT training data and building of multiple MT systems, public or proprietary, by combining and prioritizing this data.

Project website

Building the Legal Knowledge Graph for Smart Compliance Services in Multilingual Europe

Lynx envisioned an ecosystem of smart cloud services to better manage compliance documents. A one-stop shop for SMEs and companies operating internationally seeking legal information and knowledge-based services. Lynx relies on a Legal Knowledge Graph of heterogeneous compliance data sources (legislation, case law, standards, industry norms and best practices) duly interlinked and integrated. This ecosystem enables smart search, smart assistance and smart referencing of case law, as well as Artificial Intelligence technologies and machine translation of regulatory compliance documents.

MATT project logo

MATT (EUREKA project) – Web-based Multilingual Automated Terminology Translation System

The goal of the project MATT was to develop a new web-based translation system for automated translation of multilingual terminology that bridges the gap between traditional local (desktop) translation tools and terminology data on the Internet. This unique translation technology is meant for both professional translators using specialised translation environments (for example, SDL Trados, Wordfast, Kilgray MemoQ), and for various experts and other users requiring easy access to high quality term resources from standard office environments (Microsoft Word, Microsoft PowerPoint, OpenOffice Writer, etc). The platform for multilingual terminology translation is also made available to machine translation technologies.

META-NORD (ICT PSP project) – Baltic and Nordic Parts of the European Open Linguistic Infrastructure

The META-NORD project aimed to establish an open linguistic infrastructure in the Baltic and Nordic countries to serve the needs of the industry and research communities. The project focused on 8 European languages - Danish, Estonian, Finnish, Icelandic, Latvian, Lithuanian, Norwegian and Swedish - that each have less than 10 million speakers. The project assembled, linked across languages, and made widely available language resources of different types used by different categories of user communities in academia and industry to create products and applications that facilitate linguistic diversity in the EU.

Project website

MIAUCE (FP6 project) – Multi Modal Interaction Analysis and Exploration of Users within a Controlled Environment

The project aimed to investigate and develop techniques to analyse the multi-modal behaviour of users within the context of real applications. The multi-modal behaviour takes the form of eye gaze/fixation, eye blink and body move. The techniques was developed and validated within the context of three different application domains: Security, Customized marketing, and Interactive web TV.

MLi (FP7 project) – Towards a MultiLingual Data & Services Infrastructure

The MLi Support Action is working to deliver the strategic vision and operational specifications needed for building a comprehensive European MultiLingual data & services Infrastructure, along with a multiannual plan for its development and deployment, and foster multi-stakeholders alliances ensuring its long term sustainability.

Project website

eTranslation TermBank

Federated eTranslation TermBank network

Stimulates the collection and provision of terminological resources in sector-specific domains and languages of interest to sector-specific digital public services.

odin project logo

Open Data Incubator for Europe (ODINE)

As part of its ODINE incubator project, Tilde gathered, created, and contributed new Multilingual Open Data sets for EU languages, which enable the language technology community to develop key services such as machine translation systems.

Project website

QT21 (H2020 project): Quality Translation 21

Project aimed to develop (1) substantially improved statistical and machine-learning based translation models for challenging languages and resource scenarios, (2) improved evaluation and continuous learning from mistakes, guided by a systematic analysis of quality barriers, informed by human translators, (3) all with a strong focus on scalability, to ensure that learning and decoding with these models is efficient and that reliance on data (annotated or not) is minimised. .

Empowering and education young people for the internet by playing

Internet has become an integral part of children and young people’s lives. The basic mission of the RAYUELA project is to empower and educate young people in the benefits, risks and threats intrinsically linked to the use of the Internet by playing, thus preventing and mitigating cybercriminal behavior. RAYUELA’s main goal is to better understand the drivers and human factors affecting certain relevant ways of cyber criminality, as well as empower and educate young people.

safe project logo

SAFE (EUROSTARS project) – Social Analytics for Financial Engineering

The project results is a web based news service consisting of the real time social sentiment about a set of financial products. The news are multilingual (Latvian, Swedish, German, Dutch, Polish, French) social media sources (blogs, feeds). Tilde ensured multilingual social media translation for social sentiment analysis by matured and specially adapted for social networks and financial domains SMT (statistical machine translation) systems. The news feed will be available as a free version listing the sentiment only, and a paid subscription based feed offering added services (links to originating news message, personalization and archive functionality.

Project description

semo project logo

SEMO

The retrieval of metadata from various documents and their conversion into another format is one of the most significant problems faced by document processing systems. The goal of the SEMO project was to develop a novel intelligent technology that retrieves metadata from documents both in paper and electronic format regardless of their type, structure and language. With the successful implementation of the project, a universal technology is created suitable for use in various document processing systems.

SOLIM (EUROSTARS project) – Spatial Ontology Language for Multimedia Information Modelling

The objective of the SOLIM project was to improve context-aware information analysis by expansion of state of the art ontology languages and their support for automated reasoning by adding a spatial dimension. This enables semantic systems to venture beyond a static world and add the concepts of space and change.

TTC (FP7 project) – Terminology Extraction, Translation Tools and Comparable Corpora

The TTC project aimed at leveraging machine translation tools (MT tools), computer-assisted translation tools (CAT tools) and multilingual content management tools by automatically generating bilingual terminologies from comparable corpora in several European languages (i.e. English, French, German and Latvian) as well as in Chinese and Russian. Terms in different languages are aligned based on the similarity of words next to them in the corpora (immediate vicinity), the approach is known as lexical context analysis. The system generates candidate translations for single- or multi- word terms. The approach relies on the one-to-one relation between terms and concepts.

TRIPOD (FP6 project) – TRI-Partite Multimedia Object Description

Tripod project aimed to automatically build rich multi-faceted text and semantic descriptions of the landscape and permanent man-made features pictured in a photograph; and to create a more advanced image search engine. Tripod augmented images with spatial data to compute contextual information about the location and features of the actual landscape pictured. Using 3D models, buildings and landscape features contained in the image are identified and located within the picture. Techniques from Web search and text summarisation were applied to automatically create textual descriptions of the photographs, producing a rich readable and multifaceted caption far removed from merely location but encompassing culturally encoded notions such as socially connoted language of place such as suburb, west end, etc.

TaaS (FP7 project) – Terminology as a Service

The TaaS project addressed the need for instant access to the most up-to-date terms, user participation in the acquisition and sharing of multilingual terminological data, and efficient solutions for terminology reuse. The developed cloud-based TaaS platform provides the following online core terminology services: 1) automatic extraction of monolingual term candidates from user uploaded documents using the state-of-the-art terminology extraction techniques; 2) automatic recognition of translation equivalents for the extracted terms in user-defined target language(s) from different public and industry terminology databases; 3) automatic acquisition of translation equivalents for terms not found in term banks from parallel/comparable web data using the state-of-the-art terminology extraction and bilingual terminology alignment methods; 4)facilities for cleaning automatically acquired terminology by users; 5) facilities for terminology sharing and reusing: APIs and export tools for sharing resulting terminological data with major term banks and reuse in different user applications.

Project website

User Focused Marian

The University of Edinburgh, Unbabel and Tilde will work towards improving the open-source Marian neural automated translation toolkit. This project will improve Marian tool for automated translation by adding new features commonly requested by its users: factors, forced translation, on-the fly domain adaptation from translation memories, and GPU efficiency.

Ongoing Projects

Artificial Intelligence for Defence

Earth Observation Multi-mission federation layer

HumanE-AI-Net

A Competitive Intelligence Cloud/High Performance Computing Platform for Artificial Intelligence-based Science, Technology and Innovation Policy Making

NExt ApplicationS of Quantum Computing

EUropean Cyber and INFormation warfare toolbox

STARLIGHT

Ease the Engagement of Low-Tech users to the AI-on-Demand platform through AI

NLTP: The National Language Technology Platform

European Language Data Space

Completed Projects

Your region