Internationalization

Multilingual support in OERSI

Add new language

For adding a new language to the OERSI, we need to extend the frontend-labels, the vocabularies and the multilingual search. Please create a new Issue in sidre-frontend for this with title Add translation into <YOUR-LANGUAGE>

Frontend-Labels

OERSI-internal:

  • Create a language.json inside of the new folder. Can be created via get-language-labels.py (uses Wikidata).
  • In all existing translation.json: add HEADER.CHANGE_LANGUAGE-entry for the new language. Please use the new language as label for all existing files.

Vocabularies

The ttl-files of all vocabularies used by OERSI needs to be extended: there needs to be an additional entry in the skos:prefLabel for the new language for every skos:Concept-entry.

Example:

skos:prefLabel "Softwareanwendung"@de, "Software Application"@en, "TRANSLATE ME"@<YOUR-NEW-ISO639-1-CODE> .

Vocabularies:

external contributers: Please attach these files to your Merge-Request/issue.

OERSI-internal: Please create a PullRequest directly in the corresponding github-repositories.

SIDRE-internal: we need to add the new language to the Synonyms-process (example-file and/or automatic-process). For this: add the new language code to the ansible-variable internationalization_tool_search_index_output_languages.


Experimental / under development


We implemented a first, experimental feature to include translations of the OERSI keywords into other languages in the search. It can be activated via config in sidre-setup. Then, for example, the results of a search for a German keyword will also include the results of the English/Spanish/French/… keyword.

A tool (internationalization-tool) provides the translations of the keywords (based on Wikidata-translations). From these translations a synonym file for elasticsearch is generated, which can be included in the configuration. An example for such a synonym file can be found here.

The feature can be enabled or disabled via toggle (ansible-variable search_index_features_use_synonyms in all.yml). If you change this configuration, the mapping will be adjusted and the index will be reindexed automatically by the setup. The synonym file will be placed in the OERSI configuration directory (usually ~/conf) and have the filename synonym.txt (this will be linked into the Elasticsearch-config-dir by the setup). For testing purposes, the sample file can be used in the first phase of the feature development (set elasticsearch_synonyms_file: "synonyms-example.txt" and it will be copied by ansible). So, the following configuration will enable the feature using the example file:

search_index_features_use_synonyms: true
elasticsearch_synonyms_file: "synonyms-example.txt"

For now, the tool has to be activated separately. So, the following configuration will enable the feature using automatic keyword-translations via the translation-tool:

search_index_features_use_synonyms: true
internationalization_tool_install: "{{ search_index_features_use_synonyms }}"

The synonyms are included during the Elasticsearch search (not during indexing). If there are changes to the synonym file, then the search analyzers need to be reloaded so that the changes are taken into account in the search. This can be done with the help of the script reload-search_analyzer.sh.