Parallel text typology dataset

Kurfalı, Murathan

doi:10.5281/zenodo.7506220

Parallel text typology dataset

https://doi.org/10.5281/zenodo.7506220

This repository contains data accompanying the paper Neural models can sometimes discover typological generalizations, currently being submitted for publication. It contains the following information for 1295 different languages: language vector representations from a range of neural models automatically derived lists of affixes automatically derived lists of inflectional paradigms typological features derived from annotation projection, and statistics on dependency relations typological features derived from classifiers trained on language vectors and typological databases automatically derived word lists data needed for automatic evaluation of language representations (code in separate repository) Note that the multilingual word embeddings described in the paper are very large, and therefore distributed in a separate public repository. The computations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at C3SE partially funded by the Swedish Research Council through grant agreement no. 2018-05973. This work was funded in part by the Swedish Research Council through grant agreement no. 2019-04129.

Go to data source

https://doi.org/10.5281/zenodo.7506220

Citation and access

Data access level:

Data are freely accessible

Creator/Principal investigator(s):

Research principal:

Stockholm University
Opens a new window at ror.org.
ROROpens in a new tab

Citation:

License:

Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)Opens in a new tab

Language:

English

Administrative information

Topic and keywords

Relations

Metadata

Parallel text typology dataset

Citation and access

Data access level:

Creator/Principal investigator(s):

Research principal:

Citation:

License:

Language:

Administrative information

URL:

oai:

Topic and keywords

Standard för svensk indelning av forskningsämnen 2025:

Keywords:

Relations

Is version of:

Is part of:

Metadata

Metadata