Gå direkt till huvudinnehåll
Researchdata.se
ℹ️ Detta är en preview-version av Researchdata.se, innehåll och funktionalitet är under utveckling.

SweDiagnostics

SweDiagnostics
https://doi.org/10.23695/YEPN-SE26
I. IDENTIFYING INFORMATION Title* SuperLim Diagnostic Dataset, v1.1 Subtitle Created by* Felix Morger, Gothenburg University (felix.morger@gu.seÖppnas i en ny tabb) Publisher(s)* Språkbanken Text (sb-info@svenska.gu.seÖppnas i en ny tabb) Link(s) / permanent identifier(s)* https://spraakbanken.gu.se/en/resources/superlimÖppnas i en ny tabb License(s)* CC BY 4.0 Abstract* Manual Swedish translation of all 1106 sentence pairs of the SuperGLUE diagnostic dataset. Funded by* Vinnova (grants no. 2020-02523, 2021-04165) Cite as Related datasets SuperLim, SuperGLUE diagnostic dataset, FraCaS test suite II. USAGE Key applications Fine-grained analysis of system performance on a broad range of linguistic phenomena. Intended task(s)/usage(s) Natural language inference. Recommended evaluation measures Krippendorff's alpha (the official SuperLim measure), Matthews' correlation coefficient. Dataset function(s) Diagnostics Recommended split(s) Test only III. DATA Primary data* Text Language* Swedish Dataset in numbers* 1106 Nature of the content* Pairs of sentences annotated according with their inference relation and the linguistic phenomena that account for their differencs Format* JSONL and TSV. Nine columns/objects: id, four columns with the information about the relevant linguistic phenomena; domain; label; premise; hypothesis Data source(s)* SuperGLUE Diagnostic Dataset: Pruksachatkun, Yada & Nangia, Nikita & Singh, Amanpreet & Michael, Julian & Hill, Felix & Levy, Omer & Bowman, Samuel. (2019). SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. Data collection method(s)* See original source. Data selection and filtering* See original source. Data preprocessing* See original source. Data labeling* Some data labels (annotations) were changed to fit with Swedish example, but in general the aim was to keep such changes to a minimum. Annotator characteristics IV. ETHICS AND CAVEATS Ethical considerations See original data source. Things to watch out for See original data source. V. ABOUT DOCUMENTATION Data last updated* 2023-03-01, v1.1 Which changes have been made, compared to the previous version* Minor format changes Access to previous versions This document created* 2021-06-04, Felix Morger. This document last updated* 2023-04-02, Aleksandrs Berdicevskis. Where to look for further details Documentation template version* v1.1 VI. OTHER Related projects References
Gå till källa för data
Öppnas i en ny tabb
https://doi.org/10.23695/YEPN-SE26

Citering och åtkomst

Administrativ information

Ämnesområde och nyckelord

Relationer

Metadata

sprakbanken-textgu_sv