Skip to main content
Researchdata.se
ℹ️ This is a preview version of Researchdata.se. The site contents and features are under development.

The Arabic E-Book Corpus

The Arabic E-Book Corpus
https://doi.org/10.23695/XWZ6-JV19
The Arabic E-Book Corpus is a freely available collection of 1,745 books (81.5 million words) published in by the Hindawi foundation between 2008 and 2024. The books are of various genres, including non-fiction, novels, children's literature, poetry, and plays. The corpus is provided in two versions: html and unformatted plain text. The latter version will be appropriate for most purposes. For additional detail, see Hallberg, A. (2025). An 81-million-word multi-genre corpus of Arabic books. Data in Brief, 60, 111456. The corpus is also available for download in HTML format or unformatted plain text.
Go to data source
Opens in a new tab
https://doi.org/10.23695/XWZ6-JV19

Citation and access

Administrative information

Topic and keywords

Metadata

sprakbanken-textgu_en