Productivity of Derivational Affixes in Surani Kurdish Language: Corpus-based Approach

Document Type : Original Article

Authors

1 Phd in linguistics, Department of linguistics, Faculty of Language and Humanities, Bu-Ali Sina University, Hamedan, Iran

2 Master's student in linguistics, Department of English Language and Linguistics, Faculty of Language and Literature, University of Kurdistan, Sanandaj, Iran

3 Associate professor in general linguistics, Department of English Language and Linguistics, Faculty of Language and Literature, University of Kurdistan, Sanandaj, Iran.

Abstract

The present research has investigated the production of derivational affixes in the Surani Kurdish language, based on the corpus of “Kurdpress”. In morphology, the productivity of a suffix was expressed as the number of new words created by that suffix. In this way, the present research has investigated the frequency of the derivational affixes of the Kurdish language in the final position of the word, based on a corpus approach. In the first step of this research, a list of Kurdish suffixes was identified. Then, by using web crawling programming in Python language, the required data was prepared from the Kurdpress news network. The measuring criterion used for the productivity of each of these suffixes is based on the criterion of the p value. Finally, it was determined that the words “چی، دان، ین، زن، یلانە” were among the most productive suffixes in the Kurdish language with a productivity rate of more than 50%. Among the other things discussed in this study is the difference between the concept of productivity and the importance that an affix assigns to itself. The importance of an affix is known in the number of words that are used in the vocabulary list of the language.
 
Extended abstract
1.Introduction
The use of affixes in a language varies in terms of their productivity in word formation. Some affixes tend to generate more new words than others and can be considered more active in the process. However, their productivity rates may change over time, resulting in shifts in their usage. This study is distinctive in that it delves into the investigation of Kurdish language derivational affixes in a rigorous manner, exploring their productivity rates based on a considerable amount of data. Additionally, the study examines the significance of existing variables in relation to the productivity rates of these affixes. The primary objective of the study is to identify which of the Kurdish language's derivational affixes is the most productive. The study aims to fill a gap in the literature by conducting a detailed analysis of this aspect of the Kurdish language.
 
2.Theoretical framework
As is typically the case, studies conducted in the realm of morphology are concerned with topics such as affixation, location and types of affixes, affix productivity, affix importance, and changes over time in affixes. The two common types of affixes observed in most languages are prefix and suffix affixes. A prefix is a dependent morpheme that is attached to the beginning of a base. A suffix is a dependent morpheme that is attached to the end of a base. Among the affixes used in a language, some are generally more productive in word formation and create a greater number of new words, while others have less productivity. Even the productivity of affixes may change over time, with some affixes being more productive at certain times and transformed into less productive affixes at other times. Affix productivity is defined as the number of new words created by that affix in word formation, and the importance of an affix is measured by the number of words that are created on a diachronic basis based on that affix. The frequency of a unit, along with all its repetitive forms, occurring multiple times in the text is referred to as its token frequency. The number of occurrences of a unit under study in the corpus, disregarding repetition, is referred to as its type frequency. Affix productivity is considered as a value obtained by dividing the number of single occurrences of affixes known as hapax legomena by the total number of tokens in the corpus. The formula for affix importance and productivity is presented below.
 
Hapax Token Rate= HTR =
Type Token Rate= TTR =
 
3.Methodology
For corpus-based studies in Kurdish linguistics, the most important issue is the availability of suitable data for analysis. Therefore, the first step was to collect sufficient data for this research, using the Kurdish language corpus and web crawling techniques. The initial version of this corpus consisted of 69,000 news documents, containing various news items from different categories, which were collected using a web crawler program written in Python version 3.4 that focused on news sources.
In the second step, a specific list of derivational suffixes in Kurdish was determined using Kurdish grammar books, such as books on grammar and a book that explains the structure of the Sorani dialect of the Kurdish language. Since the frequency of single-word counts is commonly used for examining the productivity of suffixes, after collecting the data and the list of derivational suffixes in Kurdish, the frequency of all words was calculated. This calculation was performed using Python scripts. Then, words with a frequency of one were identified, and it was determined how many words are formed by each of these suffixes. Suffixes that produced fewer than five single-word forms were removed to obtain more accurate results.
 
4.Result and discussion
After identifying the derivational suffixes, the frequency of different affix types, markers, and single-frequency words for each morpheme at the end of the examined word were considered. The results, along with the values of productivity and importance of the morphemes, are displayed in a table within the article. Productivity is generally considered as a spectrum in which less productive units are on the left side and more productive units are on the right side. Another noteworthy point in this study is that in studies of word structure, paying attention to the frequency of examined units alone cannot show accurate results. This is because various factors may influence the frequency of words, and therefore focusing solely on word frequency is not enough.
 
5.Conclusion and Suggestions
This study discusses the central concept of productivity. It was found that the suffixes “chi”, “dan”, “yan”, “zan”, and “ylaneh” make up over 50% of the most productive suffixes. The average productivity rate for suffixes in Kurdish language is 37.25%. Attention to the prefixes in Kurdish language, the order of affixes, and the hierarchy that plays a role in the productivity rate and importance of affixes are among the topics that can be explored in future research to complement this study.
 
Select Bibliography
Kohanzad, P., Fallahi, M., Pahlevanzadeh B. A Corpus-based Study of the Productivity of Derivational Affixes in Persian. Journal of Researches in Linguistics. 2021; 2 (23): 219-240. [in Persian]
Badakhshan E. Kurdish corpus project. International Institute for the Study of Kurdish Societies First Biennial Conference Germany, Frankfurt, 2017; 16-19.
Gaeta L. Ricca D. Productivity in Italian word formation: A variable-corpus approach, Berlin: De Gruyter Mouton. 2006; 44(1): 57-89.
Montero-Fleta, B. Suffixes in word-formation processes in scientific English. LSP Journal-Language for special purposes, professional communication, knowledge management and cognition. 2011; 2(2): 4-14.
Motsch W. On inactivity, productivity and analogy in derivational processes. In the Contribution of Word-Structure-Theories to the Study of Word Formation. 2018; 1-30.
Stefanowitsch A. Corpus linguistics: A guide to the methodology. Berlin: Language Science Press; 2020. DOI: 10.5281/zenodo.3735822
Ten H., P. Productivity and Anticipation in Language Processing. SKASE Journal of Theoretical Lingui stics. 2020; 17(4): 23-36.

Keywords

Main Subjects


ارجمندی، امیر. و همکاران. ۱۳۹۲. «زایایی فرایند ترکیب در زبان فارسی»، زبان‌شناخت، ۴(۷): ۱-۱۴.
چمن‌آرا، بهروز. 1399. دستور زبان کردی، سنندج: پژوهشکده کردستان‌شناسی.
رحیمی، محمد، 1399. توصیف ساخت­واژه گویش سورانی زبان کردی. تهران: مؤلف.
شقاقی، ویدا. 1387. مبانی صرف، تهران: سمت.
علوی مقدم، سیدبهنام .1386. «صرف و واژگان: واژه‌سازی و زایایی»، زبان و زبان‌شناسی، ۱(۵): ۱۴۹ -۱۵۷.
کهن‌زاد، پروانه، محمدهادی فلاحی، و یهاره پهلوان‌زاده. 1400. «بررسی پیکره­بنیاد زایایی وندهای اشتقاقی زبان فارسی»، زبان­شناسی و گویش­های ایرانی، 2(23): 219-240.
Baayen R. H. Corpus linguistics in morphology: morphological productivity. Corpus linguistics. An international handbook. edited by Anke Lüdeling, Merja Kyto, Berlin: De Gruyter Mouton; 2009: 900-919.
Bauer L. Morphological productivity. Cambridge: Cambridge University Press; 2004.
Badakhshan E. Kurdish corpus project. International Institute for the Study of Kurdish Societies First Biennial Conference Germany. Frankfurt ; 2017: 16 - 19.
Cowie C. Dalton-Puffer Ch. Diachronic word-formation and studying changes in productivity over time: Theoretical and methodological considerations, In A changing world of words. edited by Javier E. Díaz Vera. New York: Rodopi; 2002. 410-437.
Gaeta L. Ricca D. Productivity in Italian word formation: A variable-corpus approach. Berlin: De Gruyter Mouton; 2006. 57-89.
Haspelmath M. Word-class-changing inflection and morphological theory. In Yearbook of morphology. edited by Geert Booij, Jaap Marle. New York: Springer Dordrecht; 1996. 43-66.
Mendaza R. M. Matching productivity indexes and diachronic evolution: The Old English affixes ful-,-isc,-cund, and-ful. Canadian Journal of Linguistics/Revue canadienne de linguistique. 2015; 60(1): 1-24.
Montero-Fleta B. Suffixes in word-formation processes in scientific English. LSP Journal-Language for special purposes, professional communication,
knowledge management and cognition. 2011; 2(2): 1-11.
Motsch W. On inactivity, productivity and analogy in derivational processes. In The Contribution of Word-Structure-Theories to the Study of Word Formation. Akademie der Wissenschaften der DDR, Zentralinstitut für Sprachwissenschaft, 2018; 1-30.
Schweikhard N. Semantic promiscuity as a factor of productivity in word formation. Computer-Assisted Language Comparison in Practice. 2018: 1:50-65.
Stefanowitsch A. Corpus linguistics: A guide to the methodology, Berlin:
Language Science Press; 2020: 308-352. DOI: 10.5281/zenodo.3735822
Ten H. P. Productivity and Anticipation in Language Processing. SKASE Journal of Theoretical Linguistics. 2020; 17(4):1-17.