دكتوراه
العلوم والتقنية
University of Southampton
مجال التميز | دراسي وبحثي |
البحوث المنشورة |
|
البحث (1): | |
عنوان البحث: | A Large-Scale Dataset of Popular Open Source Projects |
رابط إلى البحث: | http://www.jcomputers.us/vol14/jcp1404-01.pdf |
تاريخ النشر: | 11/11/2019 |
موجز عن البحث: | Abstract—Online open source software repositories offer a wealth of information related to software artifacts and the development process, making them a valuable source for research data. Mining software repositories and retrieving project data from them provide an opportunity to build large-scale datasets of selected, high quality, real project data. Such datasets could be used to empirically validate assumptions, test hypotheses, and verify anecdotal claims about software development processes and the resulting artifacts. Moreover, publishing them would make replicability and verification of studies possible that, in turn, can enhance research quality. Thus, in this work, we publish a large-scale dataset, of 4349 projects in 11 general-purpose programming languages gathered from Github repositories, where a primary language can be identified. The usage of such a dataset can vary from empirically validating claims in the software engineering field, to machine learning training and test sets. |
المؤتمرات العلمية |
|
المؤتمر (1): | |
عنوان المؤتمر: | ACM SIGSOFT Software Engineering |
تاريخ الإنعقاد: | 16/03/2019 |
مكان الإنعقاد: | Finland, Oulu |
طبيعة المشاركة: | Paper |
عنوان المشاركة: | Assessing Programming Language Impact on Software Development Productivity Based on Mining OSS Repositories |
ملخص المشاركة: | This study is to investigate the impact of high-level, general purpose, programming languages on software development productivity and quality. In particular, a comparison is to be made between scripting languages and traditionally compiled, system programming ones to examine differences, if any. The data obtained for the research is from open source repositories gathered from Github. The results are going to be based on the analysis of possibly the largest open source dataset through examining a population of 15,000 projects and by including a sample of 4349 projects, where a main language can be identified. The investigation, so far, has revealed considerable differences in productivity between the two language groups. |
المؤتمر (2): | |
عنوان المؤتمر: | ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications |
تاريخ الإنعقاد: | 19/10/2019 |
مكان الإنعقاد: | Greece, Athens |
طبيعة المشاركة: | Paper |
عنوان المشاركة: | An Empirical Study of Programming Language Effect on Open Source Software Development |
ملخص المشاركة: | Language designers and early adopters make different claims about their languages to differentiate them from others in order to attract users. Unfortunately, some of such claims are not supported by strong evidence. Moreover, the nature of languages as a special software tool makes it difficult to find objective measures to quantify and compare them per se. One approach to provide objective information about languages is empirical comparison. Hence, this research studies the usage and practice of programming languages based on mining modern, popular, existing software repositories in order to understand and characterize their effect on developing open source software. That is, to compare open source projects written in different languages to understand similarities and examine differences among them. |