Erfan Al-Hossami


2023

pdf bib
Socratic Questioning of Novice Debuggers: A Benchmark Dataset and Preliminary Evaluations
Erfan Al-Hossami | Razvan Bunescu | Ryan Teehan | Laurel Powell | Khyati Mahajan | Mohsen Dorodchi
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

Socratic questioning is a teaching strategy where the student is guided towards solving a problem on their own, instead of being given the solution directly. In this paper, we introduce a dataset of Socratic conversations where an instructor helps a novice programmer fix buggy solutions to simple computational problems. The dataset is then used for benchmarking the Socratic debugging abilities of GPT-based language models. While GPT-4 is observed to perform much better than GPT-3.5, its precision, and recall still fall short of human expert abilities, motivating further work in this area.

2021

pdf bib
TeamUNCC@LT-EDI-EACL2021: Hope Speech Detection using Transfer Learning with Transformers
Khyati Mahajan | Erfan Al-Hossami | Samira Shaikh
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion

In this paper, we describe our approach towards utilizing pre-trained models for the task of hope speech detection. We participated in Task 2: Hope Speech Detection for Equality, Diversity and Inclusion at LT-EDI-2021 @ EACL2021. The goal of this task is to predict the presence of hope speech, along with the presence of samples that do not belong to the same language in the dataset. We describe our approach to fine-tuning RoBERTa for Hope Speech detection in English and our approach to fine-tuning XLM-RoBERTa for Hope Speech detection in Tamil and Malayalam, two low resource Indic languages. We demonstrate the performance of our approach on classifying text into hope-speech, non-hope and not-language. Our approach ranked 1st in English (F1 = 0.93), 1st in Tamil (F1 = 0.61) and 3rd in Malayalam (F1 = 0.83).

pdf bib
Shellcode_IA32: A Dataset for Automatic Shellcode Generation
Pietro Liguori | Erfan Al-Hossami | Domenico Cotroneo | Roberto Natella | Bojan Cukic | Samira Shaikh
Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021)

We take the first step to address the task of automatically generating shellcodes, i.e., small pieces of code used as a payload in the exploitation of a software vulnerability, starting from natural language comments. We assemble and release a novel dataset (Shellcode_IA32), consisting of challenging but common assembly instructions with their natural language descriptions. We experiment with standard methods in neural machine translation (NMT) to establish baseline performance levels on this task.