UMBC at SemEval-2018 Task 8: Understanding Text about Malware

Ankur Padia, Arpita Roy, Taneeya Satyapanich, Francis Ferraro, Shimei Pan, Youngja Park, Anupam Joshi, Tim Finin


Abstract
We describe the systems developed by the UMBC team for 2018 SemEval Task 8, SecureNLP (Semantic Extraction from CybersecUrity REports using Natural Language Processing). We participated in three of the sub-tasks: (1) classifying sentences as being relevant or irrelevant to malware, (2) predicting token labels for sentences, and (4) predicting attribute labels from the Malware Attribute Enumeration and Characterization vocabulary for defining malware characteristics. We achieve F1 score of 50.34/18.0 (dev/test), 22.23 (test-data), and 31.98 (test-data) for Task1, Task2 and Task2 respectively. We also make our cybersecurity embeddings publicly available at http://bit.ly/cyber2vec.
Anthology ID:
S18-1142
Volume:
Proceedings of the 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, Marine Carpuat
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
878–884
Language:
URL:
https://aclanthology.org/S18-1142
DOI:
10.18653/v1/S18-1142
Bibkey:
Cite (ACL):
Ankur Padia, Arpita Roy, Taneeya Satyapanich, Francis Ferraro, Shimei Pan, Youngja Park, Anupam Joshi, and Tim Finin. 2018. UMBC at SemEval-2018 Task 8: Understanding Text about Malware. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 878–884, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
UMBC at SemEval-2018 Task 8: Understanding Text about Malware (Padia et al., SemEval 2018)
Copy Citation:
PDF:
https://aclanthology.org/S18-1142.pdf