Creating a Comprehensive Dataset of Bugs with Associated Reports, Commits, and Forum Posts for Deep Learning/Machine Learning
For a college project, I need to create a dataset that includes multiple bugs, with each bug having a bug report, GitHub commit, and Stack Overflow post associated with it. I’m looking for methods to link these elements together, considering similarity scoring techniques like BM25. Any additional suggestions?
Extract relevent documentation for my bug reports
For a college project I need to find easy ways to create a dataset of bugs (Deep learning/ machine learning orientated) that includes bug reports from bugzilla, github and relevent posts from stackoverflow and other forums
I found that i can use similarity scoring methods like BM25 yet i am looking for more alternatives