Research Artifacts

Open-source tools, datasets, and frameworks developed during my research. All artifacts are available on GitHub for reproducibility and further research.

Featured Projects

FEATURED

CodeAgent

EMNLP'24

Autonomous communicative agents for code review. A multi-agent framework that enables LLMs to collaborate in reviewing code changes, identifying bugs, and suggesting improvements through agent communication.

GitHub Paper Demo

xx Stars xx Forks xx Downloads

FEATURED

Patcherizer

ICSE'24

Learning to represent patches for automatic program repair. A novel neural architecture that learns semantic representations of code changes for patch correctness prediction and generation.

GitHub Paper Dataset

xxx+ Stars xxx+ Forks

Security & Analysis Tools

MultiSEM

ICLR'24

Multilevel semantic embedding framework for security patch detection. Uses hierarchical embeddings to identify security-related patches from large-scale code repositories.

GitHub Paper

SilentPatchDetector

TOSEM'25

Just-in-time detection system for silent security patches. Automatically identifies undocumented security fixes in software repositories using advanced ML techniques.

GitHub Paper

Bug Detection & Repair

BugRMSys

EMSE'24

App review driven collaborative bug finding system. Leverages user reviews and NLP to automatically identify and localize bugs in mobile applications.

GitHub Paper

SynFix

ACL'25

Synchronous repair framework for codebases via relation graphs. Enables coordinated fixes across multiple related code locations using graph-based program analysis.

GitHub Paper

Datasets & Benchmarks

CodeReviewBench

Dataset

Large-scale dataset of code reviews from open-source projects. Contains 50K+ review comments with associated code changes for training and evaluating code review models.

Dataset Documentation

PatchCorrectnessDataset

Dataset

Curated dataset of program patches with correctness labels. Includes 10K+ patches from real-world projects annotated for correctness, used for training patch assessment models.

Dataset Kaggle

Collaborative Projects

MetaTPTrans

AAAI'23

Meta learning approach for multilingual code representation. Enables cross-lingual transfer learning for code understanding tasks across multiple programming languages.

GitHub Paper

HedgeCode

ICSE'25

Multi-task hedging contrastive learning framework for code search. Improves code retrieval accuracy through innovative contrastive learning techniques.

GitHub Paper