The rise of sophisticated phishing attacks has underscored the limitations of traditional detection systems in dynamic and multimodal threat landscapes. This thesis systematically investigates the application of Large Language Models (LLMs) for phishing detection across websites and emails, proposing a unified framework that leverages multimodal evidence and structured reasoning. Three core experiments were conducted: a component-level ablation study to assess the individual contribution of different evidence types (e.g., URL, OCR, HTML, VirusTotal); a benchmark comparison against state-of-the-art phishing detection tools; and a cross-model evaluation of LLMs in email phishing classification. The datasets were carefully curated to reflect real-world conditions, with phishing samples sourced from PhishTank and emails drawn from verified corpora. Each model was queried in a zero-shot setting using standardized prompt structures without any fine-tuning, ensuring a fair comparison across commercial and open-source LLMs. The framework demonstrated superior performance, achieving an F1-score of 98.03% on website detection using GPT-3.5-turbo and 95.25% accuracy on email detection using NVIDIA’s Nemotron Ultra. These results highlight the strength of LLMs in adapting to adversarial scenarios, offering high interpretability and operational flexibility. The study further contributes a transparent benchmarking protocol and practical recommendations for deploying LLMs in cybersecurity pipelines. Overall, this work establishes a foundation for LLM-based threat detection and motivates future research in interpretable, zero-shot security frameworks.
Use Your Cell Phone as a Document Camera in Zoom
From Computer
Log in and start your Zoom session with participants

From Phone
To use your cell phone as a makeshift document camera