INTRODUCTION
I'm Thillai, a
Data Science Student
Researcher
Cosmophile
Researcher
Cosmophile
"I am currently pursuing a Master's in Data Science at Stony Brook University, following the completion of my Bachelor's degree in Computer Science and Engineering from Vellore Institute of Technology. My academic and research interests have been strongly centered on Machine Learning, Deep Neural Networks, Computer Vision and Generative AI models, with a focus on pushing the boundaries of what these technologies can achieve. Throughout my studies, I've conducted research that has led to several published papers, demonstrating my commitment to advancing the field. I am currently expanding my expertise by working on research involving Large Language Models (LLMs) for vision, exploring their potential to reshape how we understand and interact with visual data."
Biography
With a background in Computer Science and Engineering from Vellore Institute of Technology, Vellore, my journey began with a fascination for Data Science, driven by the intriguing ability of machines to learn and interpret data autonomously. This passion led me to upskill in Python and machine learning algorithms, where I started building projects that provided hands-on experience and deepened my expertise. My career began as a Machine Learning Intern at BillOK, where I developed an automated invoice processing system utilizing R-CNN architecture. This experience ignited my interest in AI research, leading me to an opportunity as a Research Intern at the Indian Institute of Technology Tirupati. There, I worked on liver lesion diagnosis using Uniformer Transformer architecture and earned recognition by placing among the top 15 teams globally in MICCAI's LLD MMRI challenge. Further expanding my research capabilities, I joined the Indian Space Research Organisation's Liquid Propulsion Systems Centre as a Research Intern, focusing on building a component labeling model for X-ray radiography exposures obtained from welding. Working with esteemed organizations while pursuing my undergraduate studies provided me with invaluable insights and a clear direction for my future endeavors, shaping my understanding and passion for advancing the field of Data Science. Motivated to tackle existing problems in scientific domains using deep learning, I collaborated with professors to pursue my own research ideas. This effort culminated in the publication of three research papers, reflecting the impact and depth of my work. Currently, I am furthering my education through a Master’s program in Data Science at Stony Brook University, where I am delving into advanced topics, engaging in innovative research, and collaborating with leading experts. My aim is to deepen my understanding of the field and develop novel solutions that address complex challenges, ultimately making a significant impact through data-driven advancements.
Education
M.S. in Data Science
Stony Brook University
Expected Graduation Date: May 2026B.Tech. in Computer Science and Engineering
Vellore Institute of Technology
Graduation Date: May 2024Experience
Computer Vision Researcher
Image processing of X-Ray Radiography Exposures
ISRO Liquid Propulsion Systems Centre
May 2023 — Aug 2024Research Intern
Liver Lesion Diagnosis on Multi-phase MRI Images using transformer architecture
Dr. Gorthi - Indian Institute of Technology
May 2023 — Jul 2023Machine Learning Intern
Automated Invoice Processing and ML workflow Optimization
BillOK
Dec 2022 — Apr 2023Competencies
Technical Competencies
These are technical skills that I have aquired thus far in my computer science career.
Languages
- Python
- Golang
- Java
- SQL
- Typescript
- Javascript
- Matlab
- C
- C++
- Bash
- R
- HTML
- CSS
Frameworks/Tools
- TensorFlow
- PyTorch
- Docker
- Kubernetes
- Flask
- Angular
- AWS
- Linux
- Anaconda
- IBM Watson Studio
- Google Cloud Platform
- Hugging Face
- Azure
Research interests
- Machine learning
- Computer Vision
- Natural Language Processing
- Data science
- Artificial Intelligence
- Deep learning
- Generative AI
- Large Language Models
- Quantum Computing
- Astroinformatics
- Quantum Machine Learning
- AI for High-Energy Physics
- Cloud Computing
Highlights
Featured Highlights
Here are some awards, articles, documents, certificates, and whatever else I am proud of.
Delegate for HPAIR Asia Conference 2022
I had the privilege of being selected as a delegate for the prestigious HPAIR (Harvard College Project for Asian and International Relations) Asia Conference 2022 in New Delhi. During this global event, I showcased my ideas and insights on how AI can be harnessed to address critical global crises and climate change challenges. The conference provided a unique platform to engage with global leaders, innovators, and professionals from diverse fields. Through meaningful discussions, I exchanged valuable strategies, gained new perspectives, and established connections with like-minded individuals who are passionate about creating a positive impact on the world.
Deep Learning-driven Detection of Nuclear Fusion Ignition: Illuminating the Path to Clean and Sustainable Energy
Read Paper Here <---------
Nuclear fusion, a promising energy source with virtually limitless potential, requires precise detection methods. In this study, I investigated three leading deep learning architectures—Transformers, Long Short-Term Memory (LSTM), and ResNet50—for the binary classification of nuclear fusion events, aiming to identify the most effective model. The results showed that Transformers achieved the highest accuracy, outperforming LSTM and ResNet50. A deeper analysis examined how each model processed the data: Transformers used attention weights, LSTM captured temporal dependencies, and ResNet50 learned hierarchical features. These insights highlight the strengths and weaknesses of each architecture. Given the pressing need for sustainable energy, these findings contribute to the development of more reliable fusion energy systems. By leveraging deep learning, this research advances fusion detection technologies, supporting global efforts toward a cleaner and more sustainable energy future.
Martian Terrain Classification through Federated Learning: A Decentralized Approach for Understanding the Mars
Read Paper Here <---------
Exploring the Martian landmass is crucial for advancing space research, offering key insights into planetary evolution and the potential for habitability beyond Earth. In this research, I developed a novel approach for multi-class classification of Martian terrain into seven distinct categories: crater, dark dune, slope streak, bright dune, impact ejecta, swiss cheese, and spider. Accurate terrain classification not only deepens our understanding of Martian geological processes but also aids in selecting optimal landing sites and planning safe traverses for future exploration missions. This study leverages federated learning, a decentralized machine learning paradigm, combined with the DenseNet-121 architecture to train models across distributed data sources while preserving data privacy. Extensive experimentation using the HiRISE dataset from the Mars Reconnaissance Orbiter demonstrated the effectiveness of this approach in achieving robust performance. The federated DenseNet-121 model presents a promising solution for efficient multi-class Martian terrain classification, contributing to future space missions and enhancing our understanding of Mars.
Research and Development Head of ACM-VIT Chapter
As the Research and Development Head of the ACM-VIT chapter for 2023, I played a pivotal role in fostering a research-oriented culture within the student community. My tenure was marked by the organization and execution of various events and workshops, primarily focused on Data Science and other emerging disciplines. I worked diligently to inspire and engage students in innovative research projects, creating opportunities for hands-on learning and exploration. Additionally, I embraced my role as a team player and leader, offering guidance and support to aspiring juniors, helping them to navigate and excel in their respective areas of interest. Through these efforts, I aimed to cultivate a vibrant research environment that encouraged academic and professional growth.
Exploring Huntington’s Disease Diagnosis via Artificial Intelligence Models: A Comprehensive Review
Read Paper Here <---------
Huntington’s Disease (HD) is a devastating neurodegenerative disorder characterized by progressive motor dysfunction, cognitive impairment, and psychiatric symptoms. The early and accurate diagnosis of HD is crucial for effective intervention and patient care. This comprehensive review provides a comprehensive overview of the utilization of Artificial Intelligence (AI) powered algorithms in the diagnosis of HD. This review systematically analyses the existing literature to identify key trends, methodologies, and challenges in this emerging field. It also highlights the potential of ML and DL approaches in automating HD diagnosis through the analysis of clinical, genetic, and neuroimaging data. This review also discusses the limitations and ethical considerations associated with these models and suggests future research directions aimed at improving the early detection and management of Huntington’s disease. It also serves as a valuable resource for researchers, clinicians, and healthcare professionals interested in the intersection of machine learning and neurodegenerative disease diagnosis.
Automated Glitch Detection in LIGO Data Streams Leveraging Deep Learning Architectures
Built a continual learning architecture for LIGO glitch detection to adaptively learn from successive data streams while maintaining the effectiveness of the model over time. Utilizing the Vision Transformer model with the Continual Learning Framework for effectively preserving knowledge from prior data streams and preventing catastrophic forgetting resulted in an accuracy of 93.4%.
Mediquill Large Language Model (LLM)
Fine-tuned the Llama-2 model, which boasts 7 billion parameters, using an extensive and carefully curated dataset of medical question and answer pairs. This enhancement enabled the model to effectively comprehend and address a wide range of medical inquiries, including accurate diagnoses, recommended treatments, detailed symptom descriptions, and information about various medications. The training process involved optimizing the model's ability to understand complex medical terminology and provide reliable, contextually relevant responses, significantly improving its utility for medical professionals and researchers.
Super resolution Astronomical Image Denoiser
Developed a Super-Resolution Generative Adversarial Network (SRGAN) to effectively remove noise from galaxy images, significantly improving the Peak Signal-to-Noise Ratio (PSNR) by 32.7% and reducing image noise, measured by Root Mean Square Error (RMSE), by 37.9%. Applied advanced transfer learning techniques to fine-tune the model specifically for astronomical data, resulting in enhanced clarity of celestial features by 27.3% and a 19.8% improvement in Structural Similarity Index (SSIM) scores. These improvements allowed for more precise and detailed analysis of astronomical imagery, benefiting research in astrophotography and space exploration.
AI-powered Virtual Fitness Trainer
Developed an AI model designed to assist with workout sessions by accurately tracking exercise counts and body movements using the Mediapipe library. The model leverages pose estimation to detect key body landmarks and calculates appropriate body part angles to monitor and assess exercises such as pull-ups, push-ups, squats, walking, and sit-ups. By providing real-time feedback and precise count tracking, the model helps users maintain proper form and achieve consistent workout results, enhancing the overall effectiveness and safety of exercise routines.