Building with AI, One Step at a Time

Data Scientist and MS student at IIIT Hyderabad, focusing on NLP and Vision-Language Models for document understanding and reasoning.

Akhil Theerthala

About Me

I am a Data Scientist at Perfios Software Solutions and currently pursuing an MS in Data Science at IIIT Hyderabad. My work primarily involves using NLP and Vision-Language Models to help machines better understand documents.

I enjoy working across the machine learning lifecycle, turning research ideas into practical applications. My open-source work on the Kuvera personal finance models has been downloaded over 51,000 times on Hugging Face.

97.5% Latency Reduction
27.6% Accuracy Improvement
Apr 2025 – Present

Senior Member Data Scientist

Perfios Software Solutions

Experimenting with VLMs (like PaliGemma2) for financial data, building reasoning workflows, and developing algorithms for document readability.

Jun 2023 – Apr 2025

Member Data Scientist

Perfios Software Solutions

Reduced inference latency 8s→200ms via distillation; improved table detection by 27.6%; enhanced TSR with semantic row-detection.

Aug 2019 – May 2023

B.Tech in Aerospace Engineering

IIT Kharagpur

Publications

Selected Writings

Technical articles on machine learning, reasoning systems, and applied research.

Jan 2026

Density vs. Diversity in Data Selection

Analyzing dense vs diverse sampling strategies for VLM training on 15k-sample synthetic datasets.

Read More →
Apr 2025

Creating a Reasoning Dataset with No Budget

How I ranked 1st globally in the Reasoning Dataset Creation Challenge using synthetic data.

Read More →
Feb 2025

From Training Language Models to DeepSeek-R1

An overview of how training regimes evolved from classic approaches to modern reasoning models.

Read More →
Feb 2025

7 Practical PyTorch Tips

Lessons learned from production PyTorch development covering best practices and optimization.

Read More →

Notable Projects

Recognition & Certifications

Awards & Impact

Won the Circle of Excellence award at Perfios (2026). Creator of Kuvera datasets and models with 51,000+ downloads.

Publications

AAAI 2026 Workshop (FinForge) and FinNLP @ EMNLP 2025 on behaviour-aware personal finance LLMs.

Certifications

The Reasoning Course (HuggingFace), Generative AI with LLMs (DeepLearning.AI), ML in Production (Coursera).