Education
- MSc CS with Specialization in Multimedia, University of Alberta, Canada - Nov 2024
- B.Tech in Electrical Engineering, Aligarh Muslim University, India - June 2022
Work experience
Okaki Health Intelligence Inc
Machine Learning Engineering Intern | Nov 2023 – Nov 2024 | Calgary, AB, Canada
- NLP Pipeline Engineering & Deployment: Engineered NLP pipelines for automating cognitive assessments like MoCA, optimizing AI-driven analytics, and integrating scalable architectures using Azure OpenAI.
- Advancing Audio-Speech NLP: Scaled speech-to-text integration with Whisper, WhisperX, and Azure Speech models for precise transcription and diarization. Implemented object-oriented principles to enhance the transcription library’s modularity and maintainability.
- Real-Time AI Assistant: Developed and optimized a real-time medical assistant for clinical assessments with vision capabilities, utilizing LiveKit’s Pipeline.
💡 Tech Stack: Python, NLP, Whisper, WhisperX, OpenAI API, Azure AI, LiveKit, OOP, Healthcare AI
Ersilia Open Source Initiative
Outreachy Intern | Dec 2022 – Mar 2023 | Barcelona, Spain
- Deep Learning & NLP Integration: Developed deep learning models (e.g., Biomed-RoBERTa) for drug taxonomy using textual bioassay data embeddings.
- Software Development & Optimization: Built the Auto-TabNet package, leveraging Optuna for hyperparameter tuning. Contributed to debugging and documentation.
💡 Tech Stack: Python, Deep Learning, NLP, Biomed-RoBERTa, Optuna, Auto-TabNet
Emplay Inc
Data Science Intern | Oct 2021 – Apr 2022 | Dublin, CA, USA
- API Development: Developed and deployed the Topic-Suggestion API for generating relevant keyphrases and tags.
- Keyphrase Extraction: Implemented supervised keyphrase extraction methods, achieving precision scores of 0.37.
- NLP Content Moderation: Built a content moderation pipeline to detect toxicity using pre-trained language models.
💡 Tech Stack: Python, NLP, Keyphrase Extraction, API Development, Content Moderation
Dalhousie University
MITACS Globalink Research Intern | Jun 2021 – Sep 2021 | Halifax, NS, Canada
- Enhanced Natural Language Understanding: Optimized pre-processing of datasets for compositional tasks, enabling efficient pre-training of language models like ALBERT-xxlarge-v2, BERT-large-uncased, and RoBERTa-large.
- Advanced Semantic Evaluation: Improved semantic evaluation techniques, increasing accuracy by 15-20% for ALBERT-xxlarge-v2 on benchmark datasets (COPA, Winogender, aNLI, PDP), advancing zero-shot common-sense reasoning tasks.
💡 Tech Stack: Python, NLP, ALBERT, BERT, RoBERTa, Zero-Shot Learning, Semantic Evaluation
Skills
Programming Languages
- Python, SQL
Programming Frameworks
- PyTorch, Flask, Hugging Face, LangChain, FastAPI
Tools
- Docker, Azure AI, AWS Bedrock, Linux Command Line, Git, Pandas, NumPy
Technologies
- Deep Learning, Natural Language Processing (NLP), Vector Databases, Transformer Architectures, Prompt Engineering, Retrieval-Augmented Generation (RAG)
Publications
Machine Learning for Classification of Antithetical Emotional States
2022 IEEE Xplore
J. Sharma, R. Maheshwari, Y. U. KhanEvaluating Performance of Different Machine Learning Algorithms for the Acute EMG Hand Gesture Datasets
2022 Journal of Electronics and Informatics
J. Sharma, R. Maheshwari, S. Khan, A. A. KhanEvaluating CNN with Oscillatory Activation Function
2022 arXiv Preprint
J. Sharma
You can download a PDF copy of my CV here.