
Hello! I’m Sowmya Jayaram Iyer.
I specialize in developing Machine Learning, Deep learning, and Computer Vision algorithms. I aim to produce documented, reproducable code that is integratable into existing ecosystems and helpful towards the AI community. Currently pursuing Master's in Computer Science at Purdue University (2023). My current works are focussed around Transformer-based Visual Question Answering models and cloudML/ Deployable AI.

Specialisation
AI solutions for real-world problems. Also well-versed in Software development and Cloud-based AI.
.png)
GANs | Transformers | Autoencoders | RNN | Adversarial Learning | Neural Networks Architectures (CNN | GNN | BNN) | Transfer Learning | Gaussian Mixture Models

Visual Question Answering | Image/ scene segmentation | Object Detection | Human Detection | Video Stabilization | Image Processing/ Augmentation | Audio Processing | Pose Estimation (2D/3D) | Projective Transformation

Amazon EC2 | Amazon Sagemaker | Amazon Comprehend | Amazon ElasticInference | Amazon Augmented AI

Web Development | Programming | Algorithm Optimisation
My Skills
I am a quick learner and specialize in multitude of skills required for AI, Cloud-based solutions and Programming.
GitHub Projects
Speech to Indian Sign Language Medium


A translation medium for text/speech to Indian Sign Language is built using supervised learning. From ISL educational videos, human 2D pose is estimated. This project tackles problems in pose data such as missing joint information and occlusions. Later, 2D pose data is transformed into 3D pose data using a fully connected depth-regressor model to obtain the depth information for each frame. The architecture has a consistent performance rate, for handling missing/occluded joints and obtaining depth in the output, in the GPU as well as CPU processor. This creates the first Indian Sign Language motion capture database. For the translation medium, nltk library is used to grammatically change input text in English to ISL (order of parts of sentence/ tense). Output 3D data stored as .bvh file is rendered on Blender to give action ISL output.
Robustness of Bayesian-approach to Gradient-based Attacks


Bayesian-Models are used to defend ML-Models from gradient-based adversarial attacks in images. Adversarial (perturbed) images have the ability to reduce a ML-model's performance by forcing mispredictions. BNNs take both model and data uncertainty into account which gives them a probabilistic robustness against adversarial attacks. Using Variational Inference and Local Parametrization Tricks, Bayesian AlexNet (BAlexNet), Bayesian 3conv3fc (Simple CNN) and Bayesian LeNet(BLeNet) outperform their baseline counterparts. MNIST and CIFAR10 datasets are made adversial using both large epsilon as well as small epsilon perturbations (strength of attack) using FGSM, BIM and PGD attacks. Bayesian models increase the accuracy in prediction of the baseline models by atleast 60% against adversarial images reducing misclassification.
T5 Transformer for Title generation


Conditional Generation for Text using T5 (text-to-text) transformer was used to generate a creative title based on context for the given Movie/ TV show description (1024 tokens). A dataset compiling Netflix, Amazon Prime, Hulu and Disney Plus Movies/ TV shows was created to fine-tune the conditional generative transformer. The model achieved a rougeLsum score of 24.9. The fine tuned model has been deployed in HuggingFace as a package and can be directly importd as a pre-trained model.
NamaSign: Indian Sign Language Alphabet Detection


Implemented image classification using VGG16 and Resnet18 (FastAI) on a cluttered ISL alphabet image dataset comprising different skin colors, backgrounds, and lighting conditions of ISL alphabets. Achieved 93.9% accuracy using VGG16 and 91.8% accuracy using Resnet18. Used this result to build a video-based ISL-spelling to text translation application which can bridge communication between the differently abled people with hearing difficulty and the abled.
QOS-Class label-based Network Traffic Classification


Network traffic is classified using deep packet inspection in pcap traces. Packet information (source_ip, dest_ip, etc.) is used to predict QoS Labels for Network Classification. Data is obtained from MAWI Working Group Traffic Archive for a period of three weeks. Using multi classification models- Desicion Tree, Random Forest, Naive Bayes, Logistic Regression, achieved a 99.9% train accuracy and a test accuracy of 94.5%.
Improving Hardware Resource allocation for Networked Applications using UNIKERNELS


Designed and implemented a solution to reduce the effect of cache contention on the performance of network IO and demonstrated our results. Code sections that access the cache implemented are partitioned and isolated in Intel RDT for cache-partitioning and Unikraft Unikernel library. This design overcomes traditional system drawback of heavily CPU or memory bound applications not benefiting much from DDIO tuning based on cache partitioning as observed in ML and DL. Throughput is increased for most benchmarks when going from OS to UniKernels due to the smaller memory footprint and less cache contention.
Blood Vessel segmentation in fundus(eye) images for Diabetic Retinopathy


A U-Net segmentation model (CNN) for accurate segmentation of blood vessels in retinal images used to identify various stages of Diabetic Retinopathy. U-Net allows for more paths for information flow and can be viewed as an ensemble of FCNs. On DRIVE and CHASE DB1 datasets, the model achieves a AUC-ROC of 0.98127 surpassing other state-of-the-art deep-learning based methods.
Multi Modal Transformers - Comprehensive Review


This project presents a comprehensive survey of Transformer techniques specifically geared towards multimodal data up until 2022. The contributions of this survey include giving:
(1) a theoretical review of Multimodal learning, Transformers, Vision Transformers, and Multimodal Transformers,
(2) a review of multimodal Transformers through the perspective of two important applications- specific multimodal tasks,
(3) to summarize commonalities in challenges faced and designs of existing Transformer models.
(AI-Led) Categorization of Driver stops Application


Unsupervised hierarchical clustering model predicts delivery driver behavior in B2C companies that provide electric-vehicle based delivery service. Analyzed and classified stops as hubs, driver location, illegal stops, and delivery locations. Data from electric two-wheelers with GPS location firing every few seconds was reinforced with location descriptions extracted using Google Places API and clustered as mentioned labels.
CareerQuest.in Website


Web Development Project which lists suggests users relevant courses based on their career paths. Scrapes courses from existing MOOCs platforms based on subject/career preferences using keyword matching and enables users to compare ratings and prices across different websites.
Experience
Machine Learning Engineer (Part-Time)

Created a LightGBM based Jewellery purchase prediction model to promote EMI based Jewellery purchase scheme for Kalyan Jewellers' customers. LightGBM was initially trained on user features such as income, city, previous purchases etc. over a year to achieve an accuracy of 95%. In deployment, the model also updates this score based on every user activity such as a new purchase from any branch. On performing a t-test, the model is expected to increase jewellery purchase for the year 2023 by 15%.
Machine Learning Engineer (Part-Time)

Developed a website tool to generate creative bios with fine-tuned GPT2 based on the users' profile in English and mother tongue in collaboration with the technical team at KalyanMatrimony.com. The feature was integrated in the existing website by deploying the GPT-2 with Fast API using EC2 instance, launched and managed on Amazon Elastic Container Service with Auto Scaling. This feature was provided as a campaign to users to generate a summarized creative bio for their profiles. The feature resulted in a 20% increase in profile views for users evidently increasing customer satisfaction.
AI/ ML Engineer

Created an ensemble model using Graph boosting and LightGBM to increase accuracy in subscription prediction using user features such as login_count, message_count, income etc. from 96.5% to 99%. The Subscription prediction-score is calculated on a parallel voting basis which increases AUC from 0.95 (individual models) to AUC 1.0 (ensemble) when trained on 20000 users' activity over six months. The projected increase in sales is about 20% for year 2022-2023
AI/ ML Engineer

Developed a subscription-prediction model for users in KalyanMatrimony.com. Deployed Graph boosting model which uses user features such as login_count, message_count, income etc. over a year to achieve an accuracy of 96.5%. In deployment, the model also updates the subscription-prediction score based on every user activity on a per-day basis. For the year 2021-2022, this model has increased tele-sales in the company by 40%.
AI/ ML Intern

Developed and created a real time ID verification process from videos using video enhancement and the U-Net Deep Learning model with 89.8% accuracy. Rapidly prototyped new data processing capabilities in video enhancement using Python to confirm integration feasibility into existing systems. Collaborated with multi-disciplinary product development teams (AGILE-Jira) to identify performance improvement opportunities and integrate trained models.
Deep Learning Intern

Identified new problem areas in the check verification system and identified check frauds and errors using transfer learning on a custom synthetic dataset (VGG16, VGG19, 96% acc). Also incorporated signature verification using thin-plate spline warping for spatial transformation into existing product "SnapCheck".