top of page

Research Interests & Projects

My primary areas of interest are AI in Bioinformatics, Genome Data Science and Health Inequality. I'm keen on exploring Geometric Deep Learning for drug discovery and development applications. In the context of Genomics, I am also eager to integrate perspectives of interpretability and robustness in my research. 

Research on nORFs and the Novel Proteins they Encode 

PI, Reporting Manager & CEO: Dr. Sudhakaran Prabakaran, NonExomics 

Funded by Illumina and AWS 

At the R&D division of a biotech startup, I develop ML frameworks for Proteogenomics and Structural Genomics data analysis. I work on diverse ML research problems associated with novel Open Reading Frames (nORFs): Structure prediction, protein clustering, subcellular localization, variant prioritization, protein-protein interactions, and the evolution of novel proteins. Under Dr. Matt Wayland, I have contributed to developing our enterprise's cloud and data infrastructure.

Links: None ( the proprietary algorithms, pipelines, and research ideas are confidential as per CDA.) 

sangharsh-lohakare-Iy7QyzOs1bo-unsplash.jpg

Analysis of Abnormalities in Wireless Capsule Endoscopy Images

PI: Prof. Chandra Sekhar Seelamantula, Spectrum Lab @IISc 

Funded by : Bill & Melinda Gates Foundation, Robert Bosch Centre for Cyber Physical System

Worked on classification, semantic segmentation, instance segmentation, clustering, and artificial image synthesis tasks for endoscopic abnormalities present in Wireless Capsule Endoscopy (WCE) images. Our team has developed an end-to-end WCE system that can screen WCE lesions within ten minutes. We worked in collaboration with The Command Hospital Air Force Bangalore and Kasturba Medical College, Manipal. 

Screen Shot 2022-12-21 at 2.14.44 PM.png

Fast Steerable Bilateral Edge Detectors

PI: Prof. Chandra Sekhar Seelamantula, Spectrum Lab @IISc

A government-funded project on sidewalk detection

A novel noise-robust and computationally efficient algorithm for fast bilateral edge detection in real-time. I integrated concepts of steerability in designing this algorithm, which led to a reasonable tradeoff between image accuracy and computational time. The algorithm is inspired by Sanjay Ghosh et al.'s research on 'Fast Bilateral Filtering Using Fourier Kernels'. 

Screen Shot 2022-12-21 at 3.48.58 PM.png

Transliteration of Kannada Text from a Camera Captured Scene-Image on an Android Platform

PI: Prof. K.R. Ramakrishnan, CVAI Lab @IISc 

Funded by IISc

Designed a pre-processing algorithm for raw, textual scene images to suit the requirements of OCRs like Tesseract and subsequently trained the Tesseract OCR with combinations of various Kannada fonts for optimal recognition of the Kannada Script. I also designed a Convolutional Neural Network for Kannada character recognition. Finally, I developed an Android application that pre-processed the image and transliterated the textual content with the help of Tesseract tools for Android. Our results for Kannada character recognition (accuracy: 96.8%) surpassed state-of-the-art results by almost 2%.

Screen Shot 2022-12-21 at 4.28.48 PM.png

Undergrad Projects

Alongside coursework, I worked on research projects with the faculty at the Manipal Institute of Technology. My college projects are in Predictive Analytics, Data Mining and ML. 

Handwritten Hindi Numeral Recognition Using Clustering Techniques

PI: Prof. Nisha P. Shetty

Hindi script is complex due to concavities, holes and curvatures. I evaluated traditional and deep clustering techniques for Hindi numeral recognition. Subsequently, I modified the CNN modules in the recurrent framework in Jianwei Yang et al.’s ‘Joint Unsupervised Learning of Deep Representations and Image Clusters’ to obtain normalized mutual information (NMI) values as high as 8.78, surpassing pre-existing results. Additionally, I preprocessed images using de-noising and contrast adjustment techniques.

hindinum.png

Brochure Design using Emotion Detection and Neural Style Transfer 

PI: Prof. Nisha P. Shetty

Corporates and other agencies bank on templates or human expertise for designing brochures. We aim to aid this process through automation. Based on a survey of 150 members, textures from the DTD data set have been categorized based on the emotions they evoke. We train two emotion detection algorithms (based on ANNs and SVMs) on the Twitter data set and use the results to predict the emotion associated with the catalogue content. Following this, a plain content image and an emotionally relevant texture image are combined via image style transfer. We couple concepts of image style transfer with textual images, evaluate textual emotion detection methodologies and present the designed brochures in our project.

NST.png

A Study of Various Varieties of Distributed Data Mining Architectures

PIs: Prof. Nisha P. Shetty, Prof. Balachandra Muniyal

A review of Distributed Data Mining (DDM) architectures. This study covers the GRID architecture, DDM in P2P Networks, Knowledge Grid Services, Web Service Resource Framework (WSRF), and the Globus Toolkit. We also compare the traditional warehousing models to recent DDM systems.

DDM.png

Landslide Prediction Using Classifier Models 

PI: Prof. Chethan Sharma

Team Members: Arpit Garg

The fatalities due to landslides are often overlooked. Dr. David Petley reported  32,322 deaths between 2004 and 2010 caused due to landslides. Further, his study shows that Asia has the highest number of fatalities from landslides. These numbers inspired us to predict the occurrence of landslides globally using classifier models like the Random Forest and Naive Bayes Classifiers. We performed extensive EDA on NASA's Landslide data set. Particularly, we analyzed trends in parameters like rainfall, temperature, area, population etc.

landslide.png

© 2023 by SUKRITI PAUL. 

bottom of page