We spoke with Dr. Tushar Sandhan, Assistant Professor in the Department of Electrical Engineering at IIT Kanpur, about his interdisciplinary research that bridges artificial intelligence, computer vision, and the human sense of smell. While his primary work lies in visual intelligence and multimodal AI, his curiosity has led him beyond conventional boundaries into the challenging domain of olfaction which is one of the most complex and least digitized human senses. Through this conversation, he reflects on what motivated this
unconventional shift, how AI models originally designed for vision can be adapted to study
smell, and what this means for the future of machine perception.
Can Machines Smell?
Interviewer:
Your research primarily focuses on computer vision, but you have also explored the area of
olfaction [“digitized smell,” simulates the human sense of smell using electronic noses]. What
inspired you to apply visual intelligence to something that is notoriously difficult to digitize?
Professor Tushar Sandhan:
You correctly mentioned that this sense is notoriously difficult to recognize. There have even
been a couple of Nobel Prizes related to olfaction, when researchers discovered how neurons
work in this process. But research is not compartmentalized, it is not limited to boundaries of
departments or disciplines. True research explores curiosity without barriers. What we often call
interdisciplinary research is really about collaboration across disciplines.
Computer vision itself is image processing combined with intelligence, which comes from neural
networks or AI. Neural networks are networks of neurons, and olfaction is also based on
neurons acting in patterns to give us the sense of smell. Similarly, in vision, optical nerves
collect signals from the retina and send them to the brain. Smell works in a comparable way, the
signals from fragrances, environments, or individuals are processed through neurons.
Dogs, for example, have an extremely high sense of smell. Even on campus, you’ll notice dog
lovers and dog haters. That curiosity that how animals quantify smell has led to this research.
Chemists, computer scientists, and electrical engineers were already working on this problem
and asked for my help. That is how I gravitated toward this research.
AI Models and Multimodal Research
Interviewer:
Could you give us a brief idea about the work and the models you’re using?
Professor Tushar Sandhan:
Our work started with olfaction but extends far beyond it. We work with biosignals such as ECG
and EEG, signals generated by living entities including plant leaves to detect stress in
agriculture. We also analyze microscopic data, satellite imagery, multispectral data, SAR
images, and data from UAVs and aircraft.
This work involves multimodal data: images, language, speech, and audio. Traditional
convolutional neural networks alone are insufficient. We use attention mechanisms, transformer
architectures with cross-attention across modalities, diffusion models for data generation, and
physics- or domain-inspired models.
We are not merely users of AI models. While models like DeepSeek or LLaMA exist for text, our
focus is on combining modalities and creating new architectures. I encourage my research
students to create novel models, even if the contribution is small, rather than simply using
existing ones.
Vibrational Theory vs Lock-and-Key Model in Olfaction
Interviewer:
Your Nature Scientific Reports paper claims to predict odor from vibrational spectra using a
data-driven approach. How does this differ from the traditional lock-and-key model?
Professor Tushar Sandhan:
In research, multiple hypotheses exist for any problem. Over time, experimental validation
strengthens certain hypotheses. In olfaction, there are two major theories: the vibrational theory
and the lock-and-key model.
We worked with different molecular structures: acetic acid, hydrogen sulfide, and many others,
each producing distinct smells, from pungent to kerosene-like. One hypothesis suggests
vibrations interact with neurons to create signals. The lock-and-key model suggests molecules
bind to receptors like keys in locks.
We are not arguing which theory is superior. Instead, we ask whether chemical properties can
be mapped to smell perception. By averaging responses from multiple individuals, we examine
whether AI can correctly associate chemical structures with human olfactory responses.
Turning Smell into Vision
Interviewer:
Effectively, this work turned smell into an image so a computer vision model could process it.
Why was this conversion necessary?
Professor Tushar Sandhan:
Sometimes solving a problem in a different domain allows you to use familiar tools. Computer
vision excels at detecting correlations between nearby pixels using convolutional neural
networks. In olfaction, neurons in the nasal cavity are spatially correlated, and chemical stimuli
spread across regions rather than acting independently.
This spatial correlation makes vision-based modeling suitable. We projected olfactory responses
into a visual domain, and it worked. We have now created olfactory embeddings that can be
used with generative models like diffusion models to visualize smell, essentially generating
images from odors.
Data Scarcity and Model Training
Interviewer:
Olfaction research likely suffers from limited datasets. How do you train robust models with only
dozens or hundreds of samples?
Professor Tushar Sandhan:
That is a major challenge. Smell categories are numerous, and not everyone can identify all
odors. We use techniques such as SMOTE and guided SMOTE to generate synthetic samples
without disrupting data distributions. We also use specialized loss functions like focal loss to
handle class imbalance.
Another approach is transfer learning: training models on related chemical-property datasets
and then fine-tuning them for olfaction. We combine all these methods to improve robustness.
Handling Class Imbalance in Smell Categories
Interviewer:
Labels like “fruity” appear frequently, while others like “fennel” appear less than 1% of the time.
How did your model handle this imbalance?
Professor Tushar Sandhan:
Fruity smells are easier to detect, while fennel is pungent but rare. We used focal loss and
cost-sensitive learning to emphasize underrepresented classes. Additionally, we extracted
molecular features that strongly correlate with these rare smells, enabling better detection
despite limited data.
Practical Applications of Machine Olfaction
Interviewer:
What practical applications do you envision for machine olfaction?
Professor Tushar Sandhan:
Humans have a limited sense of smell compared to animals like dogs. Machine olfaction can
help compensate for this by providing visual or textual cues about chemical environments. This
could aid safety, environmental monitoring, and human decision-making.
We can also convert smells into visuals or text for education and training, allowing a generalized
understanding of smell rather than purely subjective perception.
Future Directions: Digitizing Taste
Interviewer:
Could similar methods be used to digitize taste?
Professor Tushar Sandhan:
Taste, like smell and touch, is subjective and experimentally challenging. However, if we can
digitize olfaction and vision, there is no fundamental reason taste cannot be modeled as well.
Written and Interviewed by: Aarzoo Yadav, Suhani Joshi, Anandan Iyer
Edited by: Yeva Gupta