#3 Shin-Sound Quality Evaluation(JSAE Spring Conference 2023, Yokohama / Inter-noise 2023, Chiba)

Last updated on 1 September 2025 | Comments 0

At the end of 2020, when the world slowed down under the pandemic, I decided to take an unexpected path: studying artificial intelligence. The spark came from a playful idea—reversing the initials of my institute, Yokohama Institute of Acoustics (YIA), into AIY—AI Ypsilon. It was a joke at first, but it became a calling.

For more than two decades, I had been working on sound quality. Yet something inside me said, “It’s time to flip the script.” I remembered a favorite computer game from my youth, Wizardry. In Scenario 1, the player defeats the evil wizard Werdna. In Scenario 4, the story reverses—the player becomes Werdna himself. Everything is inverted. I wanted to create the same kind of “reverse scenario” for sound quality evaluation.

I also remembered a book I read while childhood called “Hints for Invention,” which mentioned Dr. Leo Esaki, winner of the Nobel Prize in Physics. While Western scientists focused on increasing the purity of materials to improve diode precision, he asked the opposite: “What happens if impurities are added?” This reverse way of thinking led him to discover that impurities could actually stabilize materials. That, too, was an inversion of perspective.

So, in the quiet of lockdown, I sat at my desk with Python code. Lock down, write code, learn AI. I encountered surprising ideas, like Frederic Jelinek’s provocative quote: “Every time I fire a linguist, the performance of the speech recognizer goes up.” At first I was shocked. Could AI really replace human expertise? Over time, I came to see the truth—AI was not a replacement but a powerful partner.
My answer was simple yet radical: apply image recognition AI to sound—an eye for an ear. Instead of only analyzing harmonics, I explored recurrence plots, visual patterns that reveal time-domain behavior. By late 2022, I was ready. And then, as if the world itself confirmed my choice, the ChatGPT boom erupted in 2023. I carried this momentum into my presentations.

At the JSAE Spring Conference, I revisited a joke I had made in 2015, about preparing for “Psycho Gundam.” This time, Psycho Gundam had become real—its name was Artificial Intelligence. Step by step, I showed how image recognition AI could reveal patterns in sound: from detecting simple tones and noise, to discovering unique flower-like patterns in Ferrari engine sounds. I called them “Ferrari flowers,” a poetic glimpse into the hidden beauty of acoustics.
For the first time, I presented sound quality evaluation without psychoacoustic parameters. It felt liberating. And yet, I realized something deeper: classical methods and AI must work together. This union is what I now call “Shin-Sound Quality Evaluation Aufhebung”—a synthesis, not a replacement.
As I stood there, I felt as if I had landed on a new planet—a “sound quality planet.” One small step for artificial intelligence, but an unforgettable leap for my own human intelligence.

CATEGORIES:

conferences|Machine Learning|Sound Quality

Tags:

No tags

#3 Shin-Sound Quality Evaluation(JSAE Spring Conference 2023, Yokohama / Inter-noise 2023, Chiba)

Latest Comments

No responses yet

Leave a Reply Cancel reply