Rung-Yu Tseng, Tao-Wei Wang, Szu-Wei Fu, Chia-Ying Lee, Yu Tsao
Speech perception is key to verbal communication. For people with hearing loss, the capability to recognize speech is restricted, particularly in a noisy environment or the situations without visual cues, such as lip-reading unavailable via phone call. This study aimed to understand the improvement of vocoded speech intelligibility in cochlear implant (CI) simulation through two potential methods: Speech Enhancement (SE) and Audiovisual Integration. A fully convolutional neural network (FCN) using an intelligibility-oriented objective function was recently proposed and proven to effectively facilitate the speech intelligibility as an advanced denoising SE approach. Furthermore, audiovisual integration is reported to supply better speech comprehension compared to audio-only information. An experiment was designed to test speech intelligibility using tone-vocoded speech in CI simulation with a group of normal-hearing listeners. Experimental results confirmed the effectiveness of the FCN-based denoising SE and audiovisual integration on vocoded speech. Also, it positively recommended that these two methods could become a blended feature in a CI processor to improve the speech intelligibility for CI users under noisy conditions.
Cheng-Yuan Liou, Tai-Hei Wu, Chia-Ying Lee
This paper constructs a tree structure for the music rhythm using the L-system. It models the structure as an automata and derives its complexity. It also solves the complexity for the L-system. This complexity can resolve the similarity between trees. This complexity serves as a measure of psychological complexity for rhythms. It resolves the music complexity of various compositions including the Mozart effect K488. Keyword: music perception, psychological complexity, rhythm, L-system, automata, temporal associative memory, inverse problem, rewriting rule, bracketed string, tree similarity
Milton Gomez, Marie McGraw, Saranya Ganesh S., Frederick Iat-Hin Tam, Ilia Azizi, Samuel Darmon, Monika Feldmann, Stella Bourdin, Louis Poulain--Auzéau, Suzana J. Camargo, Jonathan Lin, Dan Chavas, Chia-Ying Lee, Ritwik Gupta, Andrea Jenney, Tom Beucler
TCBench is a benchmark for evaluating global, short to medium-range (1-5 days) forecasts of tropical cyclone (TC) track and intensity. To allow a fair and model-agnostic comparison, TCBench builds on the IBTrACS observational dataset and formulates TC forecasting as predicting the time evolution of an existing tropical system conditioned on its initial position and intensity. TCBench includes state-of-the-art dynamical (TIGGE) and neural weather models (AIFS, Pangu-Weather, FourCastNet v2, GenCast). If not readily available, baseline tracks are consistently derived from model outputs using the TempestExtremes library. For evaluation, TCBench provides deterministic and probabilistic storm-following metrics. On 2023 test cases, neural weather models skillfully forecast TC tracks, while skillful intensity forecasts require additional steps such as post-processing. Designed for accessibility, TCBench helps AI practitioners tackle domain-relevant TC challenges and equips tropical meteorologists with data-driven tools and workflows to improve prediction and TC process understanding. By lowering barriers to reproducible, process-aware evaluation of extreme events, TCBench aims to democratize data-driven TC forecasting.
Ching-Chih Sung, Cheng-Hung Hsin, Yu-Anne Shiah, Bo-Jyun Lin, Yi-Xuan Lai, Chia-Ying Lee, Yu-Te Wang, Borchin Su, Yu Tsao
This paper presents EffortNet, a novel deep learning framework for decoding individual listening effort from electroencephalography (EEG) during speech comprehension. Listening effort represents a significant challenge in speech-hearing research, particularly for aging populations and those with hearing impairment. We collected 64-channel EEG data from 122 participants during speech comprehension under four conditions: clean, noisy, MMSE-enhanced, and Transformer-enhanced speech. Statistical analyses confirmed that alpha oscillations (8-13 Hz) exhibited significantly higher power during noisy speech processing compared to clean or enhanced conditions, confirming their validity as objective biomarkers of listening effort. To address the substantial inter-individual variability in EEG signals, EffortNet integrates three complementary learning paradigms: self-supervised learning to leverage unlabeled data, incremental learning for progressive adaptation to individual characteristics, and transfer learning for efficient knowledge transfer to new subjects. Our experimental results demonstrate that Effort- Net achieves 80.9% classification accuracy with only 40% training data from new subjects, significantly outperforming conventional CNN (62.3%) and STAnet (61.1%) models. The probability-based metric derived from our model revealed that Transformer-enhanced speech elicited neural responses more similar to clean speech than MMSEenhanced speech. This finding contrasted with subjective intelligibility ratings but aligned with objective metrics. The proposed framework provides a practical solution for personalized assessment of hearing technologies, with implications for designing cognitive-aware speech enhancement systems.
Jonathan Lin, Raphael Rousseau-Rizzi, Chia-Ying Lee, Adam Sobel
An open-source, physics-based tropical cyclone downscaling model is developed, in order to generate a large climatology of tropical cyclones. The model is composed of three primary components: (1) a random seeding process that determines genesis, (2) an intensity-dependent beta-advection model that determines the track, and (3) a non-linear differential equation set that determines the intensification rate. The model is entirely forced by the large-scale environment. Downscaling ERA5 reanalysis data shows that the model is generally able to reproduce observed tropical cyclone climatology, such as the global seasonal cycle, genesis locations, track density, and lifetime maximum intensity distributions. Inter-annual variability in tropical cyclone count and power-dissipation is also well captured, on both basin-wide and global scales. Regional tropical cyclone hazard estimated by this model is also analyzed using return period maps and curves. In particular, the model is able to reasonably capture the observed return period curves of landfall intensity in various sub-basins around the globe. The incorporation of an intensity-dependent steering flow is shown to lead to regionally dependent changes in power dissipation and return periods. Advantages and disadvantages of this model, compared to other downscaling models, are also discussed.