Alternatively, rhythm could involve a measure specifically of periodic modulation patterns. Pitch and reverberation
may also implicate dedicated mechanisms. Pitch is largely conveyed by harmonically related frequencies, which are not made explicit by the pair-wise correlations across frequency found in our current model (see also Figure S5). Accounting for pitch is thus likely to require a measure of local harmonic structure (de Cheveigne, 2004). Reverberation Ibrutinib is also well understood from a physical generative standpoint, as linear filtering of a sound source by the environment (Gardner, 1998), and is used to judge source distance (Zahorik, 2002) and environment properties. However, a listener has access only to the result of environmental filtering, not to the source or the filter, implying that reverberation must be reflected in something measured from the sound signal (i.e., a statistic). Our synthesis method provides
an unexplored avenue for testing theories of the perception of these sound properties. One other class of failures involved mixtures of two sounds that overlap in peripheral channels but are acoustically distinct, such as broadband clicks and slow bandpass modulations. These failures likely result because the model statistics are averages over time, and combine measurements that should be segregated. This suggests a more sophisticated form of estimating statistics, in which averaging is performed after (or in alternation with) some sort of clustering operation, a key ingredient in recent models of stream segregation (Elhilali and Shamma, 2008). Recognition is challenging because Obeticholic Acid mw Ergoloid the sensory input arising from different exemplars of a particular category
in the world often varies substantially. Perceptual systems must process their input to obtain representations that are invariant to the variation within categories, while maintaining selectivity between categories (DiCarlo and Cox, 2007). Our texture model incorporates an explicit form of invariance by representing all possible exemplars of a given texture (Figure S2) with a single set of statistic values. Moreover, different textures produce different statistics, providing an implicit form of selectivity. However, our model captures texture properties with a large number of simple statistics that are partially redundant. Humans, in contrast, categorize sounds into semantic classes, and seem to have conscious access to a fairly small set of perceptual dimensions. It should be possible to learn such lower-dimensional representations of categories from our sound statistics, combining the full set of statistics into a small number of “metastatistics” that relate to perceptual dimensions. We have found, for instance, that most of the variance in statistics over our collection of sounds can be captured with a moderate number of their principal components, indicating that dimensionality reduction is feasible.