In order to do inference, we constantly need to make use of categories and concepts: it is neither possible nor desirable to deal with every unique arrangement of quarks and leptons on an individual basis. Fortunately, we can talk about repeatable higher-level regularities in the world instead: we can distinguish particular configurations of matter as instantiations of object concepts like chair or human, and say that these objects have particular properties, like red or alive.
The sheer number of distinct configurations in which matter could be arranged is unimaginably vast, but the superexponential conceptspace of the number of different ways to categorize these possible objects is even vaster.
For example, given an object that can either have or not have each of n properties, there are 2^n different descriptions corresponding to the possible objects of that kind (a number exponential in n). The number of possible concepts, each of which either includes a given description or doesn't, is one exponential higher: 2^(2^n)
Without an inductive bias, restricting attention to only a small portion of possible concepts, it's not possible to navigate the conceptspace: to learn a concept, a "fully general" learner would need to see all the individual examples that define it. Using probability to mark the extent to which each possibility belongs to a concept is another approach to express prior information and its control over the process of learning.