Reconceiving Machine Learning

Machine learning is a cottage industry, not an engineering discipline. nearly every new problem is solved from scratch. There is no reuse of previous solutions. There is no language to describe problems.

This lack of modularity and composability limits the field advancing. Reinvention is rife.

My research agenda is to develop a composable basis for machine learning. My detailed plan is partially explained in this formal research proposal.

One way of summarising what I want to do is captured by the following seven "ilities" which are central to NICTA's overall strategic plan for Making Sense of Data.

P1.          Composability

o   This is software reuse, algorithm/method reuse, and conceptualisation of problems in a modular way. Building of software components that can be glued together and to software that does other things. It is also reuse at the problem level.

o   One wants to be able to shrinkwrap the components – describe in terms of the problems they solve: requires a taxonomy and structure of problems.

o   Compare the Declarative versus procedural approach: in order to be able to compose, one needs to clearly describe what each component does – suggest the need for improved languages and frameworks for doing this.

o   Composability is perhaps the most important design principle used to build any large system and it sits at the centre of notions of peer-to-peer (see the Peer-to-peer manifesto  for how far this idea can be taken)

o   The component architecture needs to occur at multiple levels of abstraction: individual software components and data components; aggregations of software (via middleware etc) through to a componentised way of viewing algorithms and problems.

o   The composability should reach through to control of sensing – choice of sensing.

o   We want composability to inform all design decisions in constructing the software platform and in addition to influence the very way any new MSOD problem is approached.

P2.          Scalability

o   Massive data sets are becoming more prevalent. A major bottleneck for many applications is the computation required. Thus there are challenges in developing effective means for making sense of data that exploit the complex computational hardware that is available and are effective at avoiding computation that is unlikely to be effective.

o   The data may not all be in one place: distributed and streaming data

o   In order to address (competitively!) it requires one to fully exploit the available hardware (which can vary significantly). It is not sufficient to just develop parallelisable algorithms, although that is a start. There are deep interactions with issues such as composability when one considers parallel implementations.

P3.          Embeddability

o   In order to have a large impact, MSOD technology needs to be embedded pervasively. This connects to Embedded Systems but is broader. It captures the notion of system in the loop too (control). And more generally is all about making the technology easy to use.

o   Faster processing allows embeddability, but creates a challenge too – how to explore architectures (FPGA, GPU, multicore) in a manner that does not lock in particular techniques

o   Choice of languages and information architecture is crucial

o   It is related to the notion of distributed inference or taking the inference to the data rather than the data to the inferrer.

P4.          Layerability

o   By this I mean the ability to build hierarchical systems. It is related to composability but is different. At present much of MSOD starts from scratch each time. Furthermore the notion is that typically you have a database, you make sense of it, and then you stop. But one will want to often make databases of stuff you have already made sense of. There are many challenges to do with how to represent the outputs of a MSOD system. How to combine with reasoning systems and notions such as ontologies.

o   Not merely building multi-layer representations although they are somehow relevant

o   Bridge the symbolic – sub-symbolic gap; take the output of some MSOD technology as the input for others (e.g. reasoning with probability distributions (or other uncertainty calculi primitives) rather than raw data.

o   This suggests a need to consider alternative uncertainty calculi in MSOD (hardly studied to date)

o   This is related to the issue of working at the right level of abstraction

P5.          Reliability

o   In order to achieve larger impact, MSOD technology needs to be made reliable. The challenges here are to develop better understanding of performance, new ways of determining performance etc.

o   Furthermore, many real problems are plagued with imperfections that affect reliability (missing data, noise, misleading and mislabelled data etc)

o   At present performance assessment is typically considered something done after the construction of a solution; but there are frameworks for estimating how well you are doing ­as you do the inference. Indeed the whole issue of performance assessment cries out for systematization. This would allow a much richer set of possibilities to be deployed for any particular problem.

o   Making performance assessment itself composable and layerable is a particular challenge. With the exception of a pure Bayesian approach, none of the standard methods compose at all.

P6.          Plurality

o   Real problems are not solved by techniques of just one flavour. But much MSOD research is heavily technique driven. A challenge is to turn the field around to become more problem driven and technique eclectic and pluralistic. In order to do that some languages for comparing and relating problems as well as techniques need to be developed.

o   Plurality can occur at all different levels of abstraction:

§  Platform: it is not necessary to impose a rigid monolithic coding framework to allow different software components to interoperate. Whilst understood generally for a long time (consider for example the .NET framework , the principle design features of which include interoperability, common runtime engine, language independence and portability / platform agnosticism), most MSOD solutions (it is probably misleading to describe them as “platforms”) require you to change too much of what you currently do.

P7.          Auditability  

o   This is not the same as reliability. It concerns being able to understand the inductive inference that has occurred. It is related to reasoning systems. But typically inductive and deductive problems are addressed by distinct communities.

o   How should reasoning be structured so that inductive inference can best plug into it? Conversely, how should the inductive step be structured (and evaluated) such that it fits best into a higher order reasoning system.

o   These sorts of questions are hardly even asked at present because there is not the common task language to allow it to be posed precisely enough to solve it.