The most important task is learning a large enough
feature representation. This can also be called
solving the perception problem or solving
perceptual AI.
The representation can be learned by feeding
enough data to a unsupervised learning algorithm.
There are many unsupervised learning algorithms,
Recently popular ones are
stacked autoencoders (and their generalization -
encoder graphs). It might be that only a single
layer is required - in that case something similar to a
large Singular Value
Decomposition is all that is required.
The data for this would be videos (images+audio),
audio, images, text. All of this does not need
to be labelled. It is simply sufficient to scrape
data from the web, such as from common crawl
and feed it in.
To make a speaking computer, you would train a
perceptron on the outputs. A perceptron is a
simple machine learning algorithm that
is trained to answer yes/no questions. A typical adult
has a vocabulary of between 10,000-50,000.
A perceptron would be trained for each of those
words. The parameters for the perceptron would
number in the millions to billions. Modern software
such as vowpal wabbit can handle this.
The Steps
- train large feature representation
(unsupervised learning). Requires no labelled data.
- train perceptrons to handle any of a number
of different situations (supervised learning). Requires
labelled data (input-output pairs)
Using the above approach, the prospect of
unfriendly AI is not possible
unless you train it
on the supervised data set to explicitly be "evil"
or to feed it a dataset of a person's life so that
it carries out what a person would do.
The large feature representation only needs to
be trained once. Afterwards, specific applications
can be built on this base. This could be
as powerful
as any human ability.
For example to train a "computer programmer",
give it a dataset that consists of objectives and
the stream of activity of a human programmer.
For example,
- objective : "write a program that prints hello world in python"
- out : the keyboard character stream of programmer
who completed this task.
You would get some more data of this type. You
would then chop the keyboard stream into pieces,
for example, all the characters of a keyboard. You
would then have rows of data. Each row would
have one output column, the character. It would
have a large set of input columns, the
preceding
characters that were written before that particular
instance of the character.
Example : A simple
the cat sat on the mat speaking
AI
Splitting the sentence
"the cat sat on the mat at (sa)t would give
- output : t
- input : t,h,e, ,c,a,t, ,s,a
The input would be put through the "featurizer"
and then the output of that, which would constitute
millions/billions of columns would be fed into a
perceptron for the character
t which answers
yes/no to whether a
t should be typed.
In Summary
It it possible to create AI for any
precise definition of the term by training
a large enough feature representation
once
and then giving it datasets that correspond to the
ability you would like it to achieve.
It is possible to do this
now with current
supercomputers, datasets, and algorithms