Activa JavaScript para disfrutar de los vídeos de la Mediateca.
Curso IA programamos.es -- cómo entienden el mundo los orden (eng) - Contenido educativo
Ajuste de pantallaEl ajuste de pantalla se aprecia al ver el vídeo en pantalla completa. Elige la presentación que más te guste:
People perceive the world through our senses, but how do machines perceive
00:00:01
the world? Computers use different types of sensors, like microphones,
00:00:05
cameras, radars, or GPS receivers, among others, to receive information from the environment
00:00:11
that surrounds them and build a representation of their surroundings. But computers only understand
00:00:17
working with numbers, so all the information they receive from their sensors has to be
00:00:22
stored as a set of numbers. For example, a black and white image is encoded
00:00:27
as a matrix of numbers, where each value indicates the brightness of each pixel.
00:00:31
If the image is in color, three numbers are stored for each pixel, representing the
00:00:36
brightness of the red, green, and blue components. Sounds are also encoded as a
00:00:41
series of numbers, indicating the waveform values at different moments,
00:00:46
taking hundreds or thousands of samples per second.
00:00:51
And the fact that a machine can receive information from the world already makes it an artificial intelligence system?
00:00:54
Well, no, for us to consider it as such, it needs to be able to extract meaning from that information.
00:01:01
Let's think about a supermarket door that opens when a sensor detects movement.
00:01:07
The system is too simple to be able to perceive who or what is entering
00:01:12
and make decisions based on this meaning.
00:01:17
And thanks to this limitation, we can enjoy wonderful videos of wild animals strolling through supermarket aisles, as Turesky and Garner joke in their chapter on AI literacy in this magnificent work.
00:01:20
But how do computers extract meaning from a set of numbers that represents
00:01:35
an image, for example?
00:01:42
This signal to meaning transformation occurs in progressive stages through
00:01:44
a process called feature extraction.
00:01:50
On the screen, we have an image of a number 4 written by a person that the computer
00:01:55
has already encoded into a matrix of numbers from the information of its camera.
00:01:59
But how could it know that it is a 4 and not a 1 or a 7?
00:02:04
By looking for specific combinations of values representing light and dark pixels
00:02:09
in small areas of the image, in this case 3x3 pixels, the location can be detected
00:02:13
and the orientation of different edges in the image.
00:02:20
Thus, the result of applying a filter to detect left edges is shown in the
00:02:24
image on the right, where areas detected as left edges appear
00:02:30
marked in red. Opposite areas are shown in blue, meaning in this case the right edges.
00:02:34
Now let's apply a filter to detect upper edges. See? So, through this
00:02:40
progressive stage process of feature extraction, where different
00:02:49
types of filters are used and combined, a signal is transformed into meaning. With sounds, it's done
00:02:54
something very similar, for example for speech recognition, since each vowel and each
00:03:02
consonant can be associated with different patterns of a spectrogram, which is a representation
00:03:07
visual that allows identifying the different variations of frequency and intensity of the
00:03:13
sound. But there are AI systems that not only can translate
00:03:17
an audio into text, but also seem to understand these texts. But how can this be?
00:03:22
How is this possible? Well, that's precisely what we'll see in the next video.
00:03:28
- Idioma/s:
- Idioma/s subtítulos:
- Autor/es:
- Programamos.es
- Subido por:
- David G.
- Licencia:
- Reconocimiento
- Visualizaciones:
- 12
- Fecha:
- 29 de marzo de 2024 - 17:53
- Visibilidad:
- Público
- Centro:
- IES MARIE CURIE Loeches
- Duración:
- 03′ 33″
- Relación de aspecto:
- 1.78:1
- Resolución:
- 854x480 píxeles
- Tamaño:
- 20.11 MBytes