Saltar navegación

Activa JavaScript para disfrutar de los vídeos de la Mediateca.

Recognizing audio - Contenido educativo

Ajuste de pantalla

El ajuste de pantalla se aprecia al ver el vídeo en pantalla completa. Elige la presentación que más te guste:

Subido el 6 de agosto de 2025 por David G.

11 visualizaciones

Descargar la transcripción

When you access the application urlv2-learningml.org for text recognition, image and number recognition, 00:00:00
you'll see there's a new type of recognition which is sound. 00:00:16
Let's see how it works. 00:00:19
We click on it and the three phases of supervised learning appear, just like in the rest of the recognitions. 00:00:22
The training phase to collect data, learning to build the model and the testing phase. 00:00:27
We're going to build a model that will be capable of recognizing my voice from a whistle and from the background noise in the room. 00:00:33
So we create the three classes we need, in this case voice, whistle and background. 00:00:39
Good, and now it's about adding examples of voice sound or whistle background sound. 00:00:54
I'm going to start with the voice because while I speak and explain how the recording works we'll be collecting voice samples. 00:00:59
when we want to collect sound samples we simply click record and then you'll see it will start 00:01:05
collecting sample recordings of about one second duration more or less and automatically that is 00:01:11
it will keep recording until we stop it if we're going to stop the recording stops in this case 00:01:16
it has collected 12 recordings of approximately one second of my voice if we want to play it back 00:01:21
to see what it has recorded we click here and we see how it has been collecting the different things 00:01:26
dive. Been saying here the interesting thing is to collect more the timbre because that's what 00:01:31
this recognizes quite well the timbre of sounds. If we don't like any of the samples we can simply 00:01:36
delete it let's imagine we don't want number 12 we click on the trash can button and it's deleted 00:01:42
now we're going to take sound samples. It's very important that the samples we take since they're 00:01:46
one second long that during that second or so that it's recording that it really records what we want. 00:01:52
that's why it's good to review afterwards how the samples were collected to see if it really 00:01:57
recorded what we wanted we always have to keep in mind that data quality is fundamental to then 00:02:02
obtain a good model well let's collect whistle sounds i'll stop it well 13 samples since that 00:02:08
number 13 brings bad luck and we're going to be a little superstitious we'll take advantage and 00:02:33
delete the last sample. Good and now we're going to take 12 background samples. I'll simply press 00:02:37
record, stay quiet and it will capture the ambient noise there is, a bit of the fan motor, anyway 00:02:44
there's always noise everywhere we go. Good, 12 samples more or less. Remember that it's important 00:02:50
that the number of data samples whatever they are whether sound, texts, numbers or in this case sound 00:03:09
it's important that each class has more or less the same number of samples what's called a balanced 00:03:15
data set. Good, we now have the sample data set. Now it's time for learning, that is, building the 00:03:20
model. We click here and well, the machine learning algorithm will analyze that data to build a model 00:03:28
capable of recognizing those three tambras. Good, it has been trained. It took 9.3 seconds and now 00:03:33
we're going to test it. To test it, well, we do the same as when we collected data. We press the 00:03:46
record button, in this case from the testing phase and see what happens. Well, first we stay quiet to 00:03:53
see if it picks up the background. Perfect, it picked up the background noise. Now I'm going to 00:03:59
speak. Hello, hello, hello. And again it got it right, it recognized the voice. And now I'm going 00:04:06
to make a small whistle. And we see it recognized the whistle. And well, this is how to build sound 00:04:14
recognition models well next i'm going to make a program with scratch that uses the model we just 00:04:21
created for sound recognition we click on the cat and we'll see that in the learning ml blocks 00:04:29
there's a new block called record audio this block works very similar to this record button 00:04:34
when executed it records a sound of approximately one second duration and that sound is converted 00:04:40
into a vector a vector that is multidimensional which is what will really be passed to the machine 00:04:46
learning algorithm to recognize it and how is classification performed? Well as we do with the 00:04:51
rest of classification problems with this classify item block what happens is that here we're going 00:04:57
to place audio as an argument. Let's see, let's try it. First we're going to execute it with silence 00:05:01
to see if it detects the background. Very good, now I'm going to execute it while speaking. 00:05:09
Hello, hello, hello, hello 00:05:20
And now I'm going to execute it while whistling 00:05:23
As we can see, it works exactly the same 00:05:27
As the rest of the recognitions 00:05:30
But in this case recording samples of one second duration 00:05:33
And with this we could make some type of program 00:05:37
For example, imagine making a model that recognizes the words up, down, left and right 00:05:40
and then with scratch make a program that moves the cat based on what the user is saying 00:05:46
that goes up when up is said down when down is said etc well that will be the subject of 00:05:52
another later video for now we'll stick with this so you get an idea of how this new learning ml 00:05:59
functionality works 00:06:04
Idioma/s:
en
Materias:
Tecnología
Etiquetas:
Inteligencia Artificial
Niveles educativos:
▼ Mostrar / ocultar niveles
  • Educación Secundaria Obligatoria
    • Ordinaria
      • Primer Ciclo
        • Primer Curso
        • Segundo Curso
      • Segundo Ciclo
        • Tercer Curso
        • Cuarto Curso
        • Diversificacion Curricular 1
        • Diversificacion Curricular 2
    • Compensatoria
Autor/es:
Juan David Rodríguez García
Subido por:
David G.
Licencia:
Reconocimiento - No comercial - Compartir igual
Visualizaciones:
11
Fecha:
6 de agosto de 2025 - 0:29
Visibilidad:
Público
Centro:
IES MARIE CURIE Loeches
Duración:
06′ 07″
Relación de aspecto:
1.78:1
Resolución:
1920x1080 píxeles
Tamaño:
73.51 MBytes

Del mismo autor…

Ver más del mismo autor


EducaMadrid, Plataforma Educativa de la Comunidad de Madrid

Plataforma Educativa EducaMadrid