Recognizing audio | Mediateca de EducaMadrid

When you access the application urlv2-learningml.org for text recognition, image and number recognition, 00:00:00

you'll see there's a new type of recognition which is sound. 00:00:16

Let's see how it works. 00:00:19

We click on it and the three phases of supervised learning appear, just like in the rest of the recognitions. 00:00:22

The training phase to collect data, learning to build the model and the testing phase. 00:00:27

We're going to build a model that will be capable of recognizing my voice from a whistle and from the background noise in the room. 00:00:33

So we create the three classes we need, in this case voice, whistle and background. 00:00:39

Good, and now it's about adding examples of voice sound or whistle background sound. 00:00:54

I'm going to start with the voice because while I speak and explain how the recording works we'll be collecting voice samples. 00:00:59

when we want to collect sound samples we simply click record and then you'll see it will start 00:01:05

collecting sample recordings of about one second duration more or less and automatically that is 00:01:11

it will keep recording until we stop it if we're going to stop the recording stops in this case 00:01:16

it has collected 12 recordings of approximately one second of my voice if we want to play it back 00:01:21

to see what it has recorded we click here and we see how it has been collecting the different things 00:01:26

dive. Been saying here the interesting thing is to collect more the timbre because that's what 00:01:31

this recognizes quite well the timbre of sounds. If we don't like any of the samples we can simply 00:01:36

delete it let's imagine we don't want number 12 we click on the trash can button and it's deleted 00:01:42

now we're going to take sound samples. It's very important that the samples we take since they're 00:01:46

one second long that during that second or so that it's recording that it really records what we want. 00:01:52

that's why it's good to review afterwards how the samples were collected to see if it really 00:01:57

recorded what we wanted we always have to keep in mind that data quality is fundamental to then 00:02:02

obtain a good model well let's collect whistle sounds i'll stop it well 13 samples since that 00:02:08

number 13 brings bad luck and we're going to be a little superstitious we'll take advantage and 00:02:33

delete the last sample. Good and now we're going to take 12 background samples. I'll simply press 00:02:37

record, stay quiet and it will capture the ambient noise there is, a bit of the fan motor, anyway 00:02:44

there's always noise everywhere we go. Good, 12 samples more or less. Remember that it's important 00:02:50

that the number of data samples whatever they are whether sound, texts, numbers or in this case sound 00:03:09

it's important that each class has more or less the same number of samples what's called a balanced 00:03:15

data set. Good, we now have the sample data set. Now it's time for learning, that is, building the 00:03:20

model. We click here and well, the machine learning algorithm will analyze that data to build a model 00:03:28

capable of recognizing those three tambras. Good, it has been trained. It took 9.3 seconds and now 00:03:33

we're going to test it. To test it, well, we do the same as when we collected data. We press the 00:03:46

record button, in this case from the testing phase and see what happens. Well, first we stay quiet to 00:03:53

see if it picks up the background. Perfect, it picked up the background noise. Now I'm going to 00:03:59

speak. Hello, hello, hello. And again it got it right, it recognized the voice. And now I'm going 00:04:06

to make a small whistle. And we see it recognized the whistle. And well, this is how to build sound 00:04:14

recognition models well next i'm going to make a program with scratch that uses the model we just 00:04:21

created for sound recognition we click on the cat and we'll see that in the learning ml blocks 00:04:29

there's a new block called record audio this block works very similar to this record button 00:04:34

when executed it records a sound of approximately one second duration and that sound is converted 00:04:40

into a vector a vector that is multidimensional which is what will really be passed to the machine 00:04:46

learning algorithm to recognize it and how is classification performed? Well as we do with the 00:04:51

rest of classification problems with this classify item block what happens is that here we're going 00:04:57

to place audio as an argument. Let's see, let's try it. First we're going to execute it with silence 00:05:01

to see if it detects the background. Very good, now I'm going to execute it while speaking. 00:05:09

Hello, hello, hello, hello 00:05:20

And now I'm going to execute it while whistling 00:05:23

As we can see, it works exactly the same 00:05:27

As the rest of the recognitions 00:05:30

But in this case recording samples of one second duration 00:05:33

And with this we could make some type of program 00:05:37

For example, imagine making a model that recognizes the words up, down, left and right 00:05:40

and then with scratch make a program that moves the cat based on what the user is saying 00:05:46

that goes up when up is said down when down is said etc well that will be the subject of 00:05:52

another later video for now we'll stick with this so you get an idea of how this new learning ml 00:05:59

functionality works 00:06:04

Recognizing audio - Contenido educativo

Del mismo autor…

Raíces: hojas de faltas de asistencia

Eclipse solar completo con kit Micro:bit

P4B Teatro de Ingles I´m too ill

P6LA_TEATRO_LENGUA

DT1.SD.U6.12 y 6.13_ Intersección recta-plano

P6LA_TETARO_LENGUA

P6B LENGUA Teatro de los rayitos