1 00:00:00,370 --> 00:00:08,269 As we already know, computers can only work with numbers, and yet, there are computer systems that are capable of understanding our texts. 2 00:00:08,669 --> 00:00:09,810 How does this happen? 3 00:00:10,109 --> 00:00:17,210 What is used is a mechanism to translate words or phrases into a numerical representation known as embeddings. 4 00:00:17,809 --> 00:00:22,170 As Romea Jeremy Howard says in his book "AI Applications Without a PhD", 5 00:00:22,170 --> 00:00:30,030 the artificial intelligence community sometimes likes to use rather pompous names for concepts that are actually very simple. 6 00:00:30,370 --> 00:00:33,770 And with embeddings, this is somewhat the case. Let's see how they are built. 7 00:00:34,189 --> 00:00:41,649 Imagine we are in a situation where a numerical representation has already been assigned to a set of words, using two numbers. 8 00:00:42,570 --> 00:00:44,570 Where would we place the word "apple"? 9 00:00:45,490 --> 00:00:52,350 Near position A there are several round objects. Near position B there are words related to constructions. 10 00:00:53,030 --> 00:00:57,450 But at position C, we would have the word "apple" close to others related to fruits. 11 00:00:58,149 --> 00:01:02,750 This would be a good location, since the goal of embeddings is for similar words 12 00:01:02,750 --> 00:01:07,450 to correspond to nearby points, and words that are different to correspond 13 00:01:07,450 --> 00:01:08,650 to distant points. 14 00:01:09,750 --> 00:01:11,030 Let's see another example. 15 00:01:11,849 --> 00:01:17,049 Suppose we have already assigned a numerical representation to the words "dog", "puppy", and 16 00:01:17,049 --> 00:01:17,549 "calf". 17 00:01:18,090 --> 00:01:20,030 Where would we place the word "cow"? 18 00:01:20,810 --> 00:01:25,469 All three positions could make sense, but if we place it at position 19 00:01:25,469 --> 00:01:30,269 C, we would be capturing some relationships between the words, which is precisely another 20 00:01:30,269 --> 00:01:35,769 goal of embeddings. In this case, we would be capturing two analogies. 21 00:01:36,189 --> 00:01:42,650 On one hand, "puppy" is to "dog" as "calf" is to "cow". And on the other, "puppy" is to "calf", 22 00:01:42,909 --> 00:01:49,010 as "dog" is to "cow". Thus, this embedding would be capturing two properties of the 23 00:01:49,010 --> 00:01:54,790 words, age and size. And basically, these are embeddings. What happens is that the 24 00:01:54,790 --> 00:01:58,969 ones we use in real applications have hundreds or thousands of dimensions, meaning 25 00:01:58,969 --> 00:02:04,290 that a word is translated into a vector of hundreds or thousands of numbers. 26 00:02:04,290 --> 00:02:08,870 As detailed in the article associated with this video, these embeddings allow 27 00:02:08,870 --> 00:02:13,669 performing visualizations and classroom activities that are very interesting and that 28 00:02:13,669 --> 00:02:19,110 could be the 21st-century equivalent of learning to explore a dictionary. 29 00:02:19,110 --> 00:02:22,949 But these word embeddings have certain limitations when it comes to 30 00:02:22,949 --> 00:02:28,150 recognizing phrases, since the same word can mean different things depending on the 31 00:02:28,150 --> 00:02:34,050 context. Fortunately, since transformers were born with their attention mechanism that 32 00:02:34,050 --> 00:02:39,270 allows understanding the context, we also have embeddings that are capable of assigning a 33 00:02:39,270 --> 00:02:45,430 numerical representation to complete phrases coherently. Thus, we can see that the phrase 34 00:02:45,430 --> 00:02:51,250 "I like basketball more than anything" is semantically closer to "I love basketball" 35 00:02:51,250 --> 00:02:56,889 than the phrase "I love football", even though these last two share more words 36 00:02:56,889 --> 00:03:02,770 in common. And there are even multilingual phrase embeddings where phrases that 37 00:03:02,770 --> 00:03:07,330 mean the same thing in different languages ​​receive a close numerical representation. 38 00:03:07,669 --> 00:03:12,409 As we will see in future installments, these word and phrase embeddings are the basis 39 00:03:12,409 --> 00:03:17,710 of large language models like GPT-3 and Bloom. But until we get there, don't 40 00:03:17,710 --> 00:03:22,069 stop playing with the challenges and tasks we propose on our website, as they 41 00:03:22,069 --> 00:03:26,389 will allow you to interact directly with the internal workings of many 42 00:03:26,389 --> 00:03:28,830 of the artificial intelligence systems we use daily.