1 00:00:00,000 --> 00:00:05,580 As we already know, computers can only work with numbers, and yet, there are computer systems that 2 00:00:05,580 --> 00:00:12,000 are capable of understanding our texts. How does this happen? What is used is a mechanism to 3 00:00:12,000 --> 00:00:17,660 translate words or phrases into a numerical representation known as embeddings. As Jeremy 4 00:00:17,660 --> 00:00:22,920 Howard mentions in his book AI Applications Without Having a PhD, the artificial intelligence 5 00:00:22,920 --> 00:00:27,679 community sometimes likes to use somewhat pompous names for concepts that are actually very simple. 6 00:00:27,679 --> 00:00:30,640 And this is somewhat the case with embeddings. 7 00:00:31,719 --> 00:00:33,000 Let's see how they are built. 8 00:00:34,060 --> 00:00:41,079 Let's imagine we are in this situation where a numerical representation has already been assigned to a set of words using two numbers. 9 00:00:42,179 --> 00:00:43,600 Where would we place the word apple? 10 00:00:44,640 --> 00:00:47,140 Near position A there are several round objects. 11 00:00:48,240 --> 00:00:51,079 Near B there are words that have to do with constructions. 12 00:00:51,759 --> 00:00:55,679 But in position C we would have the word apple near others related to fruits. 13 00:00:55,679 --> 00:01:01,520 This would be a good location since the objective of embeddings is that similar words 14 00:01:01,520 --> 00:01:06,780 correspond to nearby points and words that are different correspond to distant points. 15 00:01:07,480 --> 00:01:12,859 Let's see another example. Suppose we have already assigned the numerical representation 16 00:01:12,859 --> 00:01:20,340 to the words dog, puppy and calf. Where would we place the word cow? All three positions could make 17 00:01:20,340 --> 00:01:24,859 some sense but if we place it in position C we would be capturing some relationships between 18 00:01:24,859 --> 00:01:29,620 the words, which is precisely another one of the objectives of embeddings. 19 00:01:29,620 --> 00:01:33,079 In this case we would be capturing two analogies. 20 00:01:33,079 --> 00:01:40,140 On one hand, puppy is to dog, what calf is to cow. 21 00:01:40,140 --> 00:01:43,799 And on the other, puppy is to calf, what dog is to cow. 22 00:01:43,799 --> 00:01:49,799 Thus, this embedding would be capturing two properties of the words age and size. 23 00:01:49,799 --> 00:01:52,560 And basically these are embeddings. 24 00:01:52,560 --> 00:01:57,079 What happens is that the ones we use in real applications have hundreds or thousands of 25 00:01:57,620 --> 00:02:02,200 that is to say, that a word translates to a vector of hundreds or thousands of numbers. 26 00:02:03,319 --> 00:02:08,780 As we detail in the article associated with this video, these embeddings allow for visualizations 27 00:02:08,780 --> 00:02:13,500 and classroom activities that are very interesting and that could be the 21st century equivalent of 28 00:02:13,500 --> 00:02:18,879 learning to explore a dictionary. But these word embeddings have certain limitations when it comes 29 00:02:18,879 --> 00:02:23,840 to recognizing sentences, since the same word can mean different things depending on the context. 30 00:02:23,840 --> 00:02:29,400 Fortunately, since transformers were born with their attention mechanism that allows understanding 31 00:02:29,400 --> 00:02:33,960 context, we now also have embeddings that are capable of assigning a numerical representation 32 00:02:33,960 --> 00:02:45,500 to complete sentences in a coherent way. Thus, we can see that the sentence nothing 33 00:02:45,500 --> 00:02:50,000 pleases me more than basketball is semantically closer to I love basketball than the sentence 34 00:02:50,000 --> 00:02:54,300 I love football, despite the fact that these last two share more identical words. 35 00:02:59,430 --> 00:03:03,090 And there are even multilingual sentence embeddings in which sentences that mean 36 00:03:03,090 --> 00:03:06,849 the same thing in different languages receive a close numerical representation. 37 00:03:07,629 --> 00:03:12,789 As we will see in upcoming episodes, these word and sentence embeddings are the foundation of 38 00:03:12,789 --> 00:03:18,849 large language models like GPT-3 and Bloom. But while we get to that, don't stop playing 39 00:03:18,849 --> 00:03:23,270 with the challenges and tasks we propose on our website, as they will allow you to interact 40 00:03:23,270 --> 00:03:28,030 directly with the internal workings of many of the artificial intelligence systems we use daily.