Home » Uncategorized

The AI content + data mandate and personal branding

  • Alan Morrison 

Fair Data Forecast Interview with Andreas Volpini, CEO of WordLift

The AI content + data mandate and personal branding

Andreas Volpini believes every user who wants to build a personal brand online has to proactively curate their online presence first. He sees structured data (semantic entity and attribute metadata such as Schema.org) as key to building a cohesive, disambiguated personal presence online.

Volpini has been working with web technology since its early days in the 1990s. He latched onto Tim Berners-Lee’s semantic web vision in the 2000s, and then pivoted to linked data in the mid-2000s to 2010s. Since 2012, he’s been focused on knowledge graph curation as a key means of search engine optimization (SEO). 

More recently, as CEO of WordLift, he’s helped customers add Schema markup to their websites automatically and led them to the beginnings of knowledge-graph based conversational AI. It’s a feedback loop approach, a way to harness the power of dialogue between humans and machines, just as SEO has been a means of getting feedback from search engines.

Unfortunately, as Volpini points out, because it took so much data private, the Attention Economy of the 2000s has been our data equivalent of the Middle Ages. It’s a form of data feudalism. Users as a result have been forced to become proactive about the content they’re creating and how they manage content with data as an asset.

In this Fair Data Forecast interview, Volpini says,

“Any individual that wants to create an impact needs to create content and data at the same time. Because without data, you are going to be controlled by others who will train better models and would absorb whatever ability or expertise you might have. 

“But if you start by creating your data, then it’s like you are piling up some value or you’re creating some assets. You can immediately create value with SEO, which is a good starting point.“

Volpini’s shed valuable light on the sometimes inscrutable topic of AI-assisted SEO. Hope you enjoy our conversation.

Interview with Andrea Volpini of WordLift

Edited Transcript

AM: Welcome to the Fair Data Forecast, Andrea Volpini. I’m just wondering how you got started with this years ago. I know WordLift has been around for several years. When did it start? 

AV: We incorporated the company about five years ago in 2017 and I was coming out of research work financed by the European Commission on semantic web technologies applied to content management systems. 

I’ve been working on the internet since the early days, the mid ‘90s. I specialized in web publishing platforms. After the Internet service provider phase, it was primarily about connecting people. 

And then I began working on creating a platform for publishing content on the web. And then, as we started to grow with the complexity of the data that we were managing, and the web started to evolve and become more and more a place where people could find different types of services with rich data, I started to look for a solution for organizing this vast amount of content. 

One of my previous companies worked on the content management system for the Italian Parliament. And so as a parliament you deal with a lot of proceedings and laws and different steps of the legislative process. 

And so as we were creating this content management system, we found a deep need for a standard that could allow us to add metadata to these vast amounts of content that made the website so important in these years. 

I mean, we’re talking about the 2000s, so very early phases of the digital ecosystem for public administration. And so all of a sudden I started to look at the founder of the world wide web, Tim Berners Lee, as a source of inspiration as he was asking people to converge into this vision of the semantic web. 

And so the first research programs were really in 2009, 2010 I think. At that time I was working with some university here in Italy and tried to find different models for letting people organize ontologies or specific knowledge domains and publish content along with data on the web. 

And so that’s how I got started.

AM: A lot of our readership will be familiar with “structured data” that’s in relational databases, but not so familiar with the kind of structured information that you’re well versed in. 

Can you paint a picture of a bit of the metadata landscape that you’re a part of and how that fits in with the more traditionally structured tabular data? 

AV: Let me continue a bit with the story of how we got into SEO and into creating a platform for building knowledge graphs. 

We were dealing with the content management system with all different types of databases and people were kind of reaching out and trying to go with something that would be published on the Web. Because of course, it became immediately clear that if you represented a group inside a large organization, you needed to be facing the public on the web. 

And this was in 2010 and 2011. But in order for tabular data to be represented as web pages, we needed to create some level of mappings between the structured content, and whatever was available in the databases. 

And so we started to deal with the problem of mapping data in and out and also structuring web pages that could represent this data. And then. We realized that the Web could actually become a large database and could be rather than made of web pages, could be made of table data that could be connected and made accessible and computable by using some standardized form of metadata. 

So linked data was born. Now linked data is a metadata standard that allows us on the open web to describe the same data structure that we might have inside our own databases. But we make it accessible in a web format that enables the open data movement. 

In these early days, we moved from the semantic web into the linked data standard with the idea of creating a metadata layer that could be interoperable on the Web. Because of course (content) metadata had existed starting from the first XML files and then of course within the content ecosystem, standards like Darwin Information Typing Architecture (DITA) were providing a solution for people that wanted to label the fields inside their systems. 

But linked data was making this possible at web scale and brought forward the idea of creating these vocabularies that would describe the data in such a way that others could understand it and compute. 

And the vision of the semantic web was also introducing the concept of agents so AI that would reason over this web data. In 2011, with this research project that I was involved with, we started to create a prototype for WordPress to help people publish this linked data. 

And of course, it was immediately clear that there was potential within the context of SEO, because at that point, Google and Bing and other commercial search engines were also realizing that in order to crawl and make sense of the web, they needed a layer of interoperable metadata and they could tap into this link data standard and make use of it in a simplified form. And so we’ve been on this journey with search engine optimization for these search engines for over ten years now. 

AM: How has SEO changed over that time? 

AV: The first projects that we did with WordLift were very hard to prove because we were forcing the client to not only work on a metadata that was, you know, accessible to a third party that wouldn’t give evidence of how they were going to use this metadata and but also, I mean, it was a pretty massive work to start working on the new emerging standard called Schema.org because immediately the concept had been the “semantic web” previously.

Ah, being used within the context of research groups or university academic work in general became a standard of the Web. And so people had to cope with the idea that they had to understand what is a linked data vocabulary, because Schema.org is a linked data vocabulary. 

It’s interoperable, it has its own taxonomy, and it’s structured. But there was no proof on how the search engines were going to use this additional metadata. For us, coming from the experience that we had and the vision that Tim Berners Lee gave us, it was clear that it was a turning point in the web. 

In the ‘90s, it was important to publish web pages in order to claim your existence. Then in the beginning of 2011, 2012, it was important that you also started to publish some metadata in order to exist. 

But there wasn’t any proof of the pudding. In a way, it was very hard to justify a semantic project at that time. 

AM: Just to give you an example of just what you’re talking about and see if it resonates, back when I was at PwC, I’d talk to data scientists and I’d say, well, why aren’t you using semantically structured data?

Isn’t that going to help you with your data science goals? And the question back to me was, how is that data different from Wikipedia? Did you ever develop an answer for that kind of question? 

AV: I think that at that time the work that was done inside communities like Wikipedia, DBpedia, and then eventually Wikidata, was foundational for the development of the web of data as we know it and use it today. 

And so we ended up working on Wikipedia more than we expected to because of these similarities that you described. And in a way, at that time. Our claim was to let users build. Their own Wikipedia using structured data, so. 

It was clear that a formalized structure could help any publisher affirm and create. Its own or authority on the web, much like Wikipedia did. And then eventually this kind of connection between the schema markup and Wikipedia

Community growth continued over the years until Google started building its own knowledge graph in 2012. To create its own knowledge graph, Google started by ingesting Wikipedia, and then eventually started to create facts from data that they could crawl from the web and using structured data. 

And data that was coming from Freebase, this large knowledge base created to structure content. [Google had acquired Freebase in 2010.] There’s always been a strong connection between Wikipedia and structured data. But we have always positioned, even in these days, structured data as a way of creating your own Wikipedia

Because you might not be eligible to be on Wikipedia, but you are eligible for creating your graph. And then at that point also we evolved and started not to talk about linked data. Instead, we started to talk about knowledge graphs because in 2012 when Google introduced the Google Knowledge Graph and its famous motto from string to things, then the knowledge graph became a concept that a lot of people could understand. 

AM: Exactly. And there’s been a lot of development over the past decade in the area of knowledge graphs. And of course, people have different things in mind when they use the term knowledge graph. Can you talk about how the WordLift conceptualization of knowledge graph differs from others? 

AV:  We have been first in looking at this linked data stack in the context of search engine optimization. So on the SEO front, at that time there were very few people talking about structured data. Then they had to start thinking about knowledge graphs. 

In 2012, when Google created its own knowledge graph, at least in the SEO industry, there was no accepted notion of entities and concepts. And in the ETL sector, there was little attention paid to information extraction and knowledge management. 

People started to create knowledge graphs for addressing use cases that were way more complex than SEO. So the world of people building graph databases was primarily focused on finance and a few other sectors where it would be easy to justify an investment in a knowledge graph, where this has been the first to say, okay, well, we can use this technology that we call knowledge graph, which is actually the evolution of the linked data stack to do SEO. 

Why so? Because at that point, for me, it became very clear that what we used to call optimization for search engines (SEO) was actually a data curation and publishing activity. So if I had better data and if I was able to publish it and make it accessible to the search engine in the most interoperable form, which seemed to be the linked data standard, then we could create an edge for a client. 

But of course, in the early days it was very hard to prove it because it was unclear how the search engine would use it. But as soon as the knowledge graph arrived, then it became evident that Schema.org was delivering an explicit benefit.

You could see the impact on the search engine results page (SERP) because of the rich features that were presented when you were using specific mockups. And it was having implicit impact by helping you support some synthetic queries. 

Basic example was, if you would ask a CEO of WordLift, using structured data, the search engine became capable of creating a synthetic query for Andrea Volpini and combining the results coming from the query about Volpini with the results coming from “coffee”, therefore creating a more accurate representation of results for that query. 

And so these are implicit mechanics that made it possible to prove that there was actually a return on investment (ROI). And in the first year, I was very conscious about talking about ROI because we couldn’t add enough evidence of what was this ROI? 

Was any ROI really there? Would it justify the cost of setting up a knowledge graph? We couldn’t answer that question. But right now, ten years later, I can say that if you give me a dollar, I can give you at least three dollars back in terms of additional traffic that I can create. 

This is not applicable to each and every website. So it depends on the vertical. It depends on the data that you have. It depends on the data that your competitors have. But my clients get at least a 3X return on investment. 

AM: What’s the thinking behind the 3X factor? Can you tell us how you got to that figure and what it means in terms of articulated results? 

AV: In the COVID phase, we had to redefine our offering and we started to more aggressively work on the ecommerce vertical. 

And there was a reason why ecommerce was booming. It became an opportunity to do SEO in the ecommerce space also because Google wanted to fight against Amazon for getting the attention on all of these transactional queries that before were only the interest of Amazon. 

And so Google started to open up the organic results to ecommerce sites by providing free listings. So even though I added support in the roadmap for ecommerce sites during COVID, we had to accelerate at a tremendous pace the support for ecommerce, because Google was going more into these queries and there was tremendous demand because of people staying at home. 

And so at that point, in order to evaluate the impact, it wasn’t just about clicks and impressions as it was before. So  I also had the opportunity to look at the purchase order on the pages where we apply our advanced linked data markup. 

And so we started to work on A and B testing with the data science team from the clients to evaluate the impact not only on the search traffic, but also on the actual purchases. And then we could see that in terms of additional sales, we could create an impact. 

AM: Can you talk about the metrics that might be relevant to what you just said from the  customer side, or who’s seeing the ROI impact? 

AV: Of course, calculating the ROI on an editorial website is a completely different game, because you have to take into account how the website is monetizing traffic. And therefore, in that context, we would feel more confident to look at the increase in organic traffic. But again, we would apply A and B, testing and casual impacts in order to make sure that we can isolate external factors. 

Because the problem is that at that point, we started to sell SEO as a product, which is what we do now. And so in order to sell it as a product, we had to work on the way in which we could measure this impact. 

And if it’s an editorial site, we have to look at the clicks and impressions or any web metrics that the client is measuring. In some cases, it can be, I don’t know, the number of tickets that the client is selling with his editorial content. 

In some other cases, it can be the number and the quality of the leads that they are acquiring through the organic channel. So, depending on the use case, we have to be very clear on what we can measure. 

And, of course, not every site is a good fit. If your monetization strategy is not strong enough, we may not be a good solution. We might be too expensive. On the contrary, if you are, let’s say, generating leads at, I don’t know, 30 euro per lead, then organic can deliver a 3X return. 

So depending on the business value that the trust it creates, then we are capable of looking at the impact.

We’ve been talking on the enterprise side here for the most part. What about the personal side? 

I help run a personal knowledge graph working group. I started it with George Anadiotis, who has published a book with Ivo Velichkov on that topic. Our thinking was, to begin with, that we need to get more people involved in thinking about knowledge graphs and how to contribute to them, regardless of whether they’re actually builders or not. 

Ivo was keen on how to do his own structured version of Roam Research-style note taking. Some of that book has to do with that kind of use case. 

But I sent you sort of a story about a comic who is just getting started and they want to do their own promotion and have their own web presence. And for those kinds of people who are just trying to bootstrap their own brand, how does this kind of thing fit in with what they’re trying to do? 

Is it too much of a stretch? Is there a place where they can land that’s easier for them? What would you say to somebody like that? 

AV: We have a lot of cases like that. On the personal branding front, we have created, as a matter of fact, a strategic partnership with Jason Barnard of Kalicube, specifically to address the personal branding area. 

And let’s look at a few cases that we’ve worked on in the past. Matt Artz is one of the anthropologists who does a very interesting research on how anthropology can impact user experience studies. And he came to us after listening to a podcast by Jason Barnard to say, hey, can I build a knowledge graph on my website? 

Would these help in creating a knowledge graph panel and therefore providing better, stronger authority in the academic sector as well as in the business sector? And we’ve been working with Matt since then, and I met him actually for the first time in real life a few months back when I was in New York for the Knowledge Graph Conference. 

And I had the pleasure to invite him. Big is, besides being a brilliant researcher. He has been also following the strategy of building a knowledge graph on his own personal website and promoting his personal podcast on anthropology and UX. 

And he had seen, as soon as he got into the Google knowledge graph, he was able to get more visibility and to grow his network. So building a knowledge graph is not per se an enterprise effort, quite the opposite. 

I mean, everyone should create a knowledge graph much like everyone should create a website. And how can this be done? In a way, the practice is similar. To the information architecture strategies that you would apply to your own project. 

So are you going to create a section about your books? Are you going to create a section about your biography? Are you going to create a section about your other projects? How are you presenting yourself to others in a digital ecosystem? 

I mean, in a way, we’re back to the point where Ted Nelson started. How do we connect things? Things, how do we connect concepts? And so we have a lot of these cases where people, individuals, came to us to build a graph. 

And then they have seen the benefit of building a knowledge graph because Google has been able to do the disambiguation. If you look at me, for example, I’m called Andrea Volpini, but there is a very famous, worldwide famous tennis player called Andre Volpini.

And then there is an Olympics champion and swimmer that is also called Andrey Volpini. And then there is a musician called Andre Volpini. I mean, even with an Italian name like mine, there are at least four notable people with the same name. 

So how do we help the search engine understand who’s who and does that create an impact? Well, sure it does, because you can now ask the search engine, how old am I, and what are the organizations that I have contributed to, and what is my mom’s name? 

And of course, if you apply these as a creator or on a top manager, you’re going to see a tremendous impact. Yeah. 

AM: I even remember back in the day, maybe this was in the 2010s, just before or after knowledge graph was announced at Google, I remember filling out a form to disambiguate my web persona. 

AV: Right. And so they were reaching out to users, individual users, to do that at that point. And now it’s possible for you to just do that sort of thing yourself and do it your own way. 

We’ve also done a lot of work to help. When possible, when it makes sense to connect your entity with the equivalent entity on Wikidata or to publish the same data that you create with wirelift on DBpedia

We are part of the DBpedia data bus, which means that if there are concepts that a client is publishing on its own website that we think could contribute to the broader knowledge, we would publish it into DBpedia with links back and forth from DBpedia to your personal knowledge graph and vice versa. 

And until now though, the primary use case was feeding Google and then starting to use this data on the information architecture of your website. Because in our product we started to create widgets for enticing people to click from one concept to another to display maybe context card on a specific term in order to help people understand what we meant when we described, I don’t know, semantic web or any other concept that we want to care about. 

But right now it’s even more interesting having this data. And it’s interesting not just for the large organization, but also for the individual. Because if I have a well-curated knowledge graph about the things that I have working on, then I can create an assistant that is fed in context with structured data. 

So ontology prompting and fine tuning models with knowledge graphs. It’s what we have started to do since 2020. It’s now been over three years now. And we had tremendous success with structured data. 

Finally, it’s becoming way more important than Google itself. And until now, I had to prove the return of the investment primarily by looking at metrics from a third party that I do not have, in the end, any control. 

I can now show you that I can create better content if you have better data. 

AM: Let’s bring in the elephant in the room, which is so-called AI, and your product is AI enabled. 

And let’s think about your own personal information. How do you assert ownership of that information? How do you make your own information? You talked about how to make it more authoritative, but there really has to be territory that you stake out yourself. 

If you’re like the anthropologist you mentioned, you want custody of this sphere of information that you’re creating, and you want other people to have access to it. But you might want to impose certain restrictions on the access as well. 

Have you been thinking along those lines? Have people been asking you about that? 

AV: Only to some extent. So we know that we can apply a license to your knowledge graph. And this is helpful for understanding what can be done with this data that you’re making available. 

But this year we are also starting to invest more on creating triples that are for private use only. Because until now Google was for us and Bing and the other search engines were for us the primary data consumer for structured data. 

But now I see the advantage of keeping something for yourself only that might not be made available to others. And the reason is that in one of my latest blog posts I created an agent that represents me as author of the blog post and that allows you to chat with the content of the article and with myself as author. 

And I fed the system with everything that I’ve written on the blog, my personal entity on the knowledge graph of the blog and that specific article. So there is an agent created with this orchestrator framework. 

In that specific case I’m using Langchain, which has an index for the content of the article, an index for the previous content that I’ve written on the blog post and then a system prompt that taps into who I am and so my entity page content. Therefore I’m creating an agent that. 

acts as Andrea Volpini. And if you ask, who are you? It will say, I’m Andrea Volpini. I’m the CEO of WordLift. But then you can also ask, okay, Andy, I don’t want to read the article. It’s too complicated or too long. 

Tell me, what is this for–a travel agency or an ecommerce brand? Explain to me in layman’s terms, for this specific target, what you’ve written about out here. And therefore you realize that the generative web, it’s creating tremendous opportunity, but it’s all about the data. 

And I like a lot this image that Tony Sealel shared the other day, that AI is a tip of an iceberg, but the data underneath is what’s creating the actual wow effect. Yeah, exactly. And it seems like trying to get all these different technology tribes to work in the same direction on this stuff is a huge challenge. 

I know something about the Internet identity folks from the IIW workshop here in the Valley, and very smart people working on very smart things, but they’re not working on what you just talked about. 

You know what I’m saying? And it’s like, how do we get all these people together to make a bigger impact than we’re making? How do we get the tribes to work with each other better? We went through the middle age on the web, and we haven’t clearly realized it, that as Facebook and Google and the other technology stacks emerged, we actually destroy the ability of the web to become an interconnected ecosystem. And we devalue the power of interoperable standards. Web identity, it’s a beautiful standard, but the problem is that it’s an island. 

Because at the moment we kind of felt this connected and the same applied, of course, to our leader world of SEO. Why shouldn’t Google start to make its own entity in the Knowledge Graph dereferenceable and accessible to the using link data standard? If linked data is what they use in the end to organize their information and to improve their information?

But we went through a middle age. We went through a period of time where the corporate interest made it such that there wasn’t any interest in sharing data and there was no interest in sharing standards. 

And I do have hope that this will change. We cannot go through the middle age in the era of AI simply because it would give tremendous power to few and limit others from prospering on the web. 

AM: Which leads us to another initiative that we have percolating in our personal knowledge graph working group. I don’t know if you know Gyuri Lagos, but Gyuri is working on what he calls the IndyWeb and the IndyHub. 

He is trying to enable an open peer-to-peer environment that’s based on the Interplanetary File System (IPFS) that makes a lot of the collaboration environment automated. Of course the IPFS is not mature yet, and it’s not really working as well as the traditional web works.

Any thoughts on so-called decentralized web? Or the kind of approach where you could start with your own environment and have your own access control? 

AV: In a granular way. I think that the Solid project by Tim Berners Lee is something that we are starting to get ready for. It goes in the direction of decentralizing the architecture. Because in the end, as you mentioned, we have to start from the infrastructure before we get up to the different layers into the apps. 

If the infrastructure becomes decentralized, then it’s easier to kind of share the value across the different peers. So I can see that the evolution towards a decentralized semantic web is becoming a need. 

But of course there are a lot of other forces that are coming into play and I think that still people are underestimating the value of knowledge in the era of large language models because large language models have been attracting a lot of attention these days. 

And then we kind of lose again, the fog is on. Where is the data coming from? Who owns the data? Where is the lineage of this data? How do we track the usage of this data? What is fair use? What is not fair use? 

And we work with a lot of publishers and of course we represent a company that does AI. So good or bad, we do have a lot of visibility, but we do also get a lot of concern and especially with creators and publishers. 

We went through in the previous month through real fear. Because if all of a sudden people can generate your content because a model has been trained without you knowing it, that it’s your content, then it gets scary because it’s obscure. 

And there is also a misalignment between the creator and the user of the data. And so again, we need a decentralized web and we need something like Solid simply because in the context of artificial intelligence, we can’t let a few companies control the Internet. 

AM: Yes, we’re talking at a level where the engineers are trying to do work and collaborate and create the kinds of alignment that you’re speaking of. 

But let’s just bring this back to the individual user one more time here. 

If they’re reading and thinking about these things, what would you say to that person? I mean, say they’ve got a presence, but it’s not structured at all. The person doesn’t know if it’s fulfilling the goals that the person has set out for their content. 

How would you start if you were that person? 

AV: So in the 90s or early 2000, I would say to that person that he or she needed a website and couldn’t build his own digital presence only. Through services provided by others. 

And right now, what I would advocate and what I would suggest is that any individual that wants to create an impact needs to create content and data at the same time. Because without data, you are going to be controlled by others that will train better models and would absorb whatever ability or expertise you might have. 

But if you start by creating your data, then it’s like you are piling up some value or you’re creating some assets. You can immediately create value with SEO, which is a good starting point. 

Still remain SEO driven and focused. Because I realized that SEO for me has been the way in which I could start to build semantic web cost effectively. 

And the same applies to an individual. I mean, are you using advanced SEO techniques for creating an impact? Because if you are not, you’re missing out. And if you are, then most probably you are extensively curating data because this is how SEO works in 2023. 

AM: It sounds like the so-called metadata that you’re creating is the key to your own personal control over what you’re creating. 

AV: Yes, and you can measure the impact even at the individual level because the features that you will trigger on Google, on Bing, AI are so relevant for your personal reputation that you would see an immediate reward for that work on structuring content and metadata at the same time. 

So that’s for me, the starting point. Then you will realize that you’re not really working for Google or for Bing or for whatever chatbot you are targeting. You are actually creating the foundation for your personal growth. 

Because with that data, then you can train your own system. Then maybe you can work in context and personalize or digital assistant and whatnot? Yeah, it makes a whole lot of sense. It really does. 

Um, I’ve seen a graphic that Moz [published in their tutorial on how to get started with SEO. And Schema.org is at the top of the pyramid, and then there are other things that create the foundation of the pyramid. 

So you’re creating your content, but the schema is at the top. Or how would you say it?

AV: With traditional SEO, schema is an advanced technique because it’s complicated, especially if it’s not automated, and you have to write it by end it’s not the starting point. 

If your website has indexing problems, there is no advantage in creating a schema. So whether I could agree at some level that the schema is at the top, I could also start with the schema at the bottom. 

Because when we create an experience online, whether today we can say it’s a chatbot or it’s a website, we start from architecting information, and we start by looking at the personas that will access this information. 

When doing so, using a vocabulary such as a schema provides you with the insights on how you want to organize your content. 

So Schema is also helpful for helping people think in terms of information architecture. And so you will have Schema at the beginning and not just at the end.

AM: You made me think of IMDb, the film and video media database. It has all sorts of information about the different roles performers have played over the years, but also whether they’ve been an actor or a director, etc. 

It’s kind of fantastic. But what is missing from IMDb that could help actors build up their presence and their brand? I think IMDb is the de facto standard in the industry for that type of information. 

AV: But of course, how accessible is that data? How interoperable is that data? How connected is that data with other data? They could have done a lot more there. If you think about it, Wikidata has had a way larger impact simply because it’s using open standards and it’s accessible. 

For IMDb, it’s still privately owned and of course it’s managed like a private database. There are some projects that kind of make that data accessible through linked data, but they could have done more in sharing that piece of information within other pieces of information, I think. 

But that’s the principle. I mean, imagine if you could describe like IMDb does your work experience on your site, and you can do that using an interoperable standard or like Schema. What does that mean? 

It means that at the moment, if you need to create a quick and easy description of or biography using GPT, you can feed into the Schema class or the Schema attributes and then you can ask the model to describe your working experience to a six year old or to another creator or whoever. 

But if you do not have that data organized and structured every time, you will start recollecting memories on what? What you have worked on in the past? For that reason, you will not create a sustainable system for sharing your expertise. 

AM: Yes. I know that Kingsley Idehen of OpenLink, for example, posts a lot on Twitter about how he’s generating this entity schema information. And I’m wondering if you’re just getting started, how do you do something in a more automated fashion to start building the graph? 

AV: I think that we’re currently trying to approach two sides of the coin. So on one side, we want to use language models for accelerating the growth and the nurturing of the knowledge graph that we create. 

Because creating and maintaining a knowledge graph costs money. By using a language model, we can more quickly extract structure from the structure compared to the old NLP technique that we used. 

On the other side, we are using the structured data in the graph to fine tune language models. So private, dedicated language models that are trained with your data and your content. And by combining these two processes, we can, you know, very easily create thousands of FAQ answers or hundreds of thousands of product descriptions. 

So use cases where there is a lot of volume and there is a lot of need for quality. But at the same time, there is no economy in having someone writing pieces of content again. The point is, you need to have control over your data because it’s a game changer. 

Now, you can start to see that it’s a game changer in SEO, but then you will realize that it’s a game changer in anything that you will do with content or with search within your strategy. Yeah, today. 

AM: Andrea, thanks so much for edifying me and our audience. 

AV: Thanks to you, Alan.