This gig will end soon!

  1. Any smart AI algorithms of machines are utterly useless without content to learn from. Currently, they freely and effortlessly download this content from the internet. But this gravy train will soon come to an end!
  2. Because content owners will want to charge for it. And some already are 😉 It was recently revealed that Reddit alone will earn a whopping $203 million from licensing posts and comments from its site.
  3. There’s a chance to earn your commission from such agreements. To do this, you need to create a platform like this and connect as many content sites to it as possible:

Project Essence

“Bots are already here. They are downloading your data,” warns TollBit content websites. This refers to bots launched by AI developers to download data from websites for their machines to learn from.

“But we will help you make money from this,” promises the startup to website owners.

To do this, the website owner can set prices for bot downloads of their content on the TollBit platform.

Prices for different types of content can be set differently – for example, higher for the latest news or exclusive content. Moreover, different prices can be charged for different AI machines. This depends on how wealthy the website owner considers the developers of these AI machines to be 😉

The platform allows for a flexible and dynamic tariff system to be set up. The most obvious option is when the price per request depends on the total number of requests from a specific bot. In addition, prices can also depend on specific keywords specified in such requests.

At the technical level, the TollBit platform blocks requests from any bots it recognizes by the bot’s name transmitted in the request or by the IP addresses of the servers from which these requests are initiated.

Requests are only allowed from those bots that transmit a special token in the request. This token is allocated by the platform to the AI machine developer after the relevant agreement is concluded with the website owner.

The platform keeps track of authorized requests to automatically generate and send an invoice to the AI machine developer at the end of each reporting period. In addition, the platform ensures that the frequency of requests does not exceed the set threshold and their quantity is within the agreed maximum with the website owners.

The startup claims that a website owner can connect their platform in 15 minutes. After that, the platform starts demanding money from AI machine developers automatically.

As soon as the platform sees blocked requests from a bot, it sends a message to the developers through the contacts it has, informing them that the bot’s access to a certain website is blocked – along with a link to a ready-made contract with prices for access to this content, which the developer only needs to sign and inform the platform about.

TollBit was founded last year, but it has already managed to sign its first clients among the owners of content websites.

And now the startup has raised its first $7 million in investments.

What’s interesting

The problems for AI machine developers began in December of last year when the New York Times filed a lawsuit against OpenAI and Microsoft for the “millions of New York Times articles” used to train their AI machines.

By January, it was known that OpenAI was in talks with CNN, Fox, Time, and a dozen other publishers to license their content for training its AI machine.

In February, it was reported that Google had reached an agreement with the popular user forum site Reddit to license their content for training its AI machine. The deal was reportedly worth around $60 million.

At the same time, Reddit was preparing for an IPO, and from the filing they submitted for this purpose, it was revealed that the total amount of contracts for data licensing amounted to $203 million, of which $66.4 million is expected to turn into actual revenue for Reddit by the end of 2024.

Any smart AI algorithm is completely useless without data to train on. Improving algorithms requires an increasing amount of data. A graph compiled by Reuters shows that AI machines’ appetites for data have grown fourfold since 2022.

Thus, today’s TollBit is targeting a large and growing market that has suddenly opened up with the development of AI technologies and the emergence of more AI machines.

It is worth noting separately that the current functionality of the TollBit platform is still very primitive as it only applies to bots that crawl publicly available internet pages.

The catch is that only 4% of all data actually present on the internet is publicly available – while 96% of data is inaccessible to ordinary bots.

6% of this data belongs to the so-called “Dark Web,” which consists of carefully hidden and specially encrypted data – including personal correspondence and various illegal information.

However, 90% is perfectly normal “Deep Web” data, which, despite its normality, is still inaccessible to bots. For example, bots cannot access pages that are password-protected – such as due to a paid subscription to this content. Or databases in which users need to search by keywords to see the information found by those words – and bots need to spend a long time selecting and trying out such keywords to be able to download the contents of such databases.

In principle, owners of subscription services and databases may not mind earning some extra money by licensing this data. But the platform intended for this purpose must be able to open and tariff access to such content, which is technically more difficult than blocking or allowing access for bots to public pages of websites.

It turns out that 90% of the technical work on developing platforms for data licensing to AI machine developers is still ahead 😉

Where to go

The topic of licensing content for training AI machines is very relevant, given the rapid pace of AI development.

Thus, the direction of potential movement is the creation of a platform for content licensing, which would be easy and convenient to use for both content owners and AI machine developers.

The most promising segment, in my opinion, is licensing data from the “deep web” since it accumulates perhaps the most valuable and unique content, which cannot be extracted by other means.

However, if this direction seems interesting to you, you need to hurry very much because there cannot be many such platforms in every geographical market. After all, website owners only need to integrate with one of these platforms, and after some time, they will all start to choose one of those who have already gained fame.

Therefore, the key part of promoting such platforms is to conclude agreements and integrate with providers of valuable content. And AI machine developers will prefer to work through platforms that have managed to gather a larger number of suppliers.

About the Company

TollBit

Website: tollbit.com

Last round: $7M, 05.03.2024

Total investments: $7M, rounds: 1

Posted in

,