in my free time, i’ll start coding a new service: content provider

you know i played with ico-content long time ago, and as it was a crap, i decided to code a new version…

but that’s not enough for me, as it would not bring a lot of paying clients. so i got the solution, a lot time ago, and i think it’s time we do it. instead of the autonomous ico-content that grab content from yahoo.answers.com, we will become content providers.

It is instant, even if the server get shut down it’s not a problem because the job we do is to give instant access to our database for a snap of time, and the rest is done on the client side.

explanation:

as we have the content, instead that the client connect to yahoo, they connect to us. if they pay for 30 posts, they connect once, our grabber on their vBulletin grab the 30 posts structured properly with users info etc, and that’s it. takes 5 seconds. the feed is in php, it integrate directly into vBulletin, and it requires site authentication, so it is impossible to have leechers. we use the same technique as used by yahoo.

you can see it here: Yahoo! Answers – YDN

the « TRY IT NOW » part of the page, where you can choose the action then the output… the PHP version is a serialized dB insert… the only thing you need on your side is the same structure for your dB… How Magnificient.

So, what to do:

1- create an automated process that will grab actually completely everything coming from answers.yahoo.com. the content is grabbed completely, with questions, answers. the users are grabbed too, with their avatar etc… they have their userid, so we will be able to change some details afterward (because UserNick sometimes suck, are emails etc).

2- each entry is a post, no matter what they are called, so each post will have its keywords based on the content and the title, not only the title. we do not keep the filtering on yahoo’s structure, because it is not robot, it is based on human evaluation, which is really falsed by a lot of things *(the « sex » keyword really mean nothing in « house demolition, but you see it often)… so the structure of the threads/posts will change once it is inserted in the database.

3- the client select the number of new threads based on keywords x, select if they want to generate random posting « live » on their site or have « old posts » generated, which would be based on the original content dateline, making the client forum having older content than before maybe. showing a different way of things, you can choose to have an old existing forum or a complete new one with new users.

4- the client pay for a grab-key, and when authenticated, a product file is sent to the client that will grab exactly what he want. the process is instant, and if the client choose the « old posts » technique, the posts are posted on the forum at the same time. if it is the « live » version, it takes time to post the threads and answers freely, randomly based on the trafic of the client site.

…on our side, we store each thread/post ID of that client, so if they come back, we do not send the same content again. we do it on our side, because if they decide to delete everything on their forum, we are not responsible for the loss…. 🙂 and if they want to retrieve the same content again, we can!



viewing the growing number of requests for new content, paid posting, fake users etc, i think it’s the best thing we can do.

i’m just talking about yahoo’s service here because i did not check elsewhere yet, but once yahoo is tested, we can add more of these content providers. yahoo is free, it’s on our side.

the system is really easy to deal with. grabbing content like rss/xml is quite simple, we just need a master database, but i can do right now for testing…

if you think the MTF can offer something to provide content, just tell.. 🙂

3 réponses sur “in my free time, i’ll start coding a new service: content provider”

  1. kinda… once on our server, the client come and grab already formated threads… the same as the yahoo thing, but in better shape… it look strange, but a lot of yahoo stuff have long titles and no content… my engine already check this, we can do more.