Author Topic: An interesting experiment with this board's content  (Read 175 times)

The Gorn

  • I absolutely DESPISE improvised sulfur-charcoal-salt peter cannons made out of hollow tree branches filled with diamonds as projectiles.
  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 22324
  • Gorn Classic, user of Gornix
An interesting experiment with this board's content
« on: February 23, 2018, 06:18:01 pm »
This is a technical experiment with the content from this board.

As you guys may know, this forum software (SMF) has a "Who's Online" display. (It's a link at the bottom of the home page of the forum in the "Info Center" block under a heading labeled Who's Online.)

Ordinary users see only a cut down version of this display without IP addresses (for privacy's sake).

A moderator or admin sees something that looks like the following:



Mostly this displays bots and search engines constantly visiting the forum. It's also always changing. But if you look at this image you'll see a couple of lines in the "Action" column labeled "Viewing the topic".

I've noticed that during the day these "topic" lines can be fairly numerous - perhaps 8-10 on each viewing.

Today I realized something significant: the visitors viewing these topics MAY be an expression of interest from real visitors, which may be funnelled through some search engine that is spidering the site.

I've noticed that when I click through these linked articles, they're often fairly classic, lengthy threads we had here a few years ago.

This might be a key to understanding how to attract more users to this site: monitor which threads are being requested the most.

So, I just built a simple shell script that is running every 10 minutes on my Linux desktop as a cron job:

1) It requests the URL of "Who's Online". (using wget)
2) The text from that page is run through grep and sed, in order to locate all lines containing the "viewing the topic" text and then strip out the URL.
3) The URLs being collected on each pass are appended, along with a time stamp, to a file containing the collected links.

I could put the script on one of my internet hosting servers to run 24 hours a day,  but at night the activity dies down and therefore having it run while my PC is active is just fine.

I'm noticing that almost every thread being viewed by bots, search engines or whatever is pretty high quality and interesting to revisit.

If anyone is interested I'll post a cleaned up list to the private section.
Gornix is protected by the GPL. *

* Gorn Public License. Duplication by inferior sentient species prohibited.

unix

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 3946
Re: An interesting experiment with this board's content
« Reply #1 on: February 24, 2018, 08:35:09 am »
Hm, interesting. I didn't realize that.
Brawndo. It's got what plants crave.

The Gorn

  • I absolutely DESPISE improvised sulfur-charcoal-salt peter cannons made out of hollow tree branches filled with diamonds as projectiles.
  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 22324
  • Gorn Classic, user of Gornix
Re: An interesting experiment with this board's content
« Reply #2 on: February 24, 2018, 08:40:00 am »
Here's an example of the output. It's not live because I manually uploaded it but it's for my own amusement so no need to make it evergreen.

You can go to any of the URLs in this file.

http://posteddocs.nfshost.com/links.txt
Gornix is protected by the GPL. *

* Gorn Public License. Duplication by inferior sentient species prohibited.