Category Archives: Epistemology

How and Why to Blog – Epistimology – weblog

Blog – from portmanteau web log => weblog => we blog => blog

like a Captain’s log stardate 2013…

Now a Blog is a diary – the primary index is date-of-posting.

But back in the day it was weblog, an annotated bookmarks file or a web history dumped and sometimes quickly praiseed.

Surfing meant folling links not jumping in from search. Search was rubbish in the 1990’s.

Altavista ( early search engine ) only ever indexed at most 3% of the web (and less each day aas itthe web grew faster than the Altavista web-spired spidered new content).

It had no pagerank and was keyword based only – synonyms were bugs, and misdirectable / spammable by huge lists of user invisible keywords\

\<Head\> metadata was much more imporatant.

To find a site again one to retrace each click.

Fundamentally blogging is a research tool – mostly read by the author.

If I write a how-to or tutorial it is mostly for me when 6 months later I need to do the same thing again.

The difficulty is : “I only realised today the relevancy of site I saw a link for yesterday but I can’t find it in my history and can’t remember how to get there.”

Folksonomies like tags and tag-clouds as well as heirarchicle categories came later.

A blog was curated, pruned praiseed links, sometimes sorted by topic and often ‘gems’ or best-in-class.

‘weBlog’ was synonymous with ‘Links page’

Even the early google had limited reach.

This was the primary mode before Google made search work.

‘Surf the Web’ meant to catch an info wave – the first links from a search engine were usually irrelevant synonimic mistakes – if the page was on track then hit the links for curated on subject gems.

The Web is a Web of links. Links were the ONLY method to get to most of the web – from one place to the next following a link trail – Search engines had tiny indexes and next to no intelligence before Google and Pagerank ( pagerank itself is based on counting inboind links as citations ).

Today links are sent and most pages come from search – one rarely follows a trail or ‘seam/vein’ of information.

Thus pages thus linked would more than likely be on track and best-in-class and then goto their links pages… rinse… repeat …er … profit

So an infowave was caught and one surfed it twisting the board by leaning into the flow – thus surfing – the term is an anachronism. Google made it somewhat irrelevant.

Web rings had next site buttons at the bottom of every page.

Next / PREV meant websites whereas today they mean in-article pagination or next entry in a diary.

<a title=”Hacker News ( Silicon Valley Startups and Comp Sci )” href=”http://news.ycombinator.net” target=”_blank”>HackerNews</a> for 2013-Aug-08-21:30PM (save) —

$ wget –max-depth=1 to capture the discussions and then links from the discussions — caveat : The menu and logo and footer links are NOT the discussion.

*** discussions and links should be captured

*** In page Linked Images and Code – and CSS should be grabbed HTML alone is useless

*** Screenshots – as a baseline format (standards / bowsers change ) and for thumbnails

(*) The question is how to effectively blog :

**Primary Audience : My own notes – other readers are secondary

**Need to save archival copies of links in case they go down / change

***SHould change, i.e. updates or further comments be recorded / subscribed to – a wikiwiki style of svn edit history – or git record diff patches — [A] Minor issue

****Permalinks are not a solution to this – but perms should be linked as should original link

**Surf Method – Load news page – open discussions in new tabs – sometimes the Original Article (but the discussion keeps this linked from title – the reverse is not true (discussion not linked from Original Article)

**Note HN churns fast and expires further pages (? – are the pages timestamped to the 1st page to stopp article repeating — reddit repeats hn doesn’t ) so pulling page 2 nd 3 immediately is good or it can’t be done -(actionable)- should script pulling the 1st 250 pages then — or as many as needed until repeats are most of the links.

-(actionable)- should script pulling the 1st 250 pages then for all sites eithed cache expiry or reasons of churn.

with RSS we can subscribe to sites — the ‘new queue’ and so never miss a post.

(Q) Why is Web History so very broken – I want everything I see saved as I see it and a branching searchable sortable history.

History is worse now than in Netscape 3 back in 1994.