Author Archives: sinclair

Binary File Hex Viewers on Ubuntu 12.04

I tried command line beav and gnome Jeex

As these are both binaries I am looking for a binary viewer / hex viewer – I will try beav

sudo apt-get install beav

running beav on the binary from demo-words.sh based on the 1st billion bytes of wikipedia:

beav vectors.bin

reults in :

0: 37 31 32 39 30 20 32 30  30 0A 3C 2F 73 3E 20 07  71290 200.</s> .
10: F2 DE 3A 42 8B 9D 3A 0D  43 FC 3A E8 25 46 3A A3  ..:B..:.C.:.%F:.
20: 0E D9 BA C1 C0 02 BA 6F  1C D6 BA 0A 52 A3 B9 41  …….o….R..A

trying jeex hex-editor

sudo apt-get install jeex

Why early man didn’t develop Hex is beyond me after all we have 10 fingers that’s enough for signed 8 bit numbers with a carry bit much more useful than decimal.

Well much as I love the sight of Hex – flashback to 1970’s NASCOM Z80 Hex code ! This is impenetrable, no easily discernible patterns – so I need a text file and word2vec has a flag for that.

What are Cosine Distance, Cosine Similarity ?

Cosine Similarity is the cosine of the angular difference between two vectors which is equal to the dot product divided by the sum of the magnitudes. ( wikipedia / wolfram )

\text{similarity} = \cos(\theta) = {A \cdot B \over \|A\| \|B\|} = \frac{ \sum\limits_{i=1}^{n}{A_i \times B_i} }{ \sqrt{\sum\limits_{i=1}^{n}{(A_i)^2}} \times \sqrt{\sum\limits_{i=1}^{n}{(B_i)^2}} }

It is used in word2vec to find words that are close by.

It does not account for magnitude only angular difference but it can be calculated fast on sparse matrixes with only non-zero entries needing calculation and so has found a place in text classification.

 

CSV to JSON-P, a Javascript Array converter in awk

Instead of converting a CSV in to JSON it is sometimes more convenient to convert a CSV to a Javascript Array with Awk.

To import word vectors from word2vec into Javascript I used a quick awk script to add the syntactic sugar to make an array of objects :

now the array can be used in javascript. This is called JSON-P, CSV to Javscript import.

JSON-P is good because: the data is ready for use by scripts with no additional steps. The MIME type is text/javascript just include it as a script tag in html and the data is ready. Make the type file .js for maximum compatibility.

<script type=”text/javascript” src=”vectors.js”></script>

including JSON files & Javascript in HTML

JSON – Javascript Object Notation is convenient data format – based on javascript syntax.

How to read it into Javascript is another matter. From Wikipedia :

“The official MIME type for JSON text is “application/json“.[15] Although most modern implementations have adopted the official MIME type, many applications continue to provide legacy support for other MIME types. Many service providers, browsers, servers, web applications, libraries, frameworks, and APIs use, expect, or recognize the (unofficial) MIME type “text/json” or the content-type “text/javascript“.”

var my_JSON_object;
var http_request = new XMLHttpRequest();
http_request.open("GET", url, true);
http_request.onreadystatechange = function () {
    var done = 4, ok = 200;
    if (http_request.readyState == done && http_request.status == ok) {
        my_JSON_object = JSON.parse('''http_request.responseText''');
    }
};
http_request.send(null);

after tearing your hair out why not just use JSON-P – works everywhere instantly.

 

 

Python Simple HTTP Server – Web Serve Files over HTTP to properly simulate file permissions

Problem : Everything worked till I put it online ; site stopped working when uploaded to the server ?

When working with Javascript Files, Libraries, Includes, jQuery, WebGL textures and the like of HTTP included files from a web page I find it best to open files over HTTP from a web server as there can be cross site permission restrictions on file types and such like – I have run into this problem when including images as WebGL textures using three.js.

[SOLVED]
Python Simple HTTP Server – runs in a directory and serves those files and folders over HTTP on localhost port 8000

navigate to the file directory in a BASH prompt ( or shell or terminal ) and run :

python -m SimpleHTTPServer

and open http://localhost:8000 in a web-browser

which will load an index page with the files as links or index.html if that is present

And the HTTP transactions are logged to the console :D

Vary the PORT by adding a port number 8020 to the command thus :

python -m SImpleHTTPServer 8030

and you can serve multiple directories

As a bonus you can connect over your local network at the IP ( 127.0.0.1 is localhost ) address of the Server thus

http://192.168.1.5:8000

or use a dynamic dns service or DMZ or port forwards on your router to serve the site globally ( prolly super insecure so do not ever do this ever :( )