Category Archives: Command Line

Ubuntu Bash Readline ( Command Line ) Shell Terminal Keyboard Shortcuts

Alt+Y - cycle kill ring 
ctrl+Y - paste from ring
ctrl+k - kill after cursor
ctrl+u - kill before cursor
ctrl+a - start of line
ctrl+e - end of line
ctrl+spc - begin select
ctrl+r - reverse history search 
sudo !! - make me a sandwich
cd !$ - last argument ( i.e. ls )
alt+. - paste previous arguments
alt+f, alt+b - same as ctrl + <- or-> - move a word


Binary File Hex Viewers on Ubuntu 12.04

I tried command line beav and gnome Jeex

As these are both binaries I am looking for a binary viewer / hex viewer – I will try beav

sudo apt-get install beav

running beav on the binary from based on the 1st billion bytes of wikipedia:

beav vectors.bin

reults in :

0: 37 31 32 39 30 20 32 30  30 0A 3C 2F 73 3E 20 07  71290 200.</s> .
10: F2 DE 3A 42 8B 9D 3A 0D  43 FC 3A E8 25 46 3A A3  ..:B..:.C.:.%F:.
20: 0E D9 BA C1 C0 02 BA 6F  1C D6 BA 0A 52 A3 B9 41  …….o….R..A

trying jeex hex-editor

sudo apt-get install jeex

Why early man didn’t develop Hex is beyond me after all we have 10 fingers that’s enough for signed 8 bit numbers with a carry bit much more useful than decimal.

Well much as I love the sight of Hex – flashback to 1970’s NASCOM Z80 Hex code ! This is impenetrable, no easily discernible patterns – so I need a text file and word2vec has a flag for that.

CSV to JSON-P, a Javascript Array converter in awk

Instead of converting a CSV in to JSON it is sometimes more convenient to convert a CSV to a Javascript Array with Awk.

To import word vectors from word2vec into Javascript I used a quick awk script to add the syntactic sugar to make an array of objects :

now the array can be used in javascript. This is called JSON-P, CSV to Javscript import.

JSON-P is good because: the data is ready for use by scripts with no additional steps. The MIME type is text/javascript just include it as a script tag in html and the data is ready. Make the type file .js for maximum compatibility.

<script type=”text/javascript” src=”vectors.js”></script>

Python Simple HTTP Server – Web Serve Files over HTTP to properly simulate file permissions

Problem : Everything worked till I put it online ; site stopped working when uploaded to the server ?

When working with Javascript Files, Libraries, Includes, jQuery, WebGL textures and the like of HTTP included files from a web page I find it best to open files over HTTP from a web server as there can be cross site permission restrictions on file types and such like – I have run into this problem when including images as WebGL textures using three.js.

Python Simple HTTP Server – runs in a directory and serves those files and folders over HTTP on localhost port 8000

navigate to the file directory in a BASH prompt ( or shell or terminal ) and run :

python -m SimpleHTTPServer

and open http://localhost:8000 in a web-browser

which will load an index page with the files as links or index.html if that is present

And the HTTP transactions are logged to the console :D

Vary the PORT by adding a port number 8020 to the command thus :

python -m SImpleHTTPServer 8030

and you can serve multiple directories

As a bonus you can connect over your local network at the IP ( is localhost ) address of the Server thus

or use a dynamic dns service or DMZ or port forwards on your router to serve the site globally ( prolly super insecure so do not ever do this ever :( )


Downloading and Run Word2Vec from svn on Ubuntu 12.10

Downloading word2vec is easy with svn :

svn checkout w2v

then :


which automagically elicited the following

gcc word2vec.c -o word2vec -lm -pthread -Ofast -march=native -Wall -funroll-loops -Wno-unused-result
word2vec.c: In function ‘TrainModelThread’:
word2vec.c:363:36: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
word2vec.c:369:50: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
gcc word2phrase.c -o word2phrase -lm -pthread -Ofast -march=native -Wall -funroll-loops -Wno-unused-result
gcc distance.c -o distance -lm -pthread -Ofast -march=native -Wall -funroll-loops -Wno-unused-result
gcc word-analogy.c -o word-analogy -lm -pthread -Ofast -march=native -Wall -funroll-loops -Wno-unused-result
gcc compute-accuracy.c -o compute-accuracy -lm -pthread -Ofast -march=native -Wall -funroll-loops -Wno-unused-result
chmod +x *.sh

then I ran the 1st demo:


which resulted in downloading a file train8 from matt mahoney which is the 1st billion charachters from WIkipedia.

Then a training phase ramped both my CPUs to 100% and got me training at 27.16k words per second. w00t !

Enter word or sentence (EXIT to break): scotland

Word: scotland  Position in vocabulary: 1105

Word       Cosine distance
england        0.797283
wales        0.674080
scots        0.622237
ireland        0.607295
somerset        0.576749
cornwall        0.564767
scottish        0.555280
britain        0.540651
lulach        0.523911
tudor        0.508784
queen        0.508427
brittany        0.489793
elizabeth        0.477710
edward        0.476515
wessex        0.473759
earls        0.472889
dunkeld        0.465925
peerage        0.457085
jannaeus        0.456586
henry        0.449374
viii        0.448380
navarre        0.446313
king        0.442532
crowned        0.441560
shropshire        0.440221
ulster        0.439687
thrones        0.439417
victoria        0.439166
vii        0.438213
moray        0.438193
essex        0.436330
isles        0.432125
yorkshire        0.430498
gruoch        0.429486
aberdeen        0.428796
hertfordshire        0.428162
royalists        0.427168
pinewood        0.427026
afonso        0.426304
conqueror        0.426080

and running :


results in :

make: Nothing to be done for `all’.
Starting training using file text8
words processed: 14700K     Vocab size: 3944K
real    0m20.519s
user    0m17.849s
sys    0m2.164s

and running :


results in :

make: Nothing to be done for `all’.
Note that for the word analogy to perform well, the models should be trained on much larger data sets
Example input: paris france berlin
Starting training using file text8
Vocab size: 71290
Words in train file: 16718843
Alpha: 0.023731  Progress: 5.14%  Words/thread/sec: 25.92k  ^C
real    0m26.163s
user    0m39.730s
sys    0m0.452s


make: Nothing to be done for `all’.
Note that for the word analogy to perform well, the models should be trained on much larger data sets
Example input: paris france berlin
Starting training using file text8
Vocab size: 71290
Words in train file: 16718843
Alpha: 0.000121  Progress: 99.58%  Words/thread/sec: 24.97k
real    6m13.423s
user    11m13.314s
sys    0m1.568s
Enter three words (EXIT to break): king queen paris

Word: king  Position in vocabulary: 187

Word: queen  Position in vocabulary: 903

Word: paris  Position in vocabulary: 1055

Word              Distance
la        0.460125
commune        0.435542
lausanne        0.430589
vevey        0.429535
bologna        0.418084
poissy        0.416120
lyon        0.415239
sur        0.405876
madrid        0.403547
les        0.401034
nantes        0.400419
le        0.390821
du        0.390574
sorbonne        0.390027
boulogne        0.389495
conservatoire        0.388148
hospital        0.383842
complutense        0.382987
salle        0.382847
nationale        0.382411
distillery        0.381971
villa        0.379842
cegep        0.378001
louvre        0.377497
technologie        0.376842
baume        0.376681
marie        0.375048
tijuana        0.373825
chapelle        0.373256
universitaire        0.369375
dame        0.368908
universite        0.367017
grenoble        0.366289
henares        0.364779
lleida        0.359695
ghent        0.359106
puerta        0.358269
escuela        0.357838
fran        0.357790
apartment        0.357674

Enter three words (EXIT to break): lord cat crazy

Word: lord  Position in vocabulary: 757

Word: cat  Position in vocabulary: 2601

Word: crazy  Position in vocabulary: 9156

Word              Distance
rip        0.558223
dog        0.526483
ass        0.515598
haired        0.508471
bionic        0.506194
noodle        0.503654
bites        0.499798
prionailurus        0.493895
leopard        0.493136
blonde        0.492464
sloth        0.491286
coyote        0.486539
stump        0.486430
felis        0.486219
eyed        0.485090
bonzo        0.484414
iris        0.482825
slim        0.482549
candy        0.481998
rhino        0.481062
nails        0.478900
blossom        0.478873
hitch        0.476750
mom        0.475003
ugly        0.474717
shit        0.474513
mickey        0.473128
goat        0.471524
inu        0.469157
cheesy        0.467842
daddy        0.467046
kid        0.466047
watermelon        0.464281
naughty        0.463567
funny        0.463252
rabbit        0.462738
kiss        0.461589
reaper        0.460838
chupacabra        0.458305
girl        0.458293