|  | 6 years ago | |
|---|---|---|
| Stored_Analyses | 6 years ago | |
| __pycache__ | 6 years ago | |
| css | 6 years ago | |
| data | 6 years ago | |
| js | 6 years ago | |
| modules | 6 years ago | |
| textures | 7 years ago | |
| .gitignore | 7 years ago | |
| LICENSE | 6 years ago | |
| LICENSE.txt | 6 years ago | |
| Makefile.win | 6 years ago | |
| README.md | 6 years ago | |
| bh_tsne | 6 years ago | |
| bhtsne.py | 6 years ago | |
| fast_tsne.m | 6 years ago | |
| index.html | 6 years ago | |
| sptree.cpp | 6 years ago | |
| sptree.h | 6 years ago | |
| tsne.cpp | 6 years ago | |
| tsne.h | 6 years ago | |
| tsneGrid.py | 6 years ago | |
| tsne_main.cpp | 6 years ago | |
| vptree.h | 6 years ago | |
		
			
				
				README.md
			
		
		
			
			
		
	
	This software package contains a Barnes-Hut implementation of the t-SNE algorithm. The implementation is described in this paper.
Installation
On Linux or OS X, compile the source using the following command:
g++ sptree.cpp tsne.cpp tsne_main.cpp -o bh_tsne -O2
The executable will be called bh_tsne.
On Windows using Visual C++, do the following in your command line:
- Find the vcvars64.batfile in your Visual C++ installation directory. This file may be namedvcvars64.bator something similar. For example:
  // Visual Studio 12
  "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\vcvars64.bat"
  // Visual Studio 2013 Express:
  C:\VisualStudioExp2013\VC\bin\x86_amd64\vcvarsx86_amd64.bat
- 
From cmd.exe, go to the directory containing that .bat file and run it.
- 
Go to bhtsnedirectory and run:
  nmake -f Makefile.win all
The executable will be called windows\bh_tsne.exe.
Usage
The code comes with wrappers for Matlab and Python. These wrappers write your data to a file called data.dat, run the bh_tsne binary, and read the result file result.dat that the binary produces. There are also external wrappers available for Torch, R, and Julia. Writing your own wrapper should be straightforward; please refer to one of the existing wrappers for the format of the data and result files.
Demonstration of usage in Matlab:
filename = websave('mnist_train.mat', 'https://github.com/awni/cs224n-pa4/blob/master/Simple_tSNE/mnist_train.mat?raw=true');
load(filename);
numDims = 2; pcaDims = 50; perplexity = 50; theta = .5; alg = 'svd';
map = fast_tsne(digits', numDims, pcaDims, perplexity, theta, alg);
gscatter(map(:,1), map(:,2), labels');
Demonstration of usage in Python:
import numpy as np
import bhtsne
data = np.loadtxt("mnist2500_X.txt", skiprows=1)
embedding_array = bhtsne.run_bh_tsne(data, initial_dims=data.shape[1])
Python Wrapper
Usage:
python bhtsne.py [-h] [-d NO_DIMS] [-p PERPLEXITY] [-t THETA]
                  [-r RANDSEED] [-n INITIAL_DIMS] [-v] [-i INPUT]
                  [-o OUTPUT] [--use_pca] [--no_pca] [-m MAX_ITER]
Below are the various options the wrapper program bhtsne.py expects:
- -h, --helpshow this help message and exit
- -d NO_DIMS, --no_dimsNO_DIMS
- -p PERPLEXITY, --perplexityPERPLEXITY
- -t THETA, --thetaTHETA
- -r RANDSEED, --randseedRANDSEED
- -n INITIAL_DIMS, --initial_dimsINITIAL_DIMS
- -v, --verbose
- -i INPUT, --inputINPUT: the input file, expects a TSV with the first row as the header.
- -o OUTPUT, --outputOUTPUT: A TSV file having each row as the- ddimensional embedding.
- --use_pca
- --no_pca
- -m MAX_ITER, --max_iterMAX_ITER