t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections https://doi.org/10.1109/TVCG.2020.2986996

Angelos Chatzimparmpas 5c64e6ea35 updated a little bit visualization Former-commit-id: `f2e3631fa3`		6 years ago
Stored_Analyses	three plots NB	7 years ago
__pycache__	metrics	6 years ago
cachedir/joblib/tsneGrid	updated a little bit visualization	6 years ago
css	updated a little bit visualization	6 years ago
data	fixed metrics	6 years ago
js	updated a little bit visualization	6 years ago
modules	update	6 years ago
textures	fixed css	7 years ago
.gitignore	Initial commit	7 years ago
LICENSE	Add README.md	7 years ago
LICENSE.txt	test	6 years ago
Makefile.win	test	6 years ago
README.md	test	6 years ago
bh_tsne	test	6 years ago
bhtsne.py	test	6 years ago
fast_tsne.m	test	6 years ago
index.html	final version	6 years ago
requirements.txt	new	6 years ago
run.sh	new	6 years ago
sptree.cpp	test	6 years ago
sptree.h	test	6 years ago
tsne.cpp	test	6 years ago
tsne.h	test	6 years ago
tsneGrid.py	final version	6 years ago
tsne_main.cpp	test	6 years ago
vptree.h	test	6 years ago

README.md

This software package contains a Barnes-Hut implementation of the t-SNE algorithm. The implementation is described in this paper.

Note: This repository is a version of t-SNE modified to support ongoing research. It may be slightly slower than the original. If you're just trying to run t-SNE, check the original repository that we forked from.

Installation

On Linux or OS X, compile the source using the following command:

g++ sptree.cpp tsne.cpp tsne_main.cpp -o bh_tsne -O2

The executable will be called bh_tsne.

On Windows using Visual C++, do the following in your command line:

Find the vcvars64.bat file in your Visual C++ installation directory. This file may be named vcvars64.bat or something similar. For example:

  // Visual Studio 12
  "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\vcvars64.bat"

  // Visual Studio 2013 Express:
  C:\VisualStudioExp2013\VC\bin\x86_amd64\vcvarsx86_amd64.bat

From cmd.exe, go to the directory containing that .bat file and run it.
Go to bhtsne directory and run:

  nmake -f Makefile.win all

The executable will be called windows\bh_tsne.exe.

Usage

The code comes with wrappers for Matlab and Python. These wrappers write your data to a file called data.dat, run the bh_tsne binary, and read the result file result.dat that the binary produces. There are also external wrappers available for Torch, R, and Julia. Writing your own wrapper should be straightforward; please refer to one of the existing wrappers for the format of the data and result files.

Demonstration of usage in Matlab:

filename = websave('mnist_train.mat', 'https://github.com/awni/cs224n-pa4/blob/master/Simple_tSNE/mnist_train.mat?raw=true');
load(filename);
numDims = 2; pcaDims = 50; perplexity = 50; theta = .5; alg = 'svd';
map = fast_tsne(digits', numDims, pcaDims, perplexity, theta, alg);
gscatter(map(:,1), map(:,2), labels');

Demonstration of usage in Python:

import numpy as np
import bhtsne

data = np.loadtxt("mnist2500_X.txt", skiprows=1)

embedding_array = bhtsne.run_bh_tsne(data, initial_dims=data.shape[1])

Python Wrapper

Usage:

python bhtsne.py [-h] [-d NO_DIMS] [-p PERPLEXITY] [-t THETA]
                  [-r RANDSEED] [-n INITIAL_DIMS] [-v] [-i INPUT]
                  [-o OUTPUT] [--use_pca] [--no_pca] [-m MAX_ITER]

Below are the various options the wrapper program bhtsne.py expects:

-h, --help show this help message and exit
-d NO_DIMS, --no_dims NO_DIMS
-p PERPLEXITY, --perplexity PERPLEXITY
-t THETA, --theta THETA
-r RANDSEED, --randseed RANDSEED
-n INITIAL_DIMS, --initial_dims INITIAL_DIMS
-v, --verbose
-i INPUT, --input INPUT: the input file, expects a TSV with the first row as the header.
-o OUTPUT, --output OUTPUT: A TSV file having each row as the d dimensional embedding.
--use_pca
--no_pca
-m MAX_ITER, --max_iter MAX_ITER