Type: Dataset
Tags: Dataset
Bibtex:
Tags: Dataset
Bibtex:
@article{, title = {Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition}, journal = {}, author = {Alona Fyshe}, year = {2013}, url = {https://www.cs.cmu.edu/~afyshe/}, abstract = {This zip should contain 4 files: - README.txt (this file) - doc2Dep20MWU57k_1000concat2000.tab - doc2Dep20MWU57k_1000concat2000.txt - doc2Dep20MWU57k_1000concat2000.mat ****doc2Dep20MWU57k_1000concat2000.tab**** This file contains the 54975 word-units with POS tags. The order of the words in this file corresponds to the order of the rows in doc2Dep20MWU57k_1000concat2000.tab ****doc2Dep20MWU57k_1000concat2000.tab**** This tab-separated-value file contains the concatenated SVD matrices as created described in "Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition"(Fyshe 2013). The size of the matrix is 54975x2000. The first 1000 dimensions are Document dimensions, the second 1000 (1001-2000) are Dependency dimensions. The rows appear in the same order as the word-units in doc2Dep20MWU57k_1000concat2000.txt ****doc2Dep20MWU57k_1000concat2000.mat**** For convenience, this is the data contained in doc2Dep20MWU57k_1000concat2000.tab & doc2Dep20MWU57k_1000concat2000.txt saved into two matlab variables. count_matrix is the concatenated SVD matrices (tab file), words are the words (txt file). Questions may be directed to Alona Fyshe, afyshe at cs dot cmu dot edu. } }