ratarmount indexes for PMC OpenAccess subset
rngadam@coderbunker.com

folder oa_bulk-ratarmount_indexes_compressed (18 files)
filecomm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 18.99MB
filecomm_use.A-B.xml.tar.gz.index.sqlite.gz 22.96MB
filecomm_use.C-H.txt.tar.gz.index.sqlite.gz 23.29MB
filecomm_use.C-H.xml.tar.gz.index.sqlite.gz 28.76MB
filecomm_use.I-N.txt.tar.gz.index.sqlite.gz 23.96MB
filecomm_use.I-N.xml.tar.gz.index.sqlite.gz 30.19MB
filecomm_use.O-Z.txt.tar.gz.index.sqlite.gz 30.64MB
filecomm_use.O-Z.xml.tar.gz.index.sqlite.gz 45.13MB
filemount.sh 1.02kB
filenon_comm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 11.00MB
filenon_comm_use.A-B.xml.tar.gz.index.sqlite.gz 10.66MB
filenon_comm_use.C-H.txt.tar.gz.index.sqlite.gz 15.41MB
filenon_comm_use.C-H.xml.tar.gz.index.sqlite.gz 14.65MB
filenon_comm_use.I-N.txt.tar.gz.index.sqlite.gz 28.51MB
filenon_comm_use.I-N.xml.tar.gz.index.sqlite.gz 28.19MB
filenon_comm_use.O-Z.txt.tar.gz.index.sqlite.gz 17.31MB
filenon_comm_use.O-Z.xml.tar.gz.index.sqlite.gz 11.84MB
fileREADME.md 1.04kB
Type: Dataset

Metadata:
@article{,
title= {ratarmount indexes for PMC OpenAccess subset},
journal= {},
author= {rngadam@coderbunker.com},
year= {},
url= {},
abstract= {## the problem

PMC Open Access bulk article (commercial and non-commercial) is a hefty set of files
that weight in compressed at 79G and uncompressed at 388G.

Archive decompression time in itself can take hours.

A bittorrent mirror exists on:

https://academictorrents.com/details/06d6badd7d1b0cfee00081c28fddd5e15e106165

## the solution

ratarmount (https://github.com/mxmlnkn/ratarmount), a python application, allows us to
use FUSE (through fusepy) to mount a compressed archive as a disk, allowing us randomly
access files in the archive as a disk without first decompression.

To achieve good performance, it creates an index (an sqlite database per archive).

This set of indexes still weight in at 1.4G uncompressed (345M compressed).

## usage

* decompress all indexes in the same directory you've downloaded oa_bulk
* install ratarmount
* use ratarmount to mount the oa_bulk archives on the disk

a sample script ```mount.sh``` is provided as an example

## distribution

we also use bittorrent to distribute the set of indexes. },
keywords= {PMC, PubMed, ratarmount},
terms= {},
license= {CC BY 4.0},
superseded= {}
}

Citation:
rngadam@coderbunker.com. (2020). ratarmount indexes for PMC OpenAccess subset [Data set]. Academic Torrents. https://academictorrents.com/details/e95526a0bc4f39a5bbf423b24708d65fa4542d20

Send Feedback Start
   0.000007
DB Connect
   0.000516
Lookup hash in DB
   0.000432
Get torrent details
   0.000163
Get torrent details, finished
   0.000281
Get authors
   0.000025
Parse bibtex
   0.000164
Write header
   0.000245
get stars
   0.000124
home tab
   0.000374
render right panel
   0.000005
render ads
   0.000473
fetch current hosters
   0.000239
related datasets
   0.002738
Done