ratarmount indexes for PMC OpenAccess subset
rngadam@coderbunker.com

folder oa_bulk-ratarmount_indexes_compressed (18 files)
filecomm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 18.99MB
filecomm_use.A-B.xml.tar.gz.index.sqlite.gz 22.96MB
filecomm_use.C-H.txt.tar.gz.index.sqlite.gz 23.29MB
filecomm_use.C-H.xml.tar.gz.index.sqlite.gz 28.76MB
filecomm_use.I-N.txt.tar.gz.index.sqlite.gz 23.96MB
filecomm_use.I-N.xml.tar.gz.index.sqlite.gz 30.19MB
filecomm_use.O-Z.txt.tar.gz.index.sqlite.gz 30.64MB
filecomm_use.O-Z.xml.tar.gz.index.sqlite.gz 45.13MB
filemount.sh 1.02kB
filenon_comm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 11.00MB
filenon_comm_use.A-B.xml.tar.gz.index.sqlite.gz 10.66MB
filenon_comm_use.C-H.txt.tar.gz.index.sqlite.gz 15.41MB
filenon_comm_use.C-H.xml.tar.gz.index.sqlite.gz 14.65MB
filenon_comm_use.I-N.txt.tar.gz.index.sqlite.gz 28.51MB
filenon_comm_use.I-N.xml.tar.gz.index.sqlite.gz 28.19MB
filenon_comm_use.O-Z.txt.tar.gz.index.sqlite.gz 17.31MB
filenon_comm_use.O-Z.xml.tar.gz.index.sqlite.gz 11.84MB
fileREADME.md 1.04kB
Type: Dataset

Metadata:
@article{,
title= {ratarmount indexes for PMC OpenAccess subset},
journal= {},
author= {rngadam@coderbunker.com},
year= {},
url= {},
abstract= {## the problem

PMC Open Access bulk article (commercial and non-commercial) is a hefty set of files
that weight in compressed at 79G and uncompressed at 388G.

Archive decompression time in itself can take hours.

A bittorrent mirror exists on:

https://academictorrents.com/details/06d6badd7d1b0cfee00081c28fddd5e15e106165

## the solution

ratarmount (https://github.com/mxmlnkn/ratarmount), a python application, allows us to
use FUSE (through fusepy) to mount a compressed archive as a disk, allowing us randomly
access files in the archive as a disk without first decompression.

To achieve good performance, it creates an index (an sqlite database per archive).

This set of indexes still weight in at 1.4G uncompressed (345M compressed).

## usage

* decompress all indexes in the same directory you've downloaded oa_bulk
* install ratarmount
* use ratarmount to mount the oa_bulk archives on the disk

a sample script ```mount.sh``` is provided as an example

## distribution

we also use bittorrent to distribute the set of indexes. },
keywords= {PMC, PubMed, ratarmount},
terms= {},
license= {CC BY 4.0},
superseded= {}
}

Citation:
rngadam@coderbunker.com. (2020). ratarmount indexes for PMC OpenAccess subset [Data set]. Academic Torrents. https://academictorrents.com/details/e95526a0bc4f39a5bbf423b24708d65fa4542d20
No stats to report yet.

Send Feedback Start
   0.000006
DB Connect
   0.000441
Lookup hash in DB
   0.000400
Get torrent details
   0.000118
Get torrent details, finished
   0.000216
Get authors
   0.000026
Parse bibtex
   0.000146
Write header
   0.000206
get stars
   0.000111
home tab
   0.000361
render right panel
   0.000005
render ads
   0.000374
fetch current hosters
   0.000219
Start get stats
   0.000316
End get stats
   0.000001
related datasets
   0.002561
Done