ratarmount indexes for PMC OpenAccess subset
rngadam@coderbunker.com

folder oa_bulk-ratarmount_indexes_compressed (18 files)
filecomm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 18.99MB
filecomm_use.A-B.xml.tar.gz.index.sqlite.gz 22.96MB
filecomm_use.C-H.txt.tar.gz.index.sqlite.gz 23.29MB
filecomm_use.C-H.xml.tar.gz.index.sqlite.gz 28.76MB
filecomm_use.I-N.txt.tar.gz.index.sqlite.gz 23.96MB
filecomm_use.I-N.xml.tar.gz.index.sqlite.gz 30.19MB
filecomm_use.O-Z.txt.tar.gz.index.sqlite.gz 30.64MB
filecomm_use.O-Z.xml.tar.gz.index.sqlite.gz 45.13MB
filemount.sh 1.02kB
filenon_comm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 11.00MB
filenon_comm_use.A-B.xml.tar.gz.index.sqlite.gz 10.66MB
filenon_comm_use.C-H.txt.tar.gz.index.sqlite.gz 15.41MB
filenon_comm_use.C-H.xml.tar.gz.index.sqlite.gz 14.65MB
filenon_comm_use.I-N.txt.tar.gz.index.sqlite.gz 28.51MB
filenon_comm_use.I-N.xml.tar.gz.index.sqlite.gz 28.19MB
filenon_comm_use.O-Z.txt.tar.gz.index.sqlite.gz 17.31MB
filenon_comm_use.O-Z.xml.tar.gz.index.sqlite.gz 11.84MB
fileREADME.md 1.04kB
Type: Dataset

Bibtex:
@article{,
title= {ratarmount indexes for PMC OpenAccess subset},
journal= {},
author= {rngadam@coderbunker.com},
year= {},
url= {},
abstract= {## the problem

PMC Open Access bulk article (commercial and non-commercial) is a hefty set of files
that weight in compressed at 79G and uncompressed at 388G.

Archive decompression time in itself can take hours.

A bittorrent mirror exists on:

https://academictorrents.com/details/06d6badd7d1b0cfee00081c28fddd5e15e106165

## the solution

ratarmount (https://github.com/mxmlnkn/ratarmount), a python application, allows us to
use FUSE (through fusepy) to mount a compressed archive as a disk, allowing us randomly
access files in the archive as a disk without first decompression.

To achieve good performance, it creates an index (an sqlite database per archive).

This set of indexes still weight in at 1.4G uncompressed (345M compressed).

## usage

* decompress all indexes in the same directory you've downloaded oa_bulk
* install ratarmount
* use ratarmount to mount the oa_bulk archives on the disk

a sample script ```mount.sh``` is provided as an example

## distribution

we also use bittorrent to distribute the set of indexes. },
keywords= {PMC, PubMed, ratarmount},
terms= {},
license= {CC BY 4.0},
superseded= {}
}

No stats to report yet.

Send Feedback Start
   0.000006
DB Connect
   0.000485
Lookup hash in DB
   0.000385
Get torrent details
   0.000141
Get torrent details, finished
   0.000203
Get authors
   0.000026
Parse bibtex
   0.000091
Write header
   0.000184
get stars
   0.000104
home tab
   0.000317
render right panel
   0.000007
render ads
   0.000318
fetch current hosters
   0.000218
Start get stats
   0.000331
End get stats
   0.000001
related datasets
   0.002511
Done