ratarmount indexes for PMC OpenAccess subset
rngadam@coderbunker.com

folder oa_bulk-ratarmount_indexes_compressed (18 files)
filecomm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 18.99MB
filecomm_use.A-B.xml.tar.gz.index.sqlite.gz 22.96MB
filecomm_use.C-H.txt.tar.gz.index.sqlite.gz 23.29MB
filecomm_use.C-H.xml.tar.gz.index.sqlite.gz 28.76MB
filecomm_use.I-N.txt.tar.gz.index.sqlite.gz 23.96MB
filecomm_use.I-N.xml.tar.gz.index.sqlite.gz 30.19MB
filecomm_use.O-Z.txt.tar.gz.index.sqlite.gz 30.64MB
filecomm_use.O-Z.xml.tar.gz.index.sqlite.gz 45.13MB
filemount.sh 1.02kB
filenon_comm_use.0-9A-B.txt.tar.gz.index.sqlite.gz 11.00MB
filenon_comm_use.A-B.xml.tar.gz.index.sqlite.gz 10.66MB
filenon_comm_use.C-H.txt.tar.gz.index.sqlite.gz 15.41MB
filenon_comm_use.C-H.xml.tar.gz.index.sqlite.gz 14.65MB
filenon_comm_use.I-N.txt.tar.gz.index.sqlite.gz 28.51MB
filenon_comm_use.I-N.xml.tar.gz.index.sqlite.gz 28.19MB
filenon_comm_use.O-Z.txt.tar.gz.index.sqlite.gz 17.31MB
filenon_comm_use.O-Z.xml.tar.gz.index.sqlite.gz 11.84MB
fileREADME.md 1.04kB
Type: Dataset

Bibtex:
@article{,
title= {ratarmount indexes for PMC OpenAccess subset},
journal= {},
author= {rngadam@coderbunker.com},
year= {},
url= {},
abstract= {## the problem

PMC Open Access bulk article (commercial and non-commercial) is a hefty set of files
that weight in compressed at 79G and uncompressed at 388G.

Archive decompression time in itself can take hours.

A bittorrent mirror exists on:

https://academictorrents.com/details/06d6badd7d1b0cfee00081c28fddd5e15e106165

## the solution

ratarmount (https://github.com/mxmlnkn/ratarmount), a python application, allows us to
use FUSE (through fusepy) to mount a compressed archive as a disk, allowing us randomly
access files in the archive as a disk without first decompression.

To achieve good performance, it creates an index (an sqlite database per archive).

This set of indexes still weight in at 1.4G uncompressed (345M compressed).

## usage

* decompress all indexes in the same directory you've downloaded oa_bulk
* install ratarmount
* use ratarmount to mount the oa_bulk archives on the disk

a sample script ```mount.sh``` is provided as an example

## distribution

we also use bittorrent to distribute the set of indexes. },
keywords= {PMC, PubMed, ratarmount},
terms= {},
license= {CC BY 4.0},
superseded= {}
}



Send Feedback Start
   0.000006
DB Connect
   0.000441
Lookup hash in DB
   0.000425
Get torrent details
   0.000120
Get torrent details, finished
   0.000275
Get authors
   0.000024
Parse bibtex
   0.000100
Write header
   0.000203
get stars
   0.000105
home tab
   0.000309
render right panel
   0.000005
render ads
   0.000351
fetch current hosters
   0.000402
Done