Code Monkey home page Code Monkey logo

backpan-index's Introduction

NAME
    BackPAN::Index - An interface to the BackPAN index

SYNOPSIS
        use BackPAN::Index;
        my $backpan = BackPAN::Index->new;

        # These are all DBIx::Class::ResultSet's
        my $files    = $backpan->files;
        my $dists    = $backpan->dists;
        my $releases = $backpan->releases("Acme-Pony");

        # Use DBIx::Class::ResultSet methods on them
        my $release = $releases->single({ version => '1.23' });

        my $dist = $backpan->dist("Test-Simple");
        my $releases = $dist->releases;

DESCRIPTION
    This downloads, caches and parses the BackPAN index into a local
    database for efficient querying.

    Its a pretty thin wrapper around DBIx::Class returning
    DBIx::Class::ResultSet objects which makes it efficient and flexible.

    The Comprehensive Perl Archive Network (CPAN) is a very useful
    collection of Perl code. However, in order to keep CPAN relatively
    small, authors of modules can delete older versions of modules to only
    let CPAN have the latest version of a module. BackPAN is where these
    deleted modules are backed up. It's more like a full CPAN mirror, only
    without the deletions. This module provides an index of BackPAN and some
    handy methods.

METHODS
  new
        my $backpan = BackPAN::Index->new(\%options);

    Create a new object representing the BackPAN index.

    It will, if necessary, download the BackPAN index and compile it into a
    database for efficient storage. Initial creation is slow, but it will be
    cached.

    new() takes some options

   update
    Because it is rather large, BackPAN::Index caches a copy of the BackPAN
    index and builds a local database to speed access. This flag controls if
    the local index is updated.

    If true, forces an update of the BACKPAN index.

    If false, the index will never be updated even if the cache is expired.
    It will always create a new index if one does not exist.

    By default the index is cached and checked for updates according to
    "<$backpan-"cache_ttl>>.

   cache_ttl
    How many seconds before checking for an updated index.

    Defaults to an hour.

   debug
    If true, debug messages will be printed.

    Defaults to false.

   releases_only_from_authors
    If true, only files in the "authors" directory will be considered as
    releases. If false any file in the index may be considered for a
    release.

    Defaults to true.

   cache_dir
    Location of the cache directory.

    Defaults to whatever App::Cache does.

   backpan_index_url
    URL to the BackPAN index.

    Defaults to a sensible location.

  files
        my $files = $backpan->files;

    Returns a ResultSet representing all the files on BackPAN.

  files_by
        my $files = $backpan->files_by($cpanid);
        my @files = $backpan->files_by($cpanid);

    Returns all the files by a given $cpanid.

    Returns either a list of BackPAN::Index::Files or a ResultSet.

  dists
        my $dists = $backpan->dists;

    Returns a ResultSet representing all the distributions on BackPAN.

  dist
        my $dists = $backpan->dist($dist_name);

    Returns a single BackPAN::Index::Dist object for $dist_name.

  dists_by
        my $dists = $backpan->dists_by($cpanid);
        my @dists = $backpan->dists_by($cpanid);

    Returns the dists which contain at least one release by the given
    $cpanid.

    Returns either a ResultSet or a list of the Dists.

  dists_changed_since
        my $dists = $backpan->dists_changed_since($time);

    Returns a ResultSet of distributions which have had releases at or after
    after $time.

  releases
        my $all_releases  = $backpan->releases();
        my $dist_releases = $backpan->releases($dist_name);

    Returns a ResultSet representing all the releases on BackPAN. If a
    $dist_name is given it returns the releases of just one distribution.

  release
        my $release = $backpan->release($dist_name, $version);

    Returns a single BackPAN::Index::Release object for the given $dist_name
    and $version.

  releases_by
        my $releases = $backpan->releases_by($cpanid);
        my @releases = $backpan->releases_by($cpanid);

    Returns all the releases of a single author.

    Returns either a list of Releases or a ResultSet representing those
    releases.

  releases_since
        my $releases = $backpan->releases_since($time);

    Returns a ResultSet of releases which were released at or after $time.

EXAMPLES
    The real power of BackPAN::Index comes from DBIx::Class::ResultSet. Its
    very flexible and very powerful but not always obvious how to get it to
    do things. Here's some examples.

        # How many files are on BackPAN?
        my $count = $backpan->files->count;

        # How big is BackPAN?
        my $size = $backpan->files->get_column("size")->sum;

        # What are the names of all the distributions?
        my @names = $backpan->dists->get_column("name")->all;

        # What path contains this release?
        my $path = $backpan->release("Acme-Pony", 1.01)->path;

        # Get all the releases of Moose ordered by version
        my @releases = $backpan->dist("Moose")->releases
                                              ->search(undef, { order_by => "version" });

AUTHOR
    Michael G Schwern <[email protected]>

COPYRIGHT
    Copyright 2009, Michael G Schwern

LICENSE
    This module is free software; you can redistribute it or modify it under
    the same terms as Perl itself.

SEE ALSO
    DBIx::Class::ResultSet, BackPAN::Index::File, BackPAN::Index::Release,
    BackPAN::Index::Dist

    Repository: <http://github.com/acme/parse-backpan-packages> Bugs:
    <http://rt.cpan.org/Public/Dist/Display.html?Name=Parse-BACKPAN-Packages
    >

backpan-index's People

Contributors

acme avatar book avatar ilmari avatar rafl avatar schwern avatar sergeyromanov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

backpan-index's Issues

Make ording releases by date easier.

Since the files table contains the date, not the release, you can't do a simple order_by.

Might be nice also to have Distribution->releases order by date automatically.

Turn on Travis

Hey @book, would you turn on Travis for this repo? I don't think I can do it, I'm not an admin for this repo.

You'll want to merge the associated pull request first so there's a travis configuration file.

Fix how the database age is stored

Right now its based on the file time, which is problematic (just look at the code).

Instead, store it in an extra table in the DB. This can also be set only AFTER a complete update is done which eliminates the need for an empty database check.

Add a schema version for the BackPAN::Index

Force a deletion of the database if the schemas don't match.

If they do, rather than deleting the database on update delete all the rows and then insert. This allows a failed update to rollback and leave the user with a usable database.

Ability to change the BackPAN root URL

The root BackPAN URL used to make the url used to fetch a file is hard coded in BackPAN::Index::File.

  1. Make backpan_root a proper r/w accessor.
  2. Allow changing the root URL in BackPAN::Index->new.

That second bit will be tricky because DBIx::Class is the thing usually creating BackPAN::Index::File objects and changing how it does that initialization might be hard.

Update from "recent" indexes

CC @neilbowers

The slowest part of BackPAN::Index is building the database. The whole thing has to be downloaded, read and rebuilt every change.

If we had "recent" indexes like on CPAN, this could be done much faster.

BackPAN::Index::Create would be changed to...

  1. Build indexes which only go back to a certain date. An hour. A day. A week. A month.
  2. Put them on BackPAN mirrors like backpan-index-recent-1h.txt.gz

BackPAN::Index would be changed to...

  1. When building the database, note the newest file time as the age of the index (do not use the index file time since that will not accurately reflect the age of the backpan mirror it was built from) in a new table in the database.
  2. Try to retrieve the appropriate "recent" index (ie. if your database is 6 days old, get backpan-index-recent-1w.txt.gz).
  3. If not available, get the normal index file.
  4. Update from the file, ignore any file in the index which is older than the database.

What do you think?

Add a glossary

People get confused about the difference between a distribution, a module and a release.

Change args to BackPAN::Index->new

update     If true, force an update.  If false, never update.
ttl             Time for the cache to live.
debug     Turn on debugging
releases_only_from_authors  Only the authors directory is considered for releases
cache_dir  The directory to put the cache
backpan_index_url  Where to get the BACKPAN index file

Add a "short path" feature to BackPAN::Index::Release

BackPAN::Index::Release->path is too much for a lot of purposes. authors/id/S/SO/SOMEONE/Foo-Bar-1.23.tar.gz is useful if you want to build a URL to fetch the file, but if all you want is a canonical name for a release archive, SOMEONE/Foo-Bar-1.23.tar.gz is enough.

I've been calling this "short_path". I have some reservations that name isn't descriptive enough.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.