belikor / lbrytools Goto Github PK

View Code? Open in Web Editor NEW

17.0 17.0 6.0 957 KB

Python library with useful methods built on top of the lbrynet client from the LBRY project

License: MIT License

Python 100.00%

lbry lbry-desktop lbry-sdk lbryio python python3

lbrytools's People

Contributors

Stargazers

Watchers

Forkers

trendingtechnology anima-regem jilv220 durbanspoison dcartman gater308

lbrytools's Issues

Clean up claims except for select claims

This follows from the closed issue #6, and #7.

At the moment the generic function space.cleanup_space looks at all claims. The parameter never_delete considers channels to avoid when deleting claims.

Maybe a new parameter keep could be used to avoid deleting select claims, either by 'name' or by 'claim_id', regardless of the channel.

keep = ["is-justin-bieber-a-christian-singer:d",
        "b17e56e5f3b476b7f7a82916a340028aa9292f87"]
dd = t.cleanup_space(main_dir="/opt", size=1000, percent=90,
                     keep=keep, what="media")

Clean up orphaned blobs from claims

With LBRY when a claim is downloaded, it downloads blob files that are stored on the blobfiles directory. In Linux this is normally

/home/user/.local/share/lbrynet/blobfiles

However, if the claim is re-uploaded, for example, if the file is re-encoded, the blobs will be different. A new set of blobs will have to be downloaded, but the old blobs will remain in the system taking hard drive space.

A function needs to be created to examine the blobfiles directory so that only the currently managed claims have blobs. All other blobs, which are not tied to a specific claim, should be deleted so that they don't take unnecessary space in the system.

Each claim with a URI or 'claim_id' will have a "manifest" blob file. This blob file is named after the 'sd_hash' of the claim. This information is found under a specific key in the dictionary representing the claim, item["value"]["source"]["sd_hash"].

Inside this manifest blob file there is JSON data with all blobs that make the claim. Therefore, by examining this manifest blob file, we can know if all its blobs are present in the blobfiles directory or not.

We can get all claims with search.sort_files (lbrynet file list), and examine the 'sd_hash' of each of them, to find all blobs in blobfiles.

All additional blobs that don't seem to belong to any claim, that is, that are not contained in any manifest blob file, should be considered orphaned, and thus can be deleted from the system.

Reference documentation of how the content is encoded in LBRY by using blobs https://lbry.tech/spec#data

Simplify the way to print claims

Most claims when resolved online have the same type of data in the output dictionary; instead of handling the printing by special functions, we could simplify this by using a single function with many options.

For example, print_claims.print_sch_claims already can print a list of claims with a lot of information, which is controlled by the parameters given to it. We can display block height, claim_id, type, channel name, title, and others.
This function could be used in all methods that require printing claims.

print_sch_claims(claims,
                 blocks=False, claim_id=False,
                 typ=False, ch_name=False,
                 title=False, sanitize=False,
                 start=1, end=0,
                 reverse=False,
                 file=None, fdate=None, sep=";")

Delete lists of claims and from a specific channel

If we have downloaded many claims from the same channel, we may want to delete the older ones from that channel, without considering other channels.

At the moment the generic function clean.cleanup_space looks at all claims. The parameter never_delete considers channels to avoid. Maybe a new parameter only_delete could do the opposite, and consider only the channels to clean up, as oppose to avoid.

channels = ["@lbry", "@odysee"]
dd = t.cleanup_space(main_dir="/opt", size=1000, percent=90,
                     only_delete=channels, what="media")

A new function could be used to remove all older videos from a specific channel, and only leave a select number of the newest ones.

# leave only the 3 newest videos of the channel
channels = ["@lbry", "@odysee"]
dd = t.cleanup_channels(channels=channels, number=3, what="media")

We need a function that takes a list of claims to delete, just like download.redownload_claims takes a list of claims to download or re-download.

dd = t.remove_claims(ddir="/opt", file="remove.txt")
dd = t.remove_claims(ddir="/opt", start=10, end=20, file="remove.txt")

Integrate the library of tools into the main LBRY code

These tools were developed to make lbrynet simpler to use with many files, particularly to download many claims from multiple channels, and thus help with seeding them.

It would be ideal to integrate them into the lbrynet program itself so they can be used by expert users, and by the LBRY Desktop application to manage many claims in a graphical way.

Use requests module to send json messages to the daemon

Currently we communicate with the lbrynet daemon through the subprocess module.

get_cmd = ["lbrynet",
           "get",
           "lbry://@asaaa#5/a#b"]
output = subprocess.run(get_cmd,
                        capture_output=True,
                        check=True,
                        text=True)

Instead of doing it this way, we could also use the request module, by sending json messages directly to the running daemon in localhost.

import requests

server = "http://localhost:5279"
json = {"method": "get",
        "params": {"uri": "astream#bcd03a"}}

requests.post(server, json=json).json()

This is probably better and faster.

The reason the current functions use subprocess is merely historical, as this code is based on tuxfoo's lbry-seedit, and that's how he did it.

This is a similar implementation but using json messages.
https://odysee.com/@BrendonBrewer:3/channel-download:8

Generate nfo files for use with media centers, Kodi/Jellyfin

tuxfoo/lbry-seedit#20

The lbrytools.print_summary function is able to create a CSV file with all claims downloaded so far.

p = print_summary(title=True, typ=False, path=False,
                  cid=True, blobs=True, ch=False, name=True)

It would not be too hard to place this information in an XML format (.nfo) so that it can be read by Kodi and Jellyfin.

When it is cycling through the list, it could write one .nfo per claim.

Cleanup claims only from selected channels

This follows from the closed issue #6.

At the moment the generic function clean.cleanup_space looks at all claims. The parameter never_delete considers channels to avoid when deleting claims.

Maybe a new parameter only_delete could do the opposite, and consider only these channels when cleaning up older files.

channels = ["@lbry", "@odysee"]
dd = t.cleanup_space(main_dir="/opt", size=1000, percent=90,
                     only_delete=channels, what="media")

Internally, the function would use the methods clean.ch_cleanup and clean.ch_cleanup_multi introduced in e45bf17.

These functions delete all videos from a particular channel leaving only the newest one, established by the number=x parameter.

Handle the returncode of the `lbrynet` process better

Currently the lbrynet command is run through subprocess; if the returncode is 1, this indicates an error and it abruptly terminates the Python script.

    get_cmd = ["lbrynet",
               "get",
               "'lbry://@asaaa#5/a#b'"]
    output = subprocess.run(get_cmd,
                            capture_output=True,
                            check=True,
                            text=True)
    if output.returncode == 1:
        print(f"Error: {output.stderr}")
        sys.exit(1)

Although lbrynet seems to be quite stable, and it rarely returns an error, this should be handled better, and exit should be avoided, as it terminates the complete script where the function is being used.

This returncode may be exclusive to running subprocess. Maybe by solving issue #1, we can avoid this altogether.

As mentioned in #1, the reason we do it like this at the moment is historical. We started by copying the code from another programmer, and we just continued that.

Help odysee

how do I put it inside a python script and list the links to the video pages of a channel?

How to run on Windows?

Most time I get error "NameError: name 'lbrytools' is not defined"

Return only 'stream' claims when searching claims in a channel

Various operations such as download and delete can only be performed on downloadable content, that is, streams.

At the moment many functions depend on searching multiple claims from a channel by using search_ch.ch_search_latest.

Instead of returning all types (streams, reposts, collections, livestreams) from the search, we should add an option to only return streams:

claims = ch_search_latest("@some-chn", number=12, only_streams=False)
streams = ch_search_latest("@some-chn", number=12, only_streams=True)

This can be implemented by specifying the claim_type and using the has_source parameter in claim_search in lbrynet:

lbrynet claim search --channel=@some-chn --claim_type=stream --has_source=True

Livestreams are of type 'stream' but they don't have a source, so they are not downloadable, and should be avoided.

Move `zeedit` to its own repository

The program zeedit.py is contained in a single file, and uses lbrytools as backend.

The program lbrydseed (https://github.com/belikor/lbrydseed) also uses lbrytools as backend.

Therefore, this repository lbrytools, should contain only the code of the library, and it should be placed at the top level of the repository so that it can be used as a submodule by other programs, that is, zeedit and lbrydseed.

In turn, a new zeedit repository should be created to contain only the code for this program, and should have lbrytools as a submodule, just like lbrydseed.

Use `PyBRY` as wrapper for `lbrynet`

At the moment we call the SDK methods by using requests.

import requests

msg = {"method": "claim_search",
       "params": {"channel": "@example",
                  "page": 1}}

output = requests.post(server, json=msg).json()
if "error" in output:
    return False

result = output["result"]

This essentially creates a wrapper around the lbrynet methods.
Instead, we can use the already existing wrapper https://github.com/osilkin98/PyBRY

This allows us to use any method defined in lbrynet like a native Python method.

import pybry

lbry = pybry.LbrydApi()
output = lbry.claim_search(channel="@example", page=1)

result = output[0]

This would make access to the SDK more uniform, and all methods of the SDK would be available to use with all their parameters; therefore, there would be no need to write more wrappers like now.