belikor / lbrytools Goto Github PK
View Code? Open in Web Editor NEWPython library with useful methods built on top of the lbrynet client from the LBRY project
License: MIT License
Python library with useful methods built on top of the lbrynet client from the LBRY project
License: MIT License
This follows from the closed issue #6, and #7.
At the moment the generic function space.cleanup_space
looks at all claims. The parameter never_delete
considers channels to avoid when deleting claims.
Maybe a new parameter keep
could be used to avoid deleting select claims, either by 'name'
or by 'claim_id'
, regardless of the channel.
keep = ["is-justin-bieber-a-christian-singer:d",
"b17e56e5f3b476b7f7a82916a340028aa9292f87"]
dd = t.cleanup_space(main_dir="/opt", size=1000, percent=90,
keep=keep, what="media")
With LBRY when a claim is downloaded, it downloads blob files that are stored on the blobfiles directory. In Linux this is normally
/home/user/.local/share/lbrynet/blobfiles
However, if the claim is re-uploaded, for example, if the file is re-encoded, the blobs will be different. A new set of blobs will have to be downloaded, but the old blobs will remain in the system taking hard drive space.
A function needs to be created to examine the blobfiles
directory so that only the currently managed claims have blobs. All other blobs, which are not tied to a specific claim, should be deleted so that they don't take unnecessary space in the system.
Each claim with a URI or 'claim_id'
will have a "manifest" blob file. This blob file is named after the 'sd_hash'
of the claim. This information is found under a specific key in the dictionary representing the claim, item["value"]["source"]["sd_hash"]
.
Inside this manifest blob file there is JSON data with all blobs that make the claim. Therefore, by examining this manifest blob file, we can know if all its blobs are present in the blobfiles
directory or not.
We can get all claims with search.sort_files
(lbrynet file list
), and examine the 'sd_hash'
of each of them, to find all blobs in blobfiles
.
All additional blobs that don't seem to belong to any claim, that is, that are not contained in any manifest blob file, should be considered orphaned, and thus can be deleted from the system.
Reference documentation of how the content is encoded in LBRY by using blobs https://lbry.tech/spec#data
Most claims when resolved online have the same type of data in the output dictionary; instead of handling the printing by special functions, we could simplify this by using a single function with many options.
For example, print_claims.print_sch_claims
already can print a list of claims with a lot of information, which is controlled by the parameters given to it. We can display block height, claim_id, type, channel name, title, and others.
This function could be used in all methods that require printing claims.
print_sch_claims(claims,
blocks=False, claim_id=False,
typ=False, ch_name=False,
title=False, sanitize=False,
start=1, end=0,
reverse=False,
file=None, fdate=None, sep=";")
If we have downloaded many claims from the same channel, we may want to delete the older ones from that channel, without considering other channels.
At the moment the generic function clean.cleanup_space
looks at all claims. The parameter never_delete
considers channels to avoid. Maybe a new parameter only_delete
could do the opposite, and consider only the channels to clean up, as oppose to avoid.
channels = ["@lbry", "@odysee"]
dd = t.cleanup_space(main_dir="/opt", size=1000, percent=90,
only_delete=channels, what="media")
A new function could be used to remove all older videos from a specific channel, and only leave a select number of the newest ones.
# leave only the 3 newest videos of the channel
channels = ["@lbry", "@odysee"]
dd = t.cleanup_channels(channels=channels, number=3, what="media")
We need a function that takes a list of claims to delete, just like download.redownload_claims
takes a list of claims to download or re-download.
dd = t.remove_claims(ddir="/opt", file="remove.txt")
dd = t.remove_claims(ddir="/opt", start=10, end=20, file="remove.txt")
These tools were developed to make lbrynet
simpler to use with many files, particularly to download many claims from multiple channels, and thus help with seeding them.
It would be ideal to integrate them into the lbrynet
program itself so they can be used by expert users, and by the LBRY Desktop application to manage many claims in a graphical way.
Currently we communicate with the lbrynet
daemon through the subprocess
module.
get_cmd = ["lbrynet",
"get",
"lbry://@asaaa#5/a#b"]
output = subprocess.run(get_cmd,
capture_output=True,
check=True,
text=True)
Instead of doing it this way, we could also use the request
module, by sending json messages directly to the running daemon in localhost
.
import requests
server = "http://localhost:5279"
json = {"method": "get",
"params": {"uri": "astream#bcd03a"}}
requests.post(server, json=json).json()
This is probably better and faster.
The reason the current functions use subprocess
is merely historical, as this code is based on tuxfoo's lbry-seedit, and that's how he did it.
This is a similar implementation but using json messages.
https://odysee.com/@BrendonBrewer:3/channel-download:8
The lbrytools.print_summary
function is able to create a CSV file with all claims downloaded so far.
p = print_summary(title=True, typ=False, path=False,
cid=True, blobs=True, ch=False, name=True)
It would not be too hard to place this information in an XML format (.nfo
) so that it can be read by Kodi and Jellyfin.
When it is cycling through the list, it could write one .nfo
per claim.
This follows from the closed issue #6.
At the moment the generic function clean.cleanup_space
looks at all claims. The parameter never_delete
considers channels to avoid when deleting claims.
Maybe a new parameter only_delete
could do the opposite, and consider only these channels when cleaning up older files.
channels = ["@lbry", "@odysee"]
dd = t.cleanup_space(main_dir="/opt", size=1000, percent=90,
only_delete=channels, what="media")
Internally, the function would use the methods clean.ch_cleanup
and clean.ch_cleanup_multi
introduced in e45bf17.
These functions delete all videos from a particular channel leaving only the newest one, established by the number=x
parameter.
Currently the lbrynet
command is run through subprocess
; if the returncode
is 1
, this indicates an error and it abruptly terminates the Python script.
get_cmd = ["lbrynet",
"get",
"'lbry://@asaaa#5/a#b'"]
output = subprocess.run(get_cmd,
capture_output=True,
check=True,
text=True)
if output.returncode == 1:
print(f"Error: {output.stderr}")
sys.exit(1)
Although lbrynet
seems to be quite stable, and it rarely returns an error, this should be handled better, and exit
should be avoided, as it terminates the complete script where the function is being used.
This returncode
may be exclusive to running subprocess
. Maybe by solving issue #1, we can avoid this altogether.
As mentioned in #1, the reason we do it like this at the moment is historical. We started by copying the code from another programmer, and we just continued that.
how do I put it inside a python script and list the links to the video pages of a channel?
Most time I get error "NameError: name 'lbrytools' is not defined"
Various operations such as download and delete can only be performed on downloadable content, that is, streams.
At the moment many functions depend on searching multiple claims from a channel by using search_ch.ch_search_latest
.
Instead of returning all types (streams, reposts, collections, livestreams) from the search, we should add an option to only return streams:
claims = ch_search_latest("@some-chn", number=12, only_streams=False)
streams = ch_search_latest("@some-chn", number=12, only_streams=True)
This can be implemented by specifying the claim_type
and using the has_source
parameter in claim_search
in lbrynet
:
lbrynet claim search --channel=@some-chn --claim_type=stream --has_source=True
Livestreams are of type 'stream'
but they don't have a source, so they are not downloadable, and should be avoided.
The program zeedit.py
is contained in a single file, and uses lbrytools
as backend.
The program lbrydseed
(https://github.com/belikor/lbrydseed) also uses lbrytools
as backend.
Therefore, this repository lbrytools
, should contain only the code of the library, and it should be placed at the top level of the repository so that it can be used as a submodule by other programs, that is, zeedit
and lbrydseed
.
In turn, a new zeedit
repository should be created to contain only the code for this program, and should have lbrytools
as a submodule, just like lbrydseed
.
At the moment we call the SDK methods by using requests
.
import requests
msg = {"method": "claim_search",
"params": {"channel": "@example",
"page": 1}}
output = requests.post(server, json=msg).json()
if "error" in output:
return False
result = output["result"]
This essentially creates a wrapper around the lbrynet
methods.
Instead, we can use the already existing wrapper https://github.com/osilkin98/PyBRY
This allows us to use any method defined in lbrynet
like a native Python method.
import pybry
lbry = pybry.LbrydApi()
output = lbry.claim_search(channel="@example", page=1)
result = output[0]
This would make access to the SDK more uniform, and all methods of the SDK would be available to use with all their parameters; therefore, there would be no need to write more wrappers like now.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.