Comments (15)
Another alternative is that we keep a database of known lectures, (maybe a .yml in the repo), and when a user has a lecture we haven't seen before, we can prompt them to register it into the database (such as asking them the correctly formatted name, whos running etc).
Implementation options include:
- Automatic issue creation
- Automatic PR creation
- Cloud hosted service, i.e. S3 bucket for high availability, and then an API that the CLI app can communicate to to post the suggested course information
from lecture-hoarder.
To maintain current levels of abstraction, should be implemented as a category
field in the Podcast
class.
How to organise the categories is then up to the calling method.
from lecture-hoarder.
Related to #25
from lecture-hoarder.
We should only download the latest year by default.
from lecture-hoarder.
Problems with categorisation include:
- Courses with unusual code formats
- Courses from the Masters year using a mix of 6 or 4 as the year identifier (e.g. COMP40901 and COMP60411)
from lecture-hoarder.
We could attempt to parse the publish date and use that to figure out the year + semester
from lecture-hoarder.
Having a migration handler would also be useful
I don't think this is worthwhile
from lecture-hoarder.
We could attempt to parse the publish date and use that to figure out the year + semester
A good approach I hadn't considered, we could easily categorise by academic year.
Publish dates are included within the course listing as well as the individual lecture page.
from lecture-hoarder.
Implementation options include:
- Adding
publish_date
field to thePodcast
class
Categorisation is then performed by the file download handler - Adding
category
field to thePodcast
class
ThePodcastProvider
determines the category, but it's up to the download handler to determine how categories should be implemented - New abstraction layer for handling categorisation
from lecture-hoarder.
Implementation as an abstraction layer would involve:
- Adding
publish_date
to thePodcast
class - New package and classes for
CategorisationHandler
CategorisationHandler
classes contain aget_category
method, accepting aPodcast
as an argument and returning a string category name- Adding a
category
field to theDownload
class, assigned by the constructor - It is then left to the download provider to implement storage of lectures in categories
This approach will however make downloading podcasts only from the current year significantly harder, add a large amount of additional logic and will likely provide no benefit - I can't imagine a strong enough use case for different category providers
from lecture-hoarder.
Approach will be to store the date
in the Podcast
class and expose a method for calculating the academic year.
Categorisation into academic years will be an intrinsic property of the download provider.
A new setting will be introduced to control whether all podcasts will be downloaded, or just ones from the current year. The default will be to just download podcasts from the current year.
from lecture-hoarder.
Actually, extraction of podcast dates from the course page listing is more complex than initially expected - currently we get the list of podcasts from the sidebar nav, whereas for the dates we need to parse the main central listing. Unfortunately, this requires supporting pagination too (which will involve performing additional slow web requests).
The option of only extracting the date from the individual podcast page remains, but this will result in additional overhead of still queueing downloads that will be skipped.
Looks like pagination is the way to go here
from lecture-hoarder.
Actually, pagination is only implemented client side here, meaning that the logic can be kept simple.
Why the designers of the podcast service would choose to cripple the UX in this way is beyond me.
from lecture-hoarder.
After initially running the directory creation for podcasts, an issue has arised where one lecturer managed to upload a podcast with a date from the incorrect year, causing a new directory to be created just for the one mis-labeled podcast.
This problem is fundamental to the approach of identifying the academic year per podcast.
A solution is to extract the academic year from the course name which, from my experience, always contains the academic year. However, I don't know how well this is enforced across other subjects.
from lecture-hoarder.
Lectures are now categorised into academic years, based purely of the course name.
I'm going to close this issue as its main purpose is complete, then open a new issue regarding only downloading podcasts from the current year.
from lecture-hoarder.
Related Issues (20)
- Errors sometimes not being reported correctly HOT 2
- Broken on Windows HOT 1
- Video page format change HOT 2
- Clipping for long podcast names
- Check file access permissions HOT 1
- Abstract into model HOT 2
- Make settings file optional
- Abstract web requests HOT 1
- Change get_podcast_downloader return type HOT 2
- Validate every usage of BeautifulSoup in UomPodcastProvider
- Deprecate login_service_url and video_service_base_url settings
- Abstract file storage HOT 1
- Add proper command line option support HOT 2
- Add contributing guidelines
- Check for duplicate but out of order podcasts
- Only download podcasts from the current year
- Recommend setup by venv
- Login broken by switch to Duo 2FA HOT 1
- Download automatic subtitles
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lecture-hoarder.