podcastindex-org / docs-api Goto Github PK

View Code? Open in Web Editor NEW

51.0 51.0 29.0 1.38 MB

Developer documentation for the podcastindex.org api.

Home Page: https://podcastindex-org.github.io/docs-api/

License: MIT License

docs-api's People

Contributors

Stargazers

Watchers

docs-api's Issues

Always get time window error

I am developing an application in C for AmigaOS 4, called MediaVault (https://github.com/walkero-gr/mediavault), and I want to add support for podcasts.

I use libcurl to make the calls and I add the headers as I should be doing, but unfortunately, I get all the time the "+/- 3 mins window" error, which I am not sure if this actually is the problem or something else. I double-triple checked the authorization header and the information that is in there, and still get the same message.

So, my question is, if there is a way to see a log of the request is it is received by your servers, or if there is a way to find more information why this fails.

Thank you so much for your help and time.

Repository has no license

Hi,

I can't find a license for the code, this means that nobody can legally use or contribute to it. Can you please add a license?

Subscription API

Hello @daveajones ,

As mentioned on the Mastodon thread already by other people, I can add that the subscriptions should be cross app (API Key).
For me the biggest challenge here would be controlling the access to someone's subscription without an actual user authorization concept.

I don't know if you are familiar with gpodder.net, it is another open index that supports user registration and they provide an API for user data, including subscriptions: https://gpoddernet.readthedocs.io/en/latest/api/

Reference:

Mastodon thread on subscription API - https://podcastindex.social/@dave/105238950629893429

(How to contribute to) stop words list

Hi
I was just listening to the third episode of the podcast and got interested by the idea of a stop words list. Full description search would be cool, and as I understand from what you discussed, a good stop word list is a must to filter out irrelevant, generic words. Also it's one of the things I could help with as a non-programmer :)

Do such lists exist already, which could be used and expanded? Could/Should we already start building such lists ahead of an introduction of full search?

I imagine it could be helpful to have a simple text file that we can all contribute to here on GH (via PRs), but one per language to make it easier to manage (e.g. it'd be easier for me to review the list with Dutch only, rather than a massive list that includes German, English and Swahili words).

Thanks for this great work! (And sorry for posting this in the API Repo, didn't see any other that seemed fitting.)

recent/episode search results ordering

My current experience with the recent/episode API is that the results are newer first, then successively older. If I then use before= modifier, the search results are indeed the ones before the indicated ID, but in the order oldest of that set, and then successively newer.

Easy workaround, I sort the results batch for myself. But the result order without "before" is probably the right one, and it should be the result order even with a search modified (or filtered, in the future).

How bad is it if my apiSecret is not so secret?

Hi all,

I am a developer of a browser extension to subscribe and listen to podcasts (https://github.com/podStation/podStation).

As it is a client side application, the API secret would be exposed to people using the software (could be obfuscated, but not really secret).

How bad is it to disclose the API secret?

Best Regards,
Guilherme.

Duplicate entries

A big problem of public podcast directories is that they can quickly fill up with duplicates. This can be observed in the wild on gpodder.net (GitHub), where most search terms return a pretty big number of old or broken feeds, as well as unofficial mirrors.

Does Podcastindex.org have a strategy on how to deal with duplicate feed submissions?

Please add a soundbites stats endpoint

I would love to be able to see stats about soundbites (total number in database, new in the last day, new in the last 7 days, etc

I would like to fetch all recent feed data from particular since time.

When hitting the recent feed data API, I will get a maximum of 1000, if set the max as 1000. I want to get all recent feed data if the record has more than 1000. Please implement the pagination to retrieve all recent feed data.

Examples links do not work

Examples links like https://api.podcastindex.org/api/1.0/search/byperson?q=adam%20curry&pretty do not work, yielding Authorization header value either not set or blank. Please see: https://podcastindex-org.github.io/docs-api/#overview--authentication-details.

Categories documentation

Calling the search endpoint now returns a feed object with a categories property. This value is a { number: string } and I'm unsure what the key (number) means.
My assumption is that it references some enum like object, but there is no documentation and I'm unsure where to look.

Missing podcasts/trending

No documentation for podcasts/trending

Trending, with the inclusion of certain categories, returns an HTTP ERROR 500

The following are all listed as categories:

How-To
Self-Improvement
Video-Games
Climate
Weather
Tabletop
Role-Playing

But when trying to use them with trending (ihttps://api.podcastindex.org/api/1.0/podcasts/trending?max=10&lang=en&cat=How-To), a HTTP ERROR 500 is returned. All the other categories work fine.

Recent items can't walk successively older recent items

There probably needs to be an extra optional URL argument "before=". The query returns recent items, starting at the next one older than (and you can have exclude still apply, etc.). This lets a client walk back in time lazily, as needed to support a given user's actions.

Empty categories for feed is an array, but with categories is an object

There's an odd type mismatch for feeds with empty categories compared to those without. Without categories, an empty array [] is returned (see: https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=233796), however for a feed with categories an object of categoryId: categoryName is returned (see: https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=183562)

I'm trying to produce JSON Schema definitions for the endpoints and this discrepancy is causing me some issues. If it is intentional, I'll try to work around it, but on the surface, it feels a little weird.

Low res or fixed size images

Have you considered adding low res fixed size images (as in itunes API)?

episodes byITunesId duration

The Schema for episodes By iTunes Id states duration to be in minutes but it seems to be in seconds.

All the best

Sig

New red logo request

Noting the new, red, logo - could we have a copy of this in SVG format, please - and ideally, the logos in a Github repo of their own?

(JPG is particularly poor for anything red, because of the algorithms used).

new endpoint: /episodes/byguid

Need an endpoint to satisfy this issue in the namespace:

Podcastindex-org/podcast-namespace#352

Explain how 'trending' is determined

Hello,

I proposed to replace the iTunes podcast suggestions in AntennaPod with suggestions based on the Podcast Index. One of the arguments was PI cannot provide a list of podcasts popular in the user's country. However, there is a 'Trending' end-point now, so I assume we could use that.

How is defined, though? What makes a podcast Trending? (It'd be great if that could be described also in the docs.)

Trending
get /podcasts/trending
This call returns the podcasts/feeds that in the index that are trending.

Thanks!

400 - "Bad Request" errors

Not sure if this is the best place to post it but since I don't see a repo for the API backend I thought I might add it here. I'm getting frequent errors when using API that return a message:

invalid json response body at https://api.podcastindex.org/api/1.0/search/byterm?q=javascript reason: Unexpected end of JSON input

This happens when I use API directly from Postman and when I use Google Cloud Functions or my local NodeJS/Express app that I wrote. Sometimes API request returns a 200 response with results, but 75% of the time I see these errors.

When I use API directly with Postman, the response is an HTML page with "Bad request" heading. When I use NodeJS (either with Cloud Functions or with my ExpressJS app), I get an HTTP error with the message above.

Below is code from my ExpressJS app:

require('dotenv').config()

const express = require('express')
const cors = require('cors')
const bodyParser = require('body-parser')
const fetch = require('node-fetch')
const crypto = require('crypto')

const PORT = 2345

// Podcast Index
const API_BASE_URL = process.env.API_BASE_URL
const API_KEY = process.env.API_KEY
const API_SECRET = process.env.API_SECRET

const ts = Math.floor(Date.now() / 1000)
const authString = `${API_KEY}${API_SECRET}${ts.toString()}`
const authHeader = crypto.createHash('sha1').update(authString).digest('hex')

const headers = {
  'Content-Type': 'application/json',
  'User-Agent' : "AdamZ's homemade podcast-fetcher 0.01a",
  'X-Auth-Date': ts,
  'X-Auth-Key': API_KEY,
  'Authorization': authHeader
}

const server = express()

server.use(cors())
server.use(bodyParser.urlencoded({ extended: true }))
server.use(bodyParser.json())
server.use(bodyParser.raw())

server.post('/search/term', async (req, res, next) => {
  const { term } = req.body

  const url = `${API_BASE_URL}search/byterm?q=${term}`

  try {
    const results = await fetch(url, { headers })
    const data = await results.json()
    res.json({
      message: `Podcasts by term: ${term}`,
      data
    })
  } catch (err) {
    next(err)
  }
})

server.listen(PORT, (err) => {
  if (err) {
    console.error(err)
  } else {
    console.log(`Podcastindex server is listening on port: ${PORT}`)
  }
})

Search for author/owner only

I was just thinking of a practical use-case for AntennaPod: it would be great if in the 'add podcast' screen, one could tap on a podcast, preview an episode, get hooked, subscribe and then tap on the Author to discover more podcasts from the same publisher (e.g. public broadcaster channel, big news outlet).

Of course AntennaPod could just take the author name (that's displayed in the app) and use it for general search. However, currently searching for Radio France will also return podcasts from the BBC (on the Tour de France) and others.

Being able to check only the author+ownerName fields would be helpful.

(Sidenote: thanks @stevencrader for making the Postman Collection and Environment available! Not a developer at all and Postman is a great way to start experimenting with the API :) )

Add both description and encoded content fields when retrieving episodes

I had a look at the source for the partytime parser and it seems that when creating the 'description' field, it uses the rss description first and then falls back to using content:encoded otherwise.

Often, podcasters will use both of these, with a plain text, shorter description in the description field, and the long form show notes in the content:encoded field.

It would be useful to have both of these.

I know that the encoded content could be long, so is probably not a good fit for the episode list APIs, but could it possibly be added to the episodes/byid endpoint as an optional flag?

Thanks for the great work!

Different ordered result returned when change the max value in search API.

According to the document, change the max value in request will Limits the number of results. The issue is that the result from max 10 and max 20 have different order. The first ten feeds of max 20 is not the same as feed of max 10. This is not expected because the results is ordered by the last-released episode as doc described.
My test search term is verge .

Thanks for the great podcast search engine, so exited!

Ratelimiting

So, like I told @daveajones , I found out there's a rate limiting system in place... the hard way :}
So thank so much for the warning and plz update docs <3

And also:

As a developer, when I create an service on the API, I would like to be informed on the current status of my requests and limits, so I can avoid hitting that 429 or predict when I am able to send new requests.

Normally this is done through headers, plus some explanation in the documentation on how to interpret these headers. A countdown header and knowing cooldown/reset time would suffice I think.
Protip: custom headers usually start with x- and I have a feeling that might help pushing them through Cloudflare.

Support for Cross Origin Resource Sharing (CORS)

Hi All,

First of all, thanks for this wonderful initiative.

I did some initial tests with the API and noticed that it does not return any headers for Cross Origin Resource Sharing (CORS, https://developer.mozilla.org/pt-PT/docs/Web/HTTP/CORS).

Do you plan to support CORS in the future?
I am planning to build an offline first Progressive Web App sometime in the future and this would be a big blocker for using the API from this type of app.

Best Regards,
Guilherme.

Error in schema documentation: newestItemPublishTime

The schema in the documentation contains newestItemPublishedTime, while the API returns newestItemPublishTime
https://podcastindex-org.github.io/docs-api/#get-/podcasts/trending

Carriage return and newline characters in dump

When I try to load the CSV dump of the podcast in to sqlite3 on Ubuntu, I still get a few faulty lines. These lines belong to the following podcasts:

https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=194753
https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=878138
https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=1081531

I'm sure it has something to do with /r and /n characters and might very have to do with the platform you're working out. On my linux machine it seems that I can find the conflicting lines by searching in vim for /"\r\n[^\"]/

This is also why CSV is not really the format to hold such rich content from so many sources. CSV is by default working with comma's, quotes and newlines. But the last is defined differently on different platforms. It's not imposslibe, but again @daveajones, I'd gently recommend to distribute a mysqldump generated sql file. This will also help importing the values as their correct types, instead having to parse everything from string :)

New Feeds endpoint returns feeds that By Feed ID endpoint does not recognize

I am getting data from https://api.podcastindex.org/api/1.0/recent/newfeeds?max=1000&feedid=5902722

Then I try to get the details for those new feeds at https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=5902723.

5902723, 5902726, 5902971, 5902737, 5902969, 5902962, 5903003, 5903014, 5902994, 5903044, 5903054, 5903063, 5903042 and 5903056 all return:

{
status: 'true',
query: { id: '5902723' },
feed: [],
description: 'No feeds match this id.'
}

Were these created and then deleted? If so (or even if not), how can you tell when feeds have been deleted from podcastIndex?

Feature request for soundbites API

Hi All,

I want to use the recent soundbites API for discovery, but the API currently does not return the episode title nor the podcast name, so I don't have any human readable information to show (as the soundbite title is option, and in most tests with the API it currently comes empty).

Could you add more information to the results of the recent soundbites API? For me the idea would be (in order of importance to me):

episode title
podcast title
podcast feed url
podcast id

podcast feed url and id are not so important, as I don't need to show them to the user, I just need in case the user wants to subscribe to that podcast, so I can get it based on the episode id (already returned), with a call to the episode API.

Episode.feedId has become Episode.feedid?

I noticed this broke my unit test in my js library for "Episodes By Feed Id"

Maybe it's just a typo or perhaps a change in case?

Small hole in API to getch dope on an episode

If you find a feed out there in the wild, you can get podcastindex's information on it by looking up via its GUID or URL.

If you have a particular episode of that out-in-the-wild feed, there's no easy way to get podcastindex's information on that episode; you need its (podcastindex-assigned) ID (which you don't have unless you found it through podcastindex) or its Apple ID (which often is not there). to get its podcastindex episode ID, you thus have to pull all the episodes of the feed, looking for the one whose GUID matches.

I suggest at some point supporting a search "episodes/byfeedid?id=X&guid=Y". From the wild, you can get the podcastindex feedid by one lookup, and then do this query to get directly to the episode's information. (Note, both the feed and episodes each have their own "guid", there would be an argument for the argument to be named "epguid"--"episode's GUID").

Please consider this low priority; my current needs only apply to podcasts which have current subscribers through podbrowser, and I can handle this query locally. But I wouldn't mind supporting all episodes of any podcast at all, and this search would help with that quite a bit.

As a user of the API I would like to be able to fetch all episodes from a podcast

Currently when consuming the episodes API, only a max of 40 episodes are returned if the max parameter is omitted.

one can provide a very large number to max (i.e. 10000), but it would also be good if a user of the API could provide the value 0 to the parameter max, and that would mean an unlimited number of episodes.

An alternative would be a better support for pagination:

the response would return a field with the total count of episodes
another parameter should be introduced as being the starting index

Entries submitted by iTunesID not with iTunesID

"PODNEWS Mx" seems brand new in Apple iTunes, with an ID 1536114728.

My script successfully added it at :44 using https://api.podcastindex.org/api/1.0/add/byitunesid - here's my logfile entry

2020-10-28 03:44:15: PODCASTINDEX INSERT FROM SEARCH PODNEWS Mx {"status":"true","feedId":1319023,"existed":"false","description":"Feed added successfully. Please allow 15-20 minutes for it to be searchable in the index."}

It's now in PodcastIndex under 1319023 - but the response appears not to include the iTunesID. Here's the chunk I'm pulling out of the API

 "title":"PODNEWS Mx",
         "url":"https:\/\/anchor.fm\/s\/29f8d170\/podcast\/rss",
         "originalUrl":"https:\/\/anchor.fm\/s\/29f8d170\/podcast\/rss",
         "link":"https:\/\/anchor.fm\/marco-edivaldo",
         "description":"Como est\u00e1n mis queridos amantes de los podcast, les damos la bienvenida a PODNEWS , el mejor canal de la plataforma con las mejores noticias del planeta, en donde seg\u00fan Andr\u00e9s Carrera y yo Marco Edivaldo se deben contar.",
         "author":"Marco Edivaldo",
         "ownerName":"Marco Edivaldo",
         "image":"https:\/\/d3t3ozftmdmh3i.cloudfront.net\/production\/podcast_uploaded\/6941724\/6941724-1602803535283-b589817b185bc.jpg",
         "artwork":"https:\/\/d3t3ozftmdmh3i.cloudfront.net\/production\/podcast_uploaded\/6941724\/6941724-1602803535283-b589817b185bc.jpg",
         "lastUpdateTime":1603856680,
         "lastCrawlTime":1603856680,
         "lastParseTime":1603856681,
         "lastGoodHttpStatusTime":1603856680,
         "lastHttpStatus":200,
         "contentType":"application\/rss+xml; charset=utf-8",
         "itunesId":null,
         "generator":"Anchor Podcasts",
         "language":"es-mx",
         "type":0,

Given that I added this script using https://api.podcastindex.org/api/1.0/add/byitunesid it should include the iTunesID, yes?

Further attempts to submit it via https://api.podcastindex.org/api/1.0/add/byitunesid - since my script doesn't know it exists in the database - are now returning blank responses by the looks of things.

Request: return the podcast GUID in /podcasts/byitunesid

Hello!

It would be a great help if the results of the /podcasts/byitunesid matched the results of podcasts/byfeedid (and by guid).

Specifically, I've noticed that the /byitunesid call does not return the podcastGuid.

Test URLs:

https://api.podcastindex.org/api/1.0/podcasts/byfeedid?id=473264&pretty

https://api.podcastindex.org/api/1.0/podcasts/byitunesid?id=1246288327&pretty

Missing hub/pubnotify

No docs for hub/pubnotify

https://api.podcastindex.org/api/1.0/hub/pubnotify?id=FEEDID&update

Blank/empty categories being returned by search

Looking at https://api.podcastindex.org/api/1.0/search/byterm?q=podnews

This query is returning an empty category for some feeds.

As an example, for the German-language id 1188972 ("podnews.de (Feed aller Podcast-Folgen)") it returns:
"categories":{"9":"Business","55":"News","56":"Daily","102":"Technology","":""}

The last, blank, pair of ID/category should probably not be there.

Reduce code example locations

Code examples are currently located in 3 separate locations:

Docs section: Example Code
Example Code repo
Individual repo's

They should be centralized (de-duped/linked). Also, a best practice is to show curl as a vanilla example and libraries for language specific implementation.

API returns 400 http status code when no podcast or episode is found, and not CORS headers

Hello all,

the API is currently returning HTTP 400 (Bad Request) when the requested podcast or episodes or podcast is not found in the index.
400 usually means that the request is malformed, but it is not the case, it should probably be returning 404.

Additionally, when it returns 400, the CORS headers are missing, and thus access through the browser by other domains is not possible.

Use case: When importing OPML feeds I use the podcasts/byfeedurl to check if I can get the podcast information and later episodes from the feed.

Best Regards,
Guilherme.

Necessity to have a RESTful API

HTTP APIs are complex because you can do pretty much anything and it's hard to specify and document. And that's why conventions are important. And in the case of podcastindex, I think that leveraging the HTTP standard and designing a truly RESTful API is particularly important. This way, it would be more likely to become a standard built on top of another standard, something with a solid foundation, that's easy to extend and evolve.

In that regard, it's important to:

identify what "resources" are part of your API: in this case "podcasts" and "episodes" are the main types of resources, and "episodes" only exist within "podcasts" so they are hierarchically related (or not, if you also want to be able to retrieve episodes individually, outside of their feeds)
use HTTP verbs: instead of using GET everywhere and just change the URL, you should only use GET to retrieve resources, POST to create new resources, PUT or PATCH to update existing resources and so on

That's the minimum, but there are a lot of other considerations, which are perfectly summed up in this video.

Those conventions don't give an answer for every single edge case, but they give a good basis for a highly standard API that is easy to navigate, document, interact with and grow over time.

Here are some examples of how this would change the current API:

Retrieve all podcasts: GET /api/1.0/podcasts (probably necessary to implement some pagination there)
Query podcasts by term: GET /api/1.0/podcasts?q=[term]
Query podcasts by feedId (assuming this feed id is a unique id specific to your index): GET /api/1.0/podcasts/[feedId]
Query podcasts by iTunes ID: GET /api/1.0/podcasts?itunesid=[itunesid]
Add a podcast: POST /api/1.0/podcasts?url=[feed url] (or pass the feed url as a JSON or form-encoded body)
Get all episodes of a specific podcast for which we know the feed ID (again assuming that this is THE unique identifier for podcasts in your index): GET /api/1.0/podcasts/[feed id]/episodes
Get all episodes of a podcast for which you only have the feed url or the iTunes ID (this one is a tricky one because there are several ways to approach it depending on the use case): GET /api/1.0/podcasts/episodes?itunesid=[iTunes ID] or GET /api/1.0/episodes?itunesid=[iTunes ID]
Query podcasts recently updated (stick with the generic endpoint structure and play with query parameters): GET /api/1.0/podcasts?sort=lastUpdateTime&order=desc&offset=0&max=10 (sort, order, offset and max are pretty standard pagination query parameters that let you define how many results you want from what starting point, sorted by which field and in what order)

These are just some examples that would need to be expanded on if you are willing to do so. Designing APIs is part of what I do for a living so I can help. Another part of what I do is develop mobile and web apps and I wanted to dive into that, but I really think an important first step is to redesign the API if you really want to maximize adoption and evolutivity.

Just my 2 cents.

ByTerm Search "Atari"

The search term "Atari" up until recently would return a bunch of podcasts about the Atari game console. Now when doing a byTerm search for Atari, I get a bunch of Spanish podcasts that don't appear to have any connection to the game console. Atari game console podcasts are no longer easily discoverable.

Episodes: Documentation for default value of the "max" parameter

max parameter is defaulting to 40, but this is not documented.

Moving the official docs here

I'm no markdown pro, so if anyone wants to get in and clean this file up and make it look pretty, I'll just change the links on the site to point here instead of internal since this document is getting better. I like the clickable live look at what the responses should look like, but I can make some dummy responses to link to instead.

API Hole

There needs to be an API call to get the list of categories and a search feature that searches in a category/categories.

Here is my story:

As a Developer, I want to to provide my user with a selectable list of categories to search, so that the user may search for Podcast with the name "Agenda" and/or the person "Adam" or "John" in the "Comedy" and/or "Drama" categories.

API rate limits

In #11, daveajones mentioned that the purpose of the API keys is to block keys that are "hammering the API". Do you have a rough estimate what you mean with that? Are 10k requests per day considered okay? What about 100k requests (for things like live suggestions while typing the search term)?

Missing /categories/list

No documentation for /categories/list endpoint

Sorting of `/search/byterm` endpoint

Hello,

In AntennaPod we got this request:

When I search for a podcast I often find podcasts that are old and inactive. It would be awesome to search for podcasts that have at least released a podcast in the last 30 days.

It would be great if the date of the most recent episode was shown in the search results, and that the search results could be sorted by most recent episode date.

The /search/byterm endpoint already provides lastUpdateTime. At AntennaPod we could already leverage that information, to display the date and sort it as per the request (although I feel it doesn't make a whole lot of sense).

But it got me thinking: how are episodes sorted, currently? Feed ID, 'relevance'?

Add "lang" parameter to filter results by language(s)

It would be useful to be able to filter results ("/search/byterm" & "/recent/episodes") by "lang".

Not sure if "lang" should accept a single value or coma-separated values to enable results in multiple language for multilingual users.

new endpoint: recent/modifiedfeeds

Is there currently a way to see which feeds have been updated since a certain date? I can't tell if this is what recent/feeds accomplishes, since it maxes out at 1000 and there is no way to paginate beyond that.

New endpoint "/collection"

It would be useful to have an endpoint (for example. /collection) that could accept a limited array of feedIds to enable getting updates to a collection of feeds.

At the moment, if someone wanted to get updates to multiple feeds at a time, they'd have to send separate requests for each feed.

There should be a documented limit on the array of feedIds. API consumers should manage their collections and if the length of their feedIds if greater then said limit, they could send multiple requests with a different value of feedIds.

podcastindex-org / docs-api Goto Github PK

docs-api's People

Contributors

Stargazers

Watchers

Forkers

docs-api's Issues

Recommend Projects

Recommend Topics

Recommend Org