Comments (5)
Hi @masalmon,
Did you get this error while trying to download any other data?
This indian railways
dataset seems quite big and the way data.gov.in API is set up, it allows to fetch only a 100 records per API call. The fetch_data()
function was made to make multiple API calls to download the entire dataset. To avoid this, I have added a max_obs
parameter (defaulted to 500
) to fetch_data()
function. This will limit the number of API calls being made (500 / 100 = 5
calls, in this case). This perhaps could resolve the error you are getting.
Could you try again after re-installing this package?
from ogdindiar.
Thanks for being so reactive! 👍
100 is a very small limit. 😞 In the API of the OpenAQ platform for which I've written a R package I've been luckier than you: the limit is 1000, they do paging and you can get the total number of measurements so you know how many calls you need to make. In your case, you have to do it "blindly" because the API doesn't return you the number of lines in the original file, what a pity!
No I didn't get the error with other datasets I had tried. They were much smaller.
I have installed the new version and I got this error
lala <- fetch_data("b46200c1-ca9a-4bbe-92f8-b5039cc25a12", max_obs=70000)
Error in function (type, msg, asError = TRUE) :
Unknown SSL protocol error in connection to data.gov.in:443
Then I did it a second time and got a new error
lala <- fetch_data("b46200c1-ca9a-4bbe-92f8-b5039cc25a12", max_obs=70000)
Error in function (type, msg, asError = TRUE) :
SSL read: error:00000000:lib(0):func(0):reason(0), errno 10053
I tried with a limit closer to the number of lines in the timetable (69007)
lala <- fetch_data("b46200c1-ca9a-4bbe-92f8-b5039cc25a12", max_obs=69010)
Error in function (type, msg, asError = TRUE) :
Failed to connect to data.gov.in port 443: Timed out
Is the data too big? It's quite a limitation of the API, ah! But I guess I could still use it if used the filter argument and queried over things that interest me (like all trains from Hyderabad). In this case, I really wanted the whole thing.
I have two suggestions (because some tables will be bigger than 100 or 500 lines without being as bing as the train timetable 😄 ):
- having a verbose option so when it is TRUE you can say "querying observations X to Y" (so the user doesn't get desperate and knows something is going on)
- If the calls stop because max_obs is reached, but not because all data was retrieved (in my case if I set 500 as the max_obs I get no error and I cannot know whether there are more lines in the original table), then give a warning, e.g. "There were more observations than this, try again with a higher max_obs".
Thanks again for your help and your package!
from ogdindiar.
Sure @masalmon, I'll incorporate your suggestions. :)
from ogdindiar.
Cool, thank you!
I was also thinking that your package needs use cases. The open data platform is a goldmine! Maybe in the next weeks I'll do something with the trains (e.g. querying all trains from a city and making a map of all trains). I'm sure it would motivate people to use the package and the data. 😄 And then you could add cool pictures/gif from the data in the README for teasing. 😆
from ogdindiar.
Sounds good, Thanks! :)
from ogdindiar.
Related Issues (16)
- Create a wrapper function to access the API data while enabling filter, selection and sorting. HOT 3
- License HOT 1
- Global variable missing HOT 1
- Documentation HOT 1
- vignette HOT 5
- Are you still developping this package? + another question HOT 27
- Unnecessary files in inst/extras
- fetch_data() is not downloading data. HOT 2
- cainfo parameter in RCurl::getURL call
- Programmatically downloading datasets not available as API HOT 4
- Downloading available datasets using XML API
- Handle 'types' of columns based on metadata HOT 2
- ogdindiar for R version 3.1.1 HOT 2
- Figure out a better way to store/handle API Key HOT 1
- DESCRIPTION Title field HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ogdindiar.