altmetric / embiggen Goto Github PK
View Code? Open in Web Editor NEWA Ruby library to expand shortened URLs
Home Page: https://rubygems.org/gems/embiggen
License: MIT License
A Ruby library to expand shortened URLs
Home Page: https://rubygems.org/gems/embiggen
License: MIT License
The reason we have two separate expansion methods (Embiggen::URI#expand
and Embiggen::URI#expand!
) is to communicate the following things to the user:
At the same time, we want to provide a graceful API which tolerates failures for users that don't mind if URI expansion fails (and so doesn't unnecessarily raise exceptions).
Rather than having two separate methods (one graceful, one exception-happy), we could instead have a single Embiggen::URI#expand
method which returns a sum type to communicate some of the above without disrupting control flow:
Embiggen::URI('http://www.altmetric.com').expand
#=> #<ExpandedURI http://www.altmetric.com>
Embiggen::URI('http://toomanyredirec.ts').expand
#=> #<ShortenedURI http://toomanyredirec.ts>
Embiggen::URI('http://badshortenedu.rl').expand
#=> #<BadShortenedURI http://badshortenedu.rl>
These types could wrap a standard Ruby URI
object so they can be used seamlessly with other libraries.
This would then be graceful (viz. failure to expand would not raise an exception or return nil
) but users could check the return value if they wanted to ensure expansion was successful (we could provide a Embiggen::URI#expanded?
helper rather than switching on type).
Hello!
I saw this example on the README
# Custom logic to attempt to expand every URI
class ExpandEverything
def self.include?(_uri)
true
end
end
Embiggen.configure do |config|
config.shorteners = ExpandEverything
end
But i think it's not working, because Embiggen::URI
doesn't stop on 200
responses, it just keep trying redirects until the domain is not on Configuration.shorteners
.
Am i right?
Would be super helpful to have something like Configuration.shorteners = false
and just follow all redirects until 200
is returned.
While RFC 7231 specifies that Location
headers should be valid URI references, we have seen headers with unescaped characters in them, e.g. http://shar.es/1p3DSI:
$ curl -I http://shar.es/1p3DSI HTTP/1.1 301 Moved Permanently
Location: http://paradrasi.gr/2015/04/19/ούτε-σεισμοί-ούτε-λιμοί-ούτε-καταποντ/
When passed to URI
, this fails with an InvalidURIError
as URIs must be ASCII-only.
Hi, I trying to add a gem to my code and the restriction to addressable ~> 2.3
is blocking it.
Is there any problem to bump it to version 2.5?
Hi Folks...
I'm wondering if you could add perma.cc to embiggen. While perma.cc isn't strictly a link shortener, for purposes of altmetrics, it's usually desirable to look at the parent page.
def parsePermaCC(incomingurl)
if incomingurl.include? "perma.cc"
archive = incomingurl.split("perma.cc/")[1]
# https://perma.cc/docs/developer#get-one-archive
url = "https://api.perma.cc/v1/public/archives/#{archive}"
uri = URI(url)
response = JSON.parse(Net::HTTP.get(uri))
return URI(response['url'])
else
return incomingurl
end
end
this is some ruby I hacked up (I usually code in python, so... this is probably all sorts of badwrong) which takes a uri and finds the perma.cc parent url or just passes the url through if not perma.cc'd. I'm thinking an invocation for this can go into follow()
Let me know if this is something you folks would like to include here, or if I should go raise a ticket in altmetric for this.
Cheers,
-Brian
As embiggen handles Timeout::Error
and Errno::ECONNRESET
, was wondering if it should also handle
Errno::EHOSTUNREACH
and return the original URL.
I see that http://longurl.org/services and the code have to
and tk
as domains for URL shortening services. And Dot TK do in fact let you create your own .tk
domain to shorten any URL you want.
But not all .tk
(and .to
?) are URL shortening services, I'm afraid. Here is one example.
$ pry
[1] pry(main)> RUBY_VERSION
=> "2.2.2"
[2] pry(main)> require "embiggen"
=> true
[3] pry(main)> url = "http://crafttalk.tk/index.php/260155-working-with-a-carpeting-cleaner-recommendations-and-suggestion/0"
=> "http://crafttalk.tk/index.php/260155-working-with-a-carpeting-cleaner-recommendations-and-suggestion/0"
[4] pry(main)> Embiggen::URI(url).expand
=> #<URI::Generic /index.php/260155-locate-a-excellent-web-host-with-these-ideas/0>
[5] pry(main)> Embiggen::URI(url).expand!
=> #<URI::Generic /index.php/260155-locate-a-excellent-web-host-with-these-ideas/0>
[6] pry(main)> Embiggen::URI(url).shortened?
=> true
A bit unexpected to only get the path back (a new path). A closer look at that:
[19] pry(main)> uri = URI(Addressable::URI.parse(url).normalize.to_s)
=> #<URI::HTTP http://crafttalk.tk/index.php/260155-working-with-a-carpeting-cleaner-recommendations-and-suggestion/0>
[35] pry(main)> http = ::Net::HTTP.new(uri.host, uri.port)
=> #<Net::HTTP crafttalk.tk:80 open=false>
[36] pry(main)> response = http.head(uri.request_uri)
=> #<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>
[37] pry(main)> response.is_a?(::Net::HTTPRedirection)
=> true
[38] pry(main)> response.fetch('Location')
=> "/index.php/260155-locate-a-excellent-web-host-with-these-ideas/0"
Not sure what the right thing is to do here, maybe there are people expecting this gem to be able to expand arbitrary .to
and .tk
domains.
Maybe it is possible to detect that we only got a path from the Location
header, and append scheme, and host (and port if non standard). But then #shortened?
isn't really true, in some cases. :)
So the easiest seem to just remove to
and tk
.
(This issue is slightly related to #2)
As raised by @sferik on Twitter:
I’d encourage you to reconsider the necessity of the existence of the
shorteners
class method. It will be a maintenance nightmare.
It still seems valuable to provide a default list of shorteners but to encourage users to supply their own.
The current implementation interrogates Configuration.shorteners
with any?
and expects to be yielded a series of domains that respond to to_s
.
If the interface between Embiggen::URI
and Configuration.shorteners
was simplified to include?
(or something similar) then it would be possible to replace a static list of shorteners with much more powerful options, e.g.
require 'bitly'
class BitlyDomains
extend Forwardable
attr_reader :client
def_delegator :client, :pro?, :include?
def initialize(client)
@client = client
end
end
Bitly.use_api_version_3
Bitly.configure do |config|
config.api_version = 3
config.access_token = ENV.fetch('BITLY_ACCESS_TOKEN')
end
Embiggen.configure do |config|
config.shorteners = BitlyDomains.new(Bitly.client)
end
I noticed today that t.co links are no longer being expanded.
irb(main):024:0> uri=Embiggen::URI('t.co/WPaJlkl1X6').expand => #<URI::Generic t.co/WPaJlkl1X6>
Embiggen::URI("http://t.").expand
returns http://t/
. I think it should give an error BadShortenedURI
.
Hello, would you mind making addressable gem version a little bit more flexible?
Instead of:
s.add_dependency('addressable', '~> 2.3')
allow anything between 2.3 and 3.0
s.add_dependency('addressable', '~> 2', '>= 2.3')
I need this because a lot of the Google Cloud SDK gems use version 2.5 and above.
Embiggen currently uses Ruby's standard Net::HTTP
library to perform HTTP requests but this is not open for extension (e.g. to configure proxies or swap out entirely). It might be better to provide an adapter interface for other HTTP client libraries.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.