Code Monkey home page Code Monkey logo

medium_to_ghost's Introduction

Medium to Ghost 2.0

Feeling locked into Medium.com? Instantly move all your content (formatted posts + images) to an open source Ghost 2.0 blog!

Migrate your data out of Medium to Ghost

This code converts all your Medium.com posts to a Ghost 2.0.x import file. With that, you can import all your content into a Ghost blog (hosted anywhere) in seconds. Your posts keep the same formatting and all your images are migrated over too.

Why?

Medium.com is a nice platform for creating blog posts. I use it and enjoy it.

But you should never feel like your content is locked into someone else's privately-owned platform. This gives you the option to move your content to your own blog if you want to do it. It's also a quick way to back up all your Medium.com content (especially your images which they don't export) in case the site disappears someday.

I hacked this together quickly to move my blog, Machine Learning is Fun! from Medium to a self-hosted Ghost site. Hopefully it's useful to someone else too. More options is always good, right?

Requirements

  • A blog running Ghost v2.0.3+ (NOT Ghost 1.x). Both Self-hosted or professionally hosted Ghost instances are both fine.
  • A Medium.com account where you've previously written content.
  • Python 3.6+ to run this program

Installing this program

After you've installed Python 3.6+, you can install this program by opening up a terminal window and running this command:

Mac / Linux

pip3 install medium_to_ghost

Windows

pip install medium_to_ghost

How to use this to export your Medium content

  1. Install Python 3.6+. Lower versions won't work!
  2. Install this program (See "Installing this program")
  3. Go to https://medium.com/me/settings and find Download your information. There's a button to export your data and email it to you.
  4. Wait for the email from Medium and download your zip file. This will give you a file called medium-export.zip
  5. Run python3 -m medium_to_ghost.medium_to_ghost medium-export.zip which will produce medium_export_for_ghost.zip. This new zip file contains all your converted Medium posts and images from your posts. Make sure to put the full path to the zip file if it's not in the current directory. This may take a few minutes if you have lots of images in your posts since they all have to be downloaded.
  6. Go into Ghost 2.0.3+, navigate to /ghost/, click on 'Labs', and choose to import that zip file.
  7. That's it!

What gets moved over

When exporting content from Medium, the following features are supported:

  • Both published articles and drafts are moved over. So even if you are in the middle of writing a new post, it should be a seamless transition.
  • Most Medium.com content is replicated perfectly in Ghost, including text formatting, embedded Github gists, image cards with captions, Upscribe mailing list signup forms, etc.
  • If your Medium posts have a featured image, that will come over automatically too.

What is lost when moving over

  • Comments are not moved over to Ghost
  • Story highlights are not moved over to Ghost
  • I tried to make the Ghost output as similar to Medium as possible. However, there may be bugs or types of Medium content I haven't seen before, so always check the results in Ghost carefully. I just made sure it worked for all my articles. No warranty! :)

Warnings!

  • Hopefully this code will work for you, but it may have bugs and cause your computer to explode. Make sure you test everything out on a test Ghost instance before you import anything into a live blog.
  • Starting with Ghost v2.17, you can set the canonical url of a post. This tool will attempt to automatically set up the imported Ghost posts to point back to original Medium URLs to avoid any SEO impact of switching blogging platforms. If you don't want your posts to point back to Medium, you will need to go in and remove the Canonical URL setting for each imported post.
  • Ghost 2.0.3+ has a bug with image paths in import files. This tool may need to be updated when that bug is fixed in order for it to keep working, but it works for now (Checked up to Ghost 2.18.3).

medium_to_ghost's People

Contributors

ageitgey avatar ferdi2005 avatar urish avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

medium_to_ghost's Issues

Import error on Windows Subsystem for Linux and Windows

Hi!

When I try to run your program, I get this error:

brandon@DESKTOP-KSVAT30:/mnt/c/Users/Brandon/Desktop/med/medium_to_ghost-master/medium_to_ghost$ python3 medium_to_ghost.py medium-export.zip
Traceback (most recent call last):
  File "medium_to_ghost.py", line 3, in <module>
    from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json
  File "/mnt/c/Users/Brandon/Desktop/med/medium_to_ghost-master/medium_to_ghost/medium_to_ghost.py", line 3, in <module>
    from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json
ModuleNotFoundError: No module named 'medium_to_ghost.medium_post_parser'; 'medium_to_ghost' is not a package

Even though it is installed:

brandon@DESKTOP-KSVAT30:/mnt/c/Users/Brandon/Desktop/med/medium_to_ghost-master/medium_to_ghost$ pip3 install medium_to_ghost
Collecting medium_to_ghost
  Using cached https://files.pythonhosted.org/packages/27/20/37193588e828fc6afdf8bcdae124af39091ed71b831e4a81eabdf4ab8e2e/medium_to_ghost-0.0.2-py3-none-any.whl
Collecting Click>=6.0 (from medium_to_ghost)
  Using cached https://files.pythonhosted.org/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl
Collecting beautifulsoup4 (from medium_to_ghost)
  Using cached https://files.pythonhosted.org/packages/3f/ef/40271f62429deec36f2d040283e722856abcfd34bac063435a2213b77bef/beautifulsoup4-4.7.0-py3-none-any.whl
Collecting soupsieve>=1.2 (from beautifulsoup4->medium_to_ghost)
  Using cached https://files.pythonhosted.org/packages/ef/06/53edcae4edea76b38a325980dd35aed3b39f9bd0ef27b9d33f2e6dc4c7f6/soupsieve-1.6.2-py2.py3-none-any.whl
Installing collected packages: Click, soupsieve, beautifulsoup4, medium-to-ghost
Successfully installed Click-7.0 beautifulsoup4-4.7.0 medium-to-ghost-0.0.2 soupsieve-1.6.2

Switching to windows:

brandon@DESKTOP-KSVAT30:/mnt/c/Users/Brandon/Desktop/med/medium_to_ghost-master/medium_to_ghost$ logout

C:\Users\Brandon\Desktop\med\medium_to_ghost-master\medium_to_ghost>pip install medium_to_ghost
Requirement already satisfied: medium_to_ghost in c:\users\brandon\appdata\local\programs\python\python35\lib\site-packages (0.0.2)
Requirement already satisfied: Click>=6.0 in c:\users\brandon\appdata\local\programs\python\python35\lib\site-packages (from medium_to_ghost) (7.0)
Requirement already satisfied: beautifulsoup4 in c:\users\brandon\appdata\local\programs\python\python35\lib\site-packages (from medium_to_ghost) (4.7.0)
Requirement already satisfied: soupsieve>=1.2 in c:\users\brandon\appdata\local\programs\python\python35\lib\site-packages (from beautifulsoup4->medium_to_ghost) (1.6.2)
C:\Users\Brandon\Desktop\med\medium_to_ghost-master\medium_to_ghost>python medium_to_ghost.py medium-export.zip
Traceback (most recent call last):
  File "medium_to_ghost.py", line 3, in <module>
    from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json
  File "C:\Users\Brandon\Desktop\med\medium_to_ghost-master\medium_to_ghost\medium_to_ghost.py", line 3, in <module>
    from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json
ImportError: No module named 'medium_to_ghost.medium_post_parser'; 'medium_to_ghost' is not a package
Python 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:27:37) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import medium_to_ghost
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Brandon\Desktop\med\medium_to_ghost-master\medium_to_ghost\medium_to_ghost.py", line 3, in <module>
    from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json
ImportError: No module named 'medium_to_ghost.medium_post_parser'; 'medium_to_ghost' is not a package

I've tried to google around and I couldn't fix this :(

IndexError: list index out of range

INFO:root:Parsing posts/2019-01-01_Bitcoin-By-the-Numbers--2018-Recap-68a91789d804.html
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/jameson/.local/lib/python3.6/site-packages/medium_to_ghost/medium_to_ghost.py", line 111, in <module>
    main()
  File "/home/jameson/.local/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/jameson/.local/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/jameson/.local/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jameson/.local/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/jameson/.local/lib/python3.6/site-packages/medium_to_ghost/medium_to_ghost.py", line 98, in main
    exported_posts = parse_posts(posts)
  File "/home/jameson/.local/lib/python3.6/site-packages/medium_to_ghost/medium_to_ghost.py", line 53, in parse_posts
    converted_post = convert_medium_post_to_ghost_json(name, content)
  File "/home/jameson/.local/lib/python3.6/site-packages/medium_to_ghost/medium_post_parser.py", line 79, in convert_medium_post_to_ghost_json
    parser.feed(post_html_content)
  File "/usr/lib/python3.6/html/parser.py", line 111, in feed
    self.goahead(0)
  File "/usr/lib/python3.6/html/parser.py", line 163, in goahead
    self.handle_data(unescape(rawdata[i:j]))
  File "/home/jameson/.local/lib/python3.6/site-packages/medium_to_ghost/medium_post_parser.py", line 459, in handle_data
    self.cards[-1][1]["caption"] = data
IndexError: list index out of range

HTML file attached:
2019-01-01_Bitcoin-By-the-Numbers--2018-Recap-68a91789d804.txt

Still, index out of range error

File "/usr/local/lib/python3.7/site-packages/medium_to_ghost/medium_post_parser.py", line 459, in handle_data
self.cards[-1][1]["caption"] = data
IndexError: list index out of range

Proposal / Setup a docker version

Hi,

I would like to help to create the docker version of it. As I don't have a medium blog, I have a question, do you have Medium export I could work with? If you don't I will create a dummy user then :)

Cheers!

feature request: generate redirects json

Would be cool to also generate the redirects.json to make sure search engines and users that have bookmarked pages on medium can access them on the new ghost blog.

embedded tweets get lost

When importing a post converted with this tool that contains embedded tweets, the line with the tweet shows up as an empty quote block.

urllib.error.HTTPError: HTTP Error 400: Bad Request

Hi, I've the following error when I'm using the script.

NFO:root:Parsing posts/2013-11-05_10-ways-to-fail-a-Startup-Weekend-7c8d533bfdad.html
INFO:root:Downloading https://cdn-images-1.medium.com/fit/t/NaN/NaN/1*BkLPtHoNPLVLxf1mBqXsvQ.jpeg to exported_content/downloaded_images/10-ways-to-fail-a-Startup-Weekend
Traceback (most recent call last):
  File "/usr/local/bin/medium_to_ghost", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/medium_to_ghost/medium_to_ghost.py", line 98, in main
    exported_posts = parse_posts(posts)
  File "/usr/local/lib/python3.7/site-packages/medium_to_ghost/medium_to_ghost.py", line 53, in parse_posts
    converted_post = convert_medium_post_to_ghost_json(name, content)
  File "/usr/local/lib/python3.7/site-packages/medium_to_ghost/medium_post_parser.py", line 84, in convert_medium_post_to_ghost_json
    new_image_path = download_image_with_local_cache(url, cache_folder)
  File "/usr/local/lib/python3.7/site-packages/medium_to_ghost/image_downloader.py", line 32, in download_image_with_local_cache
    local_filename, headers = urllib.request.urlretrieve(url, local_destination)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Do you need more info?

Escape white title post

I have had on my export 7 posts with no title. I think it's a good idea to ignore them when generating the JSON because they block import on Ghost.

Accents are lost

The title
Coprocesseur “flottant” ta mère
converted URL becomes
coprocesseur---ottant--ta-m-re

urlopen error [SSL: CERTIFICATE_VERIFY_FAILED]

I am getting an error because of some failed validation when I follow the README

self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)

This happends when I do python3 -m medium_to_ghost.medium_to_ghost medium-export.zipon MacOS 10.14.4 and using Python 3.7.3

"Unable to find medium-export.zip" error

Hello!

I'm so close to being able to do this. I've done all the necessary steps. When I run the last command python3 -m medium_to_ghost.medium_to_ghost medium-export.zip, I get the Unable to find medium-export.zip error. My zip file is in my download. Am I missing anything?

[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1122)

Starts to parse posts, downloads first HTML file and then:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1342, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1255, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1301, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1250, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1010, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 950, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1424, in connect
self.sock = self._context.wrap_socket(self.sock,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1040, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1122)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/medium_to_ghost/medium_to_ghost.py", line 111, in
main()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/medium_to_ghost/medium_to_ghost.py", line 98, in main
exported_posts = parse_posts(posts)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/medium_to_ghost/medium_to_ghost.py", line 53, in parse_posts
converted_post = convert_medium_post_to_ghost_json(name, content)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/medium_to_ghost/medium_post_parser.py", line 90, in convert_medium_post_to_ghost_json
new_image_path = download_image_with_local_cache(url, cache_folder)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/medium_to_ghost/image_downloader.py", line 34, in download_image_with_local_cache
local_filename, headers = urllib.request.urlretrieve(url, local_destination)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 239, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 517, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 534, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1385, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1345, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1122)>

Can't launch at MacOS 10.14.

if I launch as in README - then receive
/Library/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/Python: can't open file 'medium_to_ghost.py': [Errno 2] No such file or directory

If I clone project and navigate to folder and run as python3 ./medium_to_ghost/medium_to_ghost.py /path-to.zip then receive:

Traceback (most recent call last): File "./medium_to_ghost/medium_to_ghost.py", line 3, in <module> from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json File "/Users/oleksiikryvonosov/upblog/medium_to_ghost/medium_to_ghost/medium_to_ghost.py", line 3, in <module> from medium_to_ghost.medium_post_parser import convert_medium_post_to_ghost_json

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.