mikhailklassen / mining-the-social-web-3rd-edition Goto Github PK

The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

License: Other

Jupyter Notebook 95.07% Python 0.34% HTML 3.94% CSS 0.65%

mining-the-social-web-3rd-edition's Introduction

Mining the Social Web, 3rd Edition

The official code repository for Mining the Social Web, 3rd Edition (O'Reilly, 2019). The book is available from Amazon and Safari Books Online.

The notebooks folder of this repository contains the latest bug-fixed sample code used in the book chapters.

Quickstart

The easiest way to start playing with code right away is to use Binder. Binder is a service that takes a GitHub repository containing Jupyter Notebooks and spins up a cloud-based server to run them. You can start experimenting with the code without having to install anything on your machine. Click the badge above, or follow this link to get started right away.

NOTE: Binder will not save your files on its servers. During your next session, it will be a completely fresh instantiation of this repository. If you need a more persistent solution, consider running the code on your own machine.

Getting started on your own machine using Docker

Install Docker
Install repo2docker: pip install jupyter-repo2docker
From the command line:

repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

This will create a Docker container from the repository directly. It takes a while to finish building the container, but once it's done, you will see a URL printed to screen. Copy and paste the URL into your browser.

A longer set of instructions can be found here.

Getting started on your own machine from source

If you are familiar with git and have a git client installed on your machine, simply clone the repository to your own machine. However, it is up to you to install all the dependencies for the repository. The necessary Python libraries are detailed in the requirements.txt file. The other requirements are detailed in the Requirements section below.

If you prefer not to use a git client, you can instead download a zip archive directly from GitHub. The only disadvantage of this approach is that in order to synchronize your copy of the code with any future bug fixes, you will need to download the entire repository again. You are still responsible for installing any dependencies yourself.

Install all the prerequisites using pip:

pip install -r requirements.txt

Once you're done, step into the notebooks directory and launch the Jupyter notebook server:

jupyter notebook

Side note on MongoDB

If you wish to complete all the examples in Chapter 9, you will need to install MongoDB. We do not provide support on how to do this. This is for more advanced users and is really only relevant to a few examples in Chapter 9.

Contributing

There are several ways in which you can contribute to the project. If you discover a bug in any of the code, the first thing to do is to create a new issue under the Issues tab of this repository. If you are a developer and would like to contribute a bug fix, please feel free to fork the repository and submit a pull request.

The code is provided "as-is" and we make no guarantees that it is bug-free. Keep in mind that we access the APIs of various social media platforms and their APIs are subject to change. Since the start of this project, various social media platforms have tightened the permissions on their platform. Getting full use out of all the code in this book may require submitting an application the social media platform of your choice for approval. Despite these restrictions, we hope that the code still provides plenty of flexibility and opportunities to go deeper.

mining-the-social-web-3rd-edition's People

Contributors

Stargazers

Watchers

Forkers

kqsmea8 gser billchiu-tw mervesarac sqliam afzalupal alhadbarbadikar dbbevan louismoura pegayus iss-lab anurag3 jacoblgoodman billyotieno nubiofs djoguns marcrichardklenk john2912 italoadler nick-chervov krissdap ambica777 stjordanis idwaker bkawan mobilenjoy sandy1811 lzluan echoxiong chissycode davemee saroshkhan avianaglobal mr-reddy doolingdavid marcosint stonejia kma86 m4ld1to singhsanjee wtiwana bys83 tkunwar aymar73 mrquestionmark toddmath king-qamata alexander-resources rmanca batermj bitsofsteve giennely wenchenli kenseii sunnybaison kamaha12000 azurah jbdatascience sling428 m4tt30rru njisrawi mawhy davi16sm bsraya jinyuc kyuhwas bigdatasciencegroup knut0815 expertmaksud chenjiayu0808 hedgehogtw rickykiet83 vivekgoyanka wahyuak blablaper anaivatco seancarverphd angelo337 mahamega noxiguru dushansilva venkateshkunduru123 karanamvijaykumar rajeshoo7 rudrigosouza mfranzo leodenale bentosilva aashishpt qwshy xanadu31 rolando7 lixx11 deeplaunch joeyhou wilberhdez destiniwright bhavikarao xingkailiu schrominger

mining-the-social-web-3rd-edition's Issues

Binder link is not working

When I follow your binder link I get the following error:
Could not resolve ref for gh:mikhailklassen/Mining-the-Social-Web-3rd-Edition/binder. Double check your URL.

Clicking on the page failed with the following:
ERROR: Cannot uninstall 'terminado'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
Removing intermediate container 25bb4c539125
The command '/bin/sh -c ${KERNEL_PYTHON_PREFIX}/bin/pip install --no-cache-dir -r "binder/requirements.txt"' returned anon-zero code: 1

can't build it on windows 10 and linux 16.04

Error:

/opt/conda/bin/npm pack '@jupyter-widgets/jupyterlab-manager@^0.35'
jupyter-widgets-jupyterlab-manager-0.35.0.tgz

Errored, use --debug for full output:
ValueError:
"@jupyter-widgets/[email protected]" is not compatible with the current JupyterLab
Conflicting Dependencies:
JupyterLab Extension Package

=0.19.1 <0.20.0 >=0.16.0 <0.17.0 @jupyterlab/application
=2.2.1 <3.0.0 >=1.1.0 <2.0.0 @jupyterlab/coreutils
=0.19.2 <0.20.0 >=0.16.0 <0.17.0 @jupyterlab/notebook
=0.19.1 <0.20.0 >=0.16.0 <0.17.0 @jupyterlab/rendermime
=3.2.1 <4.0.0 >=2.0.0 <3.0.0 @jupyterlab/services

Proposed solution:
Instead of
jupyter labextension install @jupyter-widgets/jupyterlab-manager@^0.35 &&
jupyter labextension install jupyterlab_bokeh@^0.5.0 && \

use:
jupyter labextension install @jupyter-widgets/jupyterlab-manager &&
jupyter labextension install jupyterlab_bokeh && \

Unable to use token to sign into Jupyter I get Invalid credentials

In using the docker image, I am able to load the URL, but the token returns:
'Invalid credentials'

I'm using Windows 10 Pro Docker version 18.09.1, build 4c52b90

AttributeError: module 'urllib' has no attribute 'urlencode'

I am using Docker to run Jupyter Notebook on Mac OS Mojave 10.14.13 and "Appendix B - OAuth Primer" section "Example 2. Facebook OAuth 2.0 Flow" can't compile and generate the following error:

AttributeError Traceback (most recent call last)
in
62 )
63
---> 64 oauth_url = 'https://facebook.com/dialog/oauth?' + urllib.urlencode(args)
65
66 Timer(1, lambda: display(JS("window.open('%s')" % oauth_url))).start()

AttributeError: module 'urllib' has no attribute 'urlencode'

To resolve the issue I have found help on this Stack Overflow link https://stackoverflow.com/questions/28906859/module-has-no-attribute-urlencode and their proposed solution is :

urllib has been split up in Python 3. The urllib.urlencode() function is now urllib.parse.urlencode(), and the urllib.urlopen() function is now urllib.request.urlopen().

By changing line 64 oauth_url = 'https://facebook.com/dialog/oauth?' + urllib.urlencode(args) to oauth_url = 'https://facebook.com/dialog/oauth?' + urllib.parse.urlencode(args) the error goes away.

FileNotFoundError: [Errno 2] No such file or directory: 'resources/ch02-facebook/access_token.txt'

FileNotFoundError Traceback (most recent call last)
in
69 webserver.run(host='0.0.0.0')
70
---> 71 access_token = open(OAUTH_FILE).read()
72
73 print(access_token)

FileNotFoundError: [Errno 2] No such file or directory: 'resources/ch02-facebook/access_token.txt'

This error is generating the following error on localhost:5000

Internal Server Error
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

Permission Denied

I'm unable to clone the repo from either Windows 10 or Unbuntu CLI. Any thoughts?

404 : Not Found

I am using Docker on Mac OS Mojave 10.14.13 and "Appendix C - Python & Jupyter Notebook Tips" section "Serving Static Content" can't compile and generate the following error:

404 : Not Found
You are requesting a page that does not exist!

Error: string indices must be integers in function below

for post in recent_posts['data']:
url = post['images']['low_resolution']['url']
label_image(URL=url)

Drive has not been shared err msg

I followed the instruction of installing docker for windows and on the 5th bullet - navigating to the cloned directory - i wrote docker compose up - after 15 min of setup and installation i've got the following error message:

ERROR: for mtsw Cannot create container for service mtsw: b'Drive has not been shared'
ERROR: Encountered errors while bringing up the project.

is there any chance someone can assist here? thanks
Kobi

Invalid Scopes: friends_likes

I am using Docker to run Jupyter Notebook on Mac OS Mojave 10.14.13 and "Appendix B - OAuth Primer" section "Example 2. Facebook OAuth 2.0 Flow" have Facebook generating the following error after having resolved issue #15 :

Invalid Scopes: friends_likes. This message is only shown to developers. Users of your app will ignore these permissions if present. Please read the documentation for valid permissions at: https://developers.facebook.com/docs/facebook-login/permissions

To resolve the issue I change EXTENDED_PERMS = ['user_likes', 'friends_likes'] to EXTENDED_PERMS = ['user_likes' by removing the friends_likes permission.

"Invalid scope field(s): public_content" while generating Instagram Code

Hello
In Chapter 3 the first block in the jupiter notebook gives me an URL. And when I click it, I get the error, instead of the code i need:
error_type | "OAuthException"
code | 400
error_message | "Invalid scope field(s): public_content"
I updated the Strings for variables
CLIENT_ID
CLIENT_SECRET
REDIRECT_URI

Please have a look at this, since we need this for our university class. Thank you!
Best regards,
Nico

Yahoo Link no longer exist

In chapter 1 - mining the twitter, there is a reference to check the link of Yahoo!Where On Earth ID, but it's no longer exists. could you replace a new link? or it doesn't work the ID?

The Yahoo! Where On Earth ID for the entire world is 1.

See https://dev.twitter.com/docs/api/1.1/get/trends/place and

http://developer.yahoo.com/geo/geoplanet/

Opening

In Windows PowerShell when past “docker pull mikhailklassen/mining-the-social-web-3rd-edition:latest,” shown in Step 3, screan shows, “Status: Image is up to date for mikhailklassen/mining-the-social-web-3rd-edition:latest.” I then navigate, “cd .\Desktop\” >> “ls”. When I try opening, “Mining-the-Social-Web-3rd-Edition-master (1)” the computer displays message
Set-Location : A positional parameter cannot be found that accepts argument '1'.
At line:1 char:1

cd .\Mining-the-Social-Web-3rd-Edition-master (1)\

  + CategoryInfo          : InvalidArgument: (:) [Set-Location], ParameterBindingException
  + FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.SetLocationCommand

I can then not copy any steps you did in the GitHub video. Mining the Social Web 3rd Edition is taking up lots of space on my computer already. What is going on? Is there another tutorial I should watch?

Appendix A : missing docker-compose.yml

The Jupyter notebook for Appendix A indicates that after the git clone, there should be a file "docker-compose.yml" that can be used in the step "docker-compose up". However this file is missing in the 3rd edition.

docker_compose.yml

After cloning the repo i could not find the docker_compose.yml file in the folder. Please i need he;o with this

Error with NLTK

I try Exploring text data with NLTK from https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition/blob/master/notebooks/Chapter%205%20-%20Mining%20Text%20Files.ipynb
And I constantly get an error. What could be the reason? Thanks.
`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
22
23 # Frequent collocations in the text (usually meaningful phrases)
---> 24 text.collocations()
25
26 # Frequency analysis for words of interest

~\Anaconda3\lib\site-packages\nltk\text.py in collocations(self, num, window_size)
442
443 collocation_strings = [
--> 444 w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
445 ]
446 print(tokenwrap(collocation_strings, separator="; "))

~\Anaconda3\lib\site-packages\nltk\text.py in (.0)
442
443 collocation_strings = [
--> 444 w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
445 ]
446 print(tokenwrap(collocation_strings, separator="; "))

ValueError: too many values to unpack (expected 2)`

Unable to get the docker file working.

I've tried both options provided to install the docker, but neither works for me. In each instance I get an error message:

Option 1 fails right away with this:
Error response from daemon: Get https://registry-1.docker.io/v2/mikhailklassen/mining-the-social-web-3rd-edition/manifests/latest: unauthorized: incorrect username or password

Building the docker myself, this is how far I get:
Sending build context to Docker daemon 85.71MB
Step 1/25 : FROM jupyter/minimal-notebook
Get https://registry-1.docker.io/v2/jupyter/minimal-notebook/manifests/latest: unauthorized: incorrect username or password

I have the following Docker version: Docker version 18.09.2, build 6247962

Please help! I'm so excited to get into this book and am stuck at step 0

Can't create the docker image with repository2docker on mac

I had some problems with the following command:

repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

It stuck at step 18 (copy some files) in the image build job and filled the docker harddisk.
The workaround is to create the image directly in the official repo2docker image:

docker run -v /var/run/docker.sock:/var/run/docker.sock jupyter/repo2docker:master repo2docker --user-id 88 --user-name jupyter https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

Then you can start the image with docker run <container-id> on your mac.

Is it possible to update the instructions to build the image directly in the repo2docker container? So you don't need to install pip or jupyter-repo2docker anymore

installation of repo2docker fails due to "terminado"

Hi!

On my (Oracle Virtual Box virtual machine) virtual ubuntu 18.04 I was unable to pull the repository with
jupyter-repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition
or
repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

I get the following error:
Attempting uninstall: terminado Found existing installation: terminado 0.8.3
ERROR: Cannot uninstall 'terminado'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

It does neither work when installing it just in the system plain, in a conda or in a virtualenv environment.
for any help how to install the repo to go on with the books exercises I'd be really grateful.
Thank you.
best MH

script instagram not working

hay, MR.
I just bought your book on Google Book, I am very interested in your book, but there are some obstacles in running a Python script, can you help with the query constraints?

docer build failure with SpecsConfigurationConflictError

My machine (Mac 10.14.4) have Python 2.7.15, no conda. Do you know why this happen?

SpecsConfigurationConflictError: Requested specs conflict with configured specs.
requested specs:
- h5py=2.7
- matplotlib=2.1
- pandas=0.22
- scikit-image=0.13
- scipy=1.0
- statsmodels=0.8
pinned specs:
- python=3.7
Use 'conda config --show-sources' to look for 'pinned_specs' and 'track_features'
configuration parameters. Pinned specs may also be defined in the file
/opt/conda/conda-meta/pinned.

Problem Client_ID: Please help!

docker-compose.yml not found

Problem code - changed API - solution?

Hello!

I tried to follow the instructions in my copy of "Mining the Social Web" (3rd edition) for mining Instagram, but soon noticed that I couldn't follow the mentioned steps. Aparently Instagram's API changed a few days ago (15/10/2019). I tried to go on and I went to their new platform (https://developers.facebook.com/products/instagram/) to set up my app: "My Apps" --> "Create App" --> name --> "Add Product" ---> "Instagram" --> "Set Up".

The problem is that I now have an App ID and an App Secret, but no Client ID or Secret and no "Redirect URL" either. The page shown in figure 3-1 of the book is no longer available. I thought maybe I could still run the notebook with the App ID and App Secret, but that did not work at all. When I clicked on the URL I simply received following error message: {"error_type": "OAuthException", "code": 400, "error_message": "Invalid Client ID"}. I then changed CLIENT to APP in the code, but that did not change the outcome.

Could somene with more experience tell me whether this issue can be solved or how I might still use this notebook with Instagram's new API?

Thank you in advance,

ldegreve

Chapter 4 cannot produce "linkedin_connections.json"

After successfully geocoding locations of linkedin connections, one cannot save the connections data into json. The reported error after running the cell is the following:

#Save the processed data
CONNECTIONS_DATA = 'linkedin_connections.json'

# Loop over contacts and update the location information to store the
# string address, also adding latitude and longitude information
def serialize_contacts(contacts, output_filename):
    for c in contacts:
        location = c['Location']
        if location != None:
            # Convert the location to a string for serialization
            c.update([('Location', location.address)])
            c.update([('Lat', location.latitude)])
            c.update([('Lon', location.longitude)])
    f = open(output_filename, 'w')
    f.write(json.dumps(contacts, indent=1))
    f.close()
    return
serialize_contacts(contacts, CONNECTIONS_DATA)

AttributeError Traceback (most recent call last)
in
16 f.close()
17 return
---> 18 serialize_contacts(contacts, CONNECTIONS_DATA)

in serialize_contacts(contacts, output_filename)
9 if location != None:
10 # Convert the location to a string for serialization
---> 11 c.update([('Location', location.address)])
12 c.update([('Lat', location.latitude)])
13 c.update([('Lon', location.longitude)])

AttributeError: 'str' object has no attribute 'address'

python-linkedin is deprecated and should use python-linkedin-v2

Hi all,

If you see an error like:

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.linkedin.com/v1/people/~?format=json&oauth2_access_token=AQXqVw1bCm7yl11gxVi46pvvznAh4ZsN_i7I1O1VxPdoY75nIilzZ5h9nqsClHHXRckR-NLrn3ALOHQltR4rSncuBvqsj4tPiJ6yoTp4urG4Bxgi0_bS0nlOAR97r1SkYkqhDLi_U7zYtudQwjdfff12d4yqXdUa-R0Ew6eDrukzvFli3EX_axrEao9L6MudMSAndkPncoLncgmTBVhfIMwoy0FVvUaqliX82K3ySkfhfJIjktiHNhB1JF17ceVsBVK1H5ShmoJkh__lFe-PVeSV41YP2yhAIewKZUEKpDVXxmZTELJXVVbtOovMVRcMjz6npXgpvmZNYOvPO6eJNvsleXr5zw

You are using the v1 api, the author should just update the example oauth code base for the flask example to use the:
https://pypi.org/project/python-linkedin-v2/

can't build image on OSX Mojave

svn cache (/opt/conda/conda-bld/svn_cache)
Size: 0 B

Total: 0 B
Removing /opt/conda/conda-bld/src_cache
Removing /opt/conda/conda-bld/git_cache
Removing /opt/conda/conda-bld/hg_cache
Removing /opt/conda/conda-bld/svn_cache
Enabling notebook extension jupyter-js-widgets/extension...
- Validating: OK
Node v8.10.0
No other errors were seen in the log.

/opt/conda/bin/npm pack '@jupyter-widgets/jupyterlab-manager@^0.35'
jupyter-widgets-jupyterlab-manager-0.35.0.tgz

Errored, use --debug for full output:
ValueError:
"@jupyter-widgets/[email protected]" is not compatible with the current JupyterLab
Conflicting Dependencies:
JupyterLab Extension Package

=0.19.1 <0.20.0 >=0.16.0 <0.17.0 @jupyterlab/application
=2.2.1 <3.0.0 >=1.1.0 <2.0.0 @jupyterlab/coreutils
=0.19.2 <0.20.0 >=0.16.0 <0.17.0 @jupyterlab/notebook
=0.19.1 <0.20.0 >=0.16.0 <0.17.0 @jupyterlab/rendermime
=3.2.1 <4.0.0 >=2.0.0 <3.0.0 @jupyterlab/services
The command '/bin/sh -c conda install --quiet --yes 'blas==openblas' 'ipywidgets=7.2' 'pandas=0.22*' 'flask=0.12.2' 'numexpr=2.6*' 'matplotlib=2.1*' 'scipy=1.0*' 'seaborn=0.8*' 'scikit-learn=0.19*' 'scikit-image=0.13*' 'sympy=1.1*' 'cython=0.28*' 'patsy=0.5*' 'statsmodels=0.8*' 'cloudpickle=0.5*' 'dill=0.2*' 'numba=0.38*' 'bokeh=0.12*' 'sqlalchemy=1.2*' 'hdf5=1.10*' 'h5py=2.7*' 'vincent=0.4.' 'beautifulsoup4=4.6.' 'jpype1' 'protobuf=3.*' 'xlrd' && conda remove --quiet --yes --force qt pyqt && conda clean -tipsy && jupyter nbextension enable --py widgetsnbextension --sys-prefix && jupyter labextension install @jupyter-widgets/jupyterlab-manager@^0.35 && jupyter labextension install jupyterlab_bokeh@^0.5.0 && npm cache clean --force && rm -rf $CONDA_DIR/share/jupyter/lab/staging && rm -rf /home/$NB_USER/.cache/yarn && rm -rf /home/$NB_USER/.node-gyp && fix-permissions $CONDA_DIR && fix-permissions /home/$NB_USER' returned a non-zero code: 1

AttributeError: 'bytes' object has no attribute 'encode'

I am using Docker on Mac OS Mojave 10.14.13 and "Appendix C - Python & Jupyter Notebook Tips" section "Interact with the Virtual Machine without an SSH Client Using Python" has the following error :

AttributeError Traceback (most recent call last)
in
2
3 # Run a command just as you would in a terminal
----> 4 r = envoy.run('ps aux | grep jupyter') # show processes containing 'jupyter'
5
6 # Print its standard output

/opt/conda/lib/python3.6/site-packages/envoy/core.py in run(command, data, timeout, kill_timeout, env, cwd)
212 cmd = Command(c)
213 try:
--> 214 out, err = cmd.run(data, timeout, kill_timeout, env, cwd)
215 status_code = cmd.returncode
216 except OSError as e:

/opt/conda/lib/python3.6/site-packages/envoy/core.py in run(self, data, timeout, kill_timeout, env, cwd)
91 thread.join(timeout)
92 if self.exc:
---> 93 raise self.exc
94 if _is_alive(thread) :
95 _terminate_process(self.process)

/opt/conda/lib/python3.6/site-packages/envoy/core.py in target()
78 if sys.version_info[0] >= 3:
79 self.out, self.err = self.process.communicate(
---> 80 input = bytes(self.data, "UTF-8") if self.data else None
81 )
82 else:

/opt/conda/lib/python3.6/subprocess.py in communicate(self, input, timeout)
841
842 try:
--> 843 stdout, stderr = self._communicate(input, endtime, timeout)
844 finally:
845 self._communication_started = True

/opt/conda/lib/python3.6/subprocess.py in _communicate(self, input, endtime, orig_timeout)
1494 stderr = self._fileobj2output[self.stderr]
1495
-> 1496 self._save_input(input)
1497
1498 if self._input:

/opt/conda/lib/python3.6/subprocess.py in _save_input(self, input)
1570 if input is not None and (
1571 self.encoding or self.errors or self.universal_newlines):
-> 1572 self._input = self._input.encode(self.stdin.encoding,
1573 self.stdin.errors)
1574

AttributeError: 'bytes' object has no attribute 'encode'

Issues restarting container

I am using macOS Mojave, Version 10.14.6.

I created a container using repo2Docker.

After I shut down the Jupyter server and the container, how do I start the container again? According to https://towardsdatascience.com/docker-without-the-hassle-b98447caedd8, I should type the following in the terminal: docker run -p 12345:8888 <IMAGE ID> jupyter notebook --ip 0.0.0.0. When I run this command, I am asked to paste a URL in my browser, but I fail to connect using the provided URL.

Can't repo2docker the url

repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

Docker client initialization error: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')).
Check if docker is running on the host.

Is it the problem with my pc? I'm using mac

Appendix C examples - wrong paths

Some paths are broken in notebooks/_Appendix C - Python & Jupyter Notebook Tips.ipynb

Starting docker with this command (on Mac):

docker run --rm -p 8888:8888 -v "$PWD"/notebooks:/home/jovyan/notebooks mtsw3e:latest

This code throws FileNotFoundError: [Errno 2] No such file or directory: '/home/jovyan/notebooks/README.md':

shared_folder="/home/jovyan/notebooks"
README = os.path.join(shared_folder, "README.md")

Because README.md is at the root of the repository, but only "/notebooks/" are visible inside of a container.

The following documentation is not accurate as well:

Shared Folders

The Docker container maps the top level directory of your GitHub checkout (the directory containing README.md) on your host machine to its /home/jovyan/notebooks folder and automatically synchronizes files between the guest and host environments as an incredible convenience to you.

OAUTH_FILE

Hello,

I have followed all the steps to create a twitter developet account. However, I didn't understand what should I fill on oauth_file line ?

It appears It could not find directory

OAUTH_FILE = "/tmp/twitter_oauth\

intel i5 9600k Hyper-V N/A

Good day everyone!
Is it some way to use all information from book without using Docker? Just learned that my i5 9600K don't support hyper-v technology, so I'm unable to use pre-installed OS. Maybe you can share a short guide if it's possible. Thank you.

bash: line 9: man: command not found

I am using Docker on Mac OS Mojave 10.14.13 and "Appendix C - Python & Jupyter Notebook Tips" section "Bash Cell Magic" can't compile and generate the following error:

bash: line 9: man: command not found

NameError: name 'xrange' is not defined

"NameError: name 'xrange' is not defined" in section "Dictionary Comprehensions" in "Appendix C - Python & Jupyter Notebook Tips". The following code a_dict = { k : k*k for k in xrange(10) } should be changed to a_dict = { k : k*k for k in range(10) }

The twython library has not been installed

I am running the docker container on Windows 10 Professional. When I go through the directions, and am doing the Chapter 1 - Mining Twitter notebook, I get the error, "/opt/conda/lib/python3.6/site-packages/nltk/twitter/init.py:20: UserWarning: The twython library has not been installed. Some functionality from the twitter package will not be available.
warnings.warn("The twython library has not been installed. "

I attempted to install Downloading and Extracting Packages
blas-2.4 | 6 KB libgfortran-ng-7.3.0 | 1.3 MB python-3.6.7 pyjwt-1.7.1 | 17 KB pycurl-7.43.0.2 | 185 KB scipy-1.0.1 liblapacke-3.8.0 | 6 KB numpy-1.11.3 | 3.6 MB libssh2-1.8.1 | 242 KB libblas-3.8.0 | 6 KB blinker-1.4 | 13 KB cryptography-2.6.1 | 607 KB scikit-learn-0.19.1 | 5.2 MB requests-oauthlib-1. | 19 KB liblapack-3.8.0 | 6 KB libopenblas-0.2.20 | 8.8 MB oauthlib-3.0.1 | 82 KB openblas-0.3.5 twython-3.7.0 | 26 KB libcblas-3.8.0 | 6 KB Preparing transaction: done
Verifying transaction: failed the library with a !conda install twython, but that gave me the following error:
| ##################################### | 100%
| ##################################### | 100%
| 34.5 MB | ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| 17.8 MB | ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%
| 15.8 MB | ##################################### | 100%
| ##################################### | 100%
| ##################################### | 100%

RemoveError: 'requests' is a dependency of conda and cannot be removed from
conda's operating environment.

I can create my own python environment and go that route, but was wondering if this issue had impacted other users. Thank you.

fatal: not a git repository

I am using Docker on Mac OS Mojave 10.14.13 and "Appendix C - Python & Jupyter Notebook Tips" section "Using Bash Cell Magic to Update Your Source Code" can't compile and generate the following error:

fatal: not a git repository (or any parent up to mount point /home/jovyan)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

CalledProcessError Traceback (most recent call last)
in
----> 1 get_ipython().run_cell_magic('bash', '', 'ls ../\n\n# Displays the status of the local repository\ngit status\n\n# Execute "git pull" to perform an update\n')

/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2321 magic_arg_s = self.var_expand(line, stack_depth)
2322 with self.builtin_trap:
-> 2323 result = fn(magic_arg_s, cell)
2324 return result
2325

/opt/conda/lib/python3.6/site-packages/IPython/core/magics/script.py in named_script_magic(line, cell)
140 else:
141 line = script
--> 142 return self.shebang(line, cell)
143
144 # write a basic docstring:

</opt/conda/lib/python3.6/site-packages/decorator.py:decorator-gen-109> in shebang(self, line, cell)

/opt/conda/lib/python3.6/site-packages/IPython/core/magic.py in (f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):

/opt/conda/lib/python3.6/site-packages/IPython/core/magics/script.py in shebang(self, line, cell)
243 sys.stderr.flush()
244 if args.raise_error and p.returncode!=0:
--> 245 raise CalledProcessError(p.returncode, cell, output=out, stderr=err)
246
247 def _run_script(self, p, cell, to_close):

CalledProcessError: Command 'b'ls ../\n\n# Displays the status of the local repository\ngit status\n\n# Execute "git pull" to perform an update\n'' returned non-zero exit status 128.

Incorrect Chapter reference in LinkedIn Notebook

Awesome book and Jupyter notebooks!
There are a couple spots where the notebook for Chapter 4 on LinkedIn references the subdirectory as ch03-linked - though it is actually ch04-linkedin. If I knew how I'd create a pull request, sorry not to be more helpful!

unable to evaluate symlinks in Dockerfile path: <path...> : no such file or directory

< user1@arch:mtswdir > $ sudo systemctl enable docker.service
[sudo] senha para user1:
< user1@arch:mtswdir > $ sudo systemctl start docker.service
< user1@arch:mtswdir > $ sudo docker pull mikhailklassen/mining-the-social-web-3rd-edition:latest
latest: Pulling from mikhailklassen/mining-the-social-web-3rd-edition
Digest: sha256:54fe654add76497888f987ce36f4cd257b0163aa310ea0f6a0beb68c85b01344
Status: Image is up to date for mikhailklassen/mining-the-social-web-3rd-edition:latest
< user1@arch:mtswdir > $ sudo docker tag mikhailklassen/mining-the-social-web-3rd-edition:latest mtsw3e:latest
< user1@arch:mtswdir > $ sudo docker build -t mtsw3e .
unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /mnt/sda2/Development/CODIGOS/mtswdir/Dockerfile: no such file or directory

401 - Unauthorized

I am using Docker to run Jupyter Notebook on Mac OS Mojave 10.14.13. I had an issue with "Chapter 1 - Mining Twitter" section "Retrieving trends" generating the following errors :

The Twitter API attempts to returned HTTP status codes :

401 - Unauthorized

Twitter API error messages :

{"errors":[{"message":"Could not authenticate you","code":32}]}

To resolve the issue I changed the authorization section "Authorizing an application to access Twitter account data" with the code from Appendix B section "Example 1. Twitter OAuth 1.0a Flow".

Docker Install

I have never used Python, C++, or HTML. The YouTube video does not show what is written. After copy and pasting what was needed in Appendix A for docker into Windows PowerShell, I could not see what Mikhail typed.

repo2docker quits during installation of feedparser: "error in feedparser setup command: use_2to3 is invalid."

I did some research and it appears that the problem is the use of "setuptools" with a version>57.5.

But I have no clue how I can tell the script that is running to use an older version of setuptools...

Docker compose up failure: Py 3.7 or deprecated packages?

Hi,
I did manage to run the Jupyter notebook a few weeks ago. However, on upgrading to Python 3.7 and Anaconda I am getting the following error on docker-compose up

I have gone through both Stack overflow and closed issues on Github, but could not find any specific solution. Seems to be related to either deprecated packages and/or Py 3.7 not supported. Any help appreciated.

Thanks,

SSL: CERTIFICATE_VERIFY_FAILED

I followed all steps in the book till example 1-1. I created my keys and registered an app on the website. The keys are working because I tested with another older python program. (with the twurl module disabling the ssl certification checks)
The error here is 'SSL: CERTIFICATE_VERIFY_FAILED'
Is there an extra step I need to fro to get it working? I am working with python 3.8 and I have the Jupiyter notebooks installed.
The first part of the code is ok. I created a twitter_api object successfully
twitter_api = twitter.Twitter(auth=auth)
but this fails:
world_trends = twitter_api.trends.place(_id=1)
with the error
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)>

Anybody with the same problem?

Unable to create Docker virtual machine - ERROR: Cannot uninstall 'terminado'.

I am unable to create the Docker virtual env as I keep encountering the error: Cannot uninstall 'terminate'. What should I do?

Attempting uninstall: terminado
Found existing installation: terminado 0.8.3
ERROR: Cannot uninstall 'terminado'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
Removing intermediate container 59b2cd2c6875
The command '/bin/sh -c ${KERNEL_PYTHON_PREFIX}/bin/pip install --no-cache-dir -r "binder/requirements.txt"' returned a non-zero code: 1(base)

GraphAPIError: (#100) Pages Public Content Access requires either app secret proof or an app token

Thanks for publishing good book. I am write now working with Mining facebook notebook. When I search something facebook, i get results as per the page. For instance, i searched Mining the social web as per your example, I got results in facebook page.

Now when i do the same thing using facebook-sdk, i get no results. Do i need any permissions to access search results. I have posted in the forum https://stackoverflow.com/questions/59113788/fetch-search-results-using-facebook-sdk-in-python

In addition i get the below error GraphAPIError: (#100) Pages Public Content Access requires either app secret proof or an app token when i try to run the cell that executes the search "Mining the social Web "

Sometimes cannot find Jupyter notebook server on port 8888 on Windows 10

Some configurations appear to have this issue. This GitHub issue is a note to myself to follow up. See this StackOverflow discussion:

https://stackoverflow.com/questions/40678940/cannot-find-jupyter-notebook-server-on-port-8888-windows-10

Facebook/Meta Open Graph is now Deprecated

I realize for some they may be wondering why the IMDB example may not work for them (if it does do explain how?), and I can largely attribute it after experimenting and researching (Facebook left almost no explanation), that the Facebook Open Graph API implementation is now deprecated, source:

https://forum.unity.com/threads/facebook-has-deprecated-the-open-graph-objects-and-stories-what-are-alternatives.505757/

I have largely passed over this in the chapter but if anyone has any updates I would appreciate it.

Binder error

The quickstart Binder link https://mybinder.org/v2/gh/mikhailklassen/Mining-the-Social-Web-3rd-Edition/binder?filepath=notebooks%2F results in an error on the Binder website: Could not resolve ref for gh:mikhailklassen/Mining-the-Social-Web-3rd-Edition/binder. Double check your URL.

FileNotFoundError: [Errno 2] No such file or directory: 'resources/ch03-linkedin/linkedin.authorization_code'

I am using Docker to run Jupyter Notebook on Mac OS Mojave 10.14.13 and "Appendix B - OAuth Primer" section "Example 3. Using LinkedIn OAuth credentials to receive an access token an authorize an application" can't compile and generate the following error :

FileNotFoundError Traceback (most recent call last)
in
56 # seems to need full automation because the authorization code expires very quickly.
57
---> 58 auth.authorization_code = open(OAUTH_FILE).read()
59 auth.get_access_token()
60

FileNotFoundError: [Errno 2] No such file or directory: 'resources/ch03-linkedin/linkedin.authorization_code'