Code Monkey home page Code Monkey logo

machine-learning's Introduction

Deprecated Repository

This repository is deprecated. Currently enrolled learners, if any, can:

machine-learning

Content for Udacity's Machine Learning curriculum, which includes projects and their descriptions.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Please refer to Udacity Terms of Service for further information.

machine-learning's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

machine-learning's Issues

Various typos on Smartcab Project

File smartcab.ipynb

Question 3

Given that the agent is driving randomly, does the rate of reliabilty make sense?
Should be
Given that the agent is driving randomly, does the rate of reliability make sense?

Question 5

Given what you know about the evironment and how it is simulated,
Should be
Given what you know about the environment and how it is simulated,

Improve Q-Learning Driving Agent

(the default threshold is 0.01)
Should be
(the default threshold is 0.05) - as written in line 111 of smartcab/simulator.py

When improving on your Q-Learning implementation, consider the impliciations it creates
Should be
When improving on your Q-Learning implementation, consider the implications it creates

Optional: Future Rewards - Discount Factor gamma

Including future rewards in the algorithm is used to aid in propogating positive rewards
Should be
Including future rewards in the algorithm is used to aid in propagating positive rewards

File smartcab/agent.py

def learn()

line 112
receives an award
should be
receives a reward

Numpy and Pandas Tutorial: Some coding quiz submit answer result in import error

Submitting Answer results in import error, because it imports the wrong function.

For example, in Quiz 11, Average Bronze Medals:

Traceback (most recent call last):
  File "vm_main.py", line 33, in 
    import main
  File "/tmp/vmuser_bjdwrarirz/main.py", line 2, in 
    import aiMain
  File "/tmp/vmuser_bjdwrarirz/aiMain.py", line 2, in 
    from student import avg_medal_count as student_code
ImportError: cannot import name avg_medal_count

However, the function that is provided is named avg_bronze_medal_count

Similar problem in Quiz 14, Olympics Medal Points

Typo for Epsilon in Simulator

In the simulator line 298 reads " print "espilon = {:.4f}; alpha = {:.4f}".format(a.epsilon, a.alpha)"
but it should be:
print "epsilon = {:.4f}; alpha = {:.4f}".format(a.epsilon, a.alpha)"

Epsilon was misspelled.

Error in tester code for robot motion planning

I think there's an error in the tester for the provided capstone project (robot motion planning). Line 105 (checking if the robot has reached its goal) reads:

if robot_pos['location'][0] in goal_bounds and robot_pos['location'][1] in goal_bounds:

Which means position (6, 6) will evaluate to True even if the goal is (6, 7), because both elements in the location tuple are found in the goal_bounds tuple. I believe the line should be simply:

if robot_pos['location'] == goal_bounds:

Is it alright if I test this and make a pull request to fix the issue?

Naive Bayes Metrics

The metrics in the naive bayes is not being properly reported as the precision_score, recall_score and f1_score assumes [1,0] as positive and negative labels. The code should contain a parameter for positive_label which is pos_label="spam"

Boston Housing Error

Getting the following error in the complexity curves section when running the code.

/usr/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-56bd0b56ae45> in <module>()
----> 1 vs.ModelComplexity(X_train, y_train)

/Users/emi862/Workspaces/Projects/Machine-Learning/udacity-machine-learning-projects/boston_housing/visuals.pyc in ModelComplexity(X, y)
     80     # Calculate the training and testing scores
     81     train_scores, test_scores = curves.validation_curve(DecisionTreeRegressor(), X, y, \
---> 82         param_name = "max_depth", param_range = max_depth, cv = cv, scoring = 'r2')
     83 
     84     # Find the mean and standard deviation for smoothing

/usr/local/lib/python2.7/site-packages/sklearn/learning_curve.pyc in validation_curve(estimator, X, y, param_name, param_range, cv, scoring, n_jobs, pre_dispatch, verbose)
    352         estimator, X, y, scorer, train, test, verbose,
    353         parameters={param_name: v}, fit_params=None, return_train_score=True)
--> 354         for train, test in cv for v in param_range)
    355 
    356     out = np.asarray(out)[:, :2]

/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
    756             # was dispatched. In particular this covers the edge
    757             # case of Parallel used with an exhausted iterator.
--> 758             while self.dispatch_one_batch(iterator):
    759                 self._iterating = True
    760             else:

/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in dispatch_one_batch(self, iterator)
    606                 return False
    607             else:
--> 608                 self._dispatch(tasks)
    609                 return True
    610 

/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in _dispatch(self, batch)
    569         dispatch_timestamp = time.time()
    570         cb = BatchCompletionCallBack(dispatch_timestamp, len(batch), self)
--> 571         job = self._backend.apply_async(batch, callback=cb)
    572         self._jobs.append(job)
    573 

/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.pyc in apply_async(self, func, callback)
    107     def apply_async(self, func, callback=None):
    108         """Schedule a func to be run"""
--> 109         result = ImmediateResult(func)
    110         if callback:
    111             callback(result)

/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.pyc in __init__(self, batch)
    324         # Don't delay the application, to avoid keeping the input
    325         # arguments in memory
--> 326         self.results = batch()
    327 
    328     def get(self):

/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):

/usr/local/lib/python2.7/site-packages/sklearn/cross_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, error_score)
   1663             estimator.fit(X_train, **fit_params)
   1664         else:
-> 1665             estimator.fit(X_train, y_train, **fit_params)
   1666 
   1667     except Exception as e:

/usr/local/lib/python2.7/site-packages/sklearn/tree/tree.pyc in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
   1027             sample_weight=sample_weight,
   1028             check_input=check_input,
-> 1029             X_idx_sorted=X_idx_sorted)
   1030         return self
   1031 

/usr/local/lib/python2.7/site-packages/sklearn/tree/tree.pyc in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
    238         if len(y) != n_samples:
    239             raise ValueError("Number of labels=%d does not match "
--> 240                              "number of samples=%d" % (len(y), n_samples))
    241         if not 0 <= self.min_weight_fraction_leaf <= 0.5:
    242             raise ValueError("min_weight_fraction_leaf must in [0, 0.5]")

ValueError: Number of labels=312 does not match number of samples=1

Unable to open project using Jupyter - see error

Here is the error that pops up when trying to open the file "titanic_survival_explioration.ipynb" from the browser session in Jupyter:

Unreadable Notebook: C:\Users\magnus\Google Drive\Machine Learning\Udacity NanoDegree\Projects\02 - Titanic\titanic_survival_exploration.ipynb NotJSONError("Notebook does not appear to be JSON: u'\n\n\n\n\n\n\n<html lan...",)

error: boston_housing

I'm getting the following error. Could you please let me know what is missing here?

capture

sklearn.cross_validation deprecated in favor of the model_selection module

Full trace:

anaconda3/lib/python3.6/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)

This module is used in section 3.1

Spam example - maketrans

Hi,
In the example of spam filter, the use of maketrans returns error.
First: type object 'str' has no attribute 'maketrans'
After i changed str to string, then: Maketrans() takes exactly 2 arguments

Following is the code in the sample.

sans_punctuation_documents = []
import string

for i in lower_case_documents:
    sans_punctuation_documents.append(i.translate(str.maketrans('', '', string.punctuation)))
print(sans_punctuation_documents)

Typos in finding_donors project Evaluating Model Performance Section

In section Evaluating Model Performance, subsection Metrics and the Naive Predictor there are some typos.

  1. The pseudo-company CharityML is being referred as UdacityML.
  2. would is appropriate. should read would be appropriate.
  3. Supverised Learning Models should read Supervised Learning Models
  4. Note: Dependent on which algorithm you chose, should read Note: Depending on which algorithm you chose,.

testing_data is not defined

  • "testing_data" variable is not defined.

At machine-learning/projects/smartcab/visuals.py file,
at calculate_safety function, line 37,

if minor >= len(testing_data)/2:

"testing_data" variable is not defined.

Image-classification One Hot Encoding test passing with wrong solution

Hello,
I found that the provided unit test for one_hot_encode is passing with following code:

def one_hot_encode(x):
    """
    One hot encode a list of sample labels. Return a one-hot encoded vector for each label.
    : x: List of sample Labels
    : return: Numpy array of one-hot encoded labels
    """
    # TODO: Implement Function
    y = np.zeros((len(x), 10))
    return y

It seems that the third test: assert np.array_equal(enc_labels, new_enc_labels) is always passing.

Boston housing project's "fitting a model" code is broken

Commit 890eb59 changed n_iter argument to n_splits in the ShuffleSplit class constructor. Current recommended scikit-learn version throughout the course is 0.17.x, which does not have n_splits yet. Since the course materials also recommend to have always up-to-date revision of this repository it contains broken code out of the box. Moreover ShuffleSplit constructor in scikit-learn 0.18 has a different signature altogether so this change will not work there either.

Please consider reverting the mentioned commit. Thanks!

Error initializing GUI objects;

For smartcab projects, I have encountered an error on initialize GUI.

Simulator.init(): Error initializing GUI objects; display disabled.
error: File is not a Windows BMP file
Simulator.run(): Trial 0

Pygame and libpng is successfully installed.

tests for smartcab/agent.py

When working on smartcab, it's difficult to figure out when my smartcab is having trouble because of the learning rate I am working with, or whether I messed up the implementation of some other function.

it would be great if we had some quick tests for those functions

logs folder?

I found there is no logs folder which is required for course submission. I tried to create one in home folder but simulation did not result in any log files. Any idea to fix?

[smartcab] Incomplete reward system

While working on this project I detected a small flaw in the reward system.

File environment.py lines 328-341

        # Agent wants to perform no action:
        elif action == None:
            if light == 'green' and inputs['oncoming'] != 'left': # No oncoming traffic
                violation = 1 # Minor violation

       # (...)

        # Did the agent attempt a valid move?
        if violation == 0:
            if action == agent.get_next_waypoint(): # Was it the correct action?
                reward += 2 - penalty # (2, 1)
            elif action == None and light != 'green': # Was the agent stuck at a red light?
                reward += 2 - penalty # (2, 1)
            else: # Valid but incorrect
                reward += 1 - penalty # (1, 0)

This doesn't cover the case where the agent wants to turn left and there is oncoming traffic forward or right. The optimal policy would be to take no action (None), but None, right or forward would result in the same reward being applied: Valid but incorrect case.

The solution would be to expand:

        # Agent wants to perform no action:
        elif action == None:
            if light == 'green' and (inputs['oncoming'] != 'left' or waypoint != 'left'): # No oncoming traffic
                violation = 1 # Minor violation

to:

elif action == None and light != 'green'
    reward += 2 - penalty # (2, 1)
elif action== None and light == 'green' and inputs['oncoming'] in ['forward', 'right']:
    reward += 2 - penalty # (2, 1)

and expand:

elif action == None and light != 'green'
    reward += 2 - penalty # (2, 1)

to:

elif action == None and light != 'green'
    reward += 2 - penalty # (2, 1)
elif action== None and light == 'green' and inputs['oncoming'] in ['forward', 'right']:
    reward += 2 - penalty # (2, 1)

Smart cab simulator records success for failed trip

It appears the smart cab simulator.py recorded success in the sim_improved-learning.csv file for the last testing trial (10 of 10) when the python output showed the agent ran out of time and did not reach the destination.
Python output shows
Trial Aborted!
Agent did not reach the destination.

Simulation ended. . .

However the csv file shows a 1 in the success column.
This may be tricky to recreate. I can provide the files or code to recreate the issue.

Error in argument type for function visuals.survival_stats

In notebook titanic_survival_exploration.ipynb, there is a TypeError when calling the function visuals.survival_stats:
TypeError: Cannot concatenate list of ['DataFrame', 'Series']

This could be fixed by transforming outcomes from pandas.Series to pandas.DataFrame by means of the function pandas.Series.to_frame():
outcomes.to_frame()

Student_Admissions.ipynb notebook for Student Admissions mini-project is already completed

Hi all.

The other day I went to complete the Student Admissions mini-project presented in the Deep Learning module, in the Deep Neural Networks section part 32, and found that the notebook that had been checked into both this repo at projects/practice_projects/imdb/Student_Admissions.ipynb and the https://github.com/udacity/aind2-dl repo have already been pre-completed.

smartcab project missing folder

In the smartcab project, student is asked to set the 'log_metrics' to True to record logs to /logs/. However, this folder is missing and students will run into error information like:

No such file or directory: 'logs/sim_no-learning.csv'

One quick fix is to add the logs folder manually.

PS. In the README.md, it stated there is a logs folder but there is none.

Typos in Creating Customer Segments (Unsupervised Learning Project)

Question 2

Which feature did you attempt to predict? What was the reported prediction score? Is this feature is necessary for identifying customers' spending habits?

Question 5

Hint: (...) The rate of increase or decrease is based on the indivdual feature weights.

Implementation: Dimensionality Reduction

(...) Additionally, if a signifiant amount of variance is explained by only two or three dimensions, the reduced data can be visualized afterwards.

"Answer" is missing.

At machine-learning/projects/smartcab/smartcab.ipynb,
Question 3,
There is no "Answer" section.

There is a typo

There is a typo after in[13]:
Examining the survival statistics, the majority of males younger then 10 survived the ship sinking, whereas most males age 10 or older did not survive the ship sinking.

It should be 'than' instead of then.

Converting the code from Python2 to Python3

As the scientific libraries of Python are slowly stopping the support for Python2 (including Jupyter), maybe it'd be a good idea if we have our code in Python3 rather than Python2.

Naive_Bayes_tutorial.ipynb Step 2: Removing all punctuations fails

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-27-56c5cbdbde76> in <module>()
      6 
      7 for i in lower_case_documents:
----> 8     sans_punctuation_documents.append(i.translate(str.maketrans('', '', string.punctuation)))
      9 print(sans_punctuation_documents)

AttributeError: type object 'str' has no attribute 'maketrans'

Line 8 should be:

sans_punctuation_documents.append(string.translate(i, table=None, deletions=string.punctuation))

or similar.

Comment bug in agent.py

In the LearningAgent class, the following comment block exists in two places (in build_state, lines 59-61, and in createQ, lines 88-90):

    # When learning, check if the state is in the Q-table
    #   If it is not, create a dictionary in the Q-table for the current 'state'
    #   For each action, set the Q-value for the state-action pair to 0

I believe this comment is correct in createQ method, but not the build_state method. It definitely shouldn't be in both places. We've had several students understandably confused by this, so I promised to see if we could get it corrected in the github repository.

Deprecated function is used, and causing warning messages.

At machine-learning/projects/smartcab/visuals.py file,
pd.ralling_mean() is used, and it is causing unnecessary warning.
Warning should be suppressed or the function should be replaced with other function.

visuals.py:74: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
data['average_reward'] = pd.rolling_mean(data['net_reward'] / (data['initial_deadline'] - data['final_deadline']), 10)
visuals.py:75: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
data['reliability_rate'] = pd.rolling_mean(data['success']*100, 10) # compute avg. net reward with window=10
visuals.py:78: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
(data['initial_deadline'] - data['final_deadline']), 10)
visuals.py:80: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
(data['initial_deadline'] - data['final_deadline']), 10)
visuals.py:82: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
(data['initial_deadline'] - data['final_deadline']), 10)
visuals.py:84: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
(data['initial_deadline'] - data['final_deadline']), 10)
visuals.py:86: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with
Series.rolling(window=10,center=False).mean()
(data['initial_deadline'] - data['final_deadline']), 10)

Error in lecture video?

In Lesson 1: Deep Neural Networks, Section 28. Neural Network Architecture., in the second video, at the 1’08" mark, he combines the weights into a more compact form. The weights from x1 are 5 and -2 and from x2 are 7 and -3. Shouldn’t the weights from x1 be 5 and 7 and from x2, -2 and -3? It seems the “inner” weights should be swapped, because when I write the equations down they make sense, but not in the form as presented in the lecture video.

Link to thread in discussion forum.

finding_donors features scaling

I think there is an error in the "Normalizing Numerical Features" section of the finding_donors notebook. There the MinMaxScaler is applied to data but I think it should be better applied to features_raw, otherwise the log-transform made before would be useless.

scaler = MinMaxScaler()  
numerical = ['age', 'education-num', 'capital-gain', 'capital-loss', 'hours-per-week']
features_raw[numerical] = scaler.fit_transform(data[numerical]) 

A Small Typo

Under the "Implement a Q-Learning Driving Agent", the word iterative is misspelled as interative.

...based on the reward received and the interative update rule implemented.

Load Titanic_Survival_Exploration.ipynb failed

When I open this file by command "jupyter notebook Titanic_Survival_Exploration.ipynb", the web page alerts that:

Error loading notebook
Unreadable Notebook: /Users/Documents/WORK/project0_titanic/Titanic_Survival_Exploration.ipynb NotJSONError('Notebook does not appear to be JSON: u'\n\n\n\n\n<html lang="e...',)

Could you please help me fix this problem?
Thanks!

[Bug] Oncomming traffic not perceived when turning left

Imgur

As one can see in the screenshot above, negative reward is awarded for a situation, where the agent wants to turn left, with green light but oncoming traffic. It is as if the reward system doesn't know about the oncoming traffic.

In a situation like this (agent intents to turn left, green light and oncoming traffic forward or right) the action of the agent should be None, but the reward system wants us to turn left.

Image classification project environment instructions

Everywhere in the nanodegree materials (including the Deep Learning course) it's stated that the Python version being used is 2.7, however apparently image classification project relies on 3.x. There are no instructions about setting up the environment for it too. I've managed to do this on my own by trial and error but this really should be specified somewhere in the course materials so that students wouldn't waste time on trying to set up with Python 2 for it.

UI suggestion: allow for comments on specific videos

When going through lessons, I notice that though the whole lesson itself is great, some videos could use improvement. It would be great to add quite notes, on specific videos, rather then adding a comment for the entire lesson

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.