mgalarnyk / datasciencecoursera Goto Github PK
View Code? Open in Web Editor NEWData Science Repo and blog for John Hopkins Coursera Courses. Please let me know if you have any questions.
Data Science Repo and blog for John Hopkins Coursera Courses. Please let me know if you have any questions.
I guess there is something wrong in Stanford_Machine_Learning/Week6/MachineLearningSystemDesign.md. It shows that the statement "A good classifier should have both a high precision and high recall on the cross validation set." is False, however I think it should be True because we can see this statement from the course video. Moreover, I have tried it in the quiz, the result is as follow:
Matlab/Octave:
J = (1/(2*m)) *sum( (((X*theta)-y).^2))
Python :
s = np.power(( X.dot(theta) - np.transpose([y]) ), 2)
J = (1.0/(2*m)) * s.sum( axis = 0 )
They look equivalent except the python has that np.transpose([y])
Why is it needed?
BTW, My Octave version of this cost function is the same as yours.
This is probably not a bug, but it is confusing. You've done a nice job of doing the Python version. It would be an improvement to at least comment on that. I really wanted to do the assignment in NumPy, but Ng's tutorial on Matlab was so easy to follow that I just did the Octave version. Now I can compare the syntax. A Tabla Rosa!
pollutantmean <- function(directory, pollutant, id = 1:332) {
###Format number with fixed width and then append .csv to number
fileNames <- paste0(directory, '/', formatC(id, width=3, flag="0"), ".csv" )
###Reading in all files and making a large data.table
lst <- lapply(fileNames, data.table::fread)
dt <- rbindlist(lst)
if (c(pollutant) %in% names(dt)){
return(dt[, lapply(.SD, mean, na.rm = TRUE), .SDcols = pollutant][[1]])
}
}
###Example usage
pollutantmean(directory = '~/Desktop/specdata', pollutant = 'sulfate', id = 20)
**Q1: Please can you explain what have you done in highlighted portion (.SD and then .SDcols)?
Q2: Also this, .(n = .N) ??**
{--
complete <- function(directory, id = 1:332) {
###Format number with fixed width and then append .csv to number
fileNames <- paste0(directory, '/', formatC(id, width=3, flag="0"), ".csv" )
###Reading in all files and making a large data.table
lst <- lapply(fileNames, data.table::fread)
dt <- rbindlist(lst)
return(dt[complete.cases(dt), .(nobs = .N), by = ID])
}
###Example usage
complete(directory = '~/Desktop/specdata', id = 20:30)
--}
Answer | Explanation |
---|---|
α=0.3 is an effective choice of learning rate. | We want gradient descent to quickly converge to the minimum, so the current setting of α seems to be good, X[WRONG] |
it is wrong. The learning rate &=0.3 still looks high compared with 0.1. The right answer is or should be; Rather than use the current value of α, it'd be more promising to try a smaller value of α (say α=0.1).
You have provided solutions for the course, I am really thankful for that. As I was doing the exercises , I found something worth mentioning to you. Week 2 , Question 4 ; the answer is different. Two options are correct. Please check for that. The transpose of v (1 cross 7) multiplied by w (7 cross 1) gives one number . Maybe they changed the options over time because this option was not present.
Again thanks for your work. I will see your content on YouTube too.
It would be very nice to get an explanation of the answer.
Seriously i am not able to classify the problem.
Why some time it goes to classification and some time regression.
Hi,
i have recently started working on prediction. please help me on how to prediction weather by using previous data (not from ) skymate/accueweather site.
pls post any queries
[email protected]
thanks
Venkat
Thank you for the great job with the course materials. I would like to reuse some of the quiz questions for my statistics course with attribution.
Please let me know if that is okay.
Maybe eventually you could add a CC-BY license if you want to encourage reuse?
https://github.com/santisoler/cc-licenses?tab=readme-ov-file#cc-attribution-40-international
Unfortunately, the information in the questionnaires is outdated. The questions have already been changed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.