aml4td / website Goto Github PK
View Code? Open in Web Editor NEWWebsite sources for Applied Machine Learning for Tabular Data
Home Page: https://aml4td.org/
License: Other
Website sources for Applied Machine Learning for Tabular Data
Home Page: https://aml4td.org/
License: Other
We should use ggblend
I'd be interested in helping with a python computing supplement.
Did you have a format in mind? It seems likely that after the setup section, most sections could be tightly coupled between the R and python versions, which suggests maybe having two independent repositories isn't ideal? I think Quarto supports panelsets (as "tabsets"); that strikes me as a nice way to display the two, but also would mean both codes should be updated when a change is made.
One other thing that would be nice to decide on early: which python plotting library to use? plotnine mimics ggplot, matplotlib is already used by sklearn+pandas, others are slicker...
@kjell-stattenacity I don't have the bib entry for `tragrisso2023`.
Originally posted by @topepo in #32 (comment)
Slated to go here, talk about the full "let the machine work it out" viewpoint versus curated model development.
A few implementations, specifically torch and libsvm/kernlab, don't have good control over random number usage. Also, we have seen differences in results across intel and apple silicon chips (but that seems to be getting better).
We have some places where we programmatically write out results in-line. If the results change, our encoded conclusions might no longer be valid.
We can take a few key objects and save their results once their usage is finalized. Then we can use testthat to verify that those results are the same (or within some tolerance). Since the project is almost structured like an R package, this means that we can use devtools::test()
to check for consistency of results.
I don't like the quarto default:
The ones in Interactively exploring high-dimensional data and models in R (examples) are beautiful and Mine's class notes ones are also an improvement.
Wording edits/additions for website
Although it won't affect images, we should have some css for this
@kjell-stattenacity is `miller1984selection` the best reference for FSA and interactions?
Originally posted by @topepo in #32 (comment)
For regression models.
This section needs a little more regarding skewness and other aspects of variables.
For the numeric predictors chapter, we had previously talked about transformations for percentages and proportions (like the arc-sin; see this reference)
Also, describe some transformations based on conventions or scientific knowledge.
renv::snapshot()
R/post-process-bib-file.R
)_cache
and _freeze
quarto render
quarto preview
DESCRIPTION
usethis::use_github_release()
quarto publish gh-pages --no-render
We'll talk a lot about model complexity in regards to
Should we have an initial section on it though?
-1
to the hashing values leads to "fewer collisions"; it depends on what exactly you mean by a collision, and I'm not familiar with the cryptography literature to say. But in a parametric model, it's still enforcing some arbitrary constraint.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.