braverock / factoranalytics Goto Github PK

R 99.13% TeX 0.87%

factoranalytics's Introduction

Factor Analytics for asset return data

The FactorAnalytics project is an open source package containing fitting and analysis methods for the three main types of factor models commonly used in conjunction with portfolio construction, optimization and risk management. These include:

fundamental factor models
time series factor models
statistical factor models

The purpose of this project is to provide basic features and capabilities close to those of commercial portfolio optimization and risk management products.

In addition, the package contains functions for related risk and performance attribution estimates including:

volatility
VaR
ES
factor-contributed vs idiosyncratic returns
factor model Monte Carlo
multiple imputation methods for simulating returns and backfilling unequal histories
functions to create tabular displays of risk and performance reports

Installing FactorAnalytics

Install from source is the preferred method of installation. To do so, your local git installation will need to support git large file storage (LFS). Instruction for installing LFS can be found HERE

Plenty of documentation exists for cloning github repos and building R packages locally. A simple search specifying your IDE and OS should do.

Note a bug in remotes/devtools prevents the package from being loaded using install_github("braverock/FactorAnalytics"). Please see r-lib/remotes/issues/637.

Presentations and vignettes on FactorAnalytics

Fundamental Factor Models vignette

From the first paragraph

The overarching long-term goal of the fundamental factor model (Ffm) development in the FactorAnalytics packages is to replicate a large proportion of the non-proprietary models and model fitting and analysis methodology that is contained in commercial portfolio construction and risk management products such as MSCI Barra, Axioma, Northfield, etc. Furthermore our goals include the implementation cutting edge methods to support portfolio construction and risk management that are not much available in commercial products, such as global optimization, unequal histories and other missing data handling, highly robust covariance matrix estimators and their application to multivariate exposures and returns outliers, optimal bias robust regression, factor model Monte Carlo, new methods for handling serial correlation that improve upon traditional HAC methods, etc.

https://github.com/braverock/FactorAnalytics/blob/master/vignettes/Fundamental-Factor-Models-FactorAnalytics.pdf

R/Finance 2017, Chicago

R Script and slides used in Prof. Douglas Martin's "Fundamental Factor Models in FactorAnalytics" Pre-Conference Seminar.

Boston useR Group 2017

Click here for the background slide deck for the Boston useR group talk by Prof. Doug Martin.

factoranalytics's People

Contributors

Stargazers

Watchers

Forkers

avinashacharya jaydenwhyte nemochina2008 kashenfelter goldstar111 yyzeng supermem miceli algoskynet fintrek kkumawat25 uims vi-to justinmshea greengrassblueocean caozq19 wsdjs cnxtech icecubez gentlerainsky ssh352 shizelong1985 jaymon0703 vishalbelsare dan9401 loochao wenjian0721 foeinlove jingmouren m-muecke eduardobsg vimc-quant maxclchen spinnj sichaojin williamjin1992 arontian2018 lucasandlevinandjadon qwzhong1988 pinpss boukos awgooding mikey2000903 jerryc520 timelyportfolio arete030

factoranalytics's Issues

-R CMD check ERROR concerning "Stock.df"

Missing data set "Stock.df" is making a bunch of tests fail.

Running grep -rnw -e 'Stock.df in directory '~/R/FactorAnalytics/R` lists the following files

print.ffm.R:21:#' data(Stock.df)
plot.ffm.R:106:#' data(Stock.df)
fmVaRDecomp.R:80:#' data(Stock.df)
predict.ffm.R:34:#' data(Stock.df)
summary.ffm.R:38:#' data(Stock.df)
fmCov.R:61:#' data(Stock.df)
fmEsDecomp.R:81:#' data(Stock.df)

-R CMD check ERROR concerning "stocks145scores6"

Missing data set "stocks145scores6" is making a bunch of tests fail, as it doesn't exist.

Running grep -rnw -e 'stocks145scores6 in the R directory (~/R/FactorAnalytics/R) reveals the following list of 9 files that need to be fixed.

portVolDecomp.R:43:#' data("stocks145scores6")
portVolDecomp.R:44:#' dat = stocks145scores6
portVaRDecomp.R:78:#' data("stocks145scores6")
portVaRDecomp.R:79:#' dat = stocks145scores6
portEsDecomp.R:77:#' data("stocks145scores6")
portEsDecomp.R:78:#' dat = stocks145scores6
portSdDecomp.R:71:#' data("stocks145scores6")
portSdDecomp.R:72:#' dat = stocks145scores6
riskDecomp.R:70:#' data("stocks145scores6")
riskDecomp.R:71:#' dat = stocks145scores6
repExposures.R:41:#' data("stocks145scores6")
repExposures.R:42:#' dat = stocks145scores6
tsPlotMP.R:27:#' data("stocks145scores6")
tsPlotMP.R:28:#' dat = stocks145scores6
repReturn.R:39:#' data("stocks145scores6")
repReturn.R:40:#' dat = stocks145scores6
repRisk.R:81:#' data("stocks145scores6")
repRisk.R:82:#' dat = stocks145scores6

Questions on data sets methodology

Hi! First of all, thank you for making your package available!

I am using the factorDataSetDjia5Yrs data set you shared to fit some fundamental factors models with fitFfm() and I'd like to ask some questions about some variables definitions.

Specifically, on factors, how is MKTCAP computed? Is it inflation-adjusted? Then, how is SIZE calculated?
On assets returns, it seems RETURN is the 'raw' return? What does RETURN.OLD stand for?

I checked the manual and ffmVignette, but wasn't able to find descriptions.

-R CME Check NOTE: checking Rd files

man/FactorAnalytics-package.Rd: non-ASCII input and no declared encoding

prepare_Rd: readGithubFAdata.Rd:12-14: Dropping empty section \value

problem found in ‘FactorAnalytics-package.Rd’

-R CMD Check Warning: replacing previous import ‘data.table::last’ by ‘xts::last’ when loading ‘FactorAnalytics’

Running grep -rnw last R/*.R | grep xts gets these:

R/fitFfmDT.R:675: residuals1 <- residuals1[, which(!is.na(xts::last(residuals1)))]
R/fitFfmDT.R:677: # resid.var <- resid.var[which(!is.na(xts::last(residuals1)))]

Running grep -rnw last R/*.R gets a lot but here are the interesting files to refactor and look at

R/fitFfM2_rolling.R:54: names(rollind) <- uniqueDates[sapply(rollind, last)]
R/fitFfM2_rolling.R:78: rebalDate <- uniqueDates[last(Tindx[rollind[[i]]])] # names ?
R/fitFfmDT.R:668: resid.cov <- diag(as.numeric(w[last(index(w)),])) # use the last estimate
R/fitFfmDT.R:1178: first(SpecObj$dataDT[[d_]]), last(SpecObj$dataDT[[d_]])))

-R CMD Check errors for datasets "wtsStocks145GmvLo.rda"

Running grep -rnw wtsStocks145GmvLo R/*.R returns the following files.

R/datasets.R:133:#' @name wtsStocks145GmvLo
R/datasets.R:139:#' @Usage data("wtsStocks145GmvLo")
R/portEsDecomp.R:84:#' data("wtsStocks145GmvLo")
R/portEsDecomp.R:85:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/portEsDecomp.R:96:#' portEsDecomp(fit.cross, weights = wtsStocks145GmvLo)
R/portSdDecomp.R:78:#' data("wtsStocks145GmvLo")
R/portSdDecomp.R:79:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/portSdDecomp.R:90:#' portSdDecomp(fit.cross, wtsStocks145GmvLo)
R/portVaRDecomp.R:85:#' data("wtsStocks145GmvLo")
R/portVaRDecomp.R:86:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/portVaRDecomp.R:97:#' portVaRDecomp(fit.cross, weights = wtsStocks145GmvLo)
R/portVolDecomp.R:50:#' data("wtsStocks145GmvLo")
R/portVolDecomp.R:51:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/repExposures.R:48:#' data("wtsStocks145GmvLo")
R/repExposures.R:49:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/repExposures.R:57:#' repExposures(fit.cross, wtsStocks145GmvLo, isPlot = FALSE, digits = 4)
R/repExposures.R:58:#' repExposures(fit.cross, wtsStocks145GmvLo, isPrint = FALSE, isPlot = TRUE,
R/repExposures.R:60:#' repExposures(fit.cross, wtsStocks145GmvLo, isPlot = TRUE, which = 1,
R/repExposures.R:62:#' repExposures(fit.cross, wtsStocks145GmvLo, isPrint = FALSE, isPlot = TRUE,
R/repReturn.R:46:#' data("wtsStocks145GmvLo")
R/repReturn.R:47:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/repReturn.R:56:#' repReturn(fit.cross, wtsStocks145GmvLo, isPlot = FALSE, digits = 4)
R/repReturn.R:57:#' repReturn(fit.cross, wtsStocks145GmvLo, isPrint = FALSE, isPlot = TRUE,
R/repReturn.R:59:#' repReturn(fit.cross, wtsStocks145GmvLo, isPrint = FALSE, isPlot = TRUE,
R/repReturn.R:61:#' repReturn(fit.cross, wtsStocks145GmvLo, isPrint = FALSE, isPlot = TRUE,
R/repRisk.R:88:#' data("wtsStocks145GmvLo")
R/repRisk.R:89:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/repRisk.R:99:#' repRisk(fit.cross, wtsStocks145GmvLo, risk = "Sd", decomp = 'FPCR',
R/repRisk.R:102:#' repRisk(fit.cross, wtsStocks145GmvLo, risk = c("VaR", "ES"), decomp = 'FPCR',
R/repRisk.R:105:#' repRisk(fit.cross, wtsStocks145GmvLo, risk = "Sd", decomp = 'FPCR',
R/riskDecomp.R:77:#' data("wtsStocks145GmvLo")
R/riskDecomp.R:78:#' wtsStocks145GmvLo = round(wtsStocks145GmvLo,5)
R/riskDecomp.R:88:#' portES.decomp = riskDecomp(fit.cross, weights = wtsStocks145GmvLo, risk = "ES", portDecomp = TRUE)

-R CMD Check WARNING: checking for missing documentation entries

Undocumented data sets:

‘r.M’
‘r.W’
‘managers’
‘stocksCRSPweekly’

Add script to import larger data from "extra_data" folder

-R CMD Check ERROR: checking examples

Running examples in ‘FactorAnalytics-Ex.R’ failed
The error most likely occurred in:

Name: chartRobRisk

Title: 'riskBudget' Plots

Aliases: chartRobRisk

** Examples

data("RussellData")
Warning in data("RussellData") : data set ‘RussellData’ not found
rf = RussellData[, 16]
Error: object 'RussellData' not found
Execution halted

error in vignette fitFmDT 06 14 20

vif(fitDjia5Yr, isPlot = T, isPrint = F, lwd = .7,stripText.cex = .8,axis.cex=.8)
Error in [.data.table(ffmObj$data, , exposure.vars, drop = FALSE) :
j (the 2nd argument inside [...]) is a single symbol but column name 'exposure.vars' is not found. Perhaps you intended DT[, ..exposure.vars]. This difference to data.frame is deliberate and explained in FAQ 1.1.

Update datasets in fitFfm_vignette_04_09.Rnw

Uses old Stocks.df file. Need to replace with stocksCRSP and scoreSPGMI.

fitFfm_vignette_04_09.Rnw

Update README

The current README has good information, but could benefit from an improved layout, instructions, and content edits. These include but aren't limited to

Adding relevant badges which communicate various status to the user
Edit and expand install instructions if needed.
Update and double check data sources
A few examples to get folks started right away

Install of master branch failing - "ERROR: lazydata failed for package 'FactorAnalytics'"

Hi, I've tried to install your package using:

devtools::install_github("braverock/FactorAnalytics", ref="master")

It fails, however, with the message (see screenshot below):

I've installed git lfs, as suggested in the package documentation, and so I was wondering if this a bug or something else?

Thanks very much

fitTsfmMT.r conversion to excess returns broken

See

https://github.com/braverock/FactorAnalytics/blob/master/R/fitTsfmMT.R#L111-L115
115

Document each dataset in seperate .R files

Get the Following Warning on build: invalid input found on input connection '~/R/FactorAnalytics/R/datasets.R'

Related to issue #35, concerning the r/datasets.R file.

Each data set needs to be broken out into separate .R files. Good examples for data set documentation can be found below

The Stocks.df Problem

Stocks.df contains a single data frame named stock. The data stock is used in Sangeetha's fitFfm vignette, which we are not going to use initially, and perhaps never, stock is used as an example in a number of functions as shown in the spreadsheet issuesFAdatasetsFunctions 07 23 21.xlsx, that I will put in the sandbox. I note that Sangeetha gives no source for the stock data frame. There are two possible solutions:

Delete Stocks.df/stock, and comment out the example segments in various functions that make use of stock.
Create a new data set from stocksCRSP and scoresSPGMI and swap it in wherever Stocks.df appears.

At this point I strongly prefer 1 above, with idea based on commenting out mentioned above, I could come back later and do 2.

Demo folder

For a variety of useful scripts, especially for R scripts for static vignettes.

Remove legacy functions and Data no longer in use

Do a thorough review of all functions in the R/ folder. Some are obvious for removal, but some functions that were once useful, may no longer be necessary, or should be moved to the sandbox for refactoring.

-R CMD Check Warning: replacing previous import ‘data.table::first’ by ‘xts::first’ when loading ‘FactorAnalytics’

Similar to #29

Triggered when running the package

fitTsfmUpDn errors in do.call(lmrobdet.control, control[m2, drop = TRUE]) :

Also, part of #43 needs further investigation

Name: fitTsfmUpDn

Title: Fit a up and down market factor model using time series

regression

Aliases: fitTsfmUpDn

** Examples

load data

data(managers, package = 'PerformanceAnalytics')

Make syntactically valid column names

colnames(managers)
[1] "HAM1" "HAM2" "HAM3" "HAM4" "HAM5"
[6] "HAM6" "EDHEC LS EQ" "SP500 TR" "US 10Y TR" "US 3m TR"
colnames(managers) <- make.names( colnames(managers))
colnames(managers)
[1] "HAM1" "HAM2" "HAM3" "HAM4" "HAM5"
[6] "HAM6" "EDHEC.LS.EQ" "SP500.TR" "US.10Y.TR" "US.3m.TR"

example: Up and down market factor model with LS fit

fitUpDn <- fitTsfmUpDn(asset.names=colnames(managers[,(1:6)]),mkt.name="SP500.TR", data=managers, fit.method="LS",control=NULL)
Error in do.call(lmrobdet.control, control[m2, drop = TRUE]) :
second argument must be a list
Calls: fitTsfmUpDn -> fitTsfm -> do.call
Execution halted

-R CMD Check WARNING: checking Rd \usage sections

Undocumented arguments in documentation object 'calcFLAM'
‘specObj’ ‘modelStats’ ‘fitResults’ ‘...’

Undocumented arguments in documentation object 'convert'
‘SpecObj’ ‘FitObj’ ‘RegStatsObj’ ‘...’

Undocumented arguments in documentation object 'convert.ffmSpec'
‘...’

Undocumented arguments in documentation object 'fitFfm'
‘GARCH.MLE’

Undocumented arguments in documentation object 'fitFfmDT'
‘lmrobdet.control.para.list’ ‘...’

Undocumented arguments in documentation object 'tsPlotMP'
‘ret’
Documented arguments not in \usage in documentation object 'tsPlotMP':
‘data’

Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.

Plots not working in fitFfm_vignette_04_09.Rnw

Various plots no longer work, debug/troubleshoot @sangeeuw

Failed to install because of "Error in read.dcf(path)"

Error in read.dcf(path) :
Found continuation line starting ' time series, fun ...' at begin of record.
10. read.dcf(path)
9. is.data.frame(x)
8. colnames(read.dcf(path))
7. read_dcf(tmp)
6. remote_package_name.github_remote(remote)
5. remote_package_name(remote)
4. FUN(X[[i]], ...)
3. vapply(remotes, install_remote, ..., FUN.VALUE = character(1))
2.
install_remotes(remotes, auth_token = auth_token, host = host,
dependencies = dependencies, upgrade = upgrade, force = force,
quiet = quiet, build = build, build_opts = build_opts, repos = repos,
type = type, ...)
1.
devtools::install_github("braverock/factorAnalytics")

-R CMD Check WARNING: checking for code/documentation mismatches

Data with usage in documentation object 'CommonFactors' but not in code:
‘CommonFactors’

Data with usage in documentation object 'RussellData' but not in code:
‘RussellData’

Data with usage in documentation object 'factorDataSPGMI' but not in code:
‘factorDataSPGMI’

Data with usage in documentation object 'factorDataSetDjia14Yrs' but not in code:
‘factorDataSetDjia’

Data with usage in documentation object 'scoresSPGMIraw' but not in code:
‘scoresSPGMIraw’

Data with usage in documentation object 'stocksCRSPscoresSPGMI' but not in code:
‘stocksCRSPscoresSPGMI’

Data with usage in documentation object 'stocksCRSPscoresSPGMIraw' but not in code:
‘stocksCRSPscoresSPGMIraw’

Data with usage in documentation object 'stocksWithFactorsData300' but not in code:
‘stocksWithFactorsData300’

Data with usage in documentation object 'wtsDjiaGmv' but not in code:
‘wtsDjiaGmv’

Data with usage in documentation object 'wtsStocks145Gmv' but not in code:
‘wtsStocks145Gmv’

Set fitSfm aside for now to expedite preperations for FactorAnalytics for CRAN submission

Justin and I agreed to put the fitSfm vignette and all related functions and data sets in the sandbox, with to goal of adding this very nice capability back into FA in the second CRAN release.

-R CMD Check WARNING: Found the following significant warnings

Note: possible error in 'lmrobdetMM(data = x, ': unused arguments (mxr = 200, mxf = 200, mxs = 200)
Note: possible error in 'standardizeExposures(obj = rollingObject, ': unused argument (obj = rollingObject)
Note: possible error in 'extractRegressionStats(obj = rollingObject, ': unused argument (obj = rollingObject)

-R CMD Check ERROR: Running examples in ‘FactorAnalytics-Ex.R’ failed

The error most likely occurred in:

Name: fitTsfmLagLeadBeta

Title: Fit a lagged and lead Betas factor model using time series

regression

Aliases: fitTsfmLagLeadBeta

** Examples

load data from the database

data(managers, package = 'PerformanceAnalytics')

example: A lagged Beetas model with LS fit

fit <- fitTsfmLagLeadBeta(asset.names=colnames(managers[,(1:6)]),LagLeadBeta=2,LagOnly=TRUE, mkt.name="SP500 TR",rf.name="US 3m TR",data=managers)
Error in [.xts(data.xts, , factor.names) : subscript out of bounds
Calls: fitTsfmLagLeadBeta -> fitTsfm -> merge -> merge.xts -> [ -> [.xts
Execution halted

stocksCRSPweekly.rda

The current stocksCRSPweekly.rda file in the folder data_extra does not have all the components needed. See the yellow-highlighted text in the updated file FactorAnalytics Data 08 15 21.docx in the sandbox

FA attached packages and packages loaded via a namespace

Among the current FA attached packages, I tentatively believe that rrcov and robustbase should not be needed, at least not needed as attached packages. But likely the RobStatTM should be an attached package because its lmrobdetMM() function, among other, are needed as robust regression options for fitTsfm and fitFfm. Consquently, Justin will delete them as attached packages and we will see what happens.

As for packages loaded via a namespace, there are currently 118, which seems way over the top. So we need to review which ones of these we really need, and pair the list down as much as possible.

Plots not working in fitTsfm_vignette

plot(fit.sub, plot.single=TRUE, asset.name="HAM1")

Make a plot selection (or 0 to exit):

1: Actual and fitted asset returns (not working)
2: Actual vs fitted asset returns
3: Residuals vs fitted asset returns
4: Sqrt. of modified residuals vs fitted
5: Residuals with standard error bands (not working)
6: Time series of squared residuals (not working)
7: Time series of absolute residuals (not working)
8: SACF and PACF of residuals
9: SACF and PACF of squared residuals
10: SACF and PACF of absolute residuals
11: Non-parametric density of residuals with normal overlaid
12: Non-parametric density of residuals with skew-t overlaid
13: Histogram of residuals with non-parametric density and normal overlaid
14: QQ-plot of residuals
15: CUSUM test-Recursive residuals
16: CUSUM test-LS residuals
17: Recursive estimates (RE) test of LS regression coefficients
18: Rolling regression over a 24-period observation window
19: Asset returns vs factor returns (single factor model)

Install failed by LazyData: true

@JustinMShea As the title, I had an install error when running devtools::install_github("braverock/FactorAnalytics"). (After the merge conflict in #41 is resolved.)

* installing *source* package ‘FactorAnalytics’ ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
Warning: file ‘StockReturns.rda’ has magic number 'versi'
  Use of save versions prior to 2 is deprecated
Error in load(zfile, envir = tmp_env) : 
  bad restore file magic number (file may be corrupted) -- no data loaded
ERROR: lazydata failed for package ‘FactorAnalytics’

Removing the LazyData: true in the DESCRIPTION allows the package to be installed, but I'm not sure if this will mess up the compression.

It's not really helpful, but a similar issue here for reference: carlanetto/M4comp2018#1.

migrate from testthat to tinytest

Much more stable, eliminates issues related to sprawling dependencies that have cropped up in the past with testthat.

https://github.com/markvanderloo/tinytest

Fix test issues related to these missing data -R CMD check ERROR
── Error (test-repExposures.R:3:1): (code run outside of test_that()) ────────
Error: object 'stocks145scores6' not found
── Error (test-repReturn.R:4:1): (code run outside of test_that()) ───────────
Error: object 'stocks145scores6' not found
── Error (test-tsPlotMP.R:4:1): (code run outside of test_that()) ────────────
Error: object 'stocks145scores6' not found

Error in data("factorDataSetDjia5Yrs")

Hi, I followed the steps to install "FactorAnalytics" and "ExpectedReturns".

However, when I run the following code chunk:
library(ExpectedReturns)
data("FF3.monthly")
data("factorDataSetDjia5Yrs")

"FF3.monthly" can be correctly loaded.
"factorDataSetDjia5Yrs" will give the following error.

Error in load(zfile, envir = tmp_env) : 
  bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file ‘factorDataSetDjia5Yrs.rda’ has magic number 'versi'
  Use of save versions prior to 2 is deprecated

My environment is R = 4.1.0, RStudio-1.4.1717 （windows 10）.

Thanks and look forward.

fitSfm functions and data: to keep or not to keep in FA 1.0

This functionality is really very nice, as is Sangeetha's vignette, and the R script for the vignette seems to run without errors (maybe one warning). So it is highly desirable to leave the functionality in the FA package. The problem is that we need adequate sources for all of the following data:

StockReturns.rda which contains the 120 x 15 data frame r.M, and the 182 x 1618 data frame r.W, and TreasuryYields.rda, which contains only the 4714 x 11 xts object tr.yields.

Brian and Justin, it will be great if you could review the information in the data manual pages, and let Sangeetha and I know what is adequate, and what is not adequate, and what Sangeetha needs to provide, if we are going to keep the data.

-R CMD Check NOTE: checking R code for possible problems

plot.ffm: no visible binding for global variable ‘TICKER’
plot.ffm: no visible global function definition for ‘chart.Boxplot’
plot.ffm: no visible global function definition for ‘corrplot.mixed’
plot.tsfm: no visible global function definition for ‘addLegend’
plot.tsfm: no visible global function definition for ‘efp’
plot.tsfm: no visible global function definition for ‘rollapply’
plot.tsfm: no visible global function definition for ‘corrplot.mixed’

Undefined global functions or variables:
TICKER addLegend chart.Boxplot corrplot.mixed efp rollapply

fitFfm versus fitFfmDT

We need to decide whether or not to keep fitFfm in view of the fact that FactorAnalytics benefits greatly from Mido's addition the fitFfmDT data.table version. Just and I discussed this and believe that the best route would be to keep fitFfm by adding an optional argument for specifying whether or not the fitFfmDT version is used. My suggestion is the default DT = TRUE. We should discuss pro's and con's with Mido, come to a consensus among the three of us, and then get Brian's thoughts.

Add Doug's "Fundamental Factor Models in FactorAnalytics" paper as a static vignette

For now, use static PDFs and follow Mark van der Loo's guidance here

http://www.markvanderloo.eu/yaRb/2019/01/11/add-a-static-pdf-vignette-to-an-r-package/

- R CMD Check WARNING: checking S3 generic/method consistency

convert:
function(x, ...)
convert.ffmSpec:
function(SpecObj, FitObj, RegStatsObj)

print:
function(x, ...)
print.ffmSpec:
function(SpecObj)

See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.

-R CMD Check NOTE: 'LazyDataCompression' is specified without 'LazyData'

Check CRSP data transformations

Check all data transforms in the sandbox/CRSP folder. switch transforms to xts and related functions if needed in freqChange.R. Recreate data sets created and marked with "Dan's code" if needed. These include

Risk-free montly.csv
risk-free weekly.csv

Problem with vignette causing compile errors

The problem is due to a change in how data.table inherits data.frame features.

fitSfm function, the Sfm vignette and data used

fitSfm and the vignette that Sangeetha wrote are very nice, and it will be great to have them in FA. The fitSfm vignette currently uses the following data sets:
StockReturns
r.M originally from Berndt (1991)
r.W source S+FinMetrics Berndt.dat & folio.dat

TreasuryYields
tr.yields text file from companion web site to Ruppert (2010)

We need to replace the above data sets with ones for which we are free to use. For the r.M (monthly) and r.W (weekly) stock data, we can use stocksCRSP. And we can replace tr.yields with identical or similar ones from FRED. This requires revision of the vignette and likely examples used for other functions. I think we should proceed using those data sets to see if there is anything currently broken, and if there is nothing broken, or broken things very easily fixed, and it does cause a build error or warning, then we are in good shape, and only have to replace the data sets, with the attendant revision of the fitSfm vignette and various examples that use the data, and if everything works, it will be justified to make that effort.

Move all import and importFrom from zzz.R to relevant .R function files

Related to #56. This will give us an opportunity to review all imports, and decide what is best to move to "suggests", "imports", or "depends" on the description file.

fitFfm errors out when data object contains a single TICKER

Examples removed from fmXXDecomp.R manual pages, and saved for future replacement

Manual page examples to be removed from factor model risk decomposition
functions for now, due to removal of data sets for the examples, and saved
here for adding back once new data sets are provided for these examples

fmSdDecomp.R:

Statistical Factor Model:

data(StockReturns)
sfm.pca.fit <- fitSfm(r.M, k=2)
decomp <- fmSdDecomp(sfm.pca.fit)
decomp$pcSd

fmVaRDecomp.R:

Statistical Factor Model:

data(StockReturns)
sfm.pca.fit <- fitSfm(r.M, k=2)
VaR.decomp <- fmVaRDecomp(sfm.pca.fit, type="normal")
VaR.decomp$cVaR

Fundamental Factor Model:

data(Stocks.df)
exposure.vars <- c("BOOK2MARKET", "LOG.MARKETCAP")
fit <- fitFfm(data=stock, asset.var="TICKER", ret.var="RETURN",
date.var="DATE", exposure.vars=exposure.vars)
VaR.decomp <- fmVaRDecomp(fit, type="normal")
VaR.decomp$cVaR

fmESDecomp.R:

Statistical Factor Model:

data(StockReturns)
sfm.pca.fit <- fitSfm(r.M, k=2)
ES.decomp <- fmEsDecomp(sfm.pca.fit, type="normal")
ES.decomp$cES

Fundamental Factor Model:

data(Stocks.df)
exposure.vars <- c("BOOK2MARKET", "LOG.MARKETCAP")
fit <- fitFfm(data=stock, asset.var="TICKER", ret.var="RETURN",
date.var="DATE", exposure.vars=exposure.vars)
ES.decomp <- fmEsDecomp(fit, type="normal")
head(ES.decomp$cES)

fmCov.R

Statistical Factor Model:

data(StockReturns)
sfm.pca.fit <- fitSfm(r.M, k=2)
fmCov(sfm.pca.fit)

Fundamental factor Model:

data(Stocks.df)
exposure.vars <- c("BOOK2MARKET", "LOG.MARKETCAP", "GICS.SECTOR")
fit2 <- fitFfm(data=stock, asset.var="TICKER", ret.var="RETURN",
date.var="DATE", exposure.vars=exposure.vars)
fmCov(fit2)

summary.ffm

#' data(Stocks.df)
#' exposure.vars <- c("BOOK2MARKET", "LOG.MARKETCAP", "GICS.SECTOR")
#' fit2 <- fitFfm(data=stock, asset.var="TICKER", ret.var="RETURN",
#' date.var="DATE", exposure.vars=exposure.vars)
#'
#' # summary of factor returns estimated in each time period
#' summary(fit2)
#'
#' # summary of lm fit for a single period
#' summary(fit2$factor.fit[[1]])

-R CMD Check errors for datasets "managers.rda"

Running grep -rnw managers R/*.R returns the following

R/cusumActMgr.R:9:#' Assessing the performance of the active managers is hard because active returns (i.e. portfolio return - benchmark return) are noisy.
R/cusumActMgr.R:11:#' the standard deviation of active returns. Empirical studies of active managers across a wide range of asset classes suggest that an
R/cusumActMgr.R:14:#' inefficiencies sometimes appear in their place. Consequently, the majority of active managers deliver active returns and IR close to 0, and even
R/fitTsfm.control.R:168:#' data(managers)
R/fitTsfm.control.R:169:#' fit <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fitTsfm.control.R:170:#' factor.names=colnames(managers[,(7:9)]),
R/fitTsfm.control.R:171:#' data=managers, variable.selection="subsets",
R/fitTsfm.R:126:#' data(managers)
R/fitTsfm.R:127:#' colnames(managers) = gsub(" ",'.',colnames(managers))
R/fitTsfm.R:129:#' fit <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fitTsfm.R:130:#' factor.names=colnames(managers[,(7:9)]), data=managers)
R/fitTsfm.R:141:#' fit.sub <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fitTsfm.R:142:#' factor.names=colnames(managers[,(7:9)]),
R/fitTsfm.R:143:#' data=managers, variable.selection="subsets",
R/fitTsfm.R:147:#' fit.lar <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fitTsfm.R:148:#' factor.names=colnames(managers[,(7:9)]),
R/fitTsfm.R:149:#' rf.name="US.3m.TR", data=managers,
R/fmCov.R:50:#' data(managers)
R/fmCov.R:51:#' fit <- fitTsfm(asset.names=colnames(managers[, (1:6)]),
R/fmCov.R:52:#' factor.names=c("EDHEC.LS.EQ","SP500.TR"), data=managers)
R/fmEsDecomp.R:67:#' data(managers)
R/fmEsDecomp.R:68:#' fit.macro <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fmEsDecomp.R:69:#' factor.names=colnames(managers[,(7:8)]), data=managers)
R/fmmcSemiParam.R:61:#' data(managers)
R/fmmcSemiParam.R:62:#' fit <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fmmcSemiParam.R:63:#' factor.names=colnames(managers[,(7:9)]), data=managers)
R/fmmcSemiParam.R:67:#' fmmc.returns <- fmmcSemiParam(factor.ret=managers[,(7:9)], beta=fit$beta,
R/fmmcSemiParam.R:73:#' rownames(resid.par) <- colnames(managers[,(1:6)])
R/fmmcSemiParam.R:74:#' fmmc.returns.CF <- fmmcSemiParam(factor.ret=managers[,(7:9)], beta=fit$beta,
R/fmmcSemiParam.R:81:#' rownames(resid.par) <- colnames(managers[,(1:6)])
R/fmmcSemiParam.R:82:#' fmmc.returns.skewt <- fmmcSemiParam(factor.ret=managers[,(7:9)],
R/fmSdDecomp.R:58:#' data(managers)
R/fmSdDecomp.R:59:#' fit.macro <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fmSdDecomp.R:60:#' factor.names=colnames(managers[,(7:9)]),
R/fmSdDecomp.R:61:#' rf.name="US.3m.TR", data=managers)
R/fmVaRDecomp.R:66:#' data(managers)
R/fmVaRDecomp.R:67:#' fit.macro <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/fmVaRDecomp.R:68:#' factor.names=colnames(managers[,(7:8)]), data=managers)
R/portEsDecomp.R:62:#' data(managers)
R/portEsDecomp.R:63:#' fit.macro <- FactorAnalytics::fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/portEsDecomp.R:64:#' factor.names=colnames(managers[,(7:9)]),
R/portEsDecomp.R:65:#' rf.name=colnames(managers[,10]), data=managers)
R/portEsDecomp.R:73:#' names(wts) <- colnames(managers)[1:6]
R/portSdDecomp.R:56:#' data(managers)
R/portSdDecomp.R:57:#' fit.macro <- FactorAnalytics::fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/portSdDecomp.R:58:#' factor.names=colnames(managers[,(7:9)]),
R/portSdDecomp.R:59:#' rf.name=colnames(managers[,10]), data=managers)
R/portSdDecomp.R:67:#' names(wts) <- colnames(managers)[1:6]
R/portVaRDecomp.R:62:#' data(managers)
R/portVaRDecomp.R:63:#' fit.macro <- FactorAnalytics::fitTsfm(asset.names=colnames(manager[,(1:6)]),
R/portVaRDecomp.R:64:#' factor.names=colnames(managers[,(7:9)]),
R/portVaRDecomp.R:65:#' rf.name=colnames(managers[,10]), data=manager)
R/portVaRDecomp.R:73:#' names(wts) <- colnames(managers)[1:6]
R/portVolDecomp.R:35:#' data(managers)
R/portVolDecomp.R:36:#' fit.macro <- FactorAnalytics::fitTsfm(asset.names=colnames(manager[,(1:6)]),
R/portVolDecomp.R:37:#' factor.names=colnames(managers[,(7:9)]),
R/portVolDecomp.R:38:#' rf.name=colnames(managers[,10]), data=manager)
R/repRisk.R:68:#' data(managers)
R/repRisk.R:69:#' fit.macro <- FactorAnalytics::fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/repRisk.R:70:#' factor.names=colnames(managers[,(7:9)]),
R/repRisk.R:71:#' rf.name=colnames(managers[,10]), data=managers)
R/riskDecomp.R:52:#' data(managers)
R/riskDecomp.R:53:#' fit.macro <- FactorAnalytics::fitTsfm(asset.names=colnames(managers[,(1:6)]),
R/riskDecomp.R:54:#' factor.names=colnames(managers[,(7:9)]),
R/riskDecomp.R:55:#' rf.name=colnames(managers[,10]), data=managers)
R/riskDecomp.R:64:#' names(wts) <- colnames(managers)[1:6]

stocksCRSP$GICS data issues

Conversations with Doug Martin indicats the following minor fixes

Transform GICS to char from factor
Replace missing values - with NA

Update functions in fitFfm_vignette_04_09.Rnw

Replace fitFfm with fitFfmDT. See 'mido_versionOfSangeetha_fitFfmVignette.R' written by @midoshammaa

-R CMD Check: NOTE installed size is 8.9Mb

We have reduce the main data/ to half the size, but we need to get it under 5 Mb, or lobby for an exception.

sub-directories of 1Mb or more:
data 7.0Mb

fitTsfmLagLeadBeta, fitTsfmUpDn, fitTsfmMT

Justin and I discussed that these functions should be options in the function fitTsfm.
Probably all three of these functions can go there as is, but we should check the option arguments of each, and how we pass them, as an immediate priority.

Also, fitTsfmLabLeadBeta currently implements the Dimson method without the correction of bias due to serial correlation in the market returns, and that needs to be added as an option, and furthermore the thought is to also enhance this thin trading beta capability to include the Scholes-Williams method.

braverock / factoranalytics Goto Github PK

factoranalytics's Introduction

Factor Analytics for asset return data

Installing FactorAnalytics

Presentations and vignettes on FactorAnalytics

Fundamental Factor Models vignette

R/Finance 2017, Chicago

Boston useR Group 2017

factoranalytics's People

Contributors

Stargazers

Watchers

Forkers

factoranalytics's Issues

Name: chartRobRisk

Title: 'riskBudget' Plots

Aliases: chartRobRisk

** Examples

Name: fitTsfmUpDn

Title: Fit a up and down market factor model using time series

regression

Aliases: fitTsfmUpDn

** Examples

load data

Make syntactically valid column names

example: Up and down market factor model with LS fit

Name: fitTsfmLagLeadBeta

Title: Fit a lagged and lead Betas factor model using time series

regression

Aliases: fitTsfmLagLeadBeta

** Examples

load data from the database

example: A lagged Beetas model with LS fit

Make a plot selection (or 0 to exit):

fmSdDecomp.R:

fmVaRDecomp.R:

fmESDecomp.R:

fmCov.R

summary.ffm

Recommend Projects

Recommend Topics

Recommend Org