The MatchIt package gets very slow when I add fixed effects. Are there any ways to mak

Hello, Can you be more specific about what functions in <code class=

very slow when using fixed effects about matchit HOT 10 CLOSED

kosukeimai commented on July 28, 2024

very slow when using fixed effects

from matchit.

Comments (10)

ngreifer commented on July 28, 2024

Hello,

Can you be more specific about what functions in MatchIt are running slow? I can think of a few reasons why certain functions might run slow and some ways to speed them up, but without further information, there is little I can do.

If you are estimating propensity scores within matchit(), then the estimation of the propensity scores could be slow due to the fixed effects. By default, the estimation of propensity scores is performed by glm(), so it may be that glm() is running slowly. One way to get around this would be to estimate the propensity scores outside matchit() using a package specifically designed to handle fixed effects quickly, such as the fixest package, and then supply those propensity scores to matchit() with the distance argument. For example, if your fixed effect variable is called cl (i.e., for cluster), you could run the following:

fefit <- fixest::feglm(treat ~ X1 + X2 | cl, data = data, family = binomial)
ps <- fefit$fitted
m.out <- matchit(treat ~ X1 + X2, data = data, distance = ps)

Other propensity score-estimation methods may simply be incompatible with fixed effects, like cbps, which you should therefore avoid using.

If you are performing Mahalanobis distance or genetic matching, matchit() may need to invert and multiply huge matrices if there are lots of fixed effects and many units. This cannot be avoided except by excluding the fixed effects from the calculation of the Mahalanobis distance.

If summary() is running slow after including fixed effects in the matchit() model formula, that is because summary() needs to compute balance on every fixed effect individually, which can take a long time. You can avoid this by using the first method I recommended so that the fixed effects are included in the propensity score but not in the matchit() object, or by using cobalt to assess balance instead of MatchIt since cobalt offers finer control of which covariates are included.

Let me know if any of this helped, or please provide more detail so I can better address the problem.

Noah

from matchit.

kosukeimai commented on July 28, 2024

I would suggest to use exact restricting. That is, match within groups that define fixed effects. The idea of fixed effects is basically within-group comparison and matching exactly on groups is usually a better strategy. See this paper and this one show the equivalence (or lack thereof) between fixed effects and matching. The first paper is about one-way fixed effects while the other paper is about two-way fixed effects.

from matchit.

waynelapierre commented on July 28, 2024

Thanks so much! The fixest method fixed my problem! I have another follow-up question, how can I specify that for all my treated observations, each one's matched observation should be in the sample group (such as industry, year, etc) and the matched distance cannot be higher than 0.1? If some treated observations do not have a matched observation that satisfies these requirements, then delete them from the treated group.

from matchit.

ngreifer commented on July 28, 2024

Use the exact argument to request exact matching on those characteristics, i.e., exact = ~industry+year. This ensures that each treated unit's match is within the same industry and year. Use the caliper argument to restrict the distance between matches. By default, the caliper is in standard deviation units of the distance measure (i.e., propensity score). Use the std.caliper argument to control whether the caliper should be in raw units. For example, caliper = .1, std.caliper = FALSE ensures that each treated unit's match has a propensity score within .1 of the treated unit's propensity score. You can also place calipers on individual covariates in addition to the propensity score. Any treated units that don't have matches that satisfy the exact and caliper restrictions will be dropped.

from matchit.

waynelapierre commented on July 28, 2024

Thanks so much. I just want to make sure that the variables supplied to the exact argument do not have to be in the variables used for matching. For example, matchit(y ~ x, exact = ~z + h, data = data) will work.

from matchit.

ngreifer commented on July 28, 2024

My understanding is that you aren't using any variables for matching except the propensity score, which is supplied to distance. The variables in the main formula are used solely for balance checking with summary() but will not affect the match if you provide already-estimated propensity scores to the distance argument (unless you're using genetic matching).

The variables in exact and caliper just need to be in the dataset supplied to data and don't need to be specified anywhere else, so that example you provided should work fine as long as z and h are in data.

from matchit.

waynelapierre commented on July 28, 2024

OK. Thanks for the clarification.

from matchit.

waynelapierre commented on July 28, 2024

It seems that when I supply machit's distance with a feglm model fitted value. Setting caliper to 0.1 and std.caliper to FALSE does not drop the matched observations with a distance higher than 0.1. Is this a bug?

from matchit.

ngreifer commented on July 28, 2024

You need to provide more information for me to help you. Please provide your code and the results that you think are in error and I can try to assess.

from matchit.

waynelapierre commented on July 28, 2024

My bad, I mistyped it. Thanks again for the great package and help!

from matchit.

very slow when using fixed effects about matchit HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent