Code Monkey home page Code Monkey logo

cart's Introduction

CART

Data source: Data were analysed from the POCLS, a prospective longitudinal cohort of 4126 children aged 0-17 years who entered the OOHC system (i.e., on interim care and protection orders) in NSW Australia between 2010 and 2011 (i.e., POCLS population cohort).

Statistical analysis: CART analyses were conducted to identify subgroups of children, with similar levels of socio-emotional difficulties. First, we calculated frequencies and percentages to describe potential factors related to socio-emotional difficulties of children in OOHC in four groups: demographic characteristics, pre-care maltreatment, placement experiences and caregiver-related characteristics. Second, we conducted the bivariable logistic regression analyses to assess the association, measured as crude odds ratios (ORs) with 95% confidence intervals (CIs), between children’s socio-emotional difficulties (i.e., “clinical” vs. “non-clinical”) and individual risk factors in these four groups. Finally, we conducted the CART analyses (Breiman, Friedman, Stone, & Olshen, 1984) to identify subgroups of children, with similar levels of socio-emotional difficulties. We retained age at the interview and removed age at first entry into OOHC and duration in care from the model to avoid multicollinearity. The CART method utilises the if-then rules, partitioning a sample into progressively smaller but increasingly homogeneous subsets with regard to a given outcome, in this case, socio-emotional difficulties (Morgan, 2014). Gini index or Gini impurity was used to measure the classification error rate, that is, the probability of a particular feature being wrongly classified when it is randomly chosen (Timofeev, 2004). The algorithm starts by selecting the variable (root node) that provides the most efficient pathway to a decision (i.e., minimises error measures) and then splits the dataset into sub-groups. This process is applied recursively until either the sub-groups reach a minimum size (a sample size of 10 is commonly recommended) or until an improved capacity of explaining the variation of the outcome can be achieved by adding more factors (Therneau & Atkinson, 1997). At each partitioning, the node where the sample is split is referred to as the “parent node”, and the next nodes where the sample is further divided are called the “child nodes”. The final split or node, referred to as a “leaf”, generates the prevalence of the outcome (Morgan, 2014; Timofeev, 2004). For each split, the CART process imputes missing values and retains variables based on the impurity measures (Hayes, Usami, Jacobucci, & McArdle, 2015; Therneau & Atkinson, 1997). We randomly split the sample into a training set (80% of the full sample; n = 1230) and a validation sample (the remaining 20%) (Vabalas, Gowen, Poliakoff, & Casson, 2019). A cross-validation procedure was used to determine the classification accuracy of the model by repeating the tree classification process for the training sample sets. The accuracy rate (i.e. measured as area under the curve (AUC) representing the overall percentage of correctly identifying the child with clinical socio-emotional difficulties from the child with no clinical socio-emotional difficulties in a randomly selected set of two children), sensitivity (i.e. true positive: the ability to correctly identify children who had clinical socio-emotional difficulties) and specificity (i.e. true negative: the ability to correctly identify children who had no clinical socio-emotional difficulties) were calculated to evaluate the model classification accuracy.

Due to the sensitivity of the data this script doesn't contain data

cart's People

Contributors

yalemzewod avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.