Code Monkey home page Code Monkey logo

sentibank's People

Contributors

nick-sh-oh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

quantdev05

sentibank's Issues

Documentation for sentibank

Hi,

Thanks for the last reply. I was also wondering whethere there are any documentations for sentibank, particularly for those processed dictionaries.

Improving SO-CAL and VADER

Hi,

Thank you for providing this open resource. I wanted to ask if it would be possible to expand the lexicons in sentiment dictionaries like SO-CAL and VADER by considering their custom algorithms. For example, could we expand existing lexicon entries by adding common negators (e.g. "not") or intensifiers (e.g. "very") in front? This could transform entries like "great" into variations such as "not great" or "really great."

I read your documentation on enhancing resources like the Dictionary of Affective Language (DAL) and Norms of Valence, Arousal and Dominance (NoVAD). Since you reflected authors' discussion points to improve these lexicons, I was wondering if similar enhancement techniques could be applied to VADER and SO-CAL as well.

Thanks!

Econ/Finance Specific Sentiment Dictioanry

Hello,

I appreciate the invaluable resource you've provided; it has significantly streamlined my research process.

My current focus revolves around sentiment analysis of microblogs related to economics and finance. Considering this, I'm curious if there are any plans to incorporate dictionaries specific to these domains in the future.

Thank you again for your assistance!

Sentiment Dictionary for Organisational Culture & Environment

While collecting and processing, we realised most of the existing sentiment dictionaries out there are applicable in either general (i.e VADER) or financial (i.e MASTER) domains. Other than the domain of political science (i.e the Manifesto Corpus), there is no existing sentiment dictionary in other social domains. But as Loughran and McDonald (2011) commented, 'words have many meanings, and a word categorisation scheme derived for one discipline might not translate effectively into (other) discipline'.

We propose building a sentiment dictionary that measures sentiment in textual data relevant to the organisational culture and enviornment. And here is a brief research design sketch:

1. Data Collection:

  • Subreddits about Organisational Culture & Environment
    [r/workingmoms, r/antiwork, r/WorkReform, r/Work, r/union, r/StrikeAction, …]
  • Glassdoor Employee Reviews from S&P500 from 2018 to 2022.

2. Filtering:

  • N-grams (incl. unigrams) that appeared at least 5% of sampled data (Loughran and McDonald, 2011; Bodnaruk, Loughran and McDonald, 2015)
  • Constructing a core dictionary inspired by established sources (Strapparava and Valitutti, 2004; Hutto and Gilbert, 2014). For organisational culture domain, we can use dictionary such as:
    (i) Lasswell Value Dictionary - Sociological classification of language into four deference domains - ‘power’, ‘rectitude’, ‘respect’ and ‘affiliation’ - and four welfare domains - ‘wealth’, ‘well-being’, ‘enlightenment’ and ‘skill’. Provides resources for understanding values and motivations;
    (ii) VADER - Hire western-style emoticons (i.e “D=”) and emojis, Acronyms and Initialisms (i.e “AITA”, “WTF”);
    (iii) Harvard IV-4 - Particularly ‘Social Relations’ (words about social roles, groups and interactions), ‘Communication’ (words related to communication modes), ‘Motivation’ (words related to needs, goals and achievement)

3. Expanding Verb-forms: Suppose we filtered “promote personal growth” from the previous step. We consider variations of “promote” and expand n-grams by adding “encourage personal growth”, “advance personal growth”, “assist personal growth”, “aid personal growth”, and so on.

  • 2of12inf, a collection of word inflections (Loughran and McDonald, 2011)
  • WordNet synsets (Strapparava and Valitutti, 2004; Valitutti, Strapparava and Stock, 2004)

4. Labelling:

  • Sentiment Labelling: Wisdom-of-the-Crowd Approach: Multiple independent raters rating each lexicons on a scale from [-X, +X] (Hutto and Gilbert, 2014)
  • ESG Labelling: Sub-components of Social & Governance theme (following ISSB, https://www.ifrs.org/groups/international-sustainability-standards-board/#resources)
    S: (i) Human Capital - All aspects of human capital management including employment practices, talent development, safety, and the labour standards of suppliers.; (ii) Product Liability - The potential for products to cause harm because of quality failures, safety failures, financial harm, privacy violations or data leaks, chemical harm, other health or demographic risk, and the potential benefits of responsible investment to improve product quality, safety, or impact; (iii) Stakeholder Opposition - Societal opposition to the company because of controversial sourcing techniques or locations, or other conflicts with local communities; (iv) Social Opportunities - The potential to benefit society by improving access to products
    G: (i) Corporate Governance - Factors relating to the quality of corporate oversight, including the structure and composition of the board of directors, shareholder ownership structure and control, CEO pay practices, and accounting quality; (ii) Corporate Behaviour - Evidence into the ethical behaviour of the company, including anticompetitive practices, corruption, and tax shielding and transparency.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.