Code Monkey home page Code Monkey logo

dna's Introduction

Discourse Network Analyzer (DNA)

The Java software Discourse Network Analyzer (DNA) is a qualitative content analysis tool with network export facilities. You import text files and annotate statements that persons or organizations make, and the program will return network matrices of actors connected by shared concepts.

  • Download the latest release of the software.

  • Annotate documents, such as newspaper articles or speeches, with statements of what actors say; then export network data.

  • You can use the stand-alone software visone (or any other network analysis software) for analyzing the resulting networks.

  • The software comes with an R package called rDNA for remote controlling DNA and for further ways of analyzing the networks.

DNA 3.0 was first released on 12 June 2022. It constitutes a major rewrite from the previous version DNA 2.0 beta 25. DNA 3 comes with many new features and improvements. The release page contains all the details (scroll to version 3.0.7 for the first DNA 3 release).

If you require the latest (non-release) version of the DNA jar file from GitHub, you can clone the git repository to your computer and execute ./gradlew build on your terminal or command line. This will build the jar file and store it in the build/ directory of the cloned repository. Alternatively, you can try to download the latest artifact from the build process under GitHub Actions by clicking on the latest build and scrolling down to "Artifacts". However, it is usually recommended to use the most recent release version.

DNA/rDNA build

rDNA 3.0: Connecting DNA to R

The R package rDNA connects DNA to R for data exchange and analysis.

Please note that the current version 3.0 does not have the full functionality of the old 2.0 version yet. It can create networks, but please use the old version for now if you require more complex data management and analysis functionality in R. It is possible to import DNA 2 data into DNA 3 at any point (but not the other way around). New R functions will be added in the future.

To install the new rDNA 3 directly from GitHub, try the following code in R:

# install.packages("remotes")
remotes::install_github("leifeld/dna/rDNA/rDNA@*release",
                        INSTALL_opts = "--no-multiarch")

Note that the package relies on rJava, which needs to be installed first.

Installation of the old rDNA 2.1.18

For data management, you may still want to use the old rDNA 2.1.18 with DNA 2.0 beta 25. You can install the package directly from GitHub as well. However, you will need to download the correct JAR file and store it either in your working directory or (recommended) in the library path of the installed R package in the "extdata" subdirectory. The following code can do this for you:

# install.packages("remotes")
remotes::install_github("leifeld/dna/[email protected]",
                        INSTALL_opts = "--no-multiarch")

# find out where to store the JAR file
dest <- paste0(dirname(system.file(".", package = "rDNA")),
               "/extdata/dna-2.0-beta25.jar")

# download JAR file and store in library path
u <- "https://github.com/leifeld/dna/releases/download/v2.0-beta.25/dna-2.0-beta25.jar"
download.file(url = u, destfile = dest, mode = "wb")

Documentation and community

  • This tutorial on YouTube describes installation of DNA, basic data coding, network export, and network analysis using visone. The video clip is 18 minutes long.

    DNA tutorial

  • See the bibliography for several hundred publications and theses using discourse network analysis or the DNA software.

  • The introductory chapter (Leifeld 2017) in the Oxford Handbook of Political Networks is recommended as a primer (chapter; preprint).

  • The previous version of DNA and rDNA came with a detailed manual of more than 100 pages. It is outdated, but perhaps still useful.

  • If you have questions or want to report bugs, please create an issue in the issue tracker.

  • Join the the DNA community on Matrix. Matrix is a chat protocol. It's similar to Slack, Discord, or WhatsApp, but without the corporate shackles. It's free, open-source, decentralised, and secure. We have set up a public space called #dna:yatrix.org with separate chat rooms for installation, research, and development. It's really easy to join: You first create an account on one of the many Matrix servers (we use and recommend yatrix.org), then download one of the many Matrix clients on your phone, computer, or the web (e.g., Element) to use the account with, and finally join #dna:yatrix.org. To simplify the process, you can just click on this invitation link for some sensible default choices. Make sure you join all four public rooms (you can mute their notifications as needed) and look at the rules in the #dna-welcome room upon arrival.

Support the project

Please consider contributing to the project by:

  • telling other people about the software,
  • citing our underlying research in your publications,
  • reporting or fixing issues, or
  • starting pull requests to contribute bug fixes or new functionality.

Some suggestions of new functionality you could add via pull requests:

  • Import filters for loading data from Nvivo, MaxQDA, and other software into DNA.
  • Export filters for exporting networks to Gephi and other network analysis software.
  • Analysis functions or unit tests for the rDNA package.
  • Publications for the bibliography.
  • Bug fixes.

dna's People

Contributors

brandenberger avatar elebarreras avatar jbgruber avatar krgaric avatar leifeld avatar rakandirbas avatar shrakulkarni avatar timhenrichsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dna's Issues

Allow combinations of filters for statements

As of version 1.25, it is only possible to apply one statement filter at a time (e.g., only filter by category or organization or agreement, etc.). It should be possible, for example, to list all statements matching a certain organization, person, category AND agreement pattern at the same time.

First submitted: 2011-03-27

Add Pajek support

DNA 1.29: There should be an export function for Pajek .NET files. Exporting .DL files and converting them to .NET files in Ucinet does not work for non-Western languages because Ucinet does not support Unicode.

First submitted: 2012-08-07

Select actors by their type in the network export window

In the network export window, one should be able to select actors based on their type. For instance, there could be another exclude or include list, or a combo box. Make sure the feature is also implemented in rDNA.

First submitted: 2011-02-18

Show empty export options panel where appropriate

In the network export window, there is a panel for custom options. However, some algorithms do not have any custom options. But if these algorithms are activated, the custom options panel is not emptied.

First submitted: 2011-02-18

Quotation marks in person, organization and category tags

Problem: Users can enter quotation marks and possibly other characters that disturb some of the output export formats.

Solution: Forbid quotation marks in the person, organization and category field in the GUI.

First submitted: 2011-02-18

Statement popup windows cover text portions when activated from the sidebar

DNA 1.29: Statement popup windows are placed at a fixed position at the bottom of the text panel if they are activated from the statement sidebar. They sometimes cover the statement text, which is inconvenient. The popup window should be placed somewhere else where it does not cover the text, or the text should be centered at the page.

The issue was reported by Myanna Lahsen.

First submitted: 2012-06-04

Replace XML data format by an embedded database

Replacing the XML data format contained in the .dna files by a binary database like H2 or SQLite would have several advantages:

  1. speed

  2. possibly an undo/redo function

  3. possibly compression

  4. better data integrity due to foreign keys

  5. binary blobs can be integrated (possibly useful for future extensions)

  6. crash prevention: changes are directly saved to disk

  7. less redundancy due to foreign keys and queries.

First submitted: 2011-12-04

Check DNA files for strange symbols

When people use the XML import format to import massive loads of articles, there may be double spaces, escape sequences, question marks, quotation marks, angle brackets or other strange symbols which can corrupt the XML format or the import process. Avoid this by filtering out these characters whenever a file is opened.

First submitted: 2011-02-18

Improve handling of the bottom bar

The bottom bar is only accessible from the "Extras" menu. Switching between the different panels in the bottom bar is also only possible via the menu. There should be buttons inside the panels which can close the bottom bar or show the next panel.

First submitted: 2011-02-18

Automatically replace text in the New Article window

There is a text saying "(paste the contents of the article here by highlighting this text and replacing it using Ctrl-V)" in the New Article window. It would be nicer if this text was in light gray and if it disappeared automatically as soon as the text pane grabbed focus.

First submitted: 2011-02-18

ImportHTMLWebpageTag: error messages for xml-elements

In ImportHTMLWebpageTag.java / ImportHTML.java / ImportWebpage.java add error messages if the chosen xml-elements cannot be found in the corresponding document/webpage. This could be similar to the warning message that was added in 2.0 alpha 8 for the date extraction.

e.g.:
try{
String title = file.select(titleElement).text();
}
catch(NullPointerException e) {
String message = "\n Date not extractable.\nUse 'set date manually'-option.";
JOptionPane.showMessageDialog(new JFrame(), message, "Warning", JOptionPane.ERROR_MESSAGE);
}

Exclude actors/categories with less than n statements from export

There should be an option in the network export window and also in rDNA which lets the user constrain the set of actors and/or concepts by their frequency. For example, only include those categories in the export which occur at least ten times in the whole time period or in the whole file. Or export only those actors who make at least five statements (or five different statements, alternatively).

First submitted: 2011-02-18

Gephi interoperability via GFX file format; dynamic data?

Create an export filter for Gephi .gfx files. Possibly also for longitudinal data? Check what the longitudinal data requirements look like. If an event-based export (= time stamps) is not possible, perhaps with a time window approach as in the SONIA export function?

First submitted: 2011-02-18

rDNA does not fully work on MacOS

A problem occurs when trying to run rDNA on MacOs 10.5 or 10.6. It is possible to compile, install and load the package in R, but executing the dna.gui() command fails. The error messages are "Apple AWT Java VM was loaded on first thread -- can't start AWT." and "Error in .jnew("dna/Dna") : java.lang.NoClassDefFoundError: dna/Dna". However, DNA (without the R package rDNA) works well on MacOS.

Update: The problem is apparently caused by the fact that java.awt.Color (and possibly other AWT or Swing classes) are loaded in the main thread (i.e., in the very first thread that is started). A possible solution could be to start a new thread for the rest of DNA immediately after starting the main class of DNA (or any class associated with the DNA JAR file).

First submitted: 2011-03-17

Speed up network export

Speed up computations by optimizing the export function (especially congruence networks and the time window algorithm are slow).

First submitted: 2011-03-27

Collaborative editing functionality

There could be a function for collaborative editing with several research assistants at the same time. For example, the file is distributed to several people, and each article and statement tag have an attribute which saves the current user and the date and time of the modification. The could be a function which can automatically merge the file later on.

Or, alternatively, the software could establish an online connection to a server where a file with the current category and actor set is located. Users could propose new categories or actors in real time, so there would be no conflicts.

First submitted: 2011-02-18

Automatic backups

Implement a function which creates automatic backups of the current .dna file every n minutes, where n can be set somewhere in the options. Ask before saving because saving takes a couple of seconds and may distract the user.

First submitted: 2011-02-18

DNA is slow (almost crashes) when an article is very long

DNA 1.29: The longer an article text, the slower DNA gets. This is related to a function which adds yellow background color to the statement text. For long texts, this procedure takes a very long time. Computing time required for these operations seems to be non-linearly related to the article length. For long documents, DNA thus appears as if it crashed.

Note: The problem may get better with the new database format planned for version 2.0.

The bug was reported by Gabriela Couto and Myanna Lahsen.

First submitted: 2012-05-09

Unable to export dynamic visualizations (Commetrix + SoNIA)

When trying to export dynamic networks, be that in Commetrix SQL or SoNIA format, the "Calculating" window appears and stays there, animated, forever. I have to terminate/kill the corresponding java process 'manually' using top (Ubuntu) or Task Manager (Windows). This happens with:

  • Ubuntu 14.04 (using Open JDK 7 or Oracle java 7)
  • Windows XP (Oracle java 7)
  • Windows 7 pro (Oracle java 7)

I have the following terminal output under Ubuntu:

Exception in thread "Thread-0" java.lang.NullPointerException
at dna.ExportWindow$FileExporter.run(ExportWindow.java:1424)
at java.lang.Thread.run(Thread.java:745)

Index out of bounds exception with dna.network()

What steps will reproduce the problem?

library(rDNA)
dna.init("dna-1.28.jar")
current.data <- "2011-03-22.dna"
aff.109.no <- dna.network(
  current.data,
  algorithm="affiliation",
  agreement="no",
  include.isolates=TRUE,
  start.date="03.01.2005",
  stop.date="02.01.2007",
  exclude.categories=c(
    "CO2 legislation will not hurt the economy.",
    "Cap and trade is the solution.",
    "Emissions legislation should regulate CO2."
  ),
  invert.categories=TRUE
)

What is the expected output? What do you see instead?

expected: should export properly; actual output: "Creating matrix object... Fehler in .jcall(export, "[[D", "matrixObject") :

java.lang.IndexOutOfBoundsException: Index: 5, Size: 5"

What version of the product are you using? On what operating system?

1.28

First submitted: 2011-10-05

GUI does not work properly when started from R

What steps will reproduce the problem?

  1. open R and initialize rDNA
  2. run dna.gui()
  3. try to open a file, export etc. in the GUI

What is the expected output? What do you see instead?

Displays lots of exceptions for most advanced functions.

What version of the product are you using? On what operating system?

1.27

First submitted: 2011-09-12

Include statement frequency in visone export

It would be cool if the statement frequency of each actor was included as a node attribute when exporting to a .graphml file for visone. The statement frequency should capture how often (including repetitions/duplicates) an actor refers to the concepts selected for export in total during the specified time period.

First submitted: 2011-07-26

Statement popup window too wide when long entries have been coded

Problem: When very long person, organization, or category entries have been encoded, the statement popup window becomes very wide and sometimes extends beyond the borders of the screen. Usually, having such long codes is a bad idea in the first place. However, a future release should take care of that and should either restrict the length of the entries to a certain maximum length or only display the first 200 characters or so in the list. Reported by Christofer Edling.

First submitted: 2012-11-13

Auto-detect new statements

There should be a function which can auto-detect statements in the text. Perhaps via artificial neural networks? Or perhaps via wisdom of the crowds? Reading in large text corpora and training the software? Or semantic parsing? Or simply a regex search with user-defined terms? Or an online community of user-defined terms?

First submitted: 2011-02-18

graphML bug when actor type etc. is missing

What steps will reproduce the problem?

  1. load file with some actors associated with a type, alias, and note, and with some actors without these details filled out in the actor manager
  2. export an affiliation network in graphML format

What is the expected output? What do you see instead?

should export to a graphML file, but displays nullpointer exception instead

What version of the product are you using? On what operating system?

1.27

Please provide any additional information below:

reported by Sarah Burridge, UMN

First submitted: 2011-09-15

Add Ucinet .##h and .##d file import filter in rDNA

Add a function to rDNA which can import binary UCINET .##d and .##h files. There should be a parameter that indicates whether the result should be a matrix or a network object.

First submitted: 2011-02-18

Right to left

Hi
Thanks for this great tool. We have just started using it and it seemed to be the one we where looking for.
One problem though, we are analyzing Hebrew text - right to left data, and the copied text is aligned to the wrong side.
We would really appreciate if you can add a check-box to mark the text as right to left and change the alignment.

Thanks
Jonathan

Exception after editing actor attributes and closing the file without saving

What steps will reproduce the problem?
1. Open a DNA file
2. Change the actor type of some organizations
3. Save the file
4. Close the DNA file

What is the expected output? What do you see instead?
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at dna.ActorManager$6.valueChanged(ActorManager.java:234) at javax.swing.JList.fireSelectionValueChanged(JList.java:1782) at javax.swing.JList$ListSelectionHandler.valueChanged(JList.java:1796) at javax.swing.DefaultListSelectionModel.fireValueChanged(DefaultListSelectionModel.java:184) at javax.swing.DefaultListSelectionModel.fireValueChanged(DefaultListSelectionModel.java:164) at javax.swing.DefaultListSelectionModel.fireValueChanged(DefaultListSelectionModel.java:211) at javax.swing.DefaultListSelectionModel.removeIndexInterval(DefaultListSelectionModel.java:677) at javax.swing.plaf.basic.BasicListUI$Handler.intervalRemoved(BasicListUI.java:2597) at javax.swing.AbstractListModel.fireIntervalRemoved(AbstractListModel.java:178) at javax.swing.DefaultListModel.removeAllElements(DefaultListModel.java:402) at dna.DnaContainer.clear(DnaContainer.java:35) at dna.Dna.clearSpace(Dna.java:1019) at dna.Dna.closeDnaFile(Dna.java:1007) at dna.Dna.access$1(Dna.java:989) at dna.Dna$14.actionPerformed(Dna.java:1290)

What version of the product are you using? On what operating system?
1.25, Ubuntu 10.10

First submitted: 2011-03-22

Statement frequency cannot be determined; graphML affiliation network

What steps will reproduce the problem?

  1. open Dana's climate politics dataset
  2. export an affiliation network to a graphML file
  3. look at the error log

What is the expected output? What do you see instead?

expected: should export properly; actual output: "statement frequency cannot be determined for actor xy", then NullPointerException in line 2681 of the export class.

What version of the product are you using? On what operating system?

1.28

First submitted: 2011-10-05

DNA quits when pressing 'cancel'

When closing DNA, the program asks whether the file should be saved. Yes, no, or cancel. When pressing 'cancel', DNA is closed nevertheless. This should not happen.

First submitted: 2011-04-07

Statements are not displayed correctly in the statement list when escape characters are present

DNA 1.29: When invisible escape characters or HTML tags are present, the text portion that is highlighted in the text is displayed correctly, but it is shifted by one or more characters in the statement panel in the side bar. The same problem occurs when exporting a list of statements.

The bug report was submitted by Christopher Schulz along with a reproducible .dna file.

First submitted: 2012-09-23

Ability to remove a type from the drop-down list in the actor attribute manager

The user can enter attributes for persons and organizations in the attribute manager (available when clicking on "Show persons [organizations] in bottom bar" in the "Extras" menu). In the second column of the table, the user can select among the colored actor types specified in the list next to the table. However, when a type is removed from this list, it is still available in the drop-down list of the second table column. When removing an item from the list on the right, it should also be removed from the drop-down list.

First submitted: 2011-02-18

rDNA: exporting multiple times results in an error due to low memory

What steps will reproduce the problem?

run dna.network() on a large .dna file several times in a row (perhaps 10 times)

What is the expected output? What do you see instead?

expected: should import the data correctly into R; instead: gets slower and then crashes with an error message after a couple of rounds

What version of the product are you using? On what operating system?

1.23 through 1.28

Please provide any additional information below:

seems to be a memory or garbage collection issue

First submitted: 2011-10-16

ImportHTMLWebpageTag: buffering

In ImportHTMLWebpageTag.java add buffering-symbol when importing documents (e.g. a list of 4 links takes about 10 seconds).

More flexible tags

At the moment, only statement tags (i.e., with the four variables person, organization, category and agreement) can be attached to text portions. The software might benefit from allowing more flexible tags, possibly custom tags created by the user. One possibility could be just to add another tag type called "annotation" or "comment", which just has a free form text field. But it could be nice to be able to infer the relations between three or four category types, for example. It would also be good to have the option of establishing direct ties within a category. The problem is quite complex because it affects the data structure in DNA, the XML file structure, the GUI and also the export functions.

First submitted: 2011-02-18

Undo/redo function

There should be an undo/redo function, especially for editing the contents of statements, but if possible also for other actions, such as inserting, renaming or removing articles, or inserting/deleting statements, or editing actor attributes.

First submitted: 2011-03-28

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.