webermarcolivier / statannot Goto Github PK
View Code? Open in Web Editor NEWadd statistical annotations (pvalue significance) on an existing boxplot generated by seaborn boxplot
License: MIT License
add statistical annotations (pvalue significance) on an existing boxplot generated by seaborn boxplot
License: MIT License
I am running three comparisons with an independent t-test and the annotation shows nicely on the plot but the p values are not adjusted following the Bonferroni procedure. I think I have the right statannot version as I have the StatResult.py file in my statannot folder. I am at a loss to find out what the problem is.
When I check the analysis I can see that there is no adjustment as shown below.
Could you help me to "activate" the adjustment?
for res in test_results: print(res)
print("\nStatResult attributes:", test_results[0].__dict__.keys())
I get:
AxesSubplot(0.125,0.125;0.775x0.755)
[{'pvalue': 1.210296633675592e-07, 'test_short_name': 't-test_ind', 'formatted_output': 't-test independent samples, P_val=1.210e-07 stat=5.397e+00', 'box1': 'Donation? Yes', 'box2': 'Donation? No'}, {'pvalue': 0.002277227049961783, 'test_short_name': 't-test_ind', 'formatted_output': 't-test independent samples, P_val=2.277e-03 stat=-3.074e+00', 'box1': 'Control', 'box2': 'Donation? Yes'}, {'pvalue': 0.003862565144342565, 'test_short_name': 't-test_ind', 'formatted_output': 't-test independent samples, P_val=3.863e-03 stat=2.904e+00', 'box1': 'Control', 'box2': 'Donation? No'}]
In the current implementation, in the case of a barplot, the y positions are computed based on the real data points. This makes sense in the case of the boxplot, where all the data points are plotted. However, the final seaborn barplot output is composed of the bar + error bars. We should only take into account the y position of the error bars.
Here's the add_stat_annotation
code i'm currently using:
stat_tests = add_stat_annotation(ax=g1, data=prop_df, x='group', y='skew',
order=group_list,
box_pairs=k_pairs,
text_annot_custom=k_annot,
perform_stat_test=False, pvalues=k_pvals,
loc='inside', verbose=0)
(Where k_pairs
, k_annot
, and k_pvals
are lists of groups, strings to annotate with and p values) throws error:
--> 627 y_stack_max = max(ymaxs)
628 if loc == 'inside':
629 ax.set_ylim((ylim[0], max(1.03*y_stack_max, ylim[1])))
ValueError: max() arg is an empty sequence
This error occurs whether I choose to manually set the y limits for the axes on my figure or not. Note I am applying this statistical annotation to each plot on a matplotlib plt.subplots
.
I try to run a simple plot with the annotation but do not manage to make it work. I have checked that my packages are up to date:
(I use Anaconda and Spyder)
sns: '0.9.0'
numpy: '1.15.4'
matplotlib: '3.0.2'
pd: 0.23.4
scipy: '1.1.0'
My code produces the right figure but without annotation and spits the following error message:
TypeError: add_stat_annotation() got an unexpected keyword argument 'box_pairs'
Any idea why? Any help much appreciated!
Marie
The code I use :
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from statannot import add_stat_annotation
from pandas.compat import StringIO
df = sns.load_dataset("tips")
from statannot.statannot import add_stat_annotation
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
test_results = add_stat_annotation(ax, data=df, x=x, y=y, order=order,
box_pairs=[("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")],
test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')
test_results`
Thanks, now I understand much better. Generalizing a bit the problem, it would be nice to identify these "clusters" of annotations and only stack the lines independently in each one. This would improve a lot the layout of the annotations. For example, in grouped (hued) boxplots, as in your example, annotations could be stacked in each group of boxes independently. The clusters can be easily determined by identifying groups of non-overlapping segments on the x axis.
I will try to work on such a solution, it shouldn't be too difficult.
Originally posted by @webermarcolivier in #19 (comment)
Is there an easy way to show only the comparisons that end up as significant? I would like to compute all comparisons, but showing the non-significant ones make the plot very busy.
The pairwise testing happens without any correction for significance like with Bonferroni correction for example. A test might show up as significant while it might be not significant if correction were to be applied. Is it possible to add this feature?
First of all - a very useful tool, thank You
For some reason, the annotations are not aligned for me and I can't figure why.
I'm using the code below and the annotations are far to the right of the boxplot.
Running the code on Anacondas 1.9.7 JupyterLab 1.1.4 Python 3.7.3
Do You know where the problem could be?
ax = sns.boxplot(data=df[['ACTB','GAPDH','HPRT1']]) fig = plt.gcf() test_results = add_stat_annotation(ax, df, box_pairs=[('ACTB','GAPDH'), ('ACTB','HPRT1'), ('GAPDH','HPRT1')], test='t-test_ind', text_format='star', loc='outside', verbose=1)
There is a crash when trying to plot annotations for a barplot in statannot-0.2.3
and seaborn-0.10.0
.
Consider the following code:
import numpy as np
import pandas as pd
import statannot
import seaborn as sns
import matplotlib.pyplot as plt
np.random.seed(42)
df = pd.DataFrame({
'y': np.random.random(size=50),
'x': np.random.choice(['X1', 'X2'], size=50),
})
ax = sns.barplot(x='x', y='y', data=df)
statannot.add_stat_annotation(
ax, plot='barplot',
data=df, x='x', y='y',
box_pairs=[('X1', 'X2')],
test='Mann-Whitney', text_format='simple'
)
plt.show()
It crashes with the error
Traceback (most recent call last):
File "test.py", line 21, in <module>
test='Mann-Whitney', text_format='simple'
File "../python/site-packages/statannot/statannot.py", line 442, in add_stat_annotation
errcolor=".26", errwidth=None, capsize=None, dodge=True)
TypeError: __init__() missing 1 required positional argument: 'seed'
This bug can be fixed by adding e.g. seed=None,
in
statannot/statannot/statannot.py
Lines 438 to 442 in 1835078
Is it possible to add the text annotations in the same format as box_pairs
?
E.g.
add_stat_annotation(ax, data=df, x=x, y=y, order=order,
box_pairs=[("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")],
text=['this','that', 'the other'], loc='outside', verbose=2)
I was writing to ask the easiest way to get the absolute p-value without also printing the type of test utilized (i.e. text_format = full). Essentially, I want to the "simple" output, but with the actual p-value not the simplified form (e.g. <0.05
Hey, do you think it would be possible to apply the library to plots when you hue-ed values?
https://seaborn.pydata.org/generated/seaborn.boxplot.html?highlight=boxplot#seaborn.boxplot
Do the errors bars on the barplot represent standard deviation or standard error? Neither of which seems to correspond to these values for my data.
When trying to visualize statistical significance between two boxplots, plotted by making use of the hue
variable, I get the following error:
ValueError: boxPairList contains an unvalid box pair.
When printing the .group_names
or .hue_names
of the BoxPlotter
object, nothing is returned, which is probably causing the error.
Any ideas on how to solve this?
Please allow use of "list of lists" for box_pairs.
Thank you.
Hi) Is there a way to remove NS p-values so there are no lines and values for non-significant pairs?
Thank you)
Thanks for the great package. However, I had a problem running the demo.
Seaborn in version 0.9.0
pvalue annotation legend:
ns: 5.00e-02 < p <= 1.00e+09
*: 1.00e-02 < p <= 5.00e-02
**: 1.00e-03 < p <= 1.00e-02
***: 1.00e-04 < p <= 1.00e-03
****: p <= 1.00e-04
()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-4-3cef1a3195f1> in <module>()
1 ax = sns.boxplot(x="day", y="total_bill", data=df)
2 add_statistical_test_annotation(ax, df, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")],
----> 3 test='Mann-Whitney', order=None, textFormat='star', loc='outside', verbose=2);
4 plt.savefig('example1.png', dpi=300, bbox_inches='tight')
/home/magnus/opt/statannot/statannot.py in add_statistical_test_annotation(ax, df, catPairList, xlabel, ylabel, test, order, textFormat, loc, pvalueThresholds, color, lineYOffsetAxesCoord, lineHeightAxesCoord, yTextOffsetPoints, linewidth, fontsize, verbose)
85 x1 = np.where(catValues == cat1)[0][0]
86 x2 = np.where(catValues == cat2)[0][0]
---> 87 cat1YMax = g[ylabel].max()[cat1]
88 cat2YMax = g[ylabel].max()[cat2]
89 cat1Values = g.get_group(cat1)[ylabel].values
/usr/local/lib/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key)
599 key = com._apply_if_callable(key, self)
600 try:
--> 601 result = self.index.get_value(self, key)
602
603 if not is_scalar(result):
/usr/local/lib/python2.7/site-packages/pandas/indexes/base.pyc in get_value(self, series, key)
2151 raise InvalidIndexError(key)
2152 else:
-> 2153 raise e1
2154 except Exception: # pragma: no cover
2155 raise e1
KeyError: 'Thur'
Hi,
My data frame input has no nans, but it still raised the error nanargmax. How do I resolve this error? Thank you very much
File "", line 54, in
test='t-test_paired', text_format='full', loc='inside', verbose=2)
File "/anaconda3/lib/python3.6/site-packages/statannot/statannot.py", line 519, in add_stat_annotation
(y_stack_arr[0, :] <= x2))])
File "<array_function internals>", line 6, in nanargmax
File "/anaconda3/lib/python3.6/site-packages/numpy/lib/nanfunctions.py", line 551, in nanargmax
raise ValueError("All-NaN slice encountered")
ValueError: All-NaN slice encountered
Thanks for such a useful package! A couple quick ideas in case you have time to make any changes.
equal_var=False
to scipy.stats.ttest_ind
for a Welch's test. This could also just be made to be the default, as it is usually a better choice than a standard t test that assumes equal variance, see e.g. here for one reference.Is the multiple comparisons problem tackled somehow in the library? If not, adding an option to take this into account would be great. I am happy to help with that if needed.
Improve documentation of add_stat_annotation
function.
I'm getting this error when trying to add statistical annotation on a boxplot
This happens with both code that worked a couple of weeks ago and the example case at the statannot front page.
I updated both packages with no success.
Very grateful of any help!
Would it be possible to include an ANOVA test using the boxplots/bargraphs where you could compare multiple groups and get a single line across indicating significant difference between groups with small vertical lines indicating which groups are included?
This image was created in MS Excel and lines were added afterwards, but the line at the top is what I was wondering about the possibility in statannot
.
I was in dire need for statistical annotations and successfully used your library. Thanks a lot for your work, it generally worked great.
Let me still suggest a couple of improvements. I know, this is a potpourri of different things. You can filter the requests one by one and create new issues in case you agree with them.
add_annotated_brackets()
that is completely agnostic about any testing and p-values. This function creates the brackets and annotations according to the formatting settings. add_stat_annotation()
could then just call add_annotated_brackets()
.annotat_kws
and line_kws
that are forwarded to ax.annotate()
and ax.plot()
/lines.Line2D()
, respectively. This will give the user a bit more stylistic freedom (font color, line styles, ...)ax.annotate(..., clip_on=False, ...)
(around Line 590) is wrong, it should be ax.annotate(..., clip_on=(loc=="inside"))
I think. At least the setting is not consistent with the ax.plot()
command a couple of lines earlier. If the stats annotation falls outside plt.ylim()
, the bracket line is clipped, but not the annotation text.add_annotated_brackets()
permit to set or adjust the y-locations of the brackets freely. The default choice to use max(y)+margin is good. But it might be useful to adjust this (e.g. have all brackets at same y).You certainly know that it is not recommended to use private stuff from other libraries... On the other hand, I can completely understand why you decided to use seaborn's _Plotters
, nevertheless. To reverse-engineer the seaborn plots from its Artists would be a nightmare. I recently asked the seaborn crew / Michael Waskom why not to extend seaborn by an annotation infrastructure. See here for the feature request and Michael's response.
I am trying to add statistical significance between groups (Equator and Posterior) and age. I am able to get that part working, however, when I add the second group (Equator vs Equator for different age groups) it treats the data backwards. I'm not sure how to resolve this.
I added the two lists "Equator vs Posterior" (lines 1-5) and "Equator vs Age" (lines 5-10) together to compare statistical significance.
[(('30-39', 'Equator'), ('30-39', 'Posterior')),
(('60-69', 'Equator'), ('60-69', 'Posterior')),
(('40-49', 'Equator'), ('40-49', 'Posterior')),
(('50-59', 'Equator'), ('50-59', 'Posterior')),
(('70-79', 'Equator'), ('70-79', 'Posterior')),
(('30-39', 'Equator'), ('30-39', 'Equator')),
(('30-39', 'Equator'), ('60-69', 'Equator')),
(('30-39', 'Equator'), ('40-49', 'Equator')),
(('30-39', 'Equator'), ('50-59', 'Equator')),
(('30-39', 'Equator'), ('70-79', 'Equator'))]
However, when I try to plot, statannot treats the "Equator vs Age" groups as "Posterior"
SteadyState_Peel_force_ageGroup_BoxPlot_WithData.pdf
f, ax = plt.subplots()
ax = sns.boxplot(x=AG, y=ss_mN, hue=R, hue_order=[Eq, Po], data=df)
# Statistical test for differences
# List of groups (AgeGroups)
hue_order = list(df[AG].unique())
# Create combinations to compare
box_pairs_1 = [((Age_Group_i, Eq), (Age_Group_i, Po))
for Age_Group_i in hue_order]
# Add equator age groups
# Create combinations to compare
box_pairs_2 = [((LAG[0], Eq), (Age_Group_i, Eq))
for Age_Group_i in hue_order]
box_pairs = box_pairs_1 + box_pairs_2
test_results = add_stat_annotation(ax, plot='boxplot', data=df,
x=AG, y=ss_mN,
hue=R, box_pairs=box_pairs,
test='t-test_ind', text_format='star',
loc='outside', verbose=2,
comparisons_correction=None,
line_offset=0.0,
line_offset_to_box=0.0,
line_height= 0.015,
fontsize='small') # 'bonferroni'
boxPlotBlackBorder() # Make borders black
Any idea how I can resolve this?
Hi,
I am kind of stuck trying to fix this error. Perhaps it's an implementing issue, though it doesn't have a problem just importing the module statannot. Also I noticed in statannot.py the function is called differently (add_stat_annotation // not add_statistical_test_annotation), though several blogs from late 2018 recommended latter function name. This is my code:
from statannot import add_statistical_test_annotation
(...)
with sns.axes_style(style='ticks'):
g = sns.catplot("Category", "Week 2", data=df, kind="box")
g.set_axis_labels("Category", str(input_parameter))
add_statistical_test_annotation(g, df, [("SHAM", "iCMP"), ("PBS", "iCMP"), ("eGFP", "iCMP")],
test='Mann-Whitney', order=None, textFormat='star', loc='inside', verbose=2)
plt.show()
Error:
from statannot import add_statistical_test_annotation
ImportError: cannot import name 'add_statistical_test_annotation'
Any help gladly appreciated!
Hello,
I would like to know if there is a way to have all annotations in a straight line.
When there is a lot of boxes it doesn't leave a lot of space for the results. However, I can't find a way to have all the annotations in a straight line.
Thank you!
Clรฉment
This tool is excellent. Thank you for all your hard work. Is it possible to hide non-significant results where multiple comparisons are made? So only show the annotations for significant results?
Thanks!
It would be nice if the permutation test would be implemented as an alternative to t-test & Co.
Using the source code that produced this example figure but altering the boxplot width:
ax = sns.boxplot(data=df, x=x, y=y, hue=hue, width=0.25)
It would be nice if the lines were aligned to the boxplot centers even for non-default width values.
(Obviously this is an extreme case where lines would be become too small, but this issue also holds for simper cases, e.g. 2 x 2 comparisons).
Hey there !
I'm having a RuntimeError running this :
data
Out[66]:
Control FBS 1% FBS 3%
0 0.494348 1.196539 0.921887
1 0.556027 0.940206 1.153515
2 0.445540 NaN 1.108820
3 0.464731 0.931461 0.956742
4 0.393526 0.894547 1.073090
5 0.479290 NaN 1.099829
6 0.683442 0.936075 NaN
7 0.667166 NaN NaN
8 0.526530 NaN NaN
9 0.499731 NaN NaN
statannot.add_stat_annotation(ax, data, boxPairList=[('Control','FBS 1%'), ('Control','FBS 3%'), ('FBS 1%','FBS 3%')], test='t-test', textFormat='star',loc='outside', verbose=1)
pvalue annotation legend:
ns: 5.00e-02 < p <= 1.00e+09
*: 1.00e-02 < p <= 5.00e-02
**: 1.00e-03 < p <= 1.00e-02
***: 1.00e-04 < p <= 1.00e-03
****: p <= 1.00e-04
Traceback (most recent call last):
File "<ipython-input-67-57fb82b05cc8>", line 1, in <module>
statannot.add_stat_annotation(ax, data, boxPairList=[('Control','FBS 1%'), ('Control','FBS 3%'), ('FBS 1%','FBS 3%')], test='t-test', textFormat='star',loc='outside', verbose=1)
File "C:\Users\analysesbiophysique\.spyder-py3\projets\Imaging\packages\statannot.py", line 218, in add_stat_annotation
bbox = ann.get_window_extent()
File "C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\text.py", line 2323, in get_window_extent
text_bbox = Text.get_window_extent(self, renderer=renderer)
File "C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\text.py", line 920, in get_window_extent
raise RuntimeError('Cannot get window extent w/o renderer')
RuntimeError: Cannot get window extent w/o renderer
Do you have any idea why this is happenning ?
Thanks !
I saw found the feature only in Mann-Whitney. but there is no one-side t-test ?
I wanted to add Chi-Squared tests to be able to compare categorical data and visualize significant differences. I was successful in doing so and was curious if anyone has any suggestions on improvement like adding multiple groups to perform a Chi-Squared test to look at multiple groups instead of just two.
I also wanted to have a significant difference asterisk to match the asterisk when I use the figures in LaTeX. I had to modify the symbol which is below.
I am using my own data set but any categorical data would work for this. There are four Failure Code values (0, 1, 2, 3), two age groups (+/- 60 yrs).
Here is how I would add the annotation for the significant differences:
ax = sns.countplot(x='Failure Code', hue='Age60', hue_order=['Age $<$ 60', 'Age $\geq$ 60'], data=df_no_Nan) #
# Statistical test for differences
hue_order = list(df_no_Nan['Failure Code'].unique()) # List of groups (AgeGroups)
box_pairs_1 = [((FailureCodei, 'Age $<$ 60'), (FailureCodei, 'Age $\geq$ 60')) for FailureCodei in hue_order] # Create combinations to compare
box_pairs = box_pairs_1
test_results = add_stat_annotation(ax, plot = 'countplot', data=df_no_Nan, x='Failure Code', y='failure code', hue='Age60', box_pairs=box_pairs,
test='chisquare', text_format='star',
loc='inside', verbose=2, comparisons_correction=None) # 'bonferroni'
Here is the updated code of statannot:
import warnings
import matplotlib.pyplot as plt
from matplotlib import lines
import matplotlib.transforms as mtransforms
from matplotlib.font_manager import FontProperties
import numpy as np
import pandas as pd
import seaborn as sns
from seaborn.utils import remove_na
import pdb
from .utils import raise_expected_got, assert_is_in
from .StatResult import StatResult
from scipy import stats
DEFAULT = object()
def stat_test(
box_data1,
box_data2,
test,
comparisons_correction=None,
num_comparisons=1,
**stats_params
):
"""Get formatted result of two sample statistical test.
Arguments
---------
bbox_data1, bbox_data2
test: str
Statistical test to run. Must be one of:
- `Levene`
- `Mann-Whitney`
- `Mann-Whitney-gt`
- `Mann-Whitney-ls`
- `t-test_ind`
- `t-test_welch`
- `t-test_paired`
- `Wilcoxon`
- `Kruskal`
- `Chi squared`
comparisons_correction: str or None, default None
Method to use for multiple comparisons correction. Currently only the
Bonferroni correction is implemented.
num_comparisons: int, default 1
Number of comparisons to use for multiple comparisons correction.
stats_params
Additional keyword arguments to pass to scipy stats functions.
Returns
-------
StatResult object with formatted result of test.
"""
# Check arguments.
assert_is_in(
comparisons_correction,
['bonferroni', None],
label='argument `comparisons_correction`',
)
# Switch to run scipy.stats hypothesis test.
if test == 'Levene':
stat, pval = stats.levene(box_data1, box_data2, **stats_params)
result = StatResult(
'Levene test of variance', 'levene', 'stat', stat, pval
)
elif test == 'Mann-Whitney':
u_stat, pval = stats.mannwhitneyu(
box_data1, box_data2, alternative='two-sided', **stats_params
)
result = StatResult(
'Mann-Whitney-Wilcoxon test two-sided',
'M.W.W.',
'U_stat',
u_stat,
pval,
)
elif test == 'Mann-Whitney-gt':
u_stat, pval = stats.mannwhitneyu(
box_data1, box_data2, alternative='greater', **stats_params
)
result = StatResult(
'Mann-Whitney-Wilcoxon test greater',
'M.W.W.',
'U_stat',
u_stat,
pval,
)
elif test == 'Mann-Whitney-ls':
u_stat, pval = stats.mannwhitneyu(
box_data1, box_data2, alternative='less', **stats_params
)
result = StatResult(
'Mann-Whitney-Wilcoxon test smaller',
'M.W.W.',
'U_stat',
u_stat,
pval,
)
elif test == 't-test_ind':
stat, pval = stats.ttest_ind(a=box_data1, b=box_data2, **stats_params)
result = StatResult(
't-test independent samples', 't-test_ind', 'stat', stat, pval
)
elif test == 't-test_welch':
stat, pval = stats.ttest_ind(
a=box_data1, b=box_data2, equal_var=False, **stats_params
)
result = StatResult(
'Welch\'s t-test independent samples',
't-test_welch',
'stat',
stat,
pval,
)
elif test == 't-test_paired':
stat, pval = stats.ttest_rel(a=box_data1, b=box_data2, **stats_params)
result = StatResult(
't-test paired samples', 't-test_rel', 'stat', stat, pval
)
elif test == 'Wilcoxon':
zero_method_default = len(box_data1) <= 20 and "pratt" or "wilcox"
zero_method = stats_params.get('zero_method', zero_method_default)
print("Using zero_method ", zero_method)
stat, pval = stats.wilcoxon(
box_data1, box_data2, zero_method=zero_method, **stats_params
)
result = StatResult(
'Wilcoxon test (paired samples)', 'Wilcoxon', 'stat', stat, pval
)
elif test == 'Kruskal':
stat, pval = stats.kruskal(box_data1, box_data2, **stats_params)
test_short_name = 'Kruskal'
result = StatResult(
'Kruskal-Wallis paired samples', 'Kruskal', 'stat', stat, pval
)
elif test == 'chisquare':
stat, pval = stats.chisquare([box_data1.count(), box_data2.count()], **stats_params)
test_short_name = 'ChiSquare'
result = StatResult(
'ChiSquare categorical groups', 'ChiSquare', 'stat', stat, pval
)
else:
result = StatResult(None, '', None, None, np.nan)
# Optionally, run multiple comparisons correction.
if comparisons_correction == 'bonferroni':
result.pval = bonferroni(result.pval, num_comparisons)
result.test_str = result.test_str + ' with Bonferroni correction'
elif comparisons_correction is None:
pass
else:
# This should never be reached because `comparisons_correction` must
# be a valid correction method or None.
raise RuntimeError('Unexpectedly reached end of switch.')
return result
def bonferroni(p_values, num_comparisons='auto'):
"""Apply Bonferroni correction for multiple comparisons.
The Bonferroni correction is defined as
p_corrected = min(num_comparisons * p, 1.0).
Arguments
---------
p_values: scalar or list-like
One or more p_values to correct.
num_comparisons: int or `auto`
Number of comparisons. Use `auto` to infer the number of comparisons
from the length of the `p_values` list.
Returns
-------
Scalar or numpy array of corrected p-values.
"""
# Input checks.
if np.ndim(p_values) > 1:
raise_expected_got(
'Scalar or list-like', 'argument `p_values`', p_values
)
if num_comparisons != 'auto':
try:
# Raise a TypeError if num_comparisons is not numeric, and raise
# an AssertionError if it isn't int-like.
assert np.ceil(num_comparisons) == num_comparisons
except (AssertionError, TypeError) as e:
raise_expected_got(
'Int or `auto`', 'argument `num_comparisons`', num_comparisons
)
# Coerce p_values to numpy array.
p_values_array = np.atleast_1d(p_values)
if num_comparisons == 'auto':
# Infer number of comparisons
num_comparisons = len(p_values_array)
elif len(p_values_array) > 1 and num_comparisons != len(p_values_array):
# Warn if multiple p_values have been passed and num_comparisons is
# set manually.
warnings.warn(
'Manually-specified `num_comparisons={}` differs from number of '
'p_values to correct ({}).'.format(
num_comparisons, len(p_values_array)
)
)
# Apply correction by multiplying p_values and thresholding at p=1.0
p_values_array *= num_comparisons
p_values_array = np.min(
[p_values_array, np.ones_like(p_values_array)], axis=0
)
if len(p_values_array) == 1:
# Return a scalar if input was a scalar.
return p_values_array[0]
else:
return p_values_array
def pval_annotation_text(x, pvalue_thresholds):
single_value = False
if type(x) is np.array:
x1 = x
else:
x1 = np.array([x])
single_value = True
# Sort the threshold array
pvalue_thresholds = pd.DataFrame(pvalue_thresholds).sort_values(by=0, ascending=False).values
x_annot = pd.Series(["" for _ in range(len(x1))])
for i in range(0, len(pvalue_thresholds)):
if i < len(pvalue_thresholds)-1:
condition = (x1 <= pvalue_thresholds[i][0]) & (pvalue_thresholds[i+1][0] < x1)
x_annot[condition] = pvalue_thresholds[i][1]
else:
condition = x1 < pvalue_thresholds[i][0]
x_annot[condition] = pvalue_thresholds[i][1]
return x_annot if not single_value else x_annot.iloc[0]
def simple_text(pval, pvalue_format, pvalue_thresholds, test_short_name=None):
"""
Generates simple text for test name and pvalue
:param pval: pvalue
:param pvalue_format: format string for pvalue
:param test_short_name: Short name of test to show
:param pvalue_thresholds: String to display per pvalue range
:return: simple annotation
"""
# Sort thresholds
thresholds = sorted(pvalue_thresholds, key=lambda x: x[0])
# Test name if passed
text = test_short_name and test_short_name + " " or ""
for threshold in thresholds:
if pval < threshold[0]:
pval_text = "p โค {}".format(threshold[1])
break
else:
pval_text = "p = {}".format(pvalue_format).format(pval)
return text + pval_text
# ='boxplot' removed after the word plot
def add_stat_annotation(ax, plot,
data=None, x=None, y=None, hue=None, units=None, order=None,
hue_order=None, box_pairs=None, width=0.8,
perform_stat_test=True,
pvalues=None, test_short_name=None,
test=None, text_format='star', pvalue_format_string=DEFAULT,
text_annot_custom=None,
loc='inside', show_test_name=True,
pvalue_thresholds=DEFAULT, stats_params=dict(),
comparisons_correction='bonferroni',
use_fixed_offset=False, line_offset_to_box=None,
line_offset=None, line_height=0.02, text_offset=1,
color='0.2', linewidth=1.5,
fontsize='medium', verbose=1):
"""
Optionally computes statistical test between pairs of data series, and add statistical annotation on top
of the boxes/bars. The same exact arguments `data`, `x`, `y`, `hue`, `order`, `width`,
`hue_order` (and `units`) as in the seaborn boxplot/barplot function must be passed to this function.
This function works in one of the two following modes:
a) `perform_stat_test` is True: statistical test as given by argument `test` is performed.
b) `perform_stat_test` is False: no statistical test is performed, list of custom p-values `pvalues` are
used for each pair of boxes. The `test_short_name` argument is then used as the name of the
custom statistical test.
:param plot: type of the plot, one of 'boxplot' or 'barplot'.
:param line_height: in axes fraction coordinates
:param text_offset: in points
:param box_pairs: can be of either form: For non-grouped boxplot: `[(cat1, cat2), (cat3, cat4)]`. For boxplot grouped by hue: `[((cat1, hue1), (cat2, hue2)), ((cat3, hue3), (cat4, hue4))]`
:param pvalue_format_string: defaults to `"{.3e}"`
:param pvalue_thresholds: list of lists, or tuples. Default is: For "star" text_format: `[[1e-4, "****"], [1e-3, "***"], [1e-2, "**"], [0.05, "*"], [1, "ns"]]`. For "simple" text_format : `[[1e-5, "1e-5"], [1e-4, "1e-4"], [1e-3, "0.001"], [1e-2, "0.01"]]`
:param pvalues: list or array of p-values for each box pair comparison.
:param comparisons_correction: Method for multiple comparisons correction. `bonferroni` or None.
"""
def find_x_position_box(box_plotter, boxName):
"""
boxName can be either a name "cat" or a tuple ("cat", "hue")
"""
if box_plotter.plot_hues is None:
cat = boxName
hue_offset = 0
else:
cat = boxName[0]
hue = boxName[1]
hue_offset = box_plotter.hue_offsets[
box_plotter.hue_names.index(hue)]
group_pos = box_plotter.group_names.index(cat)
box_pos = group_pos + hue_offset
return box_pos
def get_box_data(box_plotter, boxName):
"""
boxName can be either a name "cat" or a tuple ("cat", "hue")
Here we really have to duplicate seaborn code, because there is not
direct access to the box_data in the BoxPlotter class.
"""
cat = box_plotter.plot_hues is None and boxName or boxName[0]
index = box_plotter.group_names.index(cat)
group_data = box_plotter.plot_data[index]
if box_plotter.plot_hues is None:
# Draw a single box or a set of boxes
# with a single level of grouping
box_data = remove_na(group_data)
else:
hue_level = boxName[1]
hue_mask = box_plotter.plot_hues[index] == hue_level
box_data = remove_na(group_data[hue_mask])
return box_data
# Set default values if necessary
if pvalue_format_string is DEFAULT:
pvalue_format_string = '{:.3e}'
simple_format_string = '{:.2f}'
else:
simple_format_string = pvalue_format_string
if pvalue_thresholds is DEFAULT:
if text_format == "star":
pvalue_thresholds = [[0.0001, r"${****}$"], [0.001, r"${***}$"],
[0.01, r"${**}$"], [0.05, r"$*$"], [1, "ns"]]
else:
pvalue_thresholds = [[1e-5, "1e-5"], [1e-4, "1e-4"],
[1e-3, "0.001"], [1e-2, "0.01"]]
fig = plt.gcf()
# Validate arguments
if perform_stat_test:
if test is None:
raise ValueError("If `perform_stat_test` is True, `test` must be specified.")
if pvalues is not None or test_short_name is not None:
raise ValueError("If `perform_stat_test` is True, custom `pvalues` "
"or `test_short_name` must be `None`.")
valid_list = ['t-test_ind', 't-test_welch', 't-test_paired',
'Mann-Whitney', 'Mann-Whitney-gt', 'Mann-Whitney-ls',
'Levene', 'Wilcoxon', 'Kruskal', 'chisquare']
if test not in valid_list:
raise ValueError("test value should be one of the following: {}."
.format(', '.join(valid_list)))
else:
if pvalues is None:
raise ValueError("If `perform_stat_test` is False, custom `pvalues` must be specified.")
if test is not None:
raise ValueError("If `perform_stat_test` is False, `test` must be None.")
if len(pvalues) != len(box_pairs):
raise ValueError("`pvalues` should be of the same length as `box_pairs`.")
if text_annot_custom is not None and len(text_annot_custom) != len(box_pairs):
raise ValueError("`text_annot_custom` should be of same length as `box_pairs`.")
assert_is_in(
loc, ['inside', 'outside'], label='argument `loc`'
)
assert_is_in(
text_format,
['full', 'simple', 'star'],
label='argument `text_format`'
)
assert_is_in(
comparisons_correction,
['bonferroni', None],
label='argument `comparisons_correction`'
)
if verbose >= 1 and text_format == 'star':
print("p-value annotation legend:")
pvalue_thresholds = pd.DataFrame(pvalue_thresholds).sort_values(by=0, ascending=False).values
for i in range(0, len(pvalue_thresholds)):
if i < len(pvalue_thresholds)-1:
print('{}: {:.2e} < p <= {:.2e}'.format(pvalue_thresholds[i][1],
pvalue_thresholds[i+1][0],
pvalue_thresholds[i][0]))
else:
print('{}: p <= {:.2e}'.format(pvalue_thresholds[i][1], pvalue_thresholds[i][0]))
print()
ylim = ax.get_ylim()
yrange = ylim[1] - ylim[0]
if line_offset is None:
if loc == 'inside':
line_offset = 0.05
if line_offset_to_box is None:
line_offset_to_box = 0.06
# 'outside', see valid_list
else:
line_offset = 0.03
if line_offset_to_box is None:
line_offset_to_box = line_offset
else:
if loc == 'inside':
if line_offset_to_box is None:
line_offset_to_box = 0.06
elif loc == 'outside':
line_offset_to_box = line_offset
y_offset = line_offset*yrange
y_offset_to_box = line_offset_to_box*yrange
if plot == 'boxplot':
# Create the same plotter object as seaborn's boxplot
box_plotter = sns.categorical._BoxPlotter(
x, y, hue, data, order, hue_order, orient=None, width=width, color=None,
palette=None, saturation=.75, dodge=True, fliersize=5, linewidth=None)
elif plot == 'barplot':
# Create the same plotter object as seaborn's barplot
box_plotter = sns.categorical._BarPlotter(
x, y, hue, data, order, hue_order,
estimator=np.mean, ci=95, n_boot=1000, units=None, seed=None,
orient=None, color=None, palette=None, saturation=.75,
errcolor=".26", errwidth=None, capsize=None, dodge=True)
elif plot == 'countplot':
# Create the same plotter object as seaborn's countplot
box_plotter = sns.categorical._CountPlotter(
x, y, hue, data, order, hue_order,
estimator=np.mean, ci=95, n_boot=1000, units=None, seed=None,
orient=None, color=None, palette=None, saturation=.75,
errcolor=".26", errwidth=None, capsize=None, dodge=True)
# Build the list of box data structures with the x and ymax positions
group_names = box_plotter.group_names
hue_names = box_plotter.hue_names
if box_plotter.plot_hues is None:
box_names = group_names
labels = box_names
else:
box_names = [(group_name, hue_name) for group_name in group_names for hue_name in hue_names]
labels = ['{}_{}'.format(group_name, hue_name) for (group_name, hue_name) in box_names]
if test == 'chisquare':
box_structs = [{'box':box_names[i],
'label':labels[i],
'x':find_x_position_box(box_plotter, box_names[i]),
'box_data':get_box_data(box_plotter, box_names[i]),
'ymax':np.amax(get_box_data(box_plotter, box_names[i]).count()) if
len(get_box_data(box_plotter, box_names[i])) > 0 else np.nan}
for i in range(len(box_names))]
else:
box_structs = [{'box':box_names[i],
'label':labels[i],
'x':find_x_position_box(box_plotter, box_names[i]),
'box_data':get_box_data(box_plotter, box_names[i]),
'ymax':np.amax(get_box_data(box_plotter, box_names[i])) if
len(get_box_data(box_plotter, box_names[i])) > 0 else np.nan}
for i in range(len(box_names))]
# Sort the box data structures by position along the x axis
box_structs = sorted(box_structs, key=lambda x: x['x'])
# Add the index position in the list of boxes along the x axis
box_structs = [dict(box_struct, xi=i) for i, box_struct in enumerate(box_structs)]
# Same data structure list with access key by box name
box_structs_dic = {box_struct['box']:box_struct for box_struct in box_structs}
# Build the list of box data structure pairs
box_struct_pairs = []
for i_box_pair, (box1, box2) in enumerate(box_pairs):
valid = box1 in box_names and box2 in box_names
if not valid:
raise ValueError("box_pairs contains an invalid box pair.")
pass
# i_box_pair will keep track of the original order of the box pairs.
box_struct1 = dict(box_structs_dic[box1], i_box_pair=i_box_pair)
box_struct2 = dict(box_structs_dic[box2], i_box_pair=i_box_pair)
if box_struct1['x'] <= box_struct2['x']:
pair = (box_struct1, box_struct2)
else:
pair = (box_struct2, box_struct1)
box_struct_pairs.append(pair)
# Draw first the annotations with the shortest between-boxes distance, in order to reduce
# overlapping between annotations.
box_struct_pairs = sorted(box_struct_pairs, key=lambda x: abs(x[1]['x'] - x[0]['x']))
# Build array that contains the x and y_max position of the highest annotation or box data at
# a given x position, and also keeps track of the number of stacked annotations.
# This array will be updated when a new annotation is drawn.
y_stack_arr = np.array([[box_struct['x'] for box_struct in box_structs],
[box_struct['ymax'] for box_struct in box_structs],
[0 for i in range(len(box_structs))]])
if loc == 'outside':
y_stack_arr[1, :] = ylim[1]
ann_list = []
test_result_list = []
ymaxs = []
y_stack = []
for box_struct1, box_struct2 in box_struct_pairs:
box1 = box_struct1['box']
box2 = box_struct2['box']
label1 = box_struct1['label']
label2 = box_struct2['label']
box_data1 = box_struct1['box_data']
box_data2 = box_struct2['box_data']
x1 = box_struct1['x']
x2 = box_struct2['x']
xi1 = box_struct1['xi']
xi2 = box_struct2['xi']
ymax1 = box_struct1['ymax']
ymax2 = box_struct2['ymax']
i_box_pair = box_struct1['i_box_pair']
# Find y maximum for all the y_stacks *in between* the box1 and the box2
i_ymax_in_range_x1_x2 = xi1 + np.nanargmax(y_stack_arr[1, np.where((x1 <= y_stack_arr[0, :]) &
(y_stack_arr[0, :] <= x2))])
ymax_in_range_x1_x2 = y_stack_arr[1, i_ymax_in_range_x1_x2]
if perform_stat_test:
result = stat_test(
box_data1,
box_data2,
test,
comparisons_correction,
len(box_struct_pairs),
**stats_params
)
else:
test_short_name = test_short_name if test_short_name is not None else ''
result = StatResult(
'Custom statistical test',
test_short_name,
None,
None,
pvalues[i_box_pair]
)
result.box1 = box1
result.box2 = box2
test_result_list.append(result)
# Don't plot lines that are not significantly different to only plot significant bars
# (https://github.com/webermarcolivier/statannot/issues/25)
if result.pval > 0.05:
print(result.box1, 'and' ,result.box2, 'did not show significant differences and the p value = {}'.format(result.pval))
continue
else:
print(result.box1, 'and' ,result.box2, 'did show significant differences and the p value = {}'.format(result.pval))
if verbose >= 1:
print("{} v.s. {}: {}".format(label1, label2, result.formatted_output))
if text_annot_custom is not None:
text = text_annot_custom[i_box_pair]
else:
if text_format == 'full':
text = "{} p = {}".format('{}', pvalue_format_string).format(result.test_short_name, result.pval)
elif text_format is None:
text = None
elif text_format is 'star':
text = pval_annotation_text(result.pval, pvalue_thresholds)
elif text_format is 'simple':
test_short_name = show_test_name and test_short_name or ""
text = simple_text(result.pval, simple_format_string, pvalue_thresholds, test_short_name)
yref = ymax_in_range_x1_x2
yref2 = yref
# Choose the best offset depending on wether there is an annotation below
# at the x position in the range [x1, x2] where the stack is the highest
if y_stack_arr[2, i_ymax_in_range_x1_x2] == 0:
# there is only a box below
offset = y_offset_to_box
else:
# there is an annotation below
offset = y_offset
y = yref2 + offset
h = line_height*yrange
line_x, line_y = [x1, x1, x2, x2], [y, y + h, y + h, y]
if loc == 'inside':
ax.plot(line_x, line_y, lw=linewidth, c=color)
elif loc == 'outside':
line = lines.Line2D(line_x, line_y, lw=linewidth, c=color, transform=ax.transData)
line.set_clip_on(False)
ax.add_line(line)
# why should we change here the ylim if at the very end we set it to the correct range????
# ax.set_ylim((ylim[0], 1.1*(y + h)))
if text is not None:
ann = ax.annotate(
text, xy=(np.mean([x1, x2]), y + h),
xytext=(0, text_offset), textcoords='offset points',
xycoords='data', ha='center', va='bottom',
fontsize=fontsize, clip_on=False, annotation_clip=False)
ann_list.append(ann)
plt.draw()
y_top_annot = None
got_mpl_error = False
if not use_fixed_offset:
try:
bbox = ann.get_window_extent()
bbox_data = bbox.transformed(ax.transData.inverted())
y_top_annot = bbox_data.ymax
except RuntimeError:
got_mpl_error = True
if use_fixed_offset or got_mpl_error:
if verbose >= 1:
print("Warning: cannot get the text bounding box. Falling back to a fixed"
" y offset. Layout may be not optimal.")
# We will apply a fixed offset in points,
# based on the font size of the annotation.
fontsize_points = FontProperties(size='medium').get_size_in_points()
offset_trans = mtransforms.offset_copy(
ax.transData, fig=fig, x=0,
y=1.0*fontsize_points + text_offset, units='points')
y_top_display = offset_trans.transform((0, y + h))
y_top_annot = ax.transData.inverted().transform(y_top_display)[1]
else:
y_top_annot = y + h
y_stack.append(y_top_annot) # remark: y_stack is not really necessary if we have the stack_array
ymaxs.append(max(y_stack))
# Fill the highest y position of the annotation into the y_stack array
# for all positions in the range x1 to x2
y_stack_arr[1, (x1 <= y_stack_arr[0, :]) & (y_stack_arr[0, :] <= x2)] = y_top_annot
# Increment the counter of annotations in the y_stack array
y_stack_arr[2, xi1:xi2 + 1] = y_stack_arr[2, xi1:xi2 + 1] + 1
# Check to see if there are actual significant differences
if len(ymaxs) == 0:
pass
else:
y_stack_max = max(ymaxs)
if loc == 'inside':
ax.set_ylim((ylim[0], max(1.03*y_stack_max, ylim[1])))
elif loc == 'outside':
ax.set_ylim((ylim[0], ylim[1]))
return ax, test_result_list
Could you please add a feature to add_stat_annotation so it can work on y-axis in log scale
The output of the code in the notebook is:
pvalue annotation legend:
ns: 5.00e-02 < p <= 1.00e+09
*: 1.00e-02 < p <= 5.00e-02
**: 1.00e-03 < p <= 1.00e-02
***: 1.00e-04 < p <= 1.00e-03
****: p <= 1.00e-04
Since p-value is always smaller than (or equal) to one, shouldn't the ns line be: ns: 5.00e-02 < p <= 1
?
In case this has been intentional please ignore my comments. Otherwise I think the following patch fixes the issue:
diff --git a/statannot.py b/statannot.py
index 45b90b3..925f408 100644
--- a/statannot.py
+++ b/statannot.py
@@ -52,7 +52,7 @@ def add_stat_annotation(ax,
data=None, x=None, y=None, hue=None, order=None, hue_order=None,
boxPairList=None,
test='Mann-Whitney', textFormat='star', loc='inside',
- pvalueThresholds=[[1e9,"ns"], [0.05,"*"], [1e-2,"**"], [1e-3,"***"], [1e-4,"****"]],
+ pvalueThresholds=[[1,"ns"], [0.05,"*"], [1e-2,"**"], [1e-3,"***"], [1e-4,"****"]],
color='0.2', lineYOffsetAxesCoord=None, lineHeightAxesCoord=0.02, yTextOffsetPoints=1,
linewidth=1.5, fontsize='medium', useFixedOffset=False, verbose=1):
"""
hello, is it possible to use add_stat_annotation with a boxplot that has orient='h'? in other words can the annotations be rotated sideways?
Hello,
It seems that the short_test_name
parameter could also be useful when statannot is responsible for performing the tests.
It could enable the customization of the text showing on the plots with text_format = "full"
or "simple"
, if someone needs "Wil." instead of "Wilcoxon" on a crowded plot, for example.
What do you think?
Thanks
I have called
fig, ax = plt.subplots(...)
sns.violinplot(ax=ax[0], ...)
add_stat_annotation(ax[0], ...)
ax[0].set_yscale("log")
For both loc='inside'
and 'outside'
, I get very wrong limits on the y-axis. The limits can still be set manually with ax.set_ylim(...)
, and then everything is fine again. If I skip the set_yscale
, everything works fine
Is this behaviour expected?
Hi! I am attempting to use add_stat_annotation to add t-test statistics for group differences to a bar plot, but keep getting 'nan' for the returned p and t values. I have resorted to using the scipy t-test_ind function to compute these prior and add them using add_stat_annot manually, but would be great to know if there is a way to fix this! I also tried out the other available tests, and got nans for all except the Mann-Whitney test. Of note, my x variable is a categorical variable containing 1s, 2s, and 3s, and my y variable is a continuous variable with no 0s or nans. Thanks for any insight!
Hi,
This package will make life so easy! But how to install it?
Hi,
Thank you so much for making this wonderful plugin and it helps me a lot in my research.
I have one request and wondering if you would like to improve in your future plans.
for some cases, we may only wanna keep the significant symbols while removing all the "ns" labels.
I checked somehow it is not implemented in the current version, wondering is it possible to realize it in the near future?
but again, just small advice, and thanks a lot for your contributions already!
shirley
Readme should be updated with the latest interface and examples.
In the issue #53, @normarius raised the topic of factoring out the plotting of annotation and the computation of statistical tests into two separate functions.
- Your toolbox currently offers two principal features that are blended into one API-function: (1) Compute and format p-values, (2) and create annotated brackets. I'd suggest to factor out a function
add_annotated_brackets()
that is completely agnostic about any testing and p-values. This function creates the brackets and annotations according to the formatting settings.add_stat_annotation()
could then just calladd_annotated_brackets()
.- You could also better factor out statistical testing part, such that the users can interfere with the p-values, filter the brackets of interest etc. (see e.g. #50), before they actually add the brackets.
- Personally, I think there are plenty of other libraries offering statistical testing. Keeping up with with all methodologies (bonferroni, bootstrapping, anova) is possibly difficult. I'd therefore put the primary focus on how to print these p-values (or other information), and make it as easy as possible for the user to create these annotations using the seaborn semantics.
- The reason for these suggestion: With the current solution, one has to adjust several default parameters to use custom annotation and to disable the testing. It took me a while figure this out.
It would be really cool if you add a way to use your function with seaborn.FacetGrid.
This code changes the color of the bars but not the text. How can i change the text color in this plot?
add_stat_annotation(ax, data=exc154_data, hue="Genotype", x="Tissue", y="Transcript Density", order=["Antenna","Early Pulse"],hue_order=["WT","Exc154 homozygous","Exc154 hemizygous"], box_pairs=box_pairs, test='Mann-Whitney', text_format='star', loc='inside', verbose=2,color='0.7')
When I tried to use add_state_annotation()
to a barplot, I get the following error:
Traceback (most recent call last):
File "plotting.py", line 63, in result_plot
add_stat_annotation(ax, plot="barplot", data=df,
File "/home/myhome/pyenv/py37/lib/python3.7/site-packages/statannot/statannot.py", line 442, in add_stat_annotation
errcolor=".26", errwidth=None, capsize=None, dodge=True)
TypeError: __init__() missing 1 required positional argument: 'seed'
I am using seaborn 0.11.0. I checked seaborn's source code and indeed there is a required seed
argument. I think this is a simple bug to be fixed. Also, it would be nice to have some barplot examples in the Jupyter Notebook, so that the barplot annotation functionality is tested:)
Thanks for your good work on this tool, really appreciate it, but any plan to apply it to seaborn catplot?
When I subset a pandas df with .isin() and I submit this to
add_stat_annotation(ax[0,0], data=SubsetDf, x=x, y=y, hue=hue, box_pairs=box_pairs, perform_stat_test=False, pvalues=SubsetDf["P-val"], text_annot_custom=SubsetDf["Text"], loc='inside', order=order, verbose=0)
I get the TypeError: unhashable type: 'numpy.ndarray'.
The SubsetDf is reindexed following subsetting.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.