datasalt / pangool Goto Github PK
View Code? Open in Web Editor NEWTuple MapReduce for Hadoop: Hadoop API made easy
Home Page: http://datasalt.github.io/pangool/
License: Apache License 2.0
Tuple MapReduce for Hadoop: Hadoop API made easy
Home Page: http://datasalt.github.io/pangool/
License: Apache License 2.0
And probably IOException neither.
Hi, @pereferrera
solr data index files were output in /src/test/resources/solr-es/tmp/hadoop-$user/mapred/local/solr_attempt_local_0001_m_000000_0.1/ folder, not in expected out-com.datasalt.pangool.solr.TestSolrOutputFormat/part-00000.
This issue caused all assertTrue(new File(OUTPUT + "/part-00000/data/index").exists()); asserts fail.
I ran the TopicalWordCount example and encountered the following problem. Apparently I didn't have sufficient privileges to create files somewhere, which is getting from "hadoop.tmp.dir" in the code(InstancesDistributor.java) if not set. However, the default of "hadoop.tmp.dir“ is "/tmp/hadoop-${user.name}", which is a path from disk.
Though it works well after I set this param, but.....
Exception in thread "main" org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=bjdata, access=WRITE, inode="home":hadoop:supergroup:rwxr-xr-x
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:2836)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:500)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:364)
at com.datasalt.pangool.utils.InstancesDistributor.distribute(InstancesDistributor.java:77)
at com.datasalt.pangool.tuplemr.TupleMRBuilder.createJob(TupleMRBuilder.java:240)
at com.mediav.pangoolCase.TopicalWordCount.run(TopicalWordCount.java:102)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.mediav.pangoolCase.TopicalWordCount.main(TopicalWordCount.java:113)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
i try to work pangool on hadoop 0.23,but it not work well。the unit test has error。
the logs:
java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at com.datasalt.pangool.tuplemr.mapred.lib.output.ProxyOutputFormat.createOutputFormatIfNeeded(ProxyOutputFormat.java:89)
at com.datasalt.pangool.tuplemr.mapred.lib.output.ProxyOutputFormat.checkOutputSpecs(ProxyOutputFormat.java:79)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:417)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:332)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1236)
at com.shicheng.yarn.WordCountPg.run(WordCountPg.java:108)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at com.shicheng.yarn.WordCountPgTest.test(WordCountPgTest.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
Currently it lacks some functionality that TupleMRBuilder has.
Ideally we shouldn't be replicating code. MapOnlyJoBuilder is a nice minimal API but maybe it could be developed on top of TupleMRBuilder. Then, TupleMRBuilder would need to be adapted in order to support map-only Jobs.
Because nulls are now possible in Pangool, TupleTextOutputFormat should be able to serialize them with a "nullString".
I packaged pangool-core and ran all test case, but found test1() and test2() in TestRollup failed.
The reason I found is test1() and test2() can't clean output folder.
Does anyone encounter the same problem?
The TupleTextInputFormat is not properly skipping headers. Instead, it deletes the first file character.
Provide better integration with SOLR indexing than that of Hadoop's SOLR-1301 . The idea es that we can leverage Tuples for indexing them directly without DocumentConverters.
Additionally, test that with Pangool's Multiple Outputs we can index more than one kind of index in the same Job (which is not possible in Hadoop).
Is able Pangool to work with Hadoop Streaming ?
As pointed out by Alexei: https://groups.google.com/forum/#!topic/pangool-user/iWf3BODBI9o
It would be nice to have the possibility of specifying different group bys for the same Pangool Job. Pig seems to have this possibility built-in in its optimizer. Even though I'm not sure if this is always the most efficient way of executing several groupBy's over the same input, it makes sense that Pangool supports such advanced usages because the philosophy of Pangool is to make "low-level MapReduce easier". If there are low-level use cases that we don't natively support I think it is worth taking a look and seeing how reasonable that is to implement.
Alexei proposes an API based on declaring "Streams", which would be quite different to the current API where you have to specify intermediate schemas. At first sight I wouldn't recommend such an API revamp, because this is still a particular use case and it doesn't make sense to me to make it so "first class".
Here we can follow up a discussion on this. I have been thinking on it and I have a few ideas on the matter:
addGroupBy(groupBy)
.withSchema(schema1);
So the user can either act as now (use a common group by) or add K particular groupBys to the same Job, each associated with one or more Schemas:
addGroupBy(groupBy)
.withSchema(schema1)
.withSchema(schema2);
The Pangool Mapper would check this and emit as many Tuples as needed internally if one schema has more than one groupBy, assigning different stream ids.
addGroupBy(groupBy)
.withSchema(schema1)
.withSchema(schema2)
.withReducer(reducer);
So the Pangool Reducer would instantiate many reducers in this case and delegate to one or another depending on the "stream id". If this methods are used, then the TupleMRBuilder won't complain if setTupleReducer() hasn't been called. If both things are called, the most particular thing wins: for schemas that have a particular reducer associated, this would be the one which would be called, for the rest the "general" reducer would be called.
addGroupBy(groupBy)
.withSchema(schema1, orderBy1)
.withSchema(schema2, orderBy2)
.withReducer(reducer);
This all seems a bit complex, but it is actually challenging to be as backwards compatible as possible, and remain coherent with the whole idea of the API and Pangool.
That would be my two cents as to how this could be implemented.
Something which remains in my mind is the possibility of closing the loop and making Pangool have all the convenient serialization features, which remain to be : Lists and Maps (being Set a particular case of a Map).
Currently it is possible to serialize them using Avro but the integration code required doesn't look very nice. Pangool could add a wrapper to make this a little nicer - delegating the serialization to Avro - but then it wouldn't be possible to serialize Lists of arbitrary Objects.
While it is true that it wasn't the main idea of Pangool to make it fully serialization-built-in functional, there is no reason why new features which pay off, are easy to implement and make sense with the whole codebase shouldn't be implemented.
What's more, taking a look at the current code, it doesn't seem difficult to add proper built-in serialization support for (typed) Lists or Maps. A custom FieldSerialization could be implemented, which writes the list length first and calls the delegate code in SimpleTupleSerializer for serializing the list typed values.
This would allow for arbitrary typed lists, the type defined by a Pangool's Field (so the method in Field would be something like:
public static Field createListField(String name, Field type)
Therefore it would be possible to serialize lists of lists of lists. Or lists of Tuples. Or anything which is possible due to this recursion.
Opened questions would then be:
This issue can be reproduced by implementing a MapOnlyMapper that overrides setup() and doesn't call super.setup(context).
Like in Storm:
String word = tuple.getString(0);
Such a method could hide the need for converting from UTF8 to Java String.
Hi, @pereferrera @epalace @ivanprado
Right now, pangool has three serialization way, including hadoop, protostuff and thrift serializations, but does not support protobuffer, which is also widely used.
Do you have plans to add protobuffer serialization in pangool in the near future?
java.lang.NullPointerException
at com.datasalt.pangool.utils.TupleToAvroRecordConverter.toRecord(TupleToAvroRecordConverter.java:95)
at com.datasalt.pangool.tuplemr.mapred.lib.output.TupleOutputFormat$TupleRecordWriter.write(TupleOutputFormat.java:84)
at com.datasalt.pangool.tuplemr.mapred.lib.output.TupleOutputFormat$TupleRecordWriter.write(TupleOutputFormat.java:63)
at com.datasalt.pangool.tuplemr.mapred.lib.output.PangoolMultipleOutputs.write(PangoolMultipleOutputs.java:364)
at com.datasalt.pangool.tuplemr.mapred.lib.output.PangoolMultipleOutputs.write(PangoolMultipleOutputs.java:340)
at com.datasalt.pangool.tuplemr.MultipleOutputsCollector.write(MultipleOutputsCollector.java:46)
AbstractHadoopTestLibrary.withTupleOutput keeps a map with all the tuples from TupleOutputFormat but the same instance is reused, so all the entries keep the content from the last tuple. We need to modify TupleInputReader in order to accept a new Tuple instance in every nextKeyValue() call or create a separated method like : nextTuple(ITuple tuple)
I ran the test and got errors. I debuged and found it failed in FileOutputCommitter.moveTaskOutputs, but I still don't know the fundamental reason.
Here are my testcase logs:
13/01/24 01:06:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
13/01/24 01:06:05 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/01/24 01:06:05 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
13/01/24 01:06:05 INFO input.FileInputFormat: Total input paths to process : 1
13/01/24 01:06:06 INFO mapred.JobClient: Running job: job_local_0001
13/01/24 01:06:06 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
13/01/24 01:06:06 INFO input.FileInputFormat: Total input paths to process : 1
13/01/24 01:06:06 INFO mapred.MapTask: io.sort.mb = 100
13/01/24 01:06:06 INFO mapred.MapTask: data buffer = 79691776/99614720
13/01/24 01:06:06 INFO mapred.MapTask: record buffer = 262144/327680
13/01/24 01:06:06 INFO input.DelegatingMapper: [profile] Got input split. Going to look at DC.
13/01/24 01:06:06 INFO input.DelegatingMapper: [profile] Finished. Calling run() on delegate.
13/01/24 01:06:06 INFO mapred.MapTask: Starting flush of map output
13/01/24 01:06:06 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
13/01/24 01:06:06 INFO mapred.MapTask: Finished spill 0
13/01/24 01:06:06 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
13/01/24 01:06:06 INFO mapred.LocalJobRunner:
13/01/24 01:06:06 INFO mapred.TaskRunner: Task attempt_local_0001_m_000000_0 is allowed to commit now
13/01/24 01:06:06 WARN mapred.TaskRunner: Failure committing: java.io.IOException: Failed to save output of task: attempt_local_0001_m_000000_0
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:118)
at com.datasalt.pangool.tuplemr.mapred.lib.output.ProxyOutputFormat$ProxyOutputCommitter.commitTask(ProxyOutputFormat.java:157)
at org.apache.hadoop.mapred.Task.commit(Task.java:779)
at org.apache.hadoop.mapred.Task.done(Task.java:691)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:309)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
13/01/24 01:06:06 WARN mapred.LocalJobRunner: job_local_0001
java.io.IOException: Failed to save output of task: attempt_local_0001_m_000000_0
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:165)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:118)
at com.datasalt.pangool.tuplemr.mapred.lib.output.ProxyOutputFormat$ProxyOutputCommitter.commitTask(ProxyOutputFormat.java:157)
at org.apache.hadoop.mapred.Task.commit(Task.java:779)
at org.apache.hadoop.mapred.Task.done(Task.java:691)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:309)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
13/01/24 01:06:07 INFO mapred.JobClient: map 0% reduce 0%
13/01/24 01:06:07 INFO mapred.JobClient: Job complete: job_local_0001
13/01/24 01:06:07 INFO mapred.JobClient: Counters: 8
13/01/24 01:06:07 INFO mapred.JobClient: FileSystemCounters
13/01/24 01:06:07 INFO mapred.JobClient: FILE_BYTES_READ=25861
13/01/24 01:06:07 INFO mapred.JobClient: FILE_BYTES_WRITTEN=41068
13/01/24 01:06:07 INFO mapred.JobClient: Map-Reduce Framework
13/01/24 01:06:07 INFO mapred.JobClient: Combine output records=6
13/01/24 01:06:07 INFO mapred.JobClient: Map input records=6
13/01/24 01:06:07 INFO mapred.JobClient: Spilled Records=6
13/01/24 01:06:07 INFO mapred.JobClient: Map output bytes=24
13/01/24 01:06:07 INFO mapred.JobClient: Map output records=6
13/01/24 01:06:07 INFO mapred.JobClient: Combine input records=6
HCatalogInputFormat always returns strings for partition columns, even when these partitions are declared as other types (i.e. INT). In this case, Pangool fails as it does not automatically converts from String to the target type.
Is that an HCatalog Bug? Is that a problem of our integration?
Documentation (tutorial, etc) seems to be a bit old by now. Also, as we don't automatize uploading new javadocs, they seem to be also old.
We should specifically add a new section for illustrating TupleTextInputFormat which is a very powerful tool in Pangool.
Anything else that comes into mind should be added into this ticket.
The following exception is thrown when an empty group reaches a RollupReducer:
java.lang.NullPointerException
at com.datasalt.pangool.tuplemr.mapred.RollupReducer.run(RollupReducer.java:140)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:662)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:425)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
That only affects to the Rollup functionality.
I suspect there is some strange behavior when using Rollup and Hadoop 0.20.2
I have a Job that uses Rollup and that behaves oddly only when used with Hadoop 0.20.2. These counters shows it:
Map output records=15088
Reduce input records=12378
There shouldn't be a difference between reduce input records and map output records in this case. When using Hadoop 0.20.205.0, everything is fine:
Map output records=15038
Reduce input records=15038
Current MOUT is not acting as a total OutputFormat. MOUT life cycle is managed by calls in the methods setup(), write() and cleanup() from mappers and reducers. Ant the commit is handled at the cleanup() method. That is not right. It should be handled by an OutputCommiter.
By the other hand, the responsible for commiting the results, including the MOUT results, is the principal OutputFormat. That means that MOUT would only work if the OutputFormat extends FileOutputFormat, which is not right. MOUT should work independently of the principal OutputFormat.
The solution should be capturing the principal OutputFormat signals by using some kind of ProxyOutputFormta and dealing with these signals to the different OutputFormat, the principal and the MOUT. Also we should have some kind of ProxyOutputCommiter, that should send signals properly to both output systems.
Currently we are using the Hadoop Deserializer class for the custom Pangool serialization: http://pangool.net/userguide/custom_serialization.html
But some serialization mechanisms needs the buffer size in order to deserialize properly (i.e. ProtocolBuffers). With current API the serialization needs to write the buffer size to the disk... But that is something we have already done with Pangool!. So we would be writing the buffer size twice.
That can be avoided by passing the buffer size to the deserialize() method. That would mean that we would have to stop using the Hadoop Deserialize class and use our own new interface for that.
Currently one can read any TupleFile and process any output from a TupleMRBuilder without needing to care about the Schema, which is good. But if you have a set of Schemas and evolve them, for example by adding one or two fields, and you want to use these instead of the ones that are saved in the TupleFiles, you have to write custom business logic for copying the input Tuple to another Tuple (with the new Schema).
The idea would be to allow the user to provide an optional input Schema in TupleReader or TupleMRBuilder, so that if a Schema is provided then the Tuple is instantiated with the user's Schema instead of the file schema, so that maybe not all the fields are read, or some new fields are not written (in that case we have to think of a reasonable API for providing default values) - also note that since #18 it is now possible to serialize nulls in a Tuple.
This should allow to use stateful ser/deserializers for OBJECT fields.
Currently the serialization used for OBJECT fields is defined statically in Hadoop's property "io.serialization", so all the fields with the same class share the same stateless serializer.
This new feature must allow that , for instance, multiple Avro field withins a tuple be ser/deser with its particular schemas.
For the next Pangool release I'm working on an utility called "Mutator" that allows to conveniently evolve Schemas. Currently it has these methods:
minusFields(Schema schema, String... minusFields) - returns a Schema like the argument one minus the fields passed as second parameter.
subSetOf(Schema schema, String... subSetFields) - returns a Schema like the argument one containing only the fields passed as second parameter.
superSetOf(Schema schema, Field... newFields) - returns a Schema like the argument one but adding new Fields to it.
jointSchema(Schema leftSchema, Schema rightSchema) - returns a joint Schema of two Schemas, where the left one has priority over the right (first all fields from the left are added, then the ones in the right that are not in left).
Shout if you have more suggestions or feedback.
Pangool serializes everything (InputFormat, Combiner, Mapper, Reducer, OutputFormat...) so that we can use instances instead of classes. We try to use the DistributedCache when possible (it is no possible for InputFormat and OutputFormat). The current approach works fine, however, instance files may be accumulated in "hadoop.tmp.dir" and, if not deleted, the cost of loading one increases. There are two things that must be revised:
Currently the main bottleneck happens in the Combiner, if there are a lot of files. The cost of loading the instance for everything else is negligible. Which makes me thing of another idea:
In any case, the other things should be done as well.
Pangool should work with new Hadoop version 0.23. Create a branch to migrate de code.
Should allow to print tuples as CSV or any other separator.
If the user is trying to write to a multiple outputs that hasn't been configured, and the job hasn't a default multiple output defined, successive calls will become very slow. The reason is that, before checking if the multiple output name is valid, a series of expensive actions are performed (reflection, output format instantiation, ...). This can be seen in class PangoolMultipleOutputs, method getRecordWriter().
Like : Field field = AvroField.create("my_Avro_field", myAvroSchema, false)
Right now Pangool is serializing thrift using TBinaryProtocol. But could be interesting to use TCompactProtocol, which uses less space. The idea is to make the selection of the protocol configurable.
An example use case is a Parquet file where we select different sets of columns and assign each set to a different Mapper. Right now, Pangool overrides the input assignment for the same Path, silently keeping only the last one. A small change is needed for Pangool to be able to delegate to multiple mappers and input formats for the same Path.
A small refactoring in serialization code should allow the possibility of adding stateful serialization for Tuples inside Tuples.
As a first attempt create an example of CrossProduct with two sources .
Then build a generic N-CrossProduct task that is scalable and assumes that not all the dataset can fit in memory, so multiple Job phases will be required.
Adding support for nulls. Some schema fields could be mark as nullable. Those fields supports nulls on the field. The idea is to implement an efficient serialization and comparation, that:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.