Second, Apache SystemML provides automatic optimization according to data and cluster characteristics to ensure both efficiency and scalability. Apache SystemML runs in MapReduce or Spark environments. Follow these links for more background:
Youll learn about Apache Spark and the DML scripting language, but probably the most important takeaway will be how to implement an advanced ML system in an advanced, parallel, distributed environment.
Apache SystemML will benefit from contributions in several areas. Data scientists can contribute new algorithms or enhance existing ones by making them more robust and accurate. Engineers can build support for other distributed platforms and help with the parser or improve the performance of the runtime.
Apache SystemML promises to greatly improve the productivity of analysts and data scientists by providing 1) DML, a declarative, R-like language for flexibly expressing custom analytics and 2) data independence from the underlying input formats and physical data representations.