Oracle Machine Learning for R (OML4R) makes the open source R statistical programming language and environment ready for the enterprise and big data. Designed for problems involving both large and small volumes of data, OML4R integrates R with Oracle Database.
Data scientists and broader R users can take advantage of the R ecosystem on data managed by Oracle Database. R provides a suite of software packages for data manipulation, graphics, statistical functions, and machine learning algorithms. Oracle Machine Learning for R extends R’s capabilities through three primary areas: transparent access and manipulation of database data from R, in-database machine learning algorithms, ease of deployment using embedded R execution.
Oracle Machine Learning also supports a "drag and drop" graphical user interface, Oracle Data Miner, that is integrated with Oracle SQL Developer and is capable of executing user-defined R functions as part of user-created analytics workflows.
Transparency layer - Leverage R data.frame proxy objects so data remains as database tables and views. Overloaded R functions translate select R functionality to equivalent SQL for in-database processing, parallelism, scalability and security. Data scientists can use familiar R syntax to manipulate database data that remains in the database. Leverage the package OREdplyr, which provides overloaded functionality from the popular open source R dplyr package.
Machine Learning Algorithms - R users can take advantage of Oracle Machine Learning’s library of in-database, parallel algorithms using the R language. Users can specify machine learning models using the familiar R formula syntax. Algorithms support classification, regression, anomaly detection, clustering, feature extraction, time series, and association rules.
Embedded R Execution - Manage and invoke user-defined R functions in Oracle Database for data-parallel, task-parallel, and non-parallel execution, which may also use third-party R packages, e.g., from the CRAN repository. When data scientists require techniques from the R ecosystem to satisfy unique requirements, they can leverage the R ecosystem.
Integrated Text Mining - The in-database algorithms accept text columns from tables and views, and then automates term and theme extraction. The extracted data is combined with other predictors in building models and scoring data.
Partitioned Models - With in-database models, users can automatically create ensembles of models, where each component model is built on a user-specified partition of the data. Scoring is enabled and simplified using a single integrated model.