Skip to content

Commit 6659d93

Browse files
Merge pull request #11 from learningOrchestra/joubert_learningupdate
Joubert learningupdate
2 parents ad904df + 910cafd commit 6659d93

36 files changed

Lines changed: 985 additions & 959 deletions

README.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,19 @@ pip install learning-orchestra-client
1919

2020
# Usage
2121

22-
Each functionality in learningOrchestra is contained in its own class. Check the [python client docs](https://learningorchestra.github.io/pythonClient/) for all the available.
22+
Each interoperable REST API service described in Learning Orchestra is translated
23+
into Python. Details at [python client docs](https://learningorchestra.github.io/pythonClient/).
24+
Furthermore, some extra method calls are included into Python client API to simplify
25+
even more the Machine Learning services. For instance, the REST API is asynchronous,
26+
except for GET HTTP requests, but the Python client enables also the synchronous API calls.
27+
The wait API method, useful to receive notifications from ML pipes, is another important
28+
example to illustrate an extension of the original REST API.
29+
2330

2431
# Example
2532

26-
* [Here](examples/titanic.py) has an example using the [Titanic Dataset](https://www.kaggle.com/c/titanic/overview):
27-
* [Here](examples/sentiment_analysis.py) has an example using the [Sentiment Analysis On IMDb reviews](https://www.kaggle.com/avnika22/imdb-perform-sentiment-analysis-with-scikit-learn):
28-
* [Here](examples/mnist_async.py) has an example using the [MNIST Dataset](http://yann.lecun.com/exdb/mnist/):
33+
* [Here](pipeline/titanic.py) has an example using the [Titanic Dataset](https://www.kaggle.com/c/titanic/overview):
34+
* [Here](pipeline/imdb.py) has an example using the [Sentiment Analysis On IMDb reviews](https://www.kaggle.com/avnika22/imdb-perform-sentiment-analysis-with-scikit-learn):
35+
* [Here](pipeline/mnist_async.py) has an example using the [MNIST Dataset](http://yann.lecun.com/exdb/mnist/):
2936

3037

learning_orchestra_client/builder/__init__.py

Whitespace-only changes.

learning_orchestra_client/builder.py renamed to learning_orchestra_client/builder/builder.py

Lines changed: 44 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
from .observer import Observer
2-
from ._response_treat import ResponseTreat
3-
from ._entity_reader import EntityReader
1+
from learning_orchestra_client.observe.observe import Observer
2+
from learning_orchestra_client.util._response_treat import ResponseTreat
3+
from learning_orchestra_client.util._entity_reader import EntityReader
44
import requests
55
from typing import Union
66

@@ -26,20 +26,19 @@ def run_spark_ml_sync(self,
2626
model_classifiers: list,
2727
pretty_response: bool = False) -> Union[dict, str]:
2828
"""
29-
description: This method resource join several steps of machine
30-
learning workflow (transform, tune, train and evaluate) coupling in
31-
a unique resource, builder creates several model predictions using
32-
your own modeling code using a defined set of classifiers. This is made
33-
synchronously, the caller waits until the model predictions are inserted
34-
into the Learning Orchestra storage mechanism.
29+
description: This method call runs several steps of a machine
30+
learning pipeline (transform, tune, train and evaluate, for instance)
31+
using a model code and several classifiers. It represents a way to run
32+
an entire pipeline. The caller waits until the method execution ends,
33+
since it is a synchronous method.
3534
3635
train_dataset_name: Represent final train dataset.
3736
test_dataset_name: Represent final test dataset.
38-
modeling_code: Represent Python3 code for pyspark preprocessing model
39-
model_classifiers: list of initial classifiers to be used in model
40-
pretty_response: returns indented string for visualization if True
37+
modeling_code: Represent Python3 code for pyspark pre-processing model
38+
model_classifiers: list of initial classifiers to be used in the model
39+
pretty_response: if True it represents a result useful for visualization
4140
42-
return: The resulted predictions URIs.
41+
return: The set of predictions (URIs of them).
4342
"""
4443

4544
request_body_content = {
@@ -63,22 +62,19 @@ def run_spark_ml_async(self,
6362
model_classifiers: list,
6463
pretty_response: bool = False) -> Union[dict, str]:
6564
"""
66-
description: This method resource join several steps of machine
67-
learning workflow (transform, tune, train and evaluate) coupling in
68-
a unique resource, builder creates several model predictions using
69-
your own modeling code using a defined set of classifiers. This is made
70-
asynchronously, the caller does not wait until the model predictions are
71-
inserted into the Learning Orchestra storage mechanism. Instead, the
72-
caller receives a JSON object with a URL to proceed future calls to
73-
verify if the model predictions are inserted.
65+
description: This method call runs several steps of a machine
66+
learning pipeline (transform, tune, train and evaluate, for instance)
67+
using a model code and several classifiers. It represents a way to run
68+
an entire pipeline. The caller does not wait until the method execution
69+
ends, since it is an asynchronous method.
7470
7571
train_dataset_name: Represent final train dataset.
7672
test_dataset_name: Represent final test dataset.
77-
modeling_code: Represent Python3 code for pyspark preprocessing model
78-
model_classifiers: list of initial classifiers to be used in model
79-
pretty_response: returns indented string for visualization if True
73+
modeling_code: Represent Python3 code for pyspark pre-processing model
74+
model_classifiers: list of initial classifiers to be used in the model
75+
pretty_response: if True it represents a result useful for visualization
8076
81-
return: The resulted predictions URIs.
77+
return: the URL to retrieve the Spark pipeline result
8278
"""
8379

8480
request_body_content = {
@@ -95,10 +91,10 @@ def run_spark_ml_async(self,
9591
def search_all_builders(self, pretty_response: bool = False) \
9692
-> Union[dict, str]:
9793
"""
98-
description: This method retrieves all model predictions metadata, it
94+
description: This method retrieves all model predictions metadata. It
9995
does not retrieve the model predictions content.
10096
101-
pretty_response: If true return indented string, else return dict.
97+
pretty_response: If true it returns a string, otherwise a dictionary.
10298
10399
return: A list with all model predictions metadata stored in Learning
104100
Orchestra or an empty result.
@@ -119,16 +115,16 @@ def search_builder_register_predictions(self,
119115
description: This method is responsible for retrieving the model
120116
predictions content.
121117
122-
pretty_response: If true return indented string, else return dict.
118+
pretty_response: If true it returns a string, otherwise a dictionary.
123119
builder_name: Represents the model predictions name.
124120
query: Query to make in MongoDB(default: empty query)
125121
limit: Number of rows to return in pagination(default: 10) (maximum is
126122
set at 20 rows per request)
127123
skip: Number of rows to skip in pagination(default: 0)
128124
129-
return: A page with some tuples or registers inside or an error if there
130-
is no such projection. The current page is also returned to be used in
131-
future content requests.
125+
return: A page with some tuples or registers inside or an error if the
126+
pipeline runs incorrectly. The current page is also returned to be used
127+
in future content requests.
132128
"""
133129

134130
response = self.__entity_reader.read_entity_content(
@@ -140,7 +136,7 @@ def search_builder(self, builder_name: str, pretty_response: bool = False) \
140136
-> Union[dict, str]:
141137
"""
142138
description: This method is responsible for retrieving a specific
143-
model prediction metadata.
139+
model metadata.
144140
145141
pretty_response: If true return indented string, else return dict.
146142
builder_name: Represents the model predictions name.
@@ -161,11 +157,11 @@ def delete_builder(self, builder_name: str, pretty_response: bool = False) \
161157
-> Union[dict, str]:
162158
"""
163159
description: This method is responsible for deleting a model prediction.
164-
The delete operation is always synchronous because it is very fast,
160+
The delete operation is always asynchronous,
165161
since the deletion is performed in background.
166162
167-
pretty_response: If true return indented string, else return dict.
168-
builder_name: Represents the projection name.
163+
pretty_response: If true it returns a string, otherwise a dictionary.
164+
builder_name: Represents the pipeline name.
169165
170166
return: JSON object with an error message, a warning message or a
171167
correct delete message
@@ -177,5 +173,16 @@ def delete_builder(self, builder_name: str, pretty_response: bool = False) \
177173

178174
return self.__response_treat.treatment(response, pretty_response)
179175

180-
def wait(self, dataset_name: str) -> dict:
181-
return self.__observer.wait(dataset_name)
176+
def wait(self, dataset_name: str, timeout: int = None) -> dict:
177+
"""
178+
description: This method is responsible to create a synchronization
179+
barrier for the run_spark_ml_async method.
180+
181+
dataset_name: Represents the pipeline name.
182+
timeout: Represents the time in seconds to wait for a builder to
183+
finish its run.
184+
185+
return: JSON object with an error message, a warning message or a
186+
correct execution of a pipeline
187+
"""
188+
return self.__observer.wait(dataset_name, timeout)

learning_orchestra_client/dataset.py

Lines changed: 0 additions & 169 deletions
This file was deleted.

0 commit comments

Comments
 (0)