Skip to content

Commit 328e8c6

Browse files
committed
fixed docs
1 parent 06de83d commit 328e8c6

1 file changed

Lines changed: 78 additions & 68 deletions

File tree

README.md

Lines changed: 78 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ print(data_type_handler.change_file_type(
8888
type_fields))
8989

9090

91-
preprocessing_code = '''python
91+
preprocessing_code = '''
9292
from pyspark.ml import Pipeline
9393
from pyspark.sql.functions import (
9494
mean, col, split,
@@ -198,14 +198,14 @@ print(model_builder.create_model(
198198
```
199199
# Function APIs
200200

201-
## DatabaseApi
201+
## Database API
202202

203203
### read_resume_files
204204

205205
```python
206206
read_resume_files(pretty_response=True)
207207
```
208-
* `pretty_response`: return indented string to visualization
208+
* `pretty_response`: returns indented `string` for visualization(default: `True`, returns `dict` if `False`)
209209
(default `True`, if `False`, return dict)
210210

211211
### read_file
@@ -214,156 +214,167 @@ read_resume_files(pretty_response=True)
214214
read_file(filename, skip=0, limit=10, query={}, pretty_response=True)
215215
```
216216

217-
* `filename` : filename of file
218-
* `skip`: number of rows amount to skip in pagination (default `0`)
219-
* `limit`: number of rows to return in pagination (default `10`)
220-
(max setted in `20` rows per request)
221-
* `query`: query to make in mongo (default empty query)
222-
* `pretty_response`: return indented string to visualization
223-
(default `True`, if `False`, return dict)
217+
* `filename` : name of file
218+
* `skip`: number of rows to skip in pagination(default: `0`)
219+
* `limit`: number of rows to return in pagination(default: `10`)
220+
(maximum is set at `20` rows per request)
221+
* `query`: query to make in MongoDB(default: `empty query`)
222+
* `pretty_response`: returns indented `string` for visualization(default: `True`, returns `dict` if `False`)
224223

225224
### create_file
226225

227226
```python
228227
create_file(filename, url, pretty_response=True)
229228
```
230229

231-
* `filename`: filename of file to be created
232-
* `url`: url to csv file
233-
* `pretty_response`: return indented string to visualization
234-
(default `True`, if `False`, return dict)
230+
* `filename`: name of file to be created
231+
* `url`: url to CSV file
232+
* `pretty_response`: returns indented `string` for visualization
233+
(default: `True`, returns `dict` if `False`)
235234

236235
### delete_file
237236

238237
```python
239238
delete_file(filename, pretty_response=True)
240239
```
241240

242-
* `filename`: file filename to be deleted
243-
* `pretty_response`: return indented string to visualization
244-
(default `True`, if `False`, return dict)
241+
* `filename`: name of the file to be deleted
242+
* `pretty_response`: returns indented `string` for visualization
243+
(default: `True`, returns `dict` if `False`)
245244

246-
## Projection
245+
## Projection API
247246

248247
### create_projection
249248

250249
```python
251250
create_projection(filename, projection_filename, fields, pretty_response=True)
252251
```
253252

254-
* `filename`: filename of file to make projection
255-
* `projection_filename`: filename used to create projection
253+
* `filename`: name of the file to make projection
254+
* `projection_filename`: name of file used to create projection
256255
* `fields`: list with fields to make projection
257-
* `pretty_response`: return indented string to visualization
258-
(default `True`, if `False`, return dict)
256+
* `pretty_response`: returns indented `string` for visualization
257+
(default: `True`, returns `dict` if `False`)
259258

260-
## DataTypeHandler
259+
## Data type handler API
261260

262261
### change_file_type
263262

264263
```python
265264
change_file_type(filename, fields_dict, pretty_response=True)
266265
```
267266

268-
* `filename`: filename of file
267+
* `filename`: name of file
269268
* `fields_dict`: dictionary with `field`:`number` or `field`:`string` keys
270-
* `pretty_response`: return indented string to visualization
271-
(default `True`, if `False`, return dict)
269+
* `pretty_response`: returns indented `string` for visualization
270+
(default: `True`, returns `dict` if `False`)
271+
272+
## Histogram API
272273

273-
## Histogram
274274
### create_histogram
275+
275276
```python
276277
create_histogram(filename, histogram_filename, fields,
277278
pretty_response=True)
278279
```
279280

280-
* `filename`: filename of file to make histogram
281-
* `histogram_filename`: filename used to create histogram
281+
* `filename`: name of file to make histogram
282+
* `histogram_filename`: name of file used to create histogram
282283
* `fields`: list with fields to make histogram
283-
* `pretty_response`: return indented string to visualization
284-
(default `True`, if `False`, return dict)
284+
* `pretty_response`: returns indented `string` for visualization
285+
(default: `True`, returns `dict` if `False`)
286+
287+
## t-SNE API
285288

286-
## Tsne
287289
### create_image_plot
290+
288291
```python
289292
create_image_plot(tsne_filename, parent_filename,
290293
label_name=None, pretty_response=True)
291294
```
292295

293-
* `parent_filename`: filename of file to make histogram
294-
* `tsne_filename`: filename used to create image plot
295-
* `label_name`: label name to dataset with labeled tuples (default `None`, to
296+
* `parent_filename`: name of file to make histogram
297+
* `tsne_filename`: name of file used to create image plot
298+
* `label_name`: label name to dataset with labeled tuples (default: `None`, to
296299
datasets without labeled tuples)
297-
* `pretty_response`: return indented string to visualization
298-
(default `True`, if `False`, return dict)
300+
* `pretty_response`: returns indented `string` for visualization
301+
(default: `True`, returns `dict` if `False`)
299302

300303
### read_image_plot_filenames
304+
301305
```python
302306
read_image_plot_filenames(pretty_response=True)
303307
```
304308

305-
* `pretty_response`: return indented string to visualization
306-
(default `True`, if `False`, return dict)
309+
* `pretty_response`: returns indented `string` for visualization
310+
(default: `True`, returns `dict` if `False`)
307311

308312
### read_image_plot
313+
309314
```python
310315
read_image_plot(tsne_filename, pretty_response=True)
311316
```
312317

313318
* tsne_filename: filename of a created image plot
314-
* `pretty_response`: return indented string to visualization
315-
(default `True`, if `False`, return dict)
319+
* `pretty_response`: returns indented `string` for visualization
320+
(default: `True`, returns `dict` if `False`)
316321

317322
### delete_image_plot
323+
318324
```python
319325
delete_image_plot(tsne_filename, pretty_response=True)
320326
```
321327

322328
* `tsne_filename`: filename of a created image plot
323-
* `pretty_response`: return indented string to visualization
324-
(default `True`, if `False`, return dict)
329+
* `pretty_response`: returns indented `string` for visualization
330+
(default: `True`, returns `dict` if `False`)
331+
332+
## PCA API
325333

326-
## Pca
327334
### create_image_plot
335+
328336
```python
329337
create_image_plot(tsne_filename, parent_filename,
330338
label_name=None, pretty_response=True)
331339
```
332340

333-
* `parent_filename`: filename of file to make histogram
341+
* `parent_filename`: name of file to make histogram
334342
* `pca_filename`: filename used to create image plot
335-
* `label_name`: label name to dataset with labeled tuples (default `None`, to
343+
* `label_name`: label name to dataset with labeled tuples (default: `None`, to
336344
datasets without labeled tuples)
337-
* `pretty_response`: return indented string to visualization
338-
(default `True`, if `False`, return dict)
345+
* `pretty_response`: returns indented `string` for visualization
346+
(default: `True`, returns `dict` if `False`)
339347

340348
### read_image_plot_filenames
349+
341350
```python
342351
read_image_plot_filenames(pretty_response=True)
343352
```
344353

345-
* `pretty_response`: return indented string to visualization
346-
(default `True`, if `False`, return dict)
354+
* `pretty_response`: returns indented `string` for visualization
355+
(default: `True`, returns `dict` if `False`)
347356

348357
### read_image_plot
358+
349359
```python
350360
read_image_plot(pca_filename, pretty_response=True)
351361
```
352362

353363
* `pca_filename`: filename of a created image plot
354-
* `pretty_response`: return indented string to visualization
355-
(default `True`, if `False`, return dict)
364+
* `pretty_response`: returns indented `string` for visualization
365+
(default: `True`, returns `dict` if `False`)
356366

357367
### delete_image_plot
368+
358369
```python
359370
delete_image_plot(pca_filename, pretty_response=True)
360371
```
361372

362373
* `pca_filename`: filename of a created image plot
363-
* `pretty_response`: return indented string to visualization
364-
(default `True`, if `False`, return dict)
374+
* `pretty_response`: returns indented `string` for visualization
375+
(default: `True`, returns `dict` if `False`)
365376

366-
## ModelBuilder
377+
## Model builder API
367378

368379
### create_model
369380

@@ -372,12 +383,12 @@ create_model(training_filename, test_filename, preprocessor_code,
372383
model_classificator, pretty_response=True)
373384
```
374385

375-
* `training_filename`: filename to be used in training
376-
* `test_filename`: filename to be used in test
377-
* `preprocessor_code`: python3 code for pyspark preprocessing model
378-
* `model_classificator`: list of initial from classificators to be used in model
379-
* `pretty_response`: return indented string to visualization
380-
(default `True`, if `False`, return dict)
386+
* `training_filename`: name of file to be used in training
387+
* `test_filename`: name of file to be used in test
388+
* `preprocessor_code`: Python3 code for pyspark preprocessing model
389+
* `model_classificator`: list of initial classificators to be used in model
390+
* `pretty_response`: returns indented `string` for visualization
391+
(default: `True`, returns `dict` if `False`)
381392

382393
#### model_classificator
383394

@@ -395,15 +406,15 @@ create_model(training_filename, test_filename, preprocessor_code, ["lr", "nb"])
395406

396407
#### preprocessor_code environment
397408

398-
The python 3 preprocessing code must use the environment instances in bellow:
409+
The Python 3 preprocessing code must use the environment instances as below:
399410

400411
* `training_df` (Instantiated): Spark Dataframe instance training filename
401412
* `testing_df` (Instantiated): Spark Dataframe instance testing filename
402413

403-
The preprocessing code must instantiate the variables in below, all instances must be transformed by pyspark VectorAssembler:
414+
The preprocessing code must instantiate the variables as below, all instances must be transformed by pyspark VectorAssembler:
404415

405-
* `features_training` (Not Instantiated): Spark Dataframe instance for train the model
406-
* `features_evaluation` (Not Instantiated): Spark Dataframe instance for evaluating trained model accuracy
416+
* `features_training` (Not Instantiated): Spark Dataframe instance for training the model
417+
* `features_evaluation` (Not Instantiated): Spark Dataframe instance for evaluating trained model
407418
* `features_testing` (Not Instantiated): Spark Dataframe instance for testing the model
408419

409420
In case you don't want to evaluate the model, set `features_evaluation` as `None`.
@@ -414,8 +425,7 @@ In case you don't want to evaluate the model, set `features_evaluation` as `None
414425
self.fields_from_dataframe(dataframe, is_string)
415426
```
416427

417-
This method returns string or number fields as a string list from a DataFrame.
428+
This method returns `string` or `number` fields as a `string` list from a DataFrame.
418429

419430
* `dataframe`: DataFrame instance
420-
* `is_string`: Boolean parameter, if `True`, the method returns the string DataFrame fields, otherwise, returns the numbers DataFrame fields.
421-
431+
* `is_string`: Boolean parameter(if `True`, the method returns the string DataFrame fields, otherwise, returns the numbers DataFrame fields)

0 commit comments

Comments
 (0)