@@ -88,7 +88,7 @@ print(data_type_handler.change_file_type(
8888 type_fields))
8989
9090
91- preprocessing_code = ''' python
91+ preprocessing_code = '''
9292from pyspark.ml import Pipeline
9393from pyspark.sql.functions import (
9494 mean, col, split,
@@ -198,14 +198,14 @@ print(model_builder.create_model(
198198```
199199# Function APIs
200200
201- ## DatabaseApi
201+ ## Database API
202202
203203### read_resume_files
204204
205205``` python
206206read_resume_files(pretty_response = True )
207207```
208- * ` pretty_response ` : return indented string to visualization
208+ * ` pretty_response ` : returns indented ` string ` for visualization(default: ` True ` , returns ` dict ` if ` False ` )
209209(default ` True ` , if ` False ` , return dict)
210210
211211### read_file
@@ -214,156 +214,167 @@ read_resume_files(pretty_response=True)
214214read_file(filename, skip = 0 , limit = 10 , query = {}, pretty_response = True )
215215```
216216
217- * ` filename ` : filename of file
218- * ` skip ` : number of rows amount to skip in pagination (default ` 0 ` )
219- * ` limit ` : number of rows to return in pagination (default ` 10 ` )
220- (max setted in ` 20 ` rows per request)
221- * ` query ` : query to make in mongo (default empty query)
222- * ` pretty_response ` : return indented string to visualization
223- (default ` True ` , if ` False ` , return dict)
217+ * ` filename ` : name of file
218+ * ` skip ` : number of rows to skip in pagination(default: ` 0 ` )
219+ * ` limit ` : number of rows to return in pagination(default: ` 10 ` )
220+ (maximum is set at ` 20 ` rows per request)
221+ * ` query ` : query to make in MongoDB(default: ` empty query ` )
222+ * ` pretty_response ` : returns indented ` string ` for visualization(default: ` True ` , returns ` dict ` if ` False ` )
224223
225224### create_file
226225
227226``` python
228227create_file(filename, url, pretty_response = True )
229228```
230229
231- * ` filename ` : filename of file to be created
232- * ` url ` : url to csv file
233- * ` pretty_response ` : return indented string to visualization
234- (default ` True ` , if ` False ` , return dict )
230+ * ` filename ` : name of file to be created
231+ * ` url ` : url to CSV file
232+ * ` pretty_response ` : returns indented ` string ` for visualization
233+ (default: ` True ` , returns ` dict ` if ` False ` )
235234
236235### delete_file
237236
238237``` python
239238delete_file(filename, pretty_response = True )
240239```
241240
242- * ` filename ` : file filename to be deleted
243- * ` pretty_response ` : return indented string to visualization
244- (default ` True ` , if ` False ` , return dict )
241+ * ` filename ` : name of the file to be deleted
242+ * ` pretty_response ` : returns indented ` string ` for visualization
243+ (default: ` True ` , returns ` dict ` if ` False ` )
245244
246- ## Projection
245+ ## Projection API
247246
248247### create_projection
249248
250249``` python
251250create_projection(filename, projection_filename, fields, pretty_response = True )
252251```
253252
254- * ` filename ` : filename of file to make projection
255- * ` projection_filename ` : filename used to create projection
253+ * ` filename ` : name of the file to make projection
254+ * ` projection_filename ` : name of file used to create projection
256255* ` fields ` : list with fields to make projection
257- * ` pretty_response ` : return indented string to visualization
258- (default ` True ` , if ` False ` , return dict )
256+ * ` pretty_response ` : returns indented ` string ` for visualization
257+ (default: ` True ` , returns ` dict ` if ` False ` )
259258
260- ## DataTypeHandler
259+ ## Data type handler API
261260
262261### change_file_type
263262
264263``` python
265264change_file_type(filename, fields_dict, pretty_response = True )
266265```
267266
268- * ` filename ` : filename of file
267+ * ` filename ` : name of file
269268* ` fields_dict ` : dictionary with ` field ` :` number ` or ` field ` :` string ` keys
270- * ` pretty_response ` : return indented string to visualization
271- (default ` True ` , if ` False ` , return dict)
269+ * ` pretty_response ` : returns indented ` string ` for visualization
270+ (default: ` True ` , returns ` dict ` if ` False ` )
271+
272+ ## Histogram API
272273
273- ## Histogram
274274### create_histogram
275+
275276``` python
276277create_histogram(filename, histogram_filename, fields,
277278 pretty_response = True )
278279```
279280
280- * ` filename ` : filename of file to make histogram
281- * ` histogram_filename ` : filename used to create histogram
281+ * ` filename ` : name of file to make histogram
282+ * ` histogram_filename ` : name of file used to create histogram
282283* ` fields ` : list with fields to make histogram
283- * ` pretty_response ` : return indented string to visualization
284- (default ` True ` , if ` False ` , return dict)
284+ * ` pretty_response ` : returns indented ` string ` for visualization
285+ (default: ` True ` , returns ` dict ` if ` False ` )
286+
287+ ## t-SNE API
285288
286- ## Tsne
287289### create_image_plot
290+
288291``` python
289292create_image_plot(tsne_filename, parent_filename,
290293 label_name = None , pretty_response = True )
291294```
292295
293- * ` parent_filename ` : filename of file to make histogram
294- * ` tsne_filename ` : filename used to create image plot
295- * ` label_name ` : label name to dataset with labeled tuples (default ` None ` , to
296+ * ` parent_filename ` : name of file to make histogram
297+ * ` tsne_filename ` : name of file used to create image plot
298+ * ` label_name ` : label name to dataset with labeled tuples (default: ` None ` , to
296299datasets without labeled tuples)
297- * ` pretty_response ` : return indented string to visualization
298- (default ` True ` , if ` False ` , return dict )
300+ * ` pretty_response ` : returns indented ` string ` for visualization
301+ (default: ` True ` , returns ` dict ` if ` False ` )
299302
300303### read_image_plot_filenames
304+
301305``` python
302306read_image_plot_filenames(pretty_response = True )
303307```
304308
305- * ` pretty_response ` : return indented string to visualization
306- (default ` True ` , if ` False ` , return dict )
309+ * ` pretty_response ` : returns indented ` string ` for visualization
310+ (default: ` True ` , returns ` dict ` if ` False ` )
307311
308312### read_image_plot
313+
309314``` python
310315read_image_plot(tsne_filename, pretty_response = True )
311316```
312317
313318* tsne_filename: filename of a created image plot
314- * ` pretty_response ` : return indented string to visualization
315- (default ` True ` , if ` False ` , return dict )
319+ * ` pretty_response ` : returns indented ` string ` for visualization
320+ (default: ` True ` , returns ` dict ` if ` False ` )
316321
317322### delete_image_plot
323+
318324``` python
319325delete_image_plot(tsne_filename, pretty_response = True )
320326```
321327
322328* ` tsne_filename ` : filename of a created image plot
323- * ` pretty_response ` : return indented string to visualization
324- (default ` True ` , if ` False ` , return dict)
329+ * ` pretty_response ` : returns indented ` string ` for visualization
330+ (default: ` True ` , returns ` dict ` if ` False ` )
331+
332+ ## PCA API
325333
326- ## Pca
327334### create_image_plot
335+
328336``` python
329337create_image_plot(tsne_filename, parent_filename,
330338 label_name = None , pretty_response = True )
331339```
332340
333- * ` parent_filename ` : filename of file to make histogram
341+ * ` parent_filename ` : name of file to make histogram
334342* ` pca_filename ` : filename used to create image plot
335- * ` label_name ` : label name to dataset with labeled tuples (default ` None ` , to
343+ * ` label_name ` : label name to dataset with labeled tuples (default: ` None ` , to
336344datasets without labeled tuples)
337- * ` pretty_response ` : return indented string to visualization
338- (default ` True ` , if ` False ` , return dict )
345+ * ` pretty_response ` : returns indented ` string ` for visualization
346+ (default: ` True ` , returns ` dict ` if ` False ` )
339347
340348### read_image_plot_filenames
349+
341350``` python
342351read_image_plot_filenames(pretty_response = True )
343352```
344353
345- * ` pretty_response ` : return indented string to visualization
346- (default ` True ` , if ` False ` , return dict )
354+ * ` pretty_response ` : returns indented ` string ` for visualization
355+ (default: ` True ` , returns ` dict ` if ` False ` )
347356
348357### read_image_plot
358+
349359``` python
350360read_image_plot(pca_filename, pretty_response = True )
351361```
352362
353363* ` pca_filename ` : filename of a created image plot
354- * ` pretty_response ` : return indented string to visualization
355- (default ` True ` , if ` False ` , return dict )
364+ * ` pretty_response ` : returns indented ` string ` for visualization
365+ (default: ` True ` , returns ` dict ` if ` False ` )
356366
357367### delete_image_plot
368+
358369``` python
359370delete_image_plot(pca_filename, pretty_response = True )
360371```
361372
362373* ` pca_filename ` : filename of a created image plot
363- * ` pretty_response ` : return indented string to visualization
364- (default ` True ` , if ` False ` , return dict )
374+ * ` pretty_response ` : returns indented ` string ` for visualization
375+ (default: ` True ` , returns ` dict ` if ` False ` )
365376
366- ## ModelBuilder
377+ ## Model builder API
367378
368379### create_model
369380
@@ -372,12 +383,12 @@ create_model(training_filename, test_filename, preprocessor_code,
372383 model_classificator, pretty_response = True )
373384```
374385
375- * ` training_filename ` : filename to be used in training
376- * ` test_filename ` : filename to be used in test
377- * ` preprocessor_code ` : python3 code for pyspark preprocessing model
378- * ` model_classificator ` : list of initial from classificators to be used in model
379- * ` pretty_response ` : return indented string to visualization
380- (default ` True ` , if ` False ` , return dict )
386+ * ` training_filename ` : name of file to be used in training
387+ * ` test_filename ` : name of file to be used in test
388+ * ` preprocessor_code ` : Python3 code for pyspark preprocessing model
389+ * ` model_classificator ` : list of initial classificators to be used in model
390+ * ` pretty_response ` : returns indented ` string ` for visualization
391+ (default: ` True ` , returns ` dict ` if ` False ` )
381392
382393#### model_classificator
383394
@@ -395,15 +406,15 @@ create_model(training_filename, test_filename, preprocessor_code, ["lr", "nb"])
395406
396407#### preprocessor_code environment
397408
398- The python 3 preprocessing code must use the environment instances in bellow :
409+ The Python 3 preprocessing code must use the environment instances as below :
399410
400411* ` training_df ` (Instantiated): Spark Dataframe instance training filename
401412* ` testing_df ` (Instantiated): Spark Dataframe instance testing filename
402413
403- The preprocessing code must instantiate the variables in below, all instances must be transformed by pyspark VectorAssembler:
414+ The preprocessing code must instantiate the variables as below, all instances must be transformed by pyspark VectorAssembler:
404415
405- * ` features_training ` (Not Instantiated): Spark Dataframe instance for train the model
406- * ` features_evaluation ` (Not Instantiated): Spark Dataframe instance for evaluating trained model accuracy
416+ * ` features_training ` (Not Instantiated): Spark Dataframe instance for training the model
417+ * ` features_evaluation ` (Not Instantiated): Spark Dataframe instance for evaluating trained model
407418* ` features_testing ` (Not Instantiated): Spark Dataframe instance for testing the model
408419
409420In case you don't want to evaluate the model, set ` features_evaluation ` as ` None ` .
@@ -414,8 +425,7 @@ In case you don't want to evaluate the model, set `features_evaluation` as `None
414425self .fields_from_dataframe(dataframe, is_string)
415426```
416427
417- This method returns string or number fields as a string list from a DataFrame.
428+ This method returns ` string ` or ` number ` fields as a ` string ` list from a DataFrame.
418429
419430* ` dataframe ` : DataFrame instance
420- * ` is_string ` : Boolean parameter, if ` True ` , the method returns the string DataFrame fields, otherwise, returns the numbers DataFrame fields.
421-
431+ * ` is_string ` : Boolean parameter(if ` True ` , the method returns the string DataFrame fields, otherwise, returns the numbers DataFrame fields)
0 commit comments