Scrape parse configs / Generate
Test or generate configurations..
Route: /api/{tenant:minlength(2)}/v{version:apiVersion}/scrape_parse_configs/generate
Method: POST
Authorisation: Minimum role: ADMIN
Request Arguments
| Name | Type | Source |
|---|---|---|
| input | DTO_scrape_parse_config_generate_input | Body |
Request body example
Response object
Response status: 200 (OK)
Response type: DTO_scrape_parse_config_generate_output
Wrapper: DTO_response_wrap
Other response statuses: 400 (BadRequest), 403 (Forbidden)
Response body example
DTO_scrape_parse_config_generate_input
| Name | Type | ReadOnly | Description |
|---|---|---|---|
| scrape_domain_id | Int32 | Scrape domain id | |
| configs | DTO_scrape_parse_config[] | Configs | |
| test_urls | String[] | Test urls | |
| mode | ENUM scraper_parse_generate_mode | Mode | |
| human_explained_extractions | String[] | For generate mode, explain what should be extracted. |
DTO_scrape_parse_config_generate_output
| Name | Type | ReadOnly | Description |
|---|---|---|---|
| configs | DTO_scrape_parse_config[] | Configs | |
| extractions | DTO_scrape_parse_config_generate_output_extraction[] | Extractions |
DTO_scrape_parse_config
| Name | Type | ReadOnly | Description |
|---|---|---|---|
| created_by | DTO_reference_user | ReadOnly | Created by |
| last_updated_by | DTO_reference_user | ReadOnly | Last updated by |
| sub_configs | DTO_scrape_parse_config[] | Sub configs | |
| id | Int32 | ReadOnly | Leave empty on input |
| scrape_domain_id | Int32 | Scrape domain id | |
| apply_on_url_pattern | String | Apply on url pattern | |
| unique_key | String | Unique key | |
| xpath | String | Xpath | |
| is_collection | Boolean | Is collection | |
| extraction_mode | ENUM scraper_parse_extraction_mode | Extraction mode | |
| post_process_rule | ENUM scraper_parse_post_process | Post process rule | |
| used_last | DateTime (nullable) | Will update if it was more than 10 minutes (or similar) to avoid too many updates. | |
| created | DateTime | ReadOnly | Created |
| last_updated | DateTime | ReadOnly | Last updated |
DTO_reference_user
| Name | Type | ReadOnly | Description |
|---|---|---|---|
| profile_pic | String | ReadOnly | Profile pic |
| String | ReadOnly | ||
| id | Int32 | Id | |
| label | Object | ReadOnly | Label |
DTO_scrape_parse_config_generate_output_extraction
| Name | Type | ReadOnly | Description |
|---|---|---|---|
| key_index | Int32 | Key index | |
| key | String | Key | |
| sub_key | String | Sub key | |
| values | DTO_scrape_parse_config_generate_output_extraction_value[] | Values |
DTO_scrape_parse_config_generate_output_extraction_value
| Name | Type | ReadOnly | Description |
|---|---|---|---|
| test_url | String | Test url | |
| value | String | Value |
ENUM scraper_parse_extraction_mode
| Value | Name | Description |
|---|---|---|
| 0 | NOT_SET | NOT_SET |
| 1 | INNER_HTML | INNER_HTML |
| 2 | INNER_TEXT | INNER_TEXT |
| 3 | ATTRIBUTE_HREF | ATTRIBUTE_HREF |
| 4 | ATTRIBUTE_SRC | ATTRIBUTE_SRC |
| 5 | DATETIME | DATETIME |
| 6 | ARIA_LABEL | ARIA_LABEL |
ENUM scraper_parse_post_process
| Value | Name | Description |
|---|---|---|
| 0 | NOT_SET | NOT_SET |
| 1 | PATH_JOIN | PATH_JOIN |
ENUM scraper_parse_generate_mode
| Value | Name | Description |
|---|---|---|
| 0 | UNKNOWN | UNKNOWN |
| 1 | TEST | TEST |
| 2 | GENERATE | GENERATE |
| 3 | WARM_UP | WARM_UP |