HR records
1. Extract information of HR records with image URL or pdf URL input
API:
| Method | URL | 
|---|---|
| GET | https://demo.computervision.com.vn/api/v2/ocr/employee_profile | 
Params:
| Key | Value | Description | 
|---|---|---|
| img | https://example.com/image.png | URL of photo or pdf | 
| format_type | url | Type of data to pass in, receive value: url,file,base64 | 
| get_thumb | true/false | Returns a aligned image | 
Demo Python:
import requestsapi_key = "YOUR_API_KEY"api_secret = "YOUR_API_SECRET"image_url = 'https://example.com/image.png'response = requests.get("https://demo.computervision.com.vn/api/v2/ocr/employee_profile?img=%s&format_type=url&get_thumb=false"% image_url,auth=(api_key, api_secret))print(response.json())
2. Extract information of HR records with image file or pdf file input
API:
| Method | URL | content-type | 
|---|---|---|
| POST | https://demo.computervision.com.vn/api/v2/ocr/employee_profile | multipart/form-data | 
Params:
| Key | Value | Description | 
|---|---|---|
| format_type | file | Type of data to pass in, receive value: url,file,base64 | 
| get_thumb | true/false | Returns a aligned image | 
Body:
| Key | Type | Value | Description | 
|---|---|---|---|
| img | file | example.jpg | Image file or pdf file | 
Demo Python:
import requestsapi_key = "YOUR_API_KEY"api_secret = "YOUR_API_SECRET"image_path = '/path/to/your/image.jpg'response = requests.post("https://demo.computervision.com.vn/api/v2/ocr/employee_profile?format_type=file&get_thumb=false",auth=(api_key, api_secret),files={'img': open(image_path, 'rb')})print(response.json())
3. Extract information of HR records with JSON input
API:
| Method | URL | content-type | 
|---|---|---|
| POST | https://demo.computervision.com.vn/api/v2/ocr/employee_profile | application/json | 
Params:
| Key | Value | Description | 
|---|---|---|
| format_type | base64 | Type of data to pass in, receive value: url,file,base64 | 
| get_thumb | true/false | Returns a aligned image | 
Body:
{"img": "iVBORw0KGgoAAAANSU..." // string base64 of the image or pdf to extract}
Demo Python:
import base64import ioimport requestsfrom PIL import Imagedef get_byte_img(img):img_byte_arr = io.BytesIO()img.save(img_byte_arr, format='PNG')encoded_img = base64.encodebytes(img_byte_arr.getvalue()).decode('ascii')return encoded_imgapi_key = "YOUR_API_KEY"api_secret = "YOUR_API_SECRET"img_name = "path_img"encode_cmt = get_byte_img(Image.open(img_name))response = requests.post("https://demo.computervision.com.vn/api/v2/ocr/employee_profile?format_type=base64&get_thumb=false",auth=(api_key, api_secret),json={'img' : encode_cmt})print(response.json())
4. Response
The response will be a JSON with the following format:
{"data": [xxxx],"errorCode": string,"errorMessage": string}
The data field is a JSON with the following format:
{"id_card": [xxx],"registration_book": [xxx],"curriculum_vitae": [xxx],"academic_degree": [xxx],"birth_certificate": [xxx],"health_certification": [xxx],"confirm_residence": [xxx],"image_negative": [xxx],"id_negative": [xxx]}
Details of the fields are described below:
For keys id_card, academic_degree, birth_certificate the value is an array with one or more elements.
For keys registration book, curriculum vitae, value is an array with only one element.
For key image_negative is an array of images that cannot extract any information of any type of document (images in base64 format).
Each element in the array has the following format:
{"type": string,"info": object}
If type field is id_doc, the info field contains the following fields:
- address
- address_box
- address_confidence
- dob
- dob_box
- dob_confidence
- gender
- gender_box
- gender_confidence
- hometown
- hometown_box
- hometown_confidence
- id
- id_box
- id_confidence
- image_front
- image_back
- issue_date
- issue_date_box
- issue_date_confidence
- issued_at
- issued_at_box
- issued_at_confidence
- name
- name_box
- name_confidence
If type field is curriculum_vitae, the info field contains the following fields:
- academic_level
- academic_level_box
- academic_level_confidence
- academic_level_id
- current_address
- current_address_box
- current_address_confidence
- current_address_id
- dob
- dob_box
- dob_confidence
- dob_id
- father_dob
- father_dob_box
- father_dob_confidence
- father_dob_id
- father_name
- father_name_box
- father_name_confidence
- father_name_id
- gender
- gender_box
- gender_confidence
- gender_id
- image_0
- image_1
- image_2
- image_3
- mother_dob
- mother_dob_box
- mother_dob_confidence
- mother_dob_id
- mother_name
- mother_name_box
- mother_name_confidence
- mother_name_id
- name
- name_box
- name_confidence
- name_id
- place_of_birth
- place_of_birth_box
- place_of_birth_confidence
- place_of_birth_id
- work_experience
- work_experience_box
- work_experience_confidence
- work_experience_id
If type field is registration_book, the info field contains the following fields:
- address
- address_box
- address_confidence
- book_number
- book_number_box
- book_number_confidence
- head_name
- head_name_box
- head_name_confidence
- image
- member: This field is a list. Each element contains the following fields:- `dob
- `dob_box
- dob_confidence
- gender
- gender_box
- gender_confidence
- id_card
- id_card_box
- id_card_confidence
- image_member
- name
- name_box
- name_confidence
- relationship_to_head
- relationship_to_head_box
- relationship_to_head_confidence
 
If type field is academic_degree, the info field contains the following fields:
- academic_level
- academic_level_box
- academic_level_confidence
- award_classification
- award_classification_box
- award_classification_confidence
- dob
- dob_box
- dob_confidence
- graduation_year
- graduation_year_box
- graduation_year_confidence
- image
- major
- major_box
- major_confidence
- name
- name_box
- name_confidence
- school
- school_box
- school_confidence
If type field is birth_certificate, the info field contains the following fields:
- dob
- dob_box
- dob_confidence
- father_address
- father_address_box
- father_address_confidence
- father_dob
- father_dob_box
- father_dob_confidence
- father_name
- father_name_box
- father_name_confidence
- gender
- gender_box
- gender_confidence
- image
- mother_address
- mother_address_box
- mother_address_confidence
- mother_dob
- mother_dob_box
- mother_dob_confidence
- mother_name
- mother_name_box
- mother_name_confidence
- number
- number_box
- number_confidence
- number_book
- number_book_box
- number_book_confidence
- place_of_birth
- place_of_birth_box
- place_of_birth_confidence
- regis_place
- regis_place_box
- regis_place_confidence
If type field is confirm_residence, the info field contains the following fields:
- address
- address_box
- address_confidence
- current_address
- current_address_box
- current_address_confidence
- dob
- dob_box
- dob_confidence
- ethnicity
- ethnicity_box
- ethnicity_confidence
- gender
- gender_box
- gender_confidence
- head_id
- head_id_box
- head_id_confidence
- head_name
- head_name_box
- head_name_confidence
- hometown
- hometown_box
- hometown_confidence
- id
- id_box
- id_confidence
- image
- image_member
- member: This field is a list. Each element contains the following fields:- relationship_to_head
- relationship_to_head_box
- relationship_to_head_confidence
- name
- name_box
- name_confidence
- dob
- dob_box
- dob_confidence
- gender
- gender_box
- gender_confidence
- id_card
- id_card_box
- id_card_confidence
 
- name
- name_box
- name_confidence
- nationality
- nationality_box
- nationality_confidence
- registered_address
- registered_address_box
- registered_address_confidence
- relationship_to_head
- relationship_to_head_box
- relationship_to_head_confidence
- religious
- religious_box
- religious_confidence
If type field is health_certification, the info field contains the following fields:
- name
- name_box
- name_confidence
- name_id
- dob
- dob_box
- dob_confidence
- dob_id
- health_condition
- health_condition_box
- health_condition_confidence
- health_condition_id
- height
- height_box
- height_confidence
- height_id
- weight
- weight_box
- weight_confidence
- weight_id
- image_0
- image_1
- image_2
For pages that cannot extract information, the corresponding two fields are image_negative and id_negative in data. In there:
- image_negative: Is a list of- base64images of pages where information could not be extracted.
- id_negative: Is a list- indexrepresenting the page that could not be extracted information as to how many pages in the records.
Note : information fields (except image) will have _box and _confidence included
Error code table:
| Code | Message | 
|---|---|
| 0 | Success | 
| 1 | The photo does not contain content | 
| 2 | Url is unavailable | 
| 3 | Incorrect image format | 
| 4 | Out of requests | 
| 5 | Incorrect api_key or api_secret | 
| 6 | Incorrect format type |