HR records

1. Extract information of HR records with image URL or pdf URL input

API:

MethodURL
GEThttps://cloud.computervision.com.vn/api/v2/ocr/employee_profile

Params:

KeyValueDescription
imghttps://example.com/image.pngURL of photo or pdf
format_typeurlType of data to pass in, receive value: url, file, base64
get_thumbtrue/falseReturns a aligned image

Demo Python:

import requests
api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"
image_url = 'https://example.com/image.png'
response = requests.get(
"https://cloud.computervision.com.vn/api/v2/ocr/employee_profile?img=%s&format_type=url&get_thumb=false"
% image_url,
auth=(api_key, api_secret))
print(response.json())

2. Extract information of HR records with image file or pdf file input

API:

MethodURLcontent-type
POSThttps://cloud.computervision.com.vn/api/v2/ocr/employee_profilemultipart/form-data

Params:

KeyValueDescription
format_typefileType of data to pass in, receive value: url, file, base64
get_thumbtrue/falseReturns a aligned image

Body:

KeyTypeValueDescription
imgfileexample.jpgImage file or pdf file

Demo Python:

import requests
api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"
image_path = '/path/to/your/image.jpg'
response = requests.post(
"https://cloud.computervision.com.vn/api/v2/ocr/employee_profile?format_type=file&get_thumb=false",
auth=(api_key, api_secret),
files={'img': open(image_path, 'rb')})
print(response.json())

3. Extract information of HR records with JSON input

API:

MethodURLcontent-type
POSThttps://cloud.computervision.com.vn/api/v2/ocr/employee_profileapplication/json

Params:

KeyValueDescription
format_typebase64Type of data to pass in, receive value: url, file, base64
get_thumbtrue/falseReturns a aligned image

Body:

{
"img": "iVBORw0KGgoAAAANSU..." // string base64 of the image or pdf to extract
}

Demo Python:

import base64
import io
import requests
from PIL import Image
def get_byte_img(img):
img_byte_arr = io.BytesIO()
img.save(img_byte_arr, format='PNG')
encoded_img = base64.encodebytes(img_byte_arr.getvalue()).decode('ascii')
return encoded_img
api_key = "YOUR_API_KEY"
api_secret = "YOUR_API_SECRET"
img_name = "path_img"
encode_cmt = get_byte_img(Image.open(img_name))
response = requests.post(
"https://cloud.computervision.com.vn/api/v2/ocr/employee_profile?format_type=base64&get_thumb=false",
auth=(api_key, api_secret),
json={'img' : encode_cmt})
print(response.json())

4. Response

The response will be a JSON with the following format:

{
"data": [xxxx],
"errorCode": string,
"errorMessage": string
}

The data field is a JSON with the following format:

{
"id_card": [xxx],
"registration_book": [xxx],
"curriculum_vitae": [xxx],
"academic_degree": [xxx],
"birth_certificate": [xxx],
"health_certification": [xxx],
"confirm_residence": [xxx],
"image_negative": [xxx],
"id_negative": [xxx]
}

Details of the fields are described below:

For keys id_card, academic_degree, birth_certificate the value is an array with one or more elements.

For keys registration book, curriculum vitae, value is an array with only one element.

For key image_negative is an array of images that cannot extract any information of any type of document (images in base64 format).

Each element in the array has the following format:

{
"type": string,
"info": object
}

If type field is id_doc, the info field contains the following fields:

  • address
  • address_box
  • address_confidence
  • dob
  • dob_box
  • dob_confidence
  • gender
  • gender_box
  • gender_confidence
  • hometown
  • hometown_box
  • hometown_confidence
  • id
  • id_box
  • id_confidence
  • image_front
  • image_back
  • issue_date
  • issue_date_box
  • issue_date_confidence
  • issued_at
  • issued_at_box
  • issued_at_confidence
  • name
  • name_box
  • name_confidence

If type field is curriculum_vitae, the info field contains the following fields:

  • academic_level
  • academic_level_box
  • academic_level_confidence
  • academic_level_id
  • current_address
  • current_address_box
  • current_address_confidence
  • current_address_id
  • dob
  • dob_box
  • dob_confidence
  • dob_id
  • father_dob
  • father_dob_box
  • father_dob_confidence
  • father_dob_id
  • father_name
  • father_name_box
  • father_name_confidence
  • father_name_id
  • gender
  • gender_box
  • gender_confidence
  • gender_id
  • image_0
  • image_1
  • image_2
  • image_3
  • mother_dob
  • mother_dob_box
  • mother_dob_confidence
  • mother_dob_id
  • mother_name
  • mother_name_box
  • mother_name_confidence
  • mother_name_id
  • name
  • name_box
  • name_confidence
  • name_id
  • place_of_birth
  • place_of_birth_box
  • place_of_birth_confidence
  • place_of_birth_id
  • work_experience
  • work_experience_box
  • work_experience_confidence
  • work_experience_id

If type field is registration_book, the info field contains the following fields:

  • address
  • address_box
  • address_confidence
  • book_number
  • book_number_box
  • book_number_confidence
  • head_name
  • head_name_box
  • head_name_confidence
  • image
  • member: This field is a list. Each element contains the following fields:
    • `dob
    • `dob_box
    • dob_confidence
    • gender
    • gender_box
    • gender_confidence
    • id_card
    • id_card_box
    • id_card_confidence
    • image_member
    • name
    • name_box
    • name_confidence
    • relationship_to_head
    • relationship_to_head_box
    • relationship_to_head_confidence

If type field is academic_degree, the info field contains the following fields:

  • academic_level
  • academic_level_box
  • academic_level_confidence
  • award_classification
  • award_classification_box
  • award_classification_confidence
  • dob
  • dob_box
  • dob_confidence
  • graduation_year
  • graduation_year_box
  • graduation_year_confidence
  • image
  • major
  • major_box
  • major_confidence
  • name
  • name_box
  • name_confidence
  • school
  • school_box
  • school_confidence

If type field is birth_certificate, the info field contains the following fields:

  • dob
  • dob_box
  • dob_confidence
  • father_address
  • father_address_box
  • father_address_confidence
  • father_dob
  • father_dob_box
  • father_dob_confidence
  • father_name
  • father_name_box
  • father_name_confidence
  • gender
  • gender_box
  • gender_confidence
  • image
  • mother_address
  • mother_address_box
  • mother_address_confidence
  • mother_dob
  • mother_dob_box
  • mother_dob_confidence
  • mother_name
  • mother_name_box
  • mother_name_confidence
  • number
  • number_box
  • number_confidence
  • number_book
  • number_book_box
  • number_book_confidence
  • place_of_birth
  • place_of_birth_box
  • place_of_birth_confidence
  • regis_place
  • regis_place_box
  • regis_place_confidence

If type field is confirm_residence, the info field contains the following fields:

  • address
  • address_box
  • address_confidence
  • current_address
  • current_address_box
  • current_address_confidence
  • dob
  • dob_box
  • dob_confidence
  • ethnicity
  • ethnicity_box
  • ethnicity_confidence
  • gender
  • gender_box
  • gender_confidence
  • head_id
  • head_id_box
  • head_id_confidence
  • head_name
  • head_name_box
  • head_name_confidence
  • hometown
  • hometown_box
  • hometown_confidence
  • id
  • id_box
  • id_confidence
  • image
  • image_member
  • member: This field is a list. Each element contains the following fields:
    • relationship_to_head
    • relationship_to_head_box
    • relationship_to_head_confidence
    • name
    • name_box
    • name_confidence
    • dob
    • dob_box
    • dob_confidence
    • gender
    • gender_box
    • gender_confidence
    • id_card
    • id_card_box
    • id_card_confidence
  • name
  • name_box
  • name_confidence
  • nationality
  • nationality_box
  • nationality_confidence
  • registered_address
  • registered_address_box
  • registered_address_confidence
  • relationship_to_head
  • relationship_to_head_box
  • relationship_to_head_confidence
  • religious
  • religious_box
  • religious_confidence

If type field is health_certification, the info field contains the following fields:

  • name
  • name_box
  • name_confidence
  • name_id
  • dob
  • dob_box
  • dob_confidence
  • dob_id
  • health_condition
  • health_condition_box
  • health_condition_confidence
  • health_condition_id
  • height
  • height_box
  • height_confidence
  • height_id
  • weight
  • weight_box
  • weight_confidence
  • weight_id
  • image_0
  • image_1
  • image_2

For pages that cannot extract information, the corresponding two fields are image_negative and id_negative in data. In there:

  • image_negative: Is a list of base64 images of pages where information could not be extracted.
  • id_negative: Is a list index representing the page that could not be extracted information as to how many pages in the records.

Note : information fields (except image) will have _box and _confidence included

Error code table:

CodeMessage
0Success
1The photo does not contain content
2Url is unavailable
3Incorrect image format
4Out of requests
5Incorrect api_key or api_secret
6Incorrect format type