ThunderJaw commited on
Commit
272cc32
·
verified ·
1 Parent(s): a099259

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - ganchengguang/resume_seven_class
4
+ language:
5
+ - hu
6
+ base_model:
7
+ - facebook/fasttext-hu-vectors
8
+ pipeline_tag: text-classification
9
+ ---
10
+
11
+ # Model Card for Resume Section Classifier
12
+
13
+ This model is designed to classify sections within Hungarian resumes into categories such as Skills, Education, Experience, and others. It utilizes the `facebook/fasttext-hu-vectors` model as its base and has been fine-tuned on the `ganchengguang/resume_seven_class` dataset. The dataaset was in English so I translated it into Hungarian. It's not the best approach but it still works.
14
+
15
+ ## Model Details
16
+
17
+ ### Model Description
18
+
19
+ This model leverages the `facebook/fasttext-hu-vectors` pre-trained embeddings to classify Hungarian resume sections into predefined categories. It has been fine-tuned on the `ganchengguang/resume_seven_class` dataset, which includes seven categories: Experience, Education, Knowledge, Project, and others.
20
+
21
+ - **Model type:** Text Classification
22
+ - **Language(s):** Hungarian
23
+ - **Finetuned from model:** facebook/fasttext-hu-vectors
24
+
25
+ ## Uses
26
+
27
+ ### Direct Use
28
+
29
+ This model can be used directly to classify sections of Hungarian resumes into categories such as Skills, Education, Experience, and others. It is suitable for applications in recruitment and resume analysis.
30
+
31
+ ### Downstream Use
32
+
33
+ The model can be integrated into larger systems for automated resume screening, assisting HR professionals in efficiently processing and categorizing resume information.
34
+
35
+ ### Out-of-Scope Use
36
+
37
+ This model is not intended for use with resumes in languages other than Hungarian. It may not perform accurately on resumes with non-standard formats or those containing significant amounts of non-Hungarian text.
38
+
39
+ ## Bias, Risks, and Limitations
40
+
41
+ The model has been trained on a specific dataset and may not generalize well to resumes with formats or content significantly different from those in the training data. Users should be aware of potential biases in the training data and the model's limitations in handling diverse resume formats.
42
+
43
+ ### Recommendations
44
+
45
+ Users should validate the model's predictions and consider incorporating human oversight, especially when dealing with resumes that deviate from the standard formats present in the training data.
46
+
47
+ ## How to Get Started with the Model
48
+
49
+ - https://github.com/ssobii2/Wozify-CV-Parser
50
+ - Check Fasttext Website
51
+
52
+ ## Training Details
53
+
54
+ ### Training Data
55
+
56
+ The model was fine-tuned on the `ganchengguang/resume_seven_class` dataset, which contains English resume sections labeled into seven categories: Experience, Education, Knowledge, Project, and others. I translated the dataset into Hungarian.
57
+
58
+ ### Training Procedure
59
+
60
+ The model was fine-tuned using standard text classification procedures, adjusting hyperparameters to optimize performance on the resume classification task.
61
+
62
+ ## Evaluation
63
+
64
+ ### Testing Data, Factors & Metrics
65
+
66
+ The model's performance was evaluated on a held-out test set from the `ganchengguang/resume_seven_class` dataset, using accuracy and F1-score as evaluation metrics.
67
+
68
+ #### Metrics
69
+
70
+ - **Accuracy:** Measures the proportion of correctly classified sections.
71
+ - **F1-score:** Harmonic mean of precision and recall, providing a balance between the two.
72
+
73
+ ## Environmental Impact
74
+
75
+ The training of this model was conducted on standard hardware, resulting in minimal carbon emissions. Users should consider the environmental impact of training large models and explore options for model distillation or quantization to reduce energy consumption.
76
+
77
+ ## Technical Specifications
78
+
79
+ ### Model Architecture and Objective
80
+
81
+ The model is based on the `facebook/fasttext-hu-vectors` architecture, fine-tuned for the task of classifying Hungarian resume sections into predefined categories.
82
+
83
+ ### Compute Infrastructure
84
+
85
+ The model was trained my personal gaming laptop.
86
+
87
+ #### Hardware
88
+
89
+ - **GPU:** RTX 4070 Laptop GPU 8GB VRAM
90
+ - **CPI:** Intel Core-i7-13620H
91
+ - **RAM:** 16GB