yashcode00 commited on
Commit
b9f4988
·
1 Parent(s): b83d721

yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor

Browse files
README.md CHANGED
@@ -3,8 +3,6 @@ license: apache-2.0
3
  base_model: yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor
4
  tags:
5
  - generated_from_trainer
6
- - This model is fully trained on the sample-Data of 11 languages provided
7
- - not trained yet TTS-data or the second one
8
  metrics:
9
  - accuracy
10
  model-index:
@@ -19,17 +17,16 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor](https://huggingface.co/yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.1719
23
- - Accuracy: 0.9554
24
 
25
  ## Model description
26
 
27
- This is final finetuned version of Wav2vec2 on the sample data of audio from 11 Indian Languages codemixed with English language.
28
- It has only been yet finetuned on one dataset that is sample dataset.
29
 
30
  ## Intended uses & limitations
31
 
32
- Not giving yet good accuracy on other datasets like TTS and sample dataset. Needs to be trained on more dataset.
33
 
34
  ## Training and evaluation data
35
 
@@ -41,28 +38,22 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 5e-05
44
- - train_batch_size: 16
45
- - eval_batch_size: 16
46
  - seed: 42
47
- - gradient_accumulation_steps: 16
48
- - total_train_batch_size: 256
49
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
  - lr_scheduler_type: linear
51
- - num_epochs: 90
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss | Accuracy |
56
- |:-------------:|:-----:|:----:|:---------------:|:--------:|
57
- | 0.0103 | 21.11 | 1000 | 0.1802 | 0.9501 |
58
- | 0.009 | 42.22 | 2000 | 0.1717 | 0.9497 |
59
- | 0.0086 | 63.32 | 3000 | 0.1675 | 0.9546 |
60
- | 0.0073 | 84.43 | 4000 | 0.1686 | 0.9538 |
61
 
62
 
63
  ### Framework versions
64
 
65
- - Transformers 4.33.0
66
  - Pytorch 2.0.0
67
  - Datasets 2.11.0
68
  - Tokenizers 0.13.3
 
3
  base_model: yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor
4
  tags:
5
  - generated_from_trainer
 
 
6
  metrics:
7
  - accuracy
8
  model-index:
 
17
 
18
  This model is a fine-tuned version of [yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor](https://huggingface.co/yashcode00/wav2vec2-large-xlsr-indian-language-classification-featureExtractor) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.4481
21
+ - Accuracy: 0.8710
22
 
23
  ## Model description
24
 
25
+ More information needed
 
26
 
27
  ## Intended uses & limitations
28
 
29
+ More information needed
30
 
31
  ## Training and evaluation data
32
 
 
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 5e-05
41
+ - train_batch_size: 8
42
+ - eval_batch_size: 8
43
  - seed: 42
44
+ - gradient_accumulation_steps: 8
45
+ - total_train_batch_size: 64
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
+ - num_epochs: 60
49
 
50
  ### Training results
51
 
 
 
 
 
 
 
52
 
53
 
54
  ### Framework versions
55
 
56
+ - Transformers 4.32.1
57
  - Pytorch 2.0.0
58
  - Datasets 2.11.0
59
  - Tokenizers 0.13.3
all_results.json CHANGED
@@ -1,15 +1,15 @@
1
  {
2
- "epoch": 89.29,
3
- "eval_accuracy": 0.9554455280303955,
4
- "eval_loss": 0.1719195693731308,
5
- "eval_runtime": 50.8715,
6
- "eval_samples": 2424,
7
- "eval_samples_per_second": 47.65,
8
- "eval_steps_per_second": 2.988,
9
- "total_flos": 3.2880550437308154e+19,
10
- "train_loss": 0.00991302564845863,
11
- "train_runtime": 33902.9866,
12
- "train_samples": 12120,
13
- "train_samples_per_second": 32.174,
14
- "train_steps_per_second": 0.125
15
  }
 
1
  {
2
+ "epoch": 56.95,
3
+ "eval_accuracy": 0.8709677457809448,
4
+ "eval_loss": 0.4480999708175659,
5
+ "eval_runtime": 1.4774,
6
+ "eval_samples": 93,
7
+ "eval_samples_per_second": 62.95,
8
+ "eval_steps_per_second": 8.123,
9
+ "total_flos": 8.064772262536032e+17,
10
+ "train_loss": 0.15708597316628412,
11
+ "train_runtime": 790.3622,
12
+ "train_samples": 466,
13
+ "train_samples_per_second": 35.376,
14
+ "train_steps_per_second": 0.531
15
  }
config.json CHANGED
@@ -139,7 +139,7 @@
139
  1
140
  ],
141
  "torch_dtype": "float32",
142
- "transformers_version": "4.33.0",
143
  "use_weighted_layer_sum": false,
144
  "vocab_size": 32,
145
  "xvector_output_dim": 512
 
139
  1
140
  ],
141
  "torch_dtype": "float32",
142
+ "transformers_version": "4.32.1",
143
  "use_weighted_layer_sum": false,
144
  "vocab_size": 32,
145
  "xvector_output_dim": 512
eval_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 89.29,
3
- "eval_accuracy": 0.9554455280303955,
4
- "eval_loss": 0.1719195693731308,
5
- "eval_runtime": 50.8715,
6
- "eval_samples": 2424,
7
- "eval_samples_per_second": 47.65,
8
- "eval_steps_per_second": 2.988
9
  }
 
1
  {
2
+ "epoch": 56.95,
3
+ "eval_accuracy": 0.8709677457809448,
4
+ "eval_loss": 0.4480999708175659,
5
+ "eval_runtime": 1.4774,
6
+ "eval_samples": 93,
7
+ "eval_samples_per_second": 62.95,
8
+ "eval_steps_per_second": 8.123
9
  }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7ceaacefa2250a52d0d79eeb85b6a0da21680d9b9b79e1d64c35a9ab6bd911c1
3
  size 1266146037
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c5bd079f25010ad6fed494f319bd825fbda1ad3ba0247b15ee3faa0fc2a04cef
3
  size 1266146037
train_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 89.29,
3
- "total_flos": 3.2880550437308154e+19,
4
- "train_loss": 0.00991302564845863,
5
- "train_runtime": 33902.9866,
6
- "train_samples": 12120,
7
- "train_samples_per_second": 32.174,
8
- "train_steps_per_second": 0.125
9
  }
 
1
  {
2
+ "epoch": 56.95,
3
+ "total_flos": 8.064772262536032e+17,
4
+ "train_loss": 0.15708597316628412,
5
+ "train_runtime": 790.3622,
6
+ "train_samples": 466,
7
+ "train_samples_per_second": 35.376,
8
+ "train_steps_per_second": 0.531
9
  }
trainer_state.json CHANGED
@@ -1,325 +1,61 @@
1
  {
2
  "best_metric": null,
3
  "best_model_checkpoint": null,
4
- "epoch": 89.28759894459102,
5
  "eval_steps": 1000,
6
- "global_step": 4230,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
- "epoch": 2.11,
13
- "learning_rate": 4.8817966903073283e-05,
14
- "loss": 0.0118,
15
  "step": 100
16
  },
17
  {
18
- "epoch": 4.22,
19
- "learning_rate": 4.763593380614658e-05,
20
- "loss": 0.0156,
21
  "step": 200
22
  },
23
  {
24
- "epoch": 6.33,
25
- "learning_rate": 4.645390070921986e-05,
26
- "loss": 0.0122,
27
  "step": 300
28
  },
29
  {
30
- "epoch": 8.44,
31
- "learning_rate": 4.527186761229315e-05,
32
- "loss": 0.0148,
33
  "step": 400
34
  },
35
  {
36
- "epoch": 10.55,
37
- "learning_rate": 4.4089834515366435e-05,
38
- "loss": 0.0114,
39
- "step": 500
 
 
 
40
  },
41
  {
42
- "epoch": 12.66,
43
- "learning_rate": 4.2907801418439716e-05,
44
- "loss": 0.0143,
45
- "step": 600
46
- },
47
- {
48
- "epoch": 14.78,
49
- "learning_rate": 4.1725768321513004e-05,
50
- "loss": 0.0149,
51
- "step": 700
52
- },
53
- {
54
- "epoch": 16.89,
55
- "learning_rate": 4.0543735224586285e-05,
56
- "loss": 0.0139,
57
- "step": 800
58
- },
59
- {
60
- "epoch": 19.0,
61
- "learning_rate": 3.936170212765958e-05,
62
- "loss": 0.0125,
63
- "step": 900
64
- },
65
- {
66
- "epoch": 21.11,
67
- "learning_rate": 3.817966903073286e-05,
68
- "loss": 0.0103,
69
- "step": 1000
70
- },
71
- {
72
- "epoch": 21.11,
73
- "eval_accuracy": 0.9500824809074402,
74
- "eval_loss": 0.18024244904518127,
75
- "eval_runtime": 47.0038,
76
- "eval_samples_per_second": 51.57,
77
- "eval_steps_per_second": 3.234,
78
- "step": 1000
79
- },
80
- {
81
- "epoch": 23.22,
82
- "learning_rate": 3.699763593380615e-05,
83
- "loss": 0.0111,
84
- "step": 1100
85
- },
86
- {
87
- "epoch": 25.33,
88
- "learning_rate": 3.5815602836879437e-05,
89
- "loss": 0.0093,
90
- "step": 1200
91
- },
92
- {
93
- "epoch": 27.44,
94
- "learning_rate": 3.463356973995272e-05,
95
- "loss": 0.0109,
96
- "step": 1300
97
- },
98
- {
99
- "epoch": 29.55,
100
- "learning_rate": 3.3451536643026005e-05,
101
- "loss": 0.0102,
102
- "step": 1400
103
- },
104
- {
105
- "epoch": 31.66,
106
- "learning_rate": 3.226950354609929e-05,
107
- "loss": 0.012,
108
- "step": 1500
109
- },
110
- {
111
- "epoch": 33.77,
112
- "learning_rate": 3.108747044917258e-05,
113
- "loss": 0.0116,
114
- "step": 1600
115
- },
116
- {
117
- "epoch": 35.88,
118
- "learning_rate": 2.9905437352245862e-05,
119
- "loss": 0.0145,
120
- "step": 1700
121
- },
122
- {
123
- "epoch": 37.99,
124
- "learning_rate": 2.8723404255319154e-05,
125
- "loss": 0.011,
126
- "step": 1800
127
- },
128
- {
129
- "epoch": 40.11,
130
- "learning_rate": 2.7541371158392438e-05,
131
- "loss": 0.0108,
132
- "step": 1900
133
- },
134
- {
135
- "epoch": 42.22,
136
- "learning_rate": 2.6359338061465723e-05,
137
- "loss": 0.009,
138
- "step": 2000
139
- },
140
- {
141
- "epoch": 42.22,
142
- "eval_accuracy": 0.9496699571609497,
143
- "eval_loss": 0.1716560274362564,
144
- "eval_runtime": 51.12,
145
- "eval_samples_per_second": 47.418,
146
- "eval_steps_per_second": 2.973,
147
- "step": 2000
148
- },
149
- {
150
- "epoch": 44.33,
151
- "learning_rate": 2.5177304964539007e-05,
152
- "loss": 0.0101,
153
- "step": 2100
154
- },
155
- {
156
- "epoch": 46.44,
157
- "learning_rate": 2.3995271867612295e-05,
158
- "loss": 0.0087,
159
- "step": 2200
160
- },
161
- {
162
- "epoch": 48.55,
163
- "learning_rate": 2.281323877068558e-05,
164
- "loss": 0.0114,
165
- "step": 2300
166
- },
167
- {
168
- "epoch": 50.66,
169
- "learning_rate": 2.1631205673758867e-05,
170
- "loss": 0.0076,
171
- "step": 2400
172
- },
173
- {
174
- "epoch": 52.77,
175
- "learning_rate": 2.0449172576832152e-05,
176
- "loss": 0.0088,
177
- "step": 2500
178
- },
179
- {
180
- "epoch": 54.88,
181
- "learning_rate": 1.926713947990544e-05,
182
- "loss": 0.0084,
183
- "step": 2600
184
- },
185
- {
186
- "epoch": 56.99,
187
- "learning_rate": 1.8085106382978724e-05,
188
- "loss": 0.0095,
189
- "step": 2700
190
- },
191
- {
192
- "epoch": 59.1,
193
- "learning_rate": 1.690307328605201e-05,
194
- "loss": 0.0075,
195
- "step": 2800
196
- },
197
- {
198
- "epoch": 61.21,
199
- "learning_rate": 1.5721040189125296e-05,
200
- "loss": 0.0097,
201
- "step": 2900
202
- },
203
- {
204
- "epoch": 63.32,
205
- "learning_rate": 1.4539007092198581e-05,
206
- "loss": 0.0086,
207
- "step": 3000
208
- },
209
- {
210
- "epoch": 63.32,
211
- "eval_accuracy": 0.9546204805374146,
212
- "eval_loss": 0.16754871606826782,
213
- "eval_runtime": 52.1005,
214
- "eval_samples_per_second": 46.525,
215
- "eval_steps_per_second": 2.917,
216
- "step": 3000
217
- },
218
- {
219
- "epoch": 65.44,
220
- "learning_rate": 1.3356973995271869e-05,
221
- "loss": 0.0079,
222
- "step": 3100
223
- },
224
- {
225
- "epoch": 67.55,
226
- "learning_rate": 1.2174940898345153e-05,
227
- "loss": 0.0076,
228
- "step": 3200
229
- },
230
- {
231
- "epoch": 69.66,
232
- "learning_rate": 1.0992907801418441e-05,
233
- "loss": 0.0072,
234
- "step": 3300
235
- },
236
- {
237
- "epoch": 71.77,
238
- "learning_rate": 9.810874704491727e-06,
239
- "loss": 0.0074,
240
- "step": 3400
241
- },
242
- {
243
- "epoch": 73.88,
244
- "learning_rate": 8.628841607565012e-06,
245
- "loss": 0.0076,
246
- "step": 3500
247
- },
248
- {
249
- "epoch": 75.99,
250
- "learning_rate": 7.446808510638298e-06,
251
- "loss": 0.0069,
252
- "step": 3600
253
- },
254
- {
255
- "epoch": 78.1,
256
- "learning_rate": 6.264775413711583e-06,
257
- "loss": 0.0068,
258
- "step": 3700
259
- },
260
- {
261
- "epoch": 80.21,
262
- "learning_rate": 5.08274231678487e-06,
263
- "loss": 0.007,
264
- "step": 3800
265
- },
266
- {
267
- "epoch": 82.32,
268
- "learning_rate": 3.9007092198581565e-06,
269
- "loss": 0.0072,
270
- "step": 3900
271
- },
272
- {
273
- "epoch": 84.43,
274
- "learning_rate": 2.7186761229314422e-06,
275
- "loss": 0.0073,
276
- "step": 4000
277
- },
278
- {
279
- "epoch": 84.43,
280
- "eval_accuracy": 0.9537953734397888,
281
- "eval_loss": 0.16863200068473816,
282
- "eval_runtime": 51.419,
283
- "eval_samples_per_second": 47.142,
284
- "eval_steps_per_second": 2.956,
285
- "step": 4000
286
- },
287
- {
288
- "epoch": 86.54,
289
- "learning_rate": 1.5484633569739953e-06,
290
- "loss": 0.0065,
291
- "step": 4100
292
- },
293
- {
294
- "epoch": 88.65,
295
- "learning_rate": 3.6643026004728135e-07,
296
- "loss": 0.0058,
297
- "step": 4200
298
- },
299
- {
300
- "epoch": 89.29,
301
- "step": 4230,
302
- "total_flos": 3.2880550437308154e+19,
303
- "train_loss": 0.00991302564845863,
304
- "train_runtime": 33902.9866,
305
- "train_samples_per_second": 32.174,
306
- "train_steps_per_second": 0.125
307
- },
308
- {
309
- "epoch": 89.29,
310
- "eval_accuracy": 0.9554455280303955,
311
- "eval_loss": 0.1719195693731308,
312
- "eval_runtime": 50.8715,
313
- "eval_samples_per_second": 47.65,
314
- "eval_steps_per_second": 2.988,
315
- "step": 4230
316
  }
317
  ],
318
  "logging_steps": 100,
319
- "max_steps": 4230,
320
- "num_train_epochs": 90,
321
  "save_steps": 2000,
322
- "total_flos": 3.2880550437308154e+19,
323
  "trial_name": null,
324
  "trial_params": null
325
  }
 
1
  {
2
  "best_metric": null,
3
  "best_model_checkpoint": null,
4
+ "epoch": 56.94915254237288,
5
  "eval_steps": 1000,
6
+ "global_step": 420,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
+ "epoch": 13.56,
13
+ "learning_rate": 3.821428571428572e-05,
14
+ "loss": 0.554,
15
  "step": 100
16
  },
17
  {
18
+ "epoch": 27.12,
19
+ "learning_rate": 2.6309523809523813e-05,
20
+ "loss": 0.0396,
21
  "step": 200
22
  },
23
  {
24
+ "epoch": 40.68,
25
+ "learning_rate": 1.4404761904761905e-05,
26
+ "loss": 0.0312,
27
  "step": 300
28
  },
29
  {
30
+ "epoch": 54.24,
31
+ "learning_rate": 2.5e-06,
32
+ "loss": 0.0308,
33
  "step": 400
34
  },
35
  {
36
+ "epoch": 56.95,
37
+ "step": 420,
38
+ "total_flos": 8.064772262536032e+17,
39
+ "train_loss": 0.15708597316628412,
40
+ "train_runtime": 790.3622,
41
+ "train_samples_per_second": 35.376,
42
+ "train_steps_per_second": 0.531
43
  },
44
  {
45
+ "epoch": 56.95,
46
+ "eval_accuracy": 0.8709677457809448,
47
+ "eval_loss": 0.4480999708175659,
48
+ "eval_runtime": 1.4774,
49
+ "eval_samples_per_second": 62.95,
50
+ "eval_steps_per_second": 8.123,
51
+ "step": 420
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  }
53
  ],
54
  "logging_steps": 100,
55
+ "max_steps": 420,
56
+ "num_train_epochs": 60,
57
  "save_steps": 2000,
58
+ "total_flos": 8.064772262536032e+17,
59
  "trial_name": null,
60
  "trial_params": null
61
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:afafbb8d751fc33fbf51d1298497b2fbbc858aa6e7af5c8ee9fc1310c74fcc53
3
  size 4155
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6fff3406fc6d17e7151844526156d27d071a854fa3b738d788067583d864923
3
  size 4155