Mihaiii/Pallas-0.5-LASER-0.1

This is the first LASER intervention on Pallas-0.5 . More will follow using the previous results as base.

Configs used:

lnum: 59
lnames: attn (meaning: ["self_attn.k_proj.weight", "self_attn.q_proj.weight", "self_attn.v_proj.weight", "self_attn.o_proj.weight"])
rate: 6.0
dataset: bigbench (subset: causal_judgement)
intervention type: rank-reduction

Name	Validation acc (higher is better)	Validation logloss (lower is better)	Test acc (higher is better)	Test logloss (lower is better)
Pallas-0.5	55.263	1.650	60.526	1.463
Pallas-0.5-LASER-0.1	55.263	1.639	61.184	1.451
Pallas-0.5-LASER-0.2	55.263	1.646	61.184	1.458
Pallas-0.5-LASER-0.3	55.263	1.575	61.842	1.382
Pallas-0.5-LASER-0.4	55.263	1.525	61.842	1.326
Pallas-0.5-LASER-0.5	55.263	1.484	61.842	1.297
Pallas-0.5-LASER-0.6	55.263	1.455	61.184	1.283

In order to replicate on a single A100, you can use my branch (the original code will throw OOM for 34b models).

Mihaiii
/

Pallas-0.5-LASER-0.1